Geostats Manual 2006

Geostatistics
and
Advanced Property Modelling
Ashley Francis
Managing Director
Earthworks Environment & Resources Ltd
http://www.sorviodvnvm.co.uk
Technical Note: Coloured, Deterministic & Stochastic Inversion

2003 Earthworks Environment & Resources Ltd. All rights reserved.
INSTRUCTOR BIOGRAPHY
ASHLEY FRANCIS
Ashley is a geophysicist and
geostatistician whose career has
encompassed over 20 years worldwide oil industry experience of
exploration, development and
production geophysics. He has also
consulted to the nuclear and
engineering sectors on subsurface
definition and uncertainty. Since
1993 he has specialised in
geostatistics in addition to
geophysics.
Ashley has worked in or on behalf
of service companies,
consultancies and oil companies in
North and South America, Europe,
Africa, Middle East, Far East and
Australia. He spent 5 years with
LASMO plc in Technical Services
assisting and advising asset teams
worldwide in geophysics
(particularly inversion), geostatistics, risk and uncertainty.
After leaving LASMO in 2001, Ashley founded Earthworks, a consultancy
specialising in subsurface geoscience, based in Salisbury, Wiltshire, UK. His
innovative ideas are now being developed in new and ultra-fast stochastic
seismic inversion software at Earthworks.
He lectured in Borehole Geophysics to Honours Graduates at the University of
the Witwatersrand, South Africa 1989-90 and was a Visiting Research Fellow
at the Post Graduate Institute in Sedimentology, University of Reading, UK
1995-97. He continues to teach geostatistics to MSc Petroleum Geoscience
students at Imperial College, London, a role he began in 1999. His current
geostatistical course has been running successfully since 1996 and the newer
inversion course since 2000. Both courses are also run as part of the
Geoscience Training Alliance consortium in London and in Houston and are
widely acclaimed by participants. Ashley is a committee member and regular
attendee at the SEG Development and Production Forum. He was Chairman
of the 2000 meeting and is Chairman of the forthcoming 2003 meeting to be
held in July in Big Sky, Montana. He presents widely at conferences on the
subjects of geophysics and geostatistics.
Ashley is a member of SEG, EAGE, IAMG, BSSS, IPSS, IAS, SPE and
PESGB. He is a Fellow of the Royal Astronomical Society.
2006 Earthworks Environment & Resources Ltd

All rights reserved
LIST OF CONTENTS
Theory Manual
1. Statistics
2. Stochastic Theory
3. Stationarity
4. Estimation and Kriging
5. Stochastic Simulation
6. Deterministic, Stochastic and Best Estimate
7. Support
8. Entropy and Bayes Theorem
9. Cokriging
Advanced Property Modelling

10. Introduction and Framework
11. Stratton Field Introduction
12. Scale Up of Well Logs
13. Data Analysis
14. Facies Modelling
15. Variograms in Petrel
16. Stratton Field Facies Variogram Modelling
17. Seismic Variograms
18. Facies simulation
19. Petrophysical Modelling
20. Seismic Inversion Constraints

All rights reserved
EARTH
WORKS
ENVIRONMENT & RESOURCES
STATISTICS
History of Geostatistics
Geostatistics arrived with the publication of two papers by Matheron in 1962 and 1963.
Matheron was a researcher at the Centre de Geostatistique, Fontainebleau, Paris and
is regarded as the founding father of geostatistics. His papers were a mathematical
formalism of earlier empirical work by Danie Krige. Krige, who published his Masters
thesis in 1951, worked on the problem of ore grade estimation in South African gold and
uranium mining. He became interested in the discrepancy between reserve estimation
reporting and the actual recovery of minerals. He was concerned about the magnitude
of the economic decisions that were based on the reserve estimation studies. His
work began a revolution in mining reserve evaluation methods that has continued to the
present day and includes our own industry.
Pierre Carlier coined the word krigeage to honour Kriges original work, a word that was
later anglicised to kriging by Matheron in a paper published in 1963 for the benefit of
the English-speaking scientific community. Kriging is the name given in geostatistics
to a collection of generalised linear regression techniques for the estimation of spatial
phenomena.
For many years France dominated the world of geostatistics. Although still an important
research centre, other centres such as the Center for Reservoir Forecasting, Stanford,
USA and the Norwegian Computing Centre, Oslo, Norway are now achieving eminence,
particularly in hydrocarbon applications of geostatistics.
Introduction to Geostatistics
2002 Earthworks Environment & Resources Ltd.
All rights reserved.
STATISTICS
EARTH
WORKS
What is Geostatistics?
Geostatistics is to statistics as geophysics is to physics. If a theoretical geophysicist
were to read a treatise on geophysics he would risk getting the disappointing
impression that it is but a collection of particular cases concerning the theories of
field potential and of wave propagation. In the same way, a specialist in mathematical
statistics would not find much more in geostatistics than variances, covariances and
optimal linear estimators.
Geostatistics is a collection of generalised linear regression techniques for the
estimation of spatial phenomena. It is not statistics applied to geological data, nor is it
seismic attribute mapping, just another gridding algorithm or building of 3D reservoir
models. Geostatistics will have application in all these tasks and more, but they do
not define it.
Geostatistics is based on fundamental assumptions concerning the behaviour of
random variables in a spatial coordinate system. At its heart is a theory of regionalised
variables that postulates that the real world is a single realisation of a random
function. By making observations we can deduce some of the properties of the random
function model and use these properties to constrain our estimates of the realisation
at unmeasured locations. Unlike traditional spatial estimation methods, geostatistics
is not ad hoc : it is founded on a clear hypotheses and assumptions about spatial
variables.
Basic Definitions
Random variable or random function can be defined as a measurable

quantity which can take any number xi or a range of values with a given
probability distribution.
Cumulative Distribution Function (CDF) F(x) is defined as the function
which shows the probability of the random variable being less than (or
alternatively greater than) a specified value x. In practice we approximate
this function with the cumulative frequency.
Probability Density Function (PDF) f(x) is defined as the function which
shows the number of times one particular value xi or range of values f(x) of
the random variable occurs. In practice we approximate this function with
the histogram, a measure of frequency of occurrence within class.
Feller (1950) describes a random variable as a function defined on a sample space.

Examples of a random variable are
Number of aces in a hand at Bridge

Multiple birthdays in a company of N people
We can also regard physical systems as random variables, such as
Position of a particle under diffusion

Energy
Temperature
In fact we can extend this to include any physical system which we can measure. An
alternative expression would be a random function. The independent variable is a point
STATISTICS
EARTH
WORKS
in the sample space which describes the outcome of an experiment. In the context
of subsurface description description, the earth is the outcome of the experiment
comprising all of the natural (and man-made) processes operating on it up to that time
such as erosion, deposition, compaction and so forth.
Porosity-Thickness
Porosity-Thickness
0.80
Number of Data55
mean 5.84
std. dev. 1.36
coef. of var0.23
maximum10.23
upper quartile6.43
median 5.61
lower quartile5.10
minimum3.72
0.300
Number of Data55
mean 5.84
std. dev. 1.36
coef. of var0.23
maximum10.23
upper quartile6.43
median 5.61
lower quartile5.10
minimum3.72
0.60
0.40
0.20
0.00
0.0
4.0
8.0
phiH
12.0
ft
16.0
Frequency
Cumulative Frequency
1.00
0.200
0.100
0.000
0.0
4.0
8.0
phiH
12.0
16.0
ft
Assumptions in statistics
McBratney and Webster (1983) make the following observations concerning statistics
and geostatistics
Classical statistics assumes a non-increasing variogram with a sill equal

to the population variation (a variogram will be explained below). In other
words, classical statistics assumes that the samples are independent of
each other.
Close observations of a random function will duplicate neighbouring
information to some extent due to spatial correlation
This means the effective number of samples is smaller than the actual count
of samples
Neighbouring samples minimise duplication if they are separated by a
distance greater than the range of the variogram
Univariate Statistics
Statistics is about summary description of data. Univariate statistics summarise a
single variable. It is important to distinguish between the true population parameters:
And the statistics, which are estimates of the true population parameters based on data
samples we have collected:
We call the summary parameters moments. The first moment is the mean or arithmetic
average of the data which describes the central tendency (what statisticians call the
STATISTICS
EARTH
WORKS
location) of the distribution. The most common way to calculate the arithmetic mean
is to add up all the sample values and divide by the number of samples. We can
write this as:
The terms mean, average and mathematical expectation are synonymous. The
expectation can be denoted in a number of ways including.
The analogy between frequency of occurrence and probability allows us to make the
following definition. Let X be a random variable assuming the values x1, x2, with
corresponding probabilities p(x1), p(x2),. The mean or expected value of X is defined
by
provided that the series converges absolutely. If X is continuous rather than discrete
then we would have an integral rather than a summation
The mean is the average of all the possible outcomes or measurements. It is our Best
Guess at an unmeasured sample. For a closed data set, using the mean as a predictor
will minimise the average squared difference between each value and the mean least
squares. If we extend this idea to estimating the mean from a subset of a larger
population then the the mean estimated from the subset will minimise the squared
prediction error if the population has a Gaussian (Normal) distribution.
The arithmetic mean is not the only way of describing the central tendency of a set of
data or a distribution. Some variables such as resistivity or permeabilty do not average
arithmetically and the geometric or harmonic mean may be more appropriate.
Other measures of location include the median. The median is the value of the middle
sample if the samples are arranged (ranked) in order. Half the values are above the
median and half below the median. In the case of an even number of samples the
median is conventionally defined as the arithmetic average of the two middle values.
The median is important in terms of probability density functions (pdfs) in that it defines
the value as likely to be exceeded as not (the P50). The median is often the estimator
for central tendency used in robust statistical methods because it is not sensitive to the
magnitude of the values, only to their magnitude order or rank.
STATISTICS
EARTH
WORKS
The mode is the value that occurs most frequently. The tallest bar on a histogram
indicates where the mode is located. A data set may contain more than one mode.
However, for continuous sampled data the mode has no strict definition as its value
varies with the precision of the data values. In the case where no two samples
are exactly the same value, any sample could represent the mode. For categorical
variables, such as geological descriptors (sand, shale etc) the mode would be the most
frequently occurring class.
Permeability is the most important example in our industry of a non-arithmetic
averaging property. For effective permeability (keff) the average permeability will range
between the arithmetic mean and the harmonic mean (mh)
This is the blocks in series analogy to Ohms law of electrical current.

For angled flow, the effective permeability is close to the geometric mean (mg) :
if the probability density function (pdf) is log-normal. These examples are theoretical
end members and in practice the effective permeability usually ranges between the
arithmetic and geometric averages: for continuous layers keff is closer to the arithmetic
mean and for random permeable blocks it tends towards to the geometric mean.
Marsily (1986) gives a good description of permeability estimation.
STATISTICS
EARTH
WORKS
Declustering
The arithmetic mean is a sample estimate of the expected value. However, this is
really only true if x1, x2, is a declustered sampling of a distribution, in which case the
arithmetic mean is used as an estimate of the expected value. If we define the mean
as a weighted sum then we can decluster it by appropriate assignment of weights.
In order to understand the assumptions in calculating the mean, a preferred way to
write it is as
A weighted linear combination where each sample value xi is multiplied

by an appropriate weight
An unbiasedness condition
A method of computing the appropriate weights
This definition is elegant because it clearly shows how the mean is a weighted linear
estimator obtained by adding a weight contribution from each sample (first equation).
The second equation illustrates that each sample is equally weighted. As we shall see,
most algorithms for mapping work in the same way, except the weights applied to each
sample to calculate the estimate are not equal but dependent on some other criteria.
Equation two also shows that the weights are inversely proportional to the number of
samples. Because of this it follows that the sum of the weights must be 1. This is the
unbiasedness condition and is summarised in the final equation.
By continuing the analogy between frequency of occurrence and probability we can
consider the weights as probabilities. So if we have different groupings of samples we
may not want to give all samples equal weight in estimating the mean.
The second moment of a distribution is the variance and it summarises the spread of
the values about the central tendency (expected value). The variance is the average
squared difference of the observed values from the mean:
STATISTICS
EARTH
WORKS
Since it involves squared differences, a single extreme value in the data can have a
strong influence on the calculation of the variance. The variance is one of the most
important parameters used in geostatistics and we shall return to it in many guises
during the course.
The standard deviation is simply the square root of the variance. For a Normal
(Gaussian) distribution, the variance (and therefore standard deviation) have specific
meaning in terms of probabilities. A Normal distribution is completely specified by the
first and second moments (mean and variance).
Note that in geostatistics we use the 1/n definition of variance, which is the observed
variation for a data set. If we require an unbiased estimate of the population variance
then it can be obtained from
The inter-quartile range (IQR) is another useful measure of the spread of observed
values. It is the difference between the upper and lower quartiles. Like the median,
and other percentile-based descriptors, the IQR is robust against the effect of extreme
values within the data. Some estimation techniques, such as fractal measures and
esoteric distributions such as Levy-Stable (of which the Normal is a special case), use
approximations based on discarding certain proportions of the tails of the distribution
in order to improve the estimation.
A measure of spread such as standard deviation is also an estimate of the confidence
we have on a prediction. We have already seen that the mean is a best estimator
for an unmeasured value. The standard deviation is the confidence we have that a
true value will be close to the prediction. For a variable with little variation and thus
a small standard deviation our confidence will be high, but for a variable with a large
variation our confidence will be low that the true value is close to the prediction. The
standard deviation (or variance) have a specific meaning in terms of probability when
we consider the Gaussian distribution (see below).
The coefficient of skewness captures the degree of symmetry of a distribution:
The numerator is the cubed average difference between data values and their mean
and is the third moment. As such, coefficient of skewness is very sensitive to erratic or
outlying values. Consequently, sometimes the magnitude of skewness is ignored and
only the sign of the skewness is considered - positive skew indicating a long tail of high
values, negative skew a long tail of low values.
STATISTICS
EARTH
WORKS
Central limit theorem

The central limit theorem is very important in statistics, but somewhat of a counterintuitive concept. The basic theorem provides the mathematical formal conditions
that support the experimental evidence that the distribution of sample means usually
approximates to a normal distribution, regardless of the nature of the distribution
function of the parent population.
What this means is that even if the distribution function of the parent data is distinctly
non-normal, if we take groups of samples and compute their means then the
distribution of means will tend towards a normal distribution. The consequence is that
we can formulate statistical tests to compare the sample mean against the population
or against means calculated for different sample sets. This allows us to establish
confidence intervals, for example to state the probability that two separate datasets are
from the same population.
The normal distribution is usually justified in representing the distribution of a sum of
additive effects possessing stochastic independence, such as instrumental errors,
regardless of the probability distribution of the individual effects. The independence
may be violated in many natural phenomena, thus accounting for the failure of many
spatial processes to follow normal distributions (Olea, 1991). For processes where the
effects are multiplicative rather than additive, a skewed distribution such as the lognormal may be more appropriate.
The statisticians infatuation with the normal distribution tends to focus interest away
from the fact that, for real data, the normal distribution is often rather poorly realised,
if it is realised at all. The main criticisms of assuming a normal distribution in real
data cases is it is a symmetrical distribution for additive processes (whereas the earth
science processes tend to be multiplicative) and that it is too light in the tails of the
distribution. In other words, extreme values occur more frequently than predicted by a
normal distribution of the same mean and variance as the data. Both of these issues
can be tackled, by normal score transforms (anamorphosis) and by richer families of
distributions such as the Levy-stable functions, of which the normal distribution is an
end member case.
Normal distribution
The normal or Gaussian distribution is often observed in the natural world, either for
data as they come or for transformed data. The heights of a large group of people
selected randomly will look normal in general, for example. The normal distribution
is symmetric about its mean and has a standard distribution , which is such that
practically all of the distribution (99.73%) lies inside the range - 3 x + 3. The
frequency function is:
STATISTICS
EARTH
WORKS
A standard normal distribution (written N(0,1)) is one for which the mean is 0 and the
standard deviation is 1. To convert from a general normal variable x to a standard
normal variable z, we set
99.99
Porosity-Thickness
Cumulative Probability
99.9
99.8
99
98
95
90
80
70
60
50
40
30
20
10
5
2
1
0.2
0.1
0.01
0.
4.0
8.0
phiH
12.
ft
STATISTICS
EARTH
WORKS
Normal score transform (anamorphosis)

We have already seen how we can represent any dataset as a cumulative distribution
function or cdf. The cumulative scale is essentially a linear probability scale between
0 and 1 in other words it is a uniform distribution. Clearly if the data cdf followed a
uniform distribution then they would plot as a straight line.
The probability points for a standard normal distribution N(0,1) are well known (they
can be easily obtained from statistical tables or a spreadsheet). It is therefore a
straightforward matter to take the original distribution, calculate the probability value on
the cdf for each sample and use this probability to compute the value with the same
probability in a standard normal distribution. We then replace each data value with its
standard normal value and have transformed the data from its original distribution to
a normal distribution. Clearly the process is reversible, although there is potential for
some slight loss of accuracy when back transforming the data.
Note that the normal score transform (sometimes called anamorphosis) does not
alter any spatial properties of the data, such as trends. Also, whilst the process is
reversible, if we work with our data and generate values which have a larger range than
the input data (as happens during a Monte Carlo process) then we have the problem of
how to back-transform values for which we do not know the correct cdf relation. This
problem is usually solved by making an assumption about the behaviour of the tails of
the distribution, such as by linear or power law extrapolation.
Although geostatistics does not rely on an assumption that the input variable distribution
be Gaussian, some procedures only work properly if the data are Gaussian. Similarly,
the term best estimate is only guaranteed to be best if the data are Gaussian. One
geostatistical procedure for which a Gaussian distribution is very important is sequential
Gaussian simulation (SGS).
Porosity-Thickness - NScore
Number of Data55
mean 0.00
std. dev. 0.99
coef. of varundefined
maximum2.36
upper quartile0.67
median 0.00
lower quartile-0.67
minimum-2.36
0.160
Frequency
0.120
0.080
0.040
0.000
-3.00
-2.00
-1.00
0.00
1.00
2.00
3.00
4.00
Normal Score value
STATISTICS
EARTH
WORKS
Cumulative Frequency
1.00
0.80
Number of Data55
mean 0.00
std. dev. 0.99
coef. of varundefined
maximum2.36
upper quartile0.67
median 0.00
lower quartile-0.67
minimum-2.36
0.60
0.40
0.20
0.00
-3.00
-2.00
-1.00
0.00
1.00
2.00
3.00
4.00
Normal Score value
99.99
Cumulative Probability
99.9
99.8
99
98
95
90
80
70
60
50
40
30
20
10
5
2
1
0.2
0.1
0.01
-3.0
-2.0
-1.0
0.
1.0
2.0
3.0
Normal Score value
Bivariate statistics
The scatterplot or crossplot is a graphical tool familiar to geoscientists, especially those
working with well and log data. The scatterplot is the most common way of visualising
bivariate data. Bivariate refers to data for which there are two measured attributes for
each sample. Multivariate data would have many attributes measured for each sample.
STATISTICS
EARTH
WORKS
In simple terms, three types of variability can be observed on a scatterplot. The

variables can be positively or negatively correlated or uncorrelated. Positive correlation
is where larger values of one variable are associated with the larger values of the
second variable. A negative correlation is where the larger values of one variable
are associated with the smaller values of the second variable. If the variables are
uncorrelated, then an increase or decrease of one variable will appear to have no effect
on the other variable.
The most common statistic to summarise the relationship between two variables is the
correlation coefficient:
The number of data is n; x1,...,xn are the data values for the first variable, mx is their
mean and x is their standard deviation; similarly for y. The correlation coefficient may
be given the symbol r or R as an alternative to
.
The numerator of the above equation is known as the covariance:
The covariance is often used as a summary statistic of a scatterplot and also forms an
integral component of geostatistical analysis, particularly in the key step of establishing
measures of spatial continuity. Note that the covariance between two variables
depends on the magnitude of the data, whereas the correlation coefficient is rescaled
to lie in the interval -1 to +1 by normalising with the standard deviations of the two
variables.
The correlation coefficient is only a direct measure of dependence or independence
when it takes on either of its two extreme values. If X and Y are independent then
r = 0. If r = 1 then we are saying that y and x are dependent (there exists a linear
functional relationship between y and x). Note that r = 0 does not necessarily imply
independence.
The magnitude of the correlation coefficient (r) between these two extremes has no
special significance - it is a relative measure. The quantity r2 is a useful approximation
for the relative increase in dependence. R2 should be interpreted as the amount
of variability of one attribute that could be predicted from knowledge of the second
attribute. Take note of this that means that for a correlation of 0.71, only half of the
variability can be explained from the other variable and half cannot. Similarly, for a
correlation coefficient of 0.5, only 25% of the variability can be explained, 75 % cannot
be explained from knowledge of the second variable.
STATISTICS
EARTH
WORKS
For example, one study showed that the r2 between seismic relative acoustic
impedance (AI) and porosity is 0.3 and increased to 0.6 when seismic AI plus a
low frequency model was compared to porosity. This suggests that, to a first order
approximation, the low frequency model is about of equal importance as the high
frequency information extracted from the seismic in estimating AI.
It cannot be stressed too strongly that a statistical correlation cannot, under any
circumstances, be used to establish or infer a causal relationship between x and
y. At best a correlation can be used to guide the formulation of physical models or
hypotheses that might form a predictive tool whose operational constraints (accuracy/
precision) might be summarised in some statistical way. Note also that the correlation
coefficient is only indicative of a linear dependence between y and x. A non-linear
dependence is not considered.
An alternative and useful test that can check for correlation between two variables
irrespective of the form of any relation (linear or non-linear) is the rank correlation
coefficient. This is based on comparing the rank values instead of the measured
values for each attribute, using the following equation
An algorithm for computing rank correlation can be found in Press et al (1992). If two
variables exhibit a significant rank correlation then it can be taken as strong evidence
that some association exists between them, although not the form of the association.
Computing correlated variables

It is straightforward to create a second variable Y correlated with an initial variable X
as follows
If x1 is a N[0,1] variable, we can create any other Gaussian variable X with mean
= and standard deviation = using:
To create a second variable Y which is correlated with X, we need another

random variable x2 and combine them as follows, where is the correlation
coefficient
X and Y will then be correlated with each other according to the chosen
correlation coefficient
STATISTICS
EARTH
WORKS
Spurious Correlation
Whenever two variables are plotted against each other in the form of a crossplot, it is
common practice to calculate a line estimating the linear relation of the two variables
and to compute the correlation coefficient between the two variables.
The correlation coefficient measures the degree of dependence between the two
variables. It is used for a number of purposes such as selecting seismic attributes
to assist in predicting reservoir properties and for choosing a functional relationship
for time to depth conversion. The problem is that the correlation coefficient is often
estimated using only a small number of sample points - the data at well locations. As
with any statistical measure, as the sample size decreases so the range of the true
parameter, for a given confidence level, increases.
A test for spurious correlation is made using a two-tailed Students T test with two
degrees of freedom. This is very simple to calculate in Excel, and the formula is
given below. 3 cells are set up for entering the values for number of wells, correlation
coefficient and number of attributes considered. A fourth cell contains the formula to
calculate the probability of spurious correlation (Psc). The calculation cell contains the
following entry :
=1-(1-(TDIST((R * (SQRT(Nwells-2))/(SQRT(1-R^2))), Nwells-2, 2)))^Nattr
where Nwells, R and Nattr are the 3 cell references where the values are entered.
The Psc may be expressed as a percentage. Alternatively, a Java Applet for the same
computation can be found at http://www.sorviodvnvm.co.uk.
STATISTICS
EARTH
WORKS
Psc
(1 attribute)
Psc
(3 attributes)
Al vs
R
(7 wells)
Rmin
Rmax
Porosity
0.913
0.511
0.987
0.4%
1.2%
PorosityThickness
0.939
0.634
0.991
0.1%
0.5%
Thickness
0.719
-0.074
0.954
6.8%
19.2%
The number of attributes shows the effect of trying more and more correlations with
different attributes until a correlation is found. The chance of finding a spurious
correlation increases as the number of comparisons increases. The tables overleaf
show the Psc for a range of values of R and Nwells.
Table shows probability of spurious correlation for a given R (left hand column)
and number of samples (column headings).
Table shows minimum R having a probability of spurious correlation of 5% for given

number of attributes (left hand column) and number of samples (column headings)
STATISTICS
EARTH
WORKS
Note that for spatially correlated variables the results given above will be overly
optimistic because the degrees of freedom assumes there are N independent samples.
If the samples are close enough together then they may be spatially dependent and so
N is effectively reduced, rendering some data in the regression redundant and therefore
the chance of spurious correlation will increase. In other words, real data cannot be
better than given here and will often be worse!
Whenever a correlation is used for estimation, the correlation coefficient and the Psc (or
alternatively, the number of samples) must be reported if the meaning of the correlation
is to be communicated to the reader.
In addition to computing the Psc, we can also calculate, for a given confidence interval,
the limits of the true correlation between the variables. This is an equivalent calculation
the Psc calculation computes the probability of this confidence interval including
zero correlation. The correlation range is computed approximately using Fishers
Z-transform. The already mentioned Java Applet gives the minimum and maximum R
for a 95% confidence interval about the input R.
PREDICTION ERRORS
Actually
uncorrelated
Actually
correlated
Select
correlation
Type l Error
Correct decision
(no error occurs)
Reject
correlation
Correct decision
(no error occurs)
Type ll Error
When choosing to use or ignore a correlation, there are four possible outcomes. If we
elect to use a correlation and the variables are genuinely correlated we have made
a correct choice and reduced uncertainty. If the variables are not correlated and we
reject the correlation we have correctly avoided being misled by an inappropriate
relation.
However, there are two possible prediction errors that we can make, depending on
our choice.
In a Type I error we elect to use a correlation but in fact the variables are not
correlated. This results in a prediction using the erroneous variable which is less
accurate but more precise. In other words, we mistakenly are led to reduce the
uncertainty resulting in an inaccurate prediction made with confidence.
With a Type II error we reject a correlation, even though it would have been
useful in reducing uncertainty. The result is a prediction which is both less
accurate and less precise than if we had used the correlation. We have a
prediction with more uncertainty than necessary because we have rejected
useful information.
STATISTICS
EARTH
WORKS
In general, joint variation is too complex to be summarised by a single coefficient. In

expressing a quality, it should be remembered that this is not the same as measuring it.
For example, if X and Y are independent of each other then the correlation coefficient is
zero. However the converse is not necessarily true: if the correlation coefficient is zero
it cannot be concluded that X and Y are necessarily independent. Finally, correlation
coefficient is a relative indicator of dependence. The most useful description is the
R2 value, which expresses the proportion of variance which is shared between the
two variables.
STATISTICS
EARTH
WORKS
Potential risks when using seismic attributes

as predictors of reservoir properties
&<17+,$ 7.$/.20(<0RELO( 3 7HFKQLFDO&HQWHU'DOODV7H[DV
he advance of our ability to generate seismic attributes and the growing emphasis on production geophysics has led to the widespread use of seismic attributes as predictors of reservoir properties. In many
cases, we can show using seismic modeling or rock
physics a physically justifiable relationship between
a seismic attribute and the reservoir property of interest. When this is true, we are able to greatly reduce
the uncertainty of interwell predictions of reservoir
properties.
The simple flow diagram in Figure 1 illustrates the
basic process. The first critically important step is accurately tying the well and seismic both vertically and
areally. We then must choose which seismic attributes
we believe to be related to the reservoir property. Using
that attribute or set of attributes, we use the dense seismic data to guide the prediction of reservoir properties
between sparse well data. A number of methods can be
used for the prediction step linear or nonlinear
regression, geostatistics, or neural networks. The purpose is estimating in-place hydrocarbon volumes or
making reservoir management decisions such as
location and number of wells, depletion strategy, or gas
and water injection operations. The problem with the
process is: How do we identify which seismic attributes to use?
The problem. All of the prediction methods (regression, geostatistics, and neural networks) require making inference from seismic data at a small number of
wells to a larger population which we assume is rep-
resented by that sample. Enthusiastic practitioners can

now generate 10s even 100s of seismic attributes.
An all too common practice is identifying which seismic attributes to use solely on the strength of their
observed correlations with reservoir properties measured at the wells. Often these correlations are estimated by a very small number of observations. As with
any parameter, when the sample size is small, the
uncertainty about the value of the true correlation can
be quite large.
To get an idea of how large this uncertainty can be,
consider a sample correlation (r) of 0.80 calculated from
10 data points. The 95% confidence interval estimate of
the true correlation is [0.34, 0.95]. The only statement
we can make with (95%) confidence is that the true correlation between the reservoir property and seismic
attribute is between 0.34 and 0.95. If we have only five
data points instead of 10, the 95% confidence interval
widens to [-0.28, 0.99]. Since this interval contains zero,
we cannot say with confidence that there is any correlation between the reservoir property and the seismic
attribute.
Most practitioners are already aware that the smaller the sample, the greater the uncertainty about the
true value of the correlation. A lesser known phenomena is something statisticians call experiment-wise
error rates. This just means that as we generate an ever
increasing number of attributes, the greater the chance
of observing at least one spuriously large sample correlation. You can get the idea of this if you consider an
experiment where you draw a small number of observations from two completely unrelated variables
say the annual birthrate of elephants and pounds consumed of a food substance. If we randomly select five
years and observe the birthrate of elephants and
pounds of sugar consumed, we would probably correctly infer that there is no relationship between the
two variables. But if we continue to test this for all possible food types, by chance alone we will likely find at
least one food type that appears, on the basis of the
sample correlation, to be related to the birthrate of elephants. When this happens, we say weve encountered a spurious correlation.
Definition: A spurious correlation is a sample
correlation that is large in absolute value purely by
chance.
Figure 1. Simple flow diagram showing use of seismic

attributes to predict reservoir properties.
For seismic attributes and reservoir properties this

means we have happened to sample at locations that
yield a large sample correlation when in truth they are
MARCH 1997
THE LEADING EDGE
247
STATISTICS
EARTH
WORKS
7DEOH3UREDELOLW\RIVSXULRXVFRUUHODWLRQEHWZHHQD
UHVHUYRLUSURSHUW\DQGDVLQJOHVHLVPLFDWWULEXWH
6DPSOHVL]H
5

actually uncorrelated. How likely is it that this could

occur?
Probability of observing a spurious correlation. Consider a single seismic attribute as a possible predictor.
The probability of observing the absolute value of the
sample correlation (r) greater than some constant (R)
given the true correlation (l) is zero can be found from
R n<2
p sc Pr r * R Pr t *
1< R2
where n is the sample size or number of locations with
measurements of both the reservoir property and the

seismic attribute, and t is distributed as a Students t
variate with n-2 degrees of freedom.
Note that the probability of a spurious sample correlation depends only on R (the magnitude of the spurious sample correlation) and n (the number of well
measurements). An assumption of this relationship is
that the n observations are drawn randomly. The reason for a random sample is to insure that the sample
is representative of the population under study and
that the amount of independent information available
to estimate the true correlation is maximized. If in fact
the data are spatially correlated, the relationship gives
a conservative estimate of the probability of a spurious
correlation. This is because the effective sample size will
be smaller than the actual sample size; and as n
decreases, the probability of a spurious correlation
increases. Table 1 shows the probability of observing
a spurious correlation for different magnitudes of the
sample correlation (R) and different sample sizes (n).
Now that weve quantified the probability of a spurious correlation when considering a single seismic
attribute, we can calculate the probability of at least one
spurious correlation when considering a set of k independent attributes. The probability of observing at least
one spurious correlation when considering a set of k
independent attributes is simply 1 minus the probability that none of the sample correlations are spurious. This
can be expanded to the summation
7DEOHV 3UREDELOLW\ RI REVHUYLQJ DW OHDVW RQH VSXULRXV VDPSOH FRUUHODWLRQ IURP D
JLYHQQXPEHURILQGHSHQGHQWVHLVPLFDWWULEXWHV
7DEOH7HQLQGHSHQGHQWVHLVPLFDWWULEXWHV
7DEOH)LYHLQGHSHQGHQWVHLVPLFDWWULEXWHV
6DPSOHVL]H
6DPSOHVL]H
5

7DEOH7ZHQW\LQGHSHQGHQWVHLVPLFDWWULEXWHV
7DEOH)RUW\LQGHSHQGHQWVHLVPLFDWWULEXWHV
6DPSOHVL]H
6DPSOHVL]H

STATISTICS
EARTH
WORKS
a)
b)
Figure 2. Probability of observing at least one spurious sample correlation of magnitude (a) 0.6, and (b) 0.4
when actually the well data and seismic attributes are uncorrelated.
Figure 3. Possible outcomes when selecting a seismic

attribute as a predictor of a reservoir property.
1 < 1 < p sc - p sc 1 < p sc

k
i <1
i 1
where psc is the probability of a spurious correlation for

a single attribute and k is the number of independent
seismic attributes considered.
From the summation we can see that the penalty for
increasing the number of attributes considered from k-1
to k is
p sc 1 < p sc
k <1
Tables 2-5 give the probability of observing a spurious sample correlation for one or more seismic attributes from a set of k = 5, 10, 20, and 40 independent
attributes. Each table gives the probability of observing
a sample correlation greater than or equal to 0.1 to 0.9
when the reservoir property is actually uncorrelated
with the attributes, given a sample size of 5, 10, 15, 20,
25, 35, 50, 75, or 100. Note, we are assuming that the
seismic attributes are independent. If in fact the attributes considered are correlated with each other, then the
probabilities of observing at least one spurious correlation shown in the tables will be too large. This is because
the effective number of independent seismic attributes will
be smaller than the actual number of attributes considered and as k decreases, the probability of at least one
spurious correlation decreases. To avoid this, one

should use the table where k equals the number of independent linear combinations of the set of seismic attributes considered.
These tables can be used to assess the risk of selecting a seismic attribute that is actually uncorrelated with
the reservoir property being predicted. For example,
look at the column headed 5 and row labeled 0.60 of
Table 3. We see that, given we have only five well measurements, there is a 96% chance of observing a sample
correlation coefficient of 0.60 for at least one of the 10
seismic attributes considered, when no correlation actually exists between the attributes and the property. If we
double the number of wells to 10, the probability of at
least one spurious correlation of magnitude 0.6 or more
drops to 0.50.
What happens if we increase the number of independent seismic attributes considered from 10 to 20?
Table 4 shows that the chance of observing at least one
sample correlation of 0.60 or more in absolute value
from a sample of size five, when no correlation actually
exists between the attributes and the reservoir property,
is almost 100% (its actually 0.999). Even if we double the
number of wells to 10, in this case there is still a 75%
chance of at least one spurious correlation.
Factors influencing the probability of a spurious correlation. Figure 2 reveals the factors that influence the
probability of observing spurious correlations. Figure 2a
shows the probability of observing at least one sample
correlation greater than or equal to 0.6 when the actual
correlation is zero. Figure 2b presents the same results
for an observed sample correlation of 0.4.
From Figure 2, we see that the probability of observing one or more spurious correlation increases as
the number of independent well measurements
decreases, or
the number of independent seismic attributes considered as potential predictors increases, or
the absolute value of the observed sample correlation decreases.
Associated risks. Statistical decision theory defines risk
as the expected loss due to our uncertainty about the true
state of nature under the possible decision alternatives
or actions. Figure 3 illustrates the possible outcomes
which must be considered in order to assess the risk
MARCH 1997
THE LEADING EDGE
249
20
STATISTICS
EARTH
WORKS
Figure 4. Scatter plot of porosity versus seismic amplitude at the 10 well locations.
p
o
r
o
s
i
t
y
Figure 5. True spatial distributions. (a) True distribution of porosity ranging from 0 (dark blue) to 20%
(red). Well locations are indicated are indicated by
black dots. (b) Seismic attribute over the same area.
D
seismic attribute
requires additional assumptions about the true magnitude of the correlation between the well property and
the seismic attribute.) To quantitatively assess risk, we
would have to assign the cost or economic consequence
of each of the four possible outcomes. The costs will
obviously be situation dependent; however, we can generalize the cost in a qualitative sense.
The cost of a Type I Error (using a seismic attribute
to predict a reservoir property when actually uncorrelated) is:
Inaccurate prediction biased by the attribute.
Inflated confidence in the inaccurate prediction
apparent prediction errors are small.
Figure 7. (right)
Estimate of porosity generated
using collocated
cokriging with the
seismic attribute
as the covariate.
associated with choosing seismic attributes to predict a

reservoir properties.
A Type I Error occurs if no relationship truly exists
between the seismic attribute and the reservoir property of interest, yet we select the seismic attribute as a predictor. A Type II Error occurs if a physical relationship
does exist between the seismic attribute and the reservoir property of interest, but we fail to use the seismic
attribute as a predictor.
The previous section quantified the probability of a
Type I Error occurring when the selection criteria is
based solely on the magnitude of the sample correlation
between a seismic attribute and a reservoir property.
(Calculating the probability of a Type II Error occurring
THE LEADING EDGE
MARCH 1997
Figure 8. Estimation errors. (a) Areas overestimated by

more than five porosity units. (b) Areas underestimated by more than five porosity units.
Figure 6. (above)
Scatter plot of
porosity versus
seismic amplitude
at all
locations.
250
The cost of a Type II Error (rejecting a seismic attribute

for use in predicting a reservoir property when in fact
they are truly correlated) is:
Less accurate prediction than if wed used the seismic
attribute.
Larger prediction errors than if wed used the attribute.
I believe that for most cases the economic consequences of making highly confident but inaccurate predictions are more severe than the consequences of a Type
II Error.
A simple example. To illustrate the consequences of a
Type I Error, consider a simple example. Suppose you
have measurements of porosity and a seismic attribute
STATISTICS
EARTH
WORKS
extracted at 10 wells. Figure 4 shows a scatter plot of

porosity versus the seismic attribute with a sample correlation of 0.82. It could be very tempting to proceed
with the prediction using this relationship.
The true spatial distribution of porosity over the
area is shown in Figure 5a; Figure 5b shows a seismic
attribute over the same area. If we look at the complete
scatter plot over the entire area shown in Figure 6 we
see a shotgun pattern indicating the true correlation of
zero. By happenstance, the field has been drilled in
locations which makes it appear that there is a strong
relationship between this attribute and porosity when
in fact there is none. We have observed a spurious sample correlation.
If we use this attribute for prediction, our resulting
estimate of porosity will bear a strong resemblance to
the seismic attribute regardless of whether we used
regression, geostatistics or neural networks. Figure 7
shows such an estimate, generated using collocated
cokriging.
This prediction results in large actual estimation
errors. Figure 8a highlights areas where the porosity is
overestimated by 5-12 porosity units. Figure 8b shows
areas where the porosity is underestimated by 5 -12
porosity units. As you can imagine, drilling locations
chosen on the basis of this prediction will be less than
optimal. This could also be disastrous for other reservoir management decisions. For example, if an injection strategy was designed on the basis of this predic-
tion, it would likely be chosen about 90 from the optimal orientation.

Conclusions. There are two main points to remember
from this work. First, the probability of observing spurious sample correlations between a seismic attribute
and well data can be quite large if the number of independent well measurements is small or the number of
independent attributes considered is large.
Secondly, when the probability of a spurious correlation is large, then selection of seismic attributes
based solely on empirical evidence is risky it can
lead to highly confident, but highly inaccurate predictions and thus poor business decisions. Therefore, it is
strongly recommended, especially when the number
of wells is small, that only those seismic attributes
that have a physically justifiable relationship with
the reservoir property be considered as candidates for
predictors. /(
Cynthia Kalkomey received a doctorate in statistics from Southern
Methodist University (1991). She joined Mobil in 1979 and is currently manager of Reservoir Characterization, Mobil Exploration and
Producing Technical Center.
Corresponding author: Cynthia Kalkomey, Mgr., Reservoir Characterization, Mobil Exploration and Producing Technical Center,
13777 Midway Rd., Dallas, TX 75244; phone 972-851-8598; fax
972-851-8703.
STATISTICS
EARTH
WORKS
How many seismic attributes?

When using techniques such as multilinear regression or correlation, we should pause
to consider just how many attributes a seismic dataset may have. Chen and Sidney
(1997) published a comprehensive list of possible seismic attributes which could be
determined from seismic data. Clearly, there are an infinite number of attributes which
could be ingeniously defined, but seismic data cannot have an infinite information
content to be given up. This leads us to pose the question How many independent
seismic attributes are there?
In the seminal paper on complex trace attributes, Taner et al (1979) open by clearly
stating the application of complex trace analysis to seismic data as transformations of
data from one form to another to extract significant information. They clearly state the
benefit of these transformations: Interpreting data from different points of view often
results in new insight and the discovery of relationships not otherwise evident.
Applying a transformation to data does not result in generating new variables. For
example, if we have variables X and Y and multiply Y by a constant, we do not have
a new variable, merely a scaled version of the previous version of Y. If X and Y
were linearly related, then the new Y is also linearly related. However, consider that
if ln(Y) was linearly related to X then applying a log transform to Y would enable the
nature of the relationship to be more easily recognised. In other words, transforms
do not result in new variables, they merely assist in highlighting relationships between
different variables.
Given a seismic trace, when we take the Fourier transform we do not obtain additional
independent variables. Through the FFT, amplitude and phase are functionally (but not
linearly) related to the real and imaginary parts of the transformed trace. However, by
computing amplitude and phase spectra we can observe features of the trace data in a
way that is not possible in the time domain. Our data remains the same, only our view
on the data changes.
So for post-stack data, how many seismic attributes are there? Consider the basic
equation for a seismic trace where the real seismic trace f(t) is expressed in terms of
time-dependent amplitude A(t) and a time-dependent phase (t):
The trace clearly has two attributes time and phase. (Of course, pre-stack seismic
data contains additional information in the offset data) . We can derive another
attribute, which is the time difference between events on the trace, providing they are
sufficiently far apart not to be autocorrelated. A wedge model illustrating tuning is a
good example of how the interval time becomes dependent when the two events are
close enough together that the autocorrelation (wavelet) becomes important.
Consider that the most unsymmetrical material has a maximum of 21 elastic constants.
As the degree of symmetry increases, the number of independent elastic stiffnesses
decreases to 9 for an orthotropic material, 5 for a transversely isotropic material and 2
for an isotropic material (Gregory, 1977).
Clearly there is a benefit in transforming seismic through complex trace analysis in
trying to discover useful relationships. However, it should not be assumed that new
STATISTICS
EARTH
WORKS
variables are gained in this way. The variables are all wholly dependent on each other,
although sometimes cryptically because of the nature of the transform itself. Methods
such as multi-linear regression rely on the input variables being independent. There is
no value in using two variables which are themselves correlated, almost certainly one
will do.
There are other dangers in taking multiple derivatives of a seismic trace, through
complicated transforms. One is that the probability of generating a spurious correlation
increases the more comparisons are made. Also, high order derivatives are much
more sensitive to noise. It might be tempting in a geostatistical study to find reasons
to reject outlying samples caused by noise, which could give rise to a very high (but
spurious!) correlation coefficients for the remaining samples.
EARTH
WORKS
STOCHASTIC THEORY
Random function model
There are essentially two models which we can use conceptually to describe data and
samples. We will illustrate the difference between these two models by considering
the quantity porosity.
The notion of porosity is simple to understand but it poses some problems if we want
to define it with precision. There are two accepted ways of defining local properties of
a medium: the notion of the representative elementary volume (REV) and that of the
random function (RF). These two alternative models implicitly influence descriptions
of spatial variation.
REV
Random Function
The problem stems from the fact that porosity, like many quantities, cannot be defined
or measured at single points, since a porous medium is a conglomeration of solid grains
and voids. Below a certain scale of volume, porosity has no physical significance.
The REV model consists of saying that we give to one mathematical point in space the
porosity of a certain volume of material surrounding this point, the REV, which will be
used to define and possibly measure the mean property of the volume in question.
This concept involves an integration in space. It is obviously the first method that
comes to mind and behind it lies the idea of a sample of finite size which is collected
and from which the relevant property is estimated by measurement.
The size of the REV is defined by saying that it is
Sufficiently large to contain a great number of pores so as to allow us to define a

mean global property, while ensuring that the effect of the fluctuations from one
pore to another are negligible.
Sufficiently small so that the parameter variations from one domain to the next
may be approximated by continuous functions, in order that we may use the
infinitesimal calculus, without introducing any error that might be detected by
measuring instruments at the macroscopic scale.
It should be noted that in a fractured medium the size of the REV may be quite
astonishingly large to satisfy the first condition, but therefore failing to satisfy the
second condition of a continuous function at the scale of the measurement. The
STOCHASTIC THEORY
EARTH
WORKS
size of the REV is often linked to a flattening of a curve connecting the property with
the dimension of measurement. However, nothing allows us to assert that such
a stabilisation must exist. The size of the REV is thus arbitrary and there may no
characteristic scale at which it can be said to exist.
The REV concept has other important limitations. Firstly, it is unsuited to handling
discontinuities in a medium. Second, and most importantly, is that it gives no basis for
studying the structure of the property in space. The most that can be said is that the
spatial variations of the studied property must be smooth.
The RF model is a more powerful concept. It comprises in saying that the studied
porous medium is a realisation of a random process. A property like porosity is
then defined at a given geometrical point in space as the average over all possible
realisations of its point values, defined either as a grain (0) or a pore (1). We then have
an ensemble average of grain or pore at a point and therefore refer to porosity at a point
as having some probabilistic interpretation. Thus the ensemble average is in fact the
expected value of the point porosities. Over the whole volume (e.g. the right hand side
of the picture) the expected value will be the same as the space average defined by the
whole sample - the REV. However, the ensemble average will be the same probability
at any point within the sample.
The significant advantage of the stochastic approach of the random function model is
that it becomes possible to study statistical properties of the ensemble other than just
the expected value.
Note that the real world from which the sample is drawn is just one complete realisation
of some random function and not the random function itself. In order to make this
concept work we require some additional assumptions, the most useful of which are
stationarity and ergodicity. (From de Marsily, 1986)
Simple stochastic example

We shall now demonstrate a simple stochastic model. The word stochastic means that
we are concerned with probabilities. Consider an experiment involving tossing a fair
coin (ie one that can only land heads or tails, not on its edge nor can it be grabbed by
passing birds, roll down cracks in the floor or any other outcome!)
Heads is denoted 0 and tails 1

The Expected Value Ev is (P0.5 * 0 + P0.5 * 1) = 0.5
Ev is the probability term for the mean or average result
Our best estimate of the result of tossing the coin is always 0.5 because this
gives the smallest average square error (least squares)
The only possible realisations (outcomes) for tossing the coin are 0 or 1
When we consider a discrete event the distinction between expected value and
realisations is very clear. However, the same principal applies to continuous variables.
The mean is not a real value, it is a mathematical artifice describing the probability
weighted sum of the possible outcomes.
STOCHASTIC THEORY
EARTH
WORKS
Regionalised variables
A key difference between statistics and geostatistics is the recording and extracting of
information related to the location of every observation. Attributes that have a location
(coordinate reference) associated with each observation are called regionalised
variables. Geostatistics models regionalised variables as realisations of random
functions.
We start with the observation that two data points which are close together are much
more likely to have similar values than two points which are far apart. This phenomena
is used quite naturally in procedures such as hand contouring of data. In geostatistics
we describe this property as spatial continuity or spatial correlation. Unless a variable is
spatially correlated, it is meaningless to attempt to map it.
Variography is a method for measuring and modelling spatial correlation. There are a
number of related statistical tools to describe spatial correlation
h-scatterplots
Correlation function
Covariance function
Semivariogram
An h-scatterplot is constructed for particular lags. As in geophysics, a lag describes

the offset between two points. In the case of a spatial variable, the lag is the distance
between two points. The lag h is a vector, so it has both separation and direction.
To construct an h-scatterplot for a chosen lag value
find all the pairs of values which are separated by h (to within a given tolerance).
Construct a cross-plot of second value in pair against first value in pair
Repeat for this for any number of lags h.
STOCHASTIC THEORY
EARTH
WORKS
Variogram Calculation - Paiar Combinations
h=100m
h=200m
h=300m
h=400m
h=500m
h=600m
STOCHASTIC THEORY
EARTH
WORKS
If we choose h to be very small, we would expect to see very little difference in the
values of the first and second data making up each pair. If h is large then the first and
second value of the pair may be quite different.
We can summarise a linear relation on the cross-plot using any of the common
statistics. For example we could use correlation coefficient, covariance (numerator of
the correlation coefficient) or semivariance (the deviation of the cross-plot values from
the 1:1 line).
If we calculate the correlation coefficient for each h-scatter plot and then plot the
correlation coefficients against h, we obtain the correlation function. Similarily we can
obtain the covariance function and the semi-variogram by plotting the semivariance
against h. In practise we rarely draw the h-scatterplots. Instead we compute the
semivariogram function directly using
STOCHASTIC THEORY
EARTH
WORKS
The variogram (a lazy term for the

semi-variogram) is the main spatial
analysis function used in geostatistics.
If we turn this function upside down,
we have the covariance function. If we
normalise the covariance function to a
range of 0 1 we have the correlation
function.
STOCHASTIC THEORY
EARTH
WORKS
Classic variogram
The following terms are used to describe the variogram
As the separation distance between pairs increases, the variogram value also
increases (in the ideal case). Beyond a certain separation no further increase
occurs and the variogram reaches a plateau. The separation at which this
plateau is reached is known as the range (symbol a) and it defines the spatial
separation greater than which there is no further spatial correlation apparent
for the variable. Points separated by distances greater than the range only
contribute information about the mean value - this is the realm of classical
statistics and implies the data cannot be mapped.
The magnitude of the plateau which the variogram reaches at the range is
called the sill. Under certain special conditions, the sill is equal to the classical
variance of the data. The sill is given the notation C+C0, strictly the sill is C, but
the nugget (described below) is often added into the definition.
The strict value of semivariance for h=0 is zero. Any point compared to itself
will have no difference. However a number of factors including measurement
error and short scale variability can cause sample values separated by short
distances to be quite dissimilar. This gives rise to a discontinuity at the origin
referred to as the nugget effect. It is usually given the symbol C0.
Y axis (North)
ng
=6
:a
r
cto
lag tolerance
xltol = 2.0
nv
tio
ec
dir
g6
La
ang = 60
g5
La
g4
La
g3
lag distance
xlag = 4.0
La
bandwidth
bandw = 5.0
2
ag
L
g1
tolerance atol = 22.5
La
X axis (East)
Omnidirectional and Directional Variograms

An omnidirectional variogram is a variogram where all sample pairs are used in its
construction and the direction element of the vector between a pair of samples is
ignored. The variogram is therefore based on separation (distance between) pairs
only. Since omnidirectional variograms use all the availble pairs they tend to be
robust but clearly if the geological property has an anisotropic behaivour then the
omnidirectional variogram will average it.
A directional variogram is where the binning scheme only uses pairs oriented in a
given direction such as N-S. This may be rather restrictive and no pairs may be found
that exactly meet this criteria and so it is usual to consider pairs for which the angle
between the two points is within somne tolerance of the ideal angle. For example, N-S
+/- 20 degrees. A typical scheme for directional variograms is illustrated in the diagram
(above right).
STOCHASTIC THEORY
EARTH
WORKS
Note that a directional variogram for which the tolerance is set as 90 degrees is in fact
an omnidirectional variogram as all pairs will be included. Using an angular tolerance
of 90 degrees and then changing the direction will always give the same calculated
experimental variogram. To observe directional behaviour the angular tolerance
should be reduced to make the selected pairs specific to the direction being analysed,
but the tolerance should not be too narrow or few pairs will be found and the reliability
of the resulting experimental variogram will be impaired.
When computing directional variograms it is usual to analyse several directions,
typically orthognal to each other or at regularly spaced angles around the compass.
Typically, four directions might be analysed, starting N-S (0 degrees), then NE-SW
(+45 degrees), E-W (+90 degrees) and NW-SE (+135 or -45 degrees). The angular
tolerance should usually be set no larger than 22.5 degrees to avoiding double
counting of pairs. Ideally, if sufficient data are available, the angular tolerance could
be set to 5 - 10 degrees.
The objective of directional variograms is to investigate whether the geological property
exhibits anisotropic behaviour, that is its covariance function is elliptical rather than
circular when observed in plan view. Remember that a directional variagram is a
radial cross-section from the centre out through the covariance function. The vertical
variogram will almost certainly show anisotropy compared to the horizontal variograms,
the vertical range being much shorter (typically 20:1 to 30:1 ratio between horizontal
and vertical range).
The presence of a drift or trend will strongly mask any anisotropic bevaviour and
detecting anisotropy in these circumstances (other than vertical versus horizontal) will
be difficult. In many instances, a drift or trend is mistakenly interpreted as anisotropy.
If anisotropic behaviour is suspected, the directional variograms should be aligned with
it to enable the anisotropy to be fully odentified and subsequently modelled.
A Good Variogram
It will become clear from the above practical examples that a good stable variogram
requires the data both to be stationary and be comprised of a sufficient number of
samples. In fact, these two criteria are to some extent related, in that stationarity is a
function of low frequency aliasing, that is, under sampling.
Assuming a stationary function, a question often asked is how many values are
required to ensure a stable variogram? Certainly it is true that one can never have too
many samples, but it is common to see attempts to compute variograms on far too few
samples. Webster and Oliver (1992) investigated this question for a synthetic variable
of a soil property.
Firstly, they noted that the advice of Journal and Huijbrets (1978, p194) is misleading.
That advice suggests the number of pairs need be a minimum of 30 - 50, and that the
maximum h considered should be less than L/2 where L is the dimension of the field
from which the pairs are computed. If we take 45 pairs to be the total number of pair
combinations, we require only 10 samples to achieve this. In hydrocarbon exploration
when dealing with well data this may be the typical number of samples under
consideration. Unfortunately, variograms derived from this number of samples are
likely to be unreliable, difficult to interpret, or both. A better rule of thumb is to assume
that the average number of pairs contributing to each lag should be in the range 30
- 50.
STOCHASTIC THEORY
EARTH
WORKS
Webster and Oliver observe that variograms computed on fewer than 50 data are of
little value, and that at least 100 samples are needed. They suggest that for a normally
distributed isotropic variable a variogram calculated from a sample of 150 points might
often be satisfactory, while one derived from 225 data will usually be reliable. As an
example of this, they constructed variograms from the synthetic dataset shown below,
which has a known variogram (exponential with range 40 units and a sill of 1), which
was generated from a consistent random function model.
Variograms generated by sampling 49 points on this surface are shown in (a) above
Note how disparate the variograms are, and also how difficult it would be to determine
a reliable nugget effect. In (b) above, the variograms have been derived from the
synthetic data of the previous page using 225 points per variogram. The difference
between this result and (a) above should speak clearly of the problems of inadequately
sampling to generate variograms.
(a) Variograms from 49 points on

a 7 x 7 grid at 15 unit intervals
(b) Variograms from 225 points on

a 15 x 15 grid at 7 unit intervals
STOCHASTIC THEORY
EARTH
WORKS
Remember that the construction of directional variograms to adequately describe

an anisotropic variable will have only pair contributions in certain directions. For the
example variogram looked at previously for hand calculation, in the X direction there
are 61 pairs, an average of 10 per lag, but in the Y direction the total number of pairs
possible is 31, and there are no lags greater than 300 m. The total number of pairs
possible (omnidirectional) for this example of 24 samples is 276, with a maximum
range of 671 m (2 pairs).
Another aspect of good variograms not often considered is the behaviour close to the
origin at short lags. Perversely, with a very regularly sampled data set, such as from
a 3-D seismic survey, there are no lags at an offset less than the grid interval, and so
very little is known about the spatial behaviour at short lags. With a scattered data set,
the separations are more evenly distributed across many values of h, and so although
an overall stable variogram may be more difficult to interpret, there may be significantly
more pair information at very short offsets, which often is the most critical part of any
variogram analysis. This problem was considered in an excellent paper by Laslett
and McBratney (1990), and we shall briefly look at their recommendations concerning
sampling strategy in a short section later in this course.
Therefore, should we conclude that it is pointless to attempt to construct variograms
from too few samples? The only answer is to try it and see. A well behaved, simple,
continuous function may require only a few samples to describe it quite well, so it is
always worthwhile constructing a variogram just to see.
Variogram of 55 Wells
STOCHASTIC THEORY
EARTH
WORKS
Variogram of 7 wells
Anisotropic Variograms 55 Wells
STOCHASTIC THEORY
EARTH
WORKS
Spatial random function

We will define a very simple probabilistic spatial model. It is defined by a univariate
and a bivariate (joint) probability law. as follows:
Starting condition:
V(0) =
0 with probability = 0.50

1 with probability = 0.50
Joint probability law:

V(x) =
V(x-1) with probability = 0.75
1-V(x-1) with probability = 0.25
The univariate law shows that the possible outcomes are 0 or 1, with equal probability.
This law is independent of the location x. Given the probabilities and magnitudes of the
outcomes, it is trivial to calculate that the expected value is:
and that the variance is:
The joint probability law shows that the value at the next location of x is the same as
the previous value with probability 0.75, or that it may take on the other value with
probability 0.25. Therefore two points at one unit distance apart have a joint probability
distribution. Note that this law does not depend on whether the previous value is 0 or
1, therefore it is independent of the x location.
Considering an adjacent pair, located at x and x+1, the set of possible outcomes is:
{(0,0), (0,1), (1,0), (1,1)}
and the associated probabilities are:
{3/8, 1/8, 1/8, 3/8}
There are two possible outcomes that we are interested in.. These are when the pair
are the same and when the pair are different values:
(0,0) and (1,1) = 3/8 + 3/8 = 0.75
(0,1) and (1,0) = 1/8 + 1/8 = 0.25
STOCHASTIC THEORY
EARTH
WORKS
which (not surprisingly!) are the joint probability values we defined. The semivariance
for h =1 is
(0.75 * 0 + 0.25 * 1)/2 = 0.125.
If we know the joint distribution of adjacent pairs, we can compute the joint distribution
between pairs separated by any separation. For example, considering the pairs at V(x)
and V(x+2) the following permutations are possible:
{(0,0,0), (0,0,1), (0,1,0), (0,1,1), (1,0,0), (1,0,1), (1,1,0), (1,1,1)}
and the probabilities are:
p=9/32
p=3/32
p=1/32
(0,0,0), (1,1,1)
(0,0,1), (0,1,1), (1,0,0), (1,1,0)
(0,1,0), (1,0,1)
if we group these according to the pairs separated by a lag of 2, we get the outcomes:
{(0,0), (0,1), (1,0), (1,1)}
and the corresponding probabilities are:
{10/32, 6/32, 6/32, 10/32}
considering the two outcomes where the pair x and x+2 are the same number or the
opposite number we get:
(0,0) and (1,1) = 10/32 + 10/32 = 20/32 = 0.625
(0,1) and (1,0) = 6/32 + 6/32 = 12/32 = 0.375
The semivariance for h =2 is
(0.625 * 0 + 0.375 * 1)/2 = 0.1875.
We can carry this process on to calculate the joint probability for any separation of
points. Given that the expected value can be computed from the probability multiplied
by the outcome and the variance is the squared difference of the outcome from the
expected value, we can calculate variance for any separation of points. The variance
plotted as a function of separation h has been calculated and is shown on the next
slide. This graph shows the characteristic semivariogram form. In this case it will
reach the asymptotically at infinite, where the sill will attain a value of 0.25 (the global
variance calculated from the univariate statistics).
To summarise the key points from this example, note that:
The expected value is only determined by the start condition

The joint probability law results in spatial correlation
There are an infinite number of possible realisations of this random function.
STOCHASTIC THEORY
EARTH
WORKS
If we draw realisations of this model, ie actually input the two probability laws to
a spreadsheet and draw random numbers according to the laws, each time we
recalculate we obtain a different sequence. Three example sequences are shown
below for the function V(x). Note that each realisation has the same univariate and
joint probability. The joint probability law results in spatial correlation : locations
close together tend to have the same value. Each of these realisations has the same
variogram.
Consider now a change of parameter in our function V(x). We will change the value of
0.75 to 0.875. We will call this function U(x). This change means that adjacent values
are more likely to be the same. However, the univariate probability law is unchanged
and therefore the expected value and variance remain the same. If we calculate the
variogram, we now find it looks like the graph shown below and if we draw realisations
we find that they have a greater degree of spatial continuity, as we would expect.
STOCHASTIC THEORY
EARTH
WORKS
Finally consider the effect of changing the joint probability value to 0.5. This means
that adjacent values are just as likely to change as stay the same. This law results in
no spatial correlation : its as though a 1 or 0 is picked at random at each new location
of x. The variogram for this function is a constant value of 0.25 for all values of x
and the realisations appear random. For an expanded discussion see Isaaks and
Srivastava, (1989) from where this example came.
EARTH
WORKS
STATIONARITY
Stationarity and ergodicity
Stationarity is the assumption that any of the statistical properties of the medium (mean,
variance, covariance or higher order moments) is stationary in space i.e. does not vary
from place to place. Note that this does not mean that the property is uniform, just
that its probability (expected value) at any point is invariant spatially. As we will see,
the assumption of stationarity is rather restricting. It is the hypothesis or decision of
stationarity which allows inference. Stationarity cannot be checked from data and is a
function of the sampling window dimensions.
Ergodicity implies that the unique realisation available behaves in space with the same
pdf as the ensemble of possible realisations. In other words, by observing the variation
in space of the property it is possible to determine the pdf of the random function for all
realisations. This is called the statistical inference of the pdf of the RF Z(x).
In stochastic terminology a phenomenon that is stationary and ergodic is called
homogenous. The term uniform is used to describe a medium in which some property
does not vary in space - which would usually be called homogenous to geologists or
geophysicists.
We use these properties along with the concept of a random function in many
scientific tasks, although we may be unaware of the implicit assumptions. For example,
calculating porosity from thin section by point counting assumes the sample is ergodic
and stationary. Moving the slide around is the equivalent to sampling of multiple slides
at the same point.
Weak stationarity is where only the first two moments are stationary. I.e. the RF
Z(x) has an expected value (mean) and a covariance which are not a function of the
co-ordinate space. The covariance is then assumed to be a function of the distance
between points. This is known as second order stationarity.
The intrinsic hypothesis consists in assuming that even if the variance of Z is not
finite, the variance of the first order increments (lags) of Z is finite and that these
increments are themselves second-order stationary. The variance of the increment
defines a function called the variogram:
which is the function of the mean squared difference in the property for points
separated by a distance h.
Stationarity is a property of the model, not of the data. The decision of stationarity
allows inference. We trade unavailable replication of the random function at a point for
another replicate elsewhere (in time/space) ie our conventional notion of collecting data
samples. This trade is the hypothesis or decision of stationarity. This decision cannot
be checked from data, is a function of the sampling window (scale of observations) and
to some extent contradicts fractal relations.
STATIONARITY
EARTH
WORKS
A simple illustration of stationarity can be given with a spreadsheet. Take a group of

cells and fill them with the formula =rand()*10. This formula instructs the program to
generate uniformly distributed random number on the interval [0,1] and then multiply it
by 10, giving a random number uniformly distributed on the interval [0,10].
However when we look at the spreadsheet we see the result of the computation, a
random number is displayed in each cell. The formula has the same expected value in
each cell (5) but the display shows a realisation of this formula the random number.
To an observer who is not allowed to examine the formula in any cell, they can only
assume the expected value is the same in every cell and approximate it by taking the
average of the results. To generate a new realisation (new universe) we can press the
recalc button. The result is a different array of random numbers.
Most inversion examples demonstrating a particular contractors methodology are based
on stationary models. These are the easy ones to solve. The real problem with
inversion is the reasonableness of the stationarity assumption and subsequently the
expected value predicted away from the well.
STATIONARITY
EARTH
WORKS
Global Warming
The stationarity argument is often at the root of scientific and technical disagreement.
For example, when different seismic interpreters put different maps on a wall based
on different geological concepts, they are often disputing the stationarity assumption
with each other. A more famous example is the scientific dispute over the presence
or otherwise of global warming.
The global temperature graph with time shows a steady rise of average temperature
with time over the past 100 years. It should be remembered that it is very difficult to
compute these estimates and they may not be reliable. It is also worth contemplating
that the graph is falling towards the temperature minima reached around 1893, and it
is doing so for data prior to 1880, which is not often shown although it has the same
reliability back to about 1859.
Based on these data, it has been proposed that we are in a period of climate change
(which is indisputable, given our knowledge of the geological record) and that this is a
consequence of global warming caused by a buildup of greenhouse gases. We have
to consider very carefully
Is this a real trend (nonstationary process)

Is this is just natural fluctuation around a steady long term value (stationary
process)
Even if the first model were correct, it is still a further hypothesis that the cause is
release of hydrocarbons from burning fossil fuels.
STATIONARITY
EARTH
WORKS
Those of us with longer memories will recall that in the mid 1970s a number of
prominent scientists became very concerned about climate change, based on the
average temperature data chart up to that point. In Britain, they included Sir Fred
Hoyle, then Astronomer Royal. They all pointed to the gentle downward trend in
temporature since 1940 as potentially indicative of the onset of a new ice age.
In 1976 the Australian Academy of Science prepared a report on climate change for the
Australian government. The introduction stated:
This report was prepared at the request of the Minister for Science
for advice on statements by some climatologists that the earth was
undergoing a continuing cooling trend. Examination of recorded
temperatures in the northern hemisphere suggest that, after nearly
100 years of slowly rising temperatures, there has been fall since
1940, most pronounced at high latitudes and averaging -0.3 degrees
Celsius. Combined with extreme climatic events elsewhere during
the early 1970s, notably droughts in the Sahel and the Ukraine and
failures of the monsoon in India, this fall in temperature has led
some climatologists to suggest that the worlds climate is progressing
rather rapidly towards another glacial phase, or at least another Little
Ice Age
Prominent British Scientist Sir Fred Hoyle, Astronomer Royal, wrote articles for serious
newspapers describing that we had to act quickly to prevent the ice age. He proposed
huge pumping systems to bring deep ocean water to the surface and stabilize the
temperature.
In September 1989 Scientific American devoted an entire issue to the theme Managing
Planet Earth. In introducing an article entitled The Changing Climate, it was stated
that:
The earth owes its hospitable climate to the greenhouse effect but
now the effect threatens to intensify, rapidly warming the planet. Rising
concentrations of carbon dioxide and other gases are the cause. The
danger of warming is serious enough to warrant prompt action.
(The above are taken from McTainsh and Boughton, 1993, p142).
In just 15 years we had swung from ice age to runaway heating. Even the language is
the same: extreme weather events are to be expected.
If we remain objective, we can see that this is an argument about stationarity
assumptions. If we look through different length time windows, we may draw different
conclusions. For example, the Intergovernmental Panel on Climate Change report
in 1995 which stated there is a discernible human influence on global climate and
formed the basis of the Rio Summit) showed the temperature graph deduced for the
previous 1000 years. The Little Ice Age during the 17th century is clear, as is the
Medieval warm period, when temperatures were several degrees higher than they are
today. Based on this, some scientists might consider a stationary temperature model,
with fluctuations around a mean temperature similar to today.
STATIONARITY
EARTH
WORKS
If we take an even longer view, we might consider the gradual temperature decline of
the past 3000 years shown by the data of Keigwin (1996). A stationary temperature
assumption looks improbable, and a warming trend highly implausible.
EARTH
WORKS
ESTIMATION - KRIGING
Mapping and Prediction
Mapping is all about estimating the value of an attribute at an unmeasured location.
It is not the same procedure as undertaken by cartographers, whose objective is to
represent on a sheet of paper as accurately as the scale allows that which they already
know.
In order to make an estimate an assumption must be made concerning a suitable
predictive model. All estimation or prediction requires the use of a model. Many
commercial gridding packages use arbitrary models which are not visible to the user.
In selecting a model it is important to consider the processes which have given rise to
the phenomenon under consideration. For some processes the underlying model may
be so well understood that only a few sample values are required in order to estimate
the values at unmeasured locations. In this circumstance a deterministic model is the
most appropriate.
Very few earth science processes are sufficiently well understood to allow the use of
deterministic models in estimation. Instead we must accept that there is uncertainty
about the behaviour of a phenomenon between sample locations. A random function
model as used in geostatistics recognises this uncertainty and provides the tools to
estimate values at unmeasured locations based on assumptions about the statistical
characteristics of the phenomenon.
As well as estimating a value it is also desirable to know the quality of the estimate.
Without an exhaustive reference set (which would obviate the need for prediction!) the
goodness estimate is rather qualitative and depends greatly on the appropriateness
of the underlying model. Models are neither right or wrong : they are appropriate or
inappropriate.
Methods of point estimation, such as polygons, triangulation or inverse distance
methods, may be best, that is predict the true value, depending on the estimation
criteria. All of these estimation methods are linear and theoretically unbiased.
However, of all the linear estimators, kriging is known by the acronym B.L.U.E., which
stands for best linear unbiased estimator. Ordinary kriging is linear because its
estimates are weighted linear combinations of the available data; it is unbiased since it
attempts to have the mean residual, or error, equal to zero. So why is kriging best? It
is best because it also aims to minimise the variance of errors.
Actually, these aims (mean residual equal to zero and minimisation of error variance)
are not attainable for a sample dataset for which there is no final reference. Because
we never know the mean residual, nor the error variance, we cannot guarantee that we
have made them exactly zero, or minimised respectively. The best that can be done
is to build a model of the data and work with the average error and error variance of
the model. This is done in ordinary kriging by using a probability model in which the
bias and error variance can both be calculated, and then we choose weights for nearby
samples used for estimation to ensure that the average error for our model is exactly
zero and that the modelled error variance is minimised.
EARTH
WORKS
At every location we wish to predict a value, we will estimate the unmeasured value
using a weighted linear combination of the available samples. This set of weights will
change as we estimate new unknown values at different locations.
First we assume that for every point at which we want to estimate a value our model
is a stationary random function that consists of several random variables, one for each
surrounding sample used for estimation, plus the sample location to be estimated.
Each of these random variables has the same probability law and any pair of random
variables has a joint distribution that depends only on the separation between the two
points, and not on their locations (i.e. stationarity). The covariance between pairs of
random variables separated a distance h is Cv(h).
EARTH
WORKS
Without going into the derivation (which we are trying to avoid in this course), based on
the assumption of a stationary random function, it can be shown firstly that in order to
arrive at the unbiasedness condition, the expected value is set to zero and the result of
this is that the sum of weights to ensure unbiasedness equals 1. Next, it can be shown
that to minimise the error variance, we can express the variance of error as a function
of the covariances of the pairs of points for estimation. This gives an expression
for the error variance based on the random function model parameters variance
and covariance as a function of
n variables - the weights for each
sample to estimate the unknown
location value.
The minimisation of a function
of n variables is conventionally
accomplished by setting the n partial
first derivatives to 0. This produces
a system of n equations that can be
solved by any standard procedure
for solving simultaneous linear
equations. However, for ordinary
kriging, we imposed the additional
restriction that the set of weights
must sum to 1 and so we need to
convert our constrained minimisation
problem to an unconstrained one.
Without the constraint we have n
equations and n unknowns, but
the constraint means we have n+1
equations, but still only n unknowns. Solution of this problem is achieved using the
Lagrange Parameter, a conventional mathematical solution. The result is that we
have to solve n+1 simultaneous linear equations that will produce the set of weights
that minimises the error variance under the constraint that the weights sum to 1. In
addition, we obtain the Lagrange Parameter value, and this is used in calculating the
resulting minimised error variance.
The final set of equations to be solved are known as the ordinary kriging system, and
they describe the set of weights that minimise the error variance under the constraint
that they sum to 1 using the following n+1 equations stylised on the slide above
(actually, simple kriging is shown), where w are the weights, ij are the pair combinations
of known samples, m is the Lagrange Parameter, C is the covariance and 0 subscript
denotes the sample to be estimated.
EARTH
WORKS
So, in English, to estimate a sample we construct a table of covariances between each

pair of samples ij, and a vector of covariances i0.
The purpose of kriging is to obtain an estimate of a linear function of a Random

Function Z at any point.
Kriging provides Best Linear Unbiased Estimates defined by a linear

combination of available data.
The weights are determined by the requirements that the results are not biased
and that the error variance is minimised.
Kriging is an exact interpolator : at the data points the kriging results coincide
precisely with the measured values.
The weighting factors and the kriging estimate do not depend on the sill of
the variogram.
The variance of estimation is directly related to the sill of the variogram.
We do not need to know the exact values of the data points themselves to
determine weighting factors. To one sample pattern there is one kriging matrix.
Finally consider the estimation of the value at the centre of a circle from four samples
on its perimeter. If the samples are equally distributed around the perimeter then we
will logically assign equal weight to each sample. Both inverse distance and Kriging
would assign the total weight to each samples.
EARTH
WORKS
However if the samples were arranged such three are grouped very closely, then it
would be illogical to assign the weights in the same manner. The tightly clustered
samples carry very little independent information. However inverse distance weighting
pays no heed to the inter-sample distances, only to the distances to the unknown
point and will still assign of the weight to each sample. Kriging accounts for the
dependency and will ensure approximately the weight is assigned to the lone sample
and the remaining half shared amongst the clustered samples, each receiving 1/6
each.
Simple kriging
So far we have really been describing ordinary Kriging, which is the general form of
Kriging. In addition, we should mention simple kriging. Simple Kriging differs in that it
does not require that the sum of weights sum to 1 but does require strict stationarity
of the mean. The set of equations solved involve a matrix of (n)x(n) covariances and
a vector without the Lagrange Parameter. The weights are still determined to minimise
the error variance. In practical terms, simple kriging is the same as ordinary kriging,
but the extra row and column of 1s is omitted from the covariance matrix before solving
for the weights.
Block kriging
Most of our technical consideration so far has been concerned with the estimation of
point values. Sometimes, though, we may prefer to estimate the average value of a
variable within a block, or local area.
One way of estimating the average value of a block is to estimate many points within the
block and then average them. This is effective, but is not very efficient. Block kriging
achieves the same result by an elegant adaptation of kriging.
Basically, the kriging system is modified to compute average covariances between the
sample points and the block. This only requires changing of the covariance vector D,
not the C matrix. The covariance required is the average of the covariances between
the samples and every point within the block to be estimated. In order to establish the
average point-to-block covariances, we require just a few points to be estimated within
the block, perhaps 4. This means that block kriging is very efficient and convenient
direct estimation, a feature not shared by other estimation methods. Block kriging
should be considered for use more frequently than it is.
Neighbourhood search
A critical choice for any estimation process is to determine how many nearby samples
or neighbours should be used for predicting the value at an unsampled location. For
kriging, this will determine the number of linear equations that need to be solved for
though matrix inversion.
A neighbourhood choice containing a large number of samples will be slow for kriging,
although this is unlikely to be a prohibitive factor with modern computing power.
However, selection of a large neighbourhood with a sparsely sampled data set will also
have a significant smoothing effect on the result. Similarly, small neighbourhoods will
EARTH
WORKS
have less of a smoothing effect, but if local values are erratic, may not result in reliable
estimates. Therefore, the choice of neighbourhood size will depend on the data type,
the smoothness of the underlying spatial continuity, the data sampling interval relative
to the gridding interval, the clustering of the data and many other factors.
The choice of neighbourhood search parameters is a very important decision in
estimation.
Cross Validation
Cross-validation is a technique which is often used both as a means of comparing the
results of different neighbourhood choices, before selecting final parameters for kriging.
It is a useful tool, but its importance is often over-emphasised, and its results can
sometimes be extremely misleading.
Cross-validation involves removing a single point from the data set at a time and then
re-calculating it using only the remaining points and the selected kriging parameters
and covariance model. The kriging estimate is then compared to the original value
and the difference calculated. By repeating the procedure for every point, successfully
removing and replacing them in the kriging, a statistical summary of the differences
between all the points and their kriging estimates can be generated. Based on a criteria
such as minimising the error difference between the samples and the kriging estimates,
a neighbourhood choice is then determined.
Unfortunately, cross-validation can be adversely affected by the data arrangement, for
example clustering, and so the results of a cross-validation exercise are subjective and
require careful interpretation. Notwithstanding these reservations, a properly applied
cross-validation study is an important component of any geostatistical analysis.
Variogram modelling
We can see that a prerequisite of kriging is to obtain a variogram which is
representative of the spatial continuity of the phenomenon being studied. In order
to obtain a covariance value for any separation h we need a continuous covariance
function. We can obtain this by fitting a model to our variogram and then using the
model to define the covariance function at any separation by calculation from the model
parameters.
In fitting a model, certain constraints are applicable. If we simply interpolated between
the points on the experimental variogram, the solution of ordinary kriging equations
derived using these numbers may not exist, or if it does exist, may not be unique.
In order to ensure a stable matrix solution in the kriging, all the models used to fit
variograms must have the property of positive definiteness.
Many positive definite models are available with which to describe the variogram shape
for use in kriging. For most purposes, the set from which to select either individual
or combinations of models is quite small, and these standard models are described
below. In addition, a wide range of more exotic models are also available, but their use
is in special circumstances.
EARTH
WORKS
The Exponential model is a common transition model. It reaches its sill

asymptotically and so the practical range is defined as the distance at which the
variogram value is 95% of the sill. This model is linear at short separations, but
rises more steeply and flattens gradually. This model is appropriate for quite
rough phenomena, such as well logs.
The Spherical model is probably the most commonly used model of all. It
has a linear behaviour close to the origin, but flattens off at larger separations,
reaching a sill at the range a. This is an intermediate roughness model. If in
doubt, use this model.
The Gaussian model is also a transition model and is appropriate for the
modelling of extremely continuous (smooth) phenomena. Like the exponential
model, this also reaches its sill asymptotically and the practical range a is
defined as for the exponential model. The Gaussian model is parabolic close
to the origin and has a flexure. Gaussian models should always be used with a
small nugget effect to guarantee stability of the kriging solution. The Gaussian
model is generally not recommended. The presence of a Gaussian model form
on an experimental variogram often indicates some form of smoothing process
has already been applied to the data.
The Linear model is not a transition model since it does not reach a sill, but
increases linearly with h. Sometimes useful as a nested component or for quick
and dirty results.
The Nugget effect model is sometimes not regarded as a model at all. A

nugget effect model can be thought of as a moving average type of filter, based
on the statistics of the local samples used in estimation. A very common nested
component.
EARTH
WORKS
Nested example (1)

* Nugget model +
-sill = 0.4
* Gaussian model
-sill = 1.0
-range = 10
Nested example (2)

* Nugget model +
-sill = 0.4
*Spherical model +
-sill = 1
-range = 5
*Exponential model
-sill = 1
-range = 1-0
Anisotropy
Anisotropy can be identified in variograms. Anisotropy can take one of two forms
geometric
zonal
Geometric anisotropy is where the variogram computed for different directions has the
same sill but the range is interpreted as different. This can be viewed as an elliptical
spatial correlation. The longer range direction will be in the strike direction of any
spatial features in the dataset.
* Direction 1
-sill = 2200
-range = 12
*Direction 2
-sill = 2200
-range = 5
* Nugget
-sill = 300
EARTH
WORKS
* Direction 1
-sill = 1600
-range = 7
* Direction 2
-sill = 2200
-range = 7
* Nugget
-sill
Zonal anisotropy is often described in mining examples and corresponds to the situation
where the variogram in one direction has a higher sill than that in another direction.
This may occur because one direction of the variogram is across layers or zones,
whereas the other direction(s) is/are within a layer or zone. Zonal anisotropy is really
an artefact of the way zones are mixed and it is difficult to conceive of a property which
is implied to have some form of independence between different azimuths. A trend is
often mistakenly interpreted as zonal anisotropy.
Trend
The expression of a trend (or drift) in variograms is a source of much misunderstanding
amongst lay geostatisticians. A trend is frequently confused with anisotropy. The two
concepts are completely unrelated. A trend is where the values of the attribute show
a correlation with the coordinates of the samples. The correct identification of a trend
is important when considering the stationarity assumption to be used for the spatial
correlation model.
The presence of a trend causes the variogram to rise very rapidly. In the presence
of a trend, variograms are not a suitable choice of spatial analysis tool. We are no
longer in a situation where the intrinsic hypothesis applies and our simple stationarity
assumptions are clearly violated. Either the trend should be removed during the
analysis or higher order fitting methods specifically designed to deal with the trend in
the spatial correlation model should be used.
Semivariogram
tail:TWT
ms head:TWT
2000.
1500.
1000.
500.
0.
0.
4000.
8000.
Distance
12000.
ms direction 1
EARTH
WORKS
By the far the most practical solution is to remove the trend, undertake the
geostatistical analysis and prediction on the residuals and then, after prediction,
add the trend back again. A number of geostatistical algorithms actually work in
this way. To estimate a linear trend, apply multi-linear regression of the variable
against the coordinates to obtain the regression coefficients. Compute the trend
from the coordinates using the coefficients and subtract this from the data. Work with
the resulting residual and then add the trend back after final prediction (kriging or
simulation).
The alternative is to use higher order spatial model fitting techniques, so called irf-k
or intrinsic random functions of order-k. For variograms, k=0. With a linear trend,
k=1. Some programs can also work with k=2. The only commercial software package
capable of explicitly working with these non-stationary problems is Isatis, produced
by Geovariances. The fitting is purely statistical (no graphics) and the benefits over
simply subtracting the trend and working with the residual are marginal. For a further
discussion of irf-k, see De Marsily (1986).
The following components of variogram models must be considered when fitting to
experimental variograms:
Slope at the origin

Nugget effect
Range
sill
Anisotropy
A discussion of the effects of the model type (primarily determined by the slope at the
origin and the nugget effect) will be found later. However the following diagram shows
the consequence of kriging a sparse data set using different range spherical models.
EARTH
WORKS
DEM example
The data shown here are taken from a digital elevation map (DEM) of the south east
of England. The original data was kindly supplied by Nigel Press Associates Ltd from
their EuroDEM datset. The DEM is supplied sampled on a 100 x 100 m cell size, this
has been re-scaled to give a 20 x 20 m cell size by simply dividing the coordinates in
X & Y by 5. An area of interest (AOI) has been chosen to simulate a tilted fault block
trap and is approximately 10.8 x 6.6 km in extent. This is shown as the white rectangle
on the accompanying figure. The elevation data has been shifted to a depth of about
2000 m. The grid definition for the AOI is
x origin 103400
y origin 28200
x mesh 100
y mesh 100
x nodes 109
y nodes 67
A perfect 3D migration with a 100 m Fresnel zone (50 m radius) has been simulated
by applying a 5 x 5 (25:1) moving average filter to the full dataset. The parameters
are equivalent to a velocity of 3,000 m/s and a dominant frequency of 60 Hz (Yilmaz,
1987). An enlargement of the AOI showing the raw and 25:1 smoothed data is shown
for comparison.
(Data courtesy Nigel Pess Associates. EuroDEM NPA 1997)
EARTH
WORKS
2D seismic acquisition has been simulated with seismic lines running in a grid pattern
approximately N-S and E-W. The nominal line spacing is 2 km with an along line
sample increment of 250 m. An exact fault line defining the updip closure for the raw
data has been added. This is simply the 2,000 m depth contour along the southern
edge of the escarpment.
Raw DEM Data
25:1 Smoothed Data
Exhaustive variograms for the raw and 25:1 smoothed data have been computed and
displayed. The exhaustive variograms appear very similar at first glance, apart from
a slight reduction in maximum variance. However, closer inspection of the short lags
EARTH
WORKS
shows the effect of smoothing is significant in changing the short offset behaviour
of the variogram very significantly. The inflection near the origin (like the Gaussian
variogram model) is characteristic of a smooth phenomenon. Variograms of seismic
time horizon data clearly exhibit a similar behaviour, even when snapped picks are
used. It is suggested that this is a Fresnel effect and it would be expected to be more
significant for 2D than for 3D seismic, due to migration of the former being unable to
collapse the out of plane Fresnel zone
Support is a sample volume concept. Two measurements are at the same support
if they sample at the same volume size and dimensions in X, Y and Z. Sampling at
different support sizes results in some well known effects (which were noted by Krige,
back in 1951). Specifically, as the support increases so the variance decreases.
Using the variogram and the pseudo-2D seismic lines, the depths have been estimated
on a regular grid of points. The kriged map is the expected value at a given point, and
with it we obtain the standard deivation map which indicates the confidence we have in
the estimate. The standard deviation map follows the data sampling arrangement. Our
confidence in the prediction is high close to a seismic line and reduces away from the
data control.
EARTH
WORKS
Kriging
Standard Deviation
Based on these results, we can test the accuracy of our depth predictions against the
original satellite DEM data. The results are as follows
5x5 vs Raw
Kriged vs 5x5
Least squares
Inverse distance
MAE (m)
5
11
14
13
MSE (m)
8
16
19
19
Variance (m2)
61
248
342
351
EARTH
WORKS
Clearly all methods give reasonable prediction accuracy but Kriging is best in that it has
minimised the variance of error.
Next we will look at the gross rock volume prediction accuracy based on a cutoff of
2000 m.
The volumes are:
Raw DEM
5x5
9x9
Kriged
Least squares
Inverse distance
Support (FD, Hz)

265
55
30
55
55
55
GRV (m3 x 103)

296
261
227
201
186
121
Relative to Raw GRV %

0
-12
-23
-23
-29
-54
Clearly we have serious bias in the volume estimation. The next section will seek to
explain what has gone wrong and how we can correct it.
EARTH
WORKS
STOCHASTIC SIMULATION
Uncertainty is subjective. It is not possible to quantify uncertainty, but we can use
geostatistical tools to explore uncertainty. Although we obtain a confidence interval
from kriging (the standard deviation map) there are aspects of its interpretation which
should be considerd carefully. Firstly its magnitude is solely a function of our model
the sill of the variogram to be specific. This is often a subjective decision.
If we accept our kriging error variance maps as defining meaningful values, then we
could obtain an upper and lower bound for, say, the depth to a horizon by simply adding
or subtracting the confidence limit from the kriging estimate (our best estimate) to
obtain a minimum and maximum depth value for the horizon. Having done this, it may
be tempting to subsequently use the resulting best estimate, minimum and maximum
95% depth confidence maps to derive three volumetric scenarios - best estimate, lower
case and upper case.
An example of this approach is given in the DEM example. The kriging result has a
volume of about 190 Mm3 and adding and subtracting 1.28 standard deviations of depth
(giving P90 and P10 depth maps) results in GRV estimates of about 80 - 400 Mm3 ,
depending on the model chosen. The true volume is 261 Mm3, which makes the result
rather subjective and prone to interpretation as to whether the kriged map is a mean,
most likely or P50 map.
So what is the problem? Nothing really, except that we are making an assumption
which is not true. The error variance is related to the confidence of obtaining a depth
value. In other words, it is a confidence interval around an estimated (predicted)
depth. The estimation of a volume or area is a non-linear function of the error of depth
estimation.
Kriging (and other methods of mapping found in commercial software packages) is
a linear, unbiased estimator. The estimate is linear because it is a weightd linear
combination of sample values. It is just like the mean or average calculation but with
the weights adapted to account for the spatial positions of the samples. The resulting
prediction (expected value) is smoother than the actual values and represents our best
estimate of the true surface at that point. It is not the surface itself, which is unknown.
Kriging, like the mean, is unbiased. This means that the average error is about zero,
with positive and negative errors cancelling out.
EARTH
WORKS
If we plot the CDF of the data and of the kriged surface we see that the effect of
smoothing is a CDF which is narrower and steeper. The calculation of a volume does
not use all the data: it is an integration between limits of the height of each grid node
above the OWC. Depths below the OWC are ignored. Inspection of the detail of the
CDF shows that the kriged result has fewer samples shallower than the cutoff and that
their average value (column height) is also smaller. Multiplying the two gives the vlume
(each grid node is 1x 104 m2)
%<2000m
Average
Column (m)
GRV (Mm3)
Raw Depth
13.3
30.5
296
Depth 25:1
13.2
27.1
261
Kriged
12.1
22.8
201
The problem lies in the non-linearity of estimating the value above a cut-off from
a surface estimate. Consider the problem in the diagram above. On the left, the
uncertainty never impinges on the cut-off and so the volume is correctly estimated,
but in the example on the right the volume will be zero for the deterministic case and
+0.25 for a Monte Carlo simulation. The deterministic (smooth) case will always
underestimate the volume (bias).
EARTH
WORKS
.
In the cartoon above, a series of measurements have been made of the sea depth
(yellow dots) between London and New York. A bathymetric cross section has been
constructed by interpolating between these points (kriging, shown in green). The error
envelope around the kriging is shown as the string of sausages (red dashed line) and
is zero at the data points. In order to navigate a submarine between London and New
York as close as possible to the bottom, the kriged result is the correct line to follow,
although a risk averse captain will want to take account of the error and pilot a course
closer to the shallower error line.
However, consider measuring the length of the kriged line, or even the error surfaces,
and using the longest of these lengths to establish the length of cable required to lay
a telephone line between New York and London. Clearly the estimate will always be
too short and the ship run out of cable about where its position is shown in the cartoon.
Different estimations require different methods of prediction.
Another analogy of the same type of problem is the classic measurement of the length
of coastline of the mainland UK. The length of the coastline depends on the length of
unit measurement used to determine it. As the unit of measurement decreases in size,
so the length increases. This is now a well known example of a fractal relationship,
and the idea is closely related to the subject of simulation. This is known as Steinhaus
paradox
Using a high side and low side map based on smooth estimation can never give
correct volumes. The variogram of the kriged surface is not the same as the modelled
variogram estimated using the observed data. The basic RF assumption of ergodicity
implies that a smooth mapped surface cannot be a possible realisation.
Secondly, to assume that all depth values are positive or negative at the same time is
implausible - its extremely improbable that all the depths are simultaneously high or
low. Therefore the high/low volume estimate approach must exaggerate the range of
GRV
The common method of determining the spill point and then using higher and lower
contours on the same deterministic map is tantamount to assigning a degree of fill
range as being the volumetric uncertainty range. This is erroneous.
In order to correctly estimate volume we need to use the geostatistical technique of
conditional simulation. This requires that we generate possible surfaces which honour
the observed data points and the variogram model. This cannot be achieved uniquely,
and so we are required to generate many equi-probable realisations of the depth map,
estimate the volume for each and construct a GRV expectation curve. Additionally, by
converting each grid node to a Boolean result of above the OWC, and adding many
realisations together, we can also compute the isoprobability closure map, expressing
the likelihood at each node of being above OWC.
As in the simple case of tossing a coin or in the random function examples shown
earlier, we may not be interested in the expected value but in an outcome or realisation
of the variable. In the case of depth, this is the case if we wish to estimate volume
above a cutoff. The reason is that the estimation process obtains accuracy at the
expense of the higher order statistics. This means that the CDF is not reproduced.
Instead, the CDF of the estimate is narrower and steeper than the that of the raw data
(which is a realisation). However, the CDFof the estimate is centred such that it is
unbiased.
Six Example Realisations for the DEM data
Stochastic Simulation
WORKS
EARTH
EARTH
WORKS
Unfortunately if we wish to investigate a property such as volume, we truncate the CDF

at a threshold and only consider part of the curve. The predictions will then be biased.
The solution is to work only with realisations for any procedure that requires the CDF
to be truncated. Because each realisation is not unique, we must generate many
realisations, analyse each one and then look at the average of the computations.
In the DEM example, we can generate realisations of the depth map. 100 maps were
computed, and six are illustrated. We compute the gross rock volume for all 100 maps,
sort the results into magnitude order and plot the CDF of GRV. The average of these
volumes is then our best estimate of the GRV.
In addition we can estimate the probability of a grid node being above closure. For
each map we test each node. If the node is above contact we set it to a 1, otherwise
to 0. We then sum the 1/0 maps and obtain the probability of the nodes being above
closure. This is known as the isoprobability closure map.
EARTH
WORKS
Simulation Algorithms
There are three important criteria to be met in generating acceptable realisations,
which are:
Conditional (tied) to measured data points such as wells

Reproduce the CDF (which will ensure the mean, variance, histogram and other
statistics are honoured)
Reproduce the spatial correlation behaviour represented by the variogram
model.
For the continuous variable types considered so far in this course, there are three
basic algorithms by which a realisation can be generated. All of these methods are
maximum entropy solutions. In other words they generate a realisation that is the
most disorganised result whilst still reproducing the required spatial correlation function
defined by the variogram model.
Turning Bands
Turning bands is the oldest algorithm implemented for the generation of conditional
realisations. It is based on exact 1D simulations along lines which constitute a
regular partitioning of a 2D or 3D space. The method is very fast, but generates nonconditional realisations. Each realisation is then conditioned to the actual data points
by extracting the simulated values at the positions of the original data. These values
are kriged and the resulting surface subtracted from the non-conditional realisation
giving zero values at the data points and a simulated residual everywhere else. The
simulated residual is then added to the surface obtained by kriging the real data points
to give the conditional realisation.
The additional kriging step makes turning bands comparable in speed to other methods
when used for conditional simulation. Turning bands suffers from a tendency to
generate linear artefacts (related to the lines constructed to partition the space) and this
can be problem. The solution is to increase the number of lines used for partitioning
but this can slow down the method sufficiently as to be impractical.
Spectral Methods
A modern and very fast alternative to the turning bands method is to use a fast Fourier
transform (FFT). The covariance function, which is the spectral density in xyz space
equivalent to the autocorrelation function in geophysics, is transformed using the FFT.
In the frequency domain the covariance is than the amplitude spectrum with a zero
phase spectrum. A non-conditional realisation is obtained by adding random phase
and performing an inverse FFT. Conditioning is by klriging, as in Turning Bands.
By saving the weights for the kriging, the method can be made very fast. It also avoids
the artefacts of turning bands.
EARTH
WORKS
Sequential Gaussian Simulation

SGS is the most straightforward algorithm for generating conditional realisations. A
simple sequential method was encountered when we constructed the random function
models V(x) etc. SGS is now regarded by most vendors and geostatisticians as the
standard method of generating conditional realisations.
The method is very simple. Randomly select a grid node and krige using the input
data. This will give an expected value Z* (the kriged result) and a variance 2 (or
usually the standard deviation). Assuming a Gaussian distribution, draw a random
number from the distribution N[Z*, 2] and place the random number on the grid.
Repeat the above procedure at a new randomly selected grid node. However, when
kriging the next position, the previously simulated node is included as though it were a
real data point. Thus the realisation is sequentially generated and converges to have
the appropriate CDF and spatial correlation function.
Clearly, being based on kriging, SGS is conditional by construction. There are detail
considerations in the algorithm. For example, the speed is usually improved by moving
the real data onto the nearest grid as the first step. Including too many simulated
nodes will swamp the original data, resulting in a simulation with uncorrelated noise,
so the number of previously simulated is commonly a user-defined control parameter.
The method is strictly only valid for simple kriging and so some initial normalisation
steps are usually required to satisfy this condition. Finally, the long range behaviour
of a variogram model is best reproduced if the early grid nodes randomly selected are
forced to be far apart.
Simulated Annealing
Although rarely used for something as straightforward as a conditional realisation
based on a variogram model, simulated annealing is a very adaptable (but slow)
method which deserves a mention.
To generate a conditional realisation on a regular, usually the original data are moved
to the nearest grid nodes and then the remaining grid nodes filled in with random
numbers so as to recreate the desired CDF. However, this then results in a realisation
comprised entirely of uncorrelated random noise.
The conditioning to a spatial function defined by the variogram model is generated
by a process rather similar to how materials cool. Firstly, the variogram of the grid is
computed and the difference between this and the required variogram is calculated.
The required variogram is known as the objective function. Next, two grid nodes are
selected at random and their values swapped over. (Grid nodes where real data
values are were placed at setup are disallowed in this selction.) Afte the swap has
been made, the variogram is recalculated and compared to the objective function. If
the result of the swap is to reduce the difference between the calculated variogram and
the objective function then the swap is retained. Otherwise the swap is rejected and
the values returned to their previous positions.
The procedure of swapping and comparison with the objective function is repeated until
an acceptable match between the calculated and desired variogram is obtained, at
which point an acceptable realisation is generated.
EARTH
WORKS
The method is very slow and can suffer from periodic artefacts, depending on how the
calculations are made. For straightforward generation of conditional realisations, any
of the alternative methods are preferable. The specific advantage of the simulated
annealing technique lies in its flexibility. Providing an objective function can be defined,
any information can be used for conditioning.
Variogram model influence

If we map a small number of data points using different variogram model types, we
obtain maps of the expected value. As shown in the accompanying diagram, these
maps are quite simple in form and fairly similar, all comprising of bullseyes around the
data points, dropping away to the mean value at greater distances.
However, if we generate a realisation from each variogram type the results are very
different. The realisation generated from a nugget variogram model comprises
uncorrelated noise which is standard Normal in its distribution. Then, in varying
degrees of smoothness, we progress through the exponential, spherical and Gaussian
variogram models. The PDF in each case is standard Normal, what differs is the
organisation of the points. If we cut the whole map at a given threshold, every map
would give the same volume (approximately), but the connectivity between the islands
where the samples are above the contact would be very different. The exponentional
would give rise to a large number of small, disconnected islands whereas the Gaussian
would result in a few large islands. The spherical model result would lie between these
two cases. This can be also looked at in the cross-sections through the realisations.
Kriging
Simulation
EARTH
WORKS
Cross Sections
Connectivity
We can measure connectivity through the use of multipoint statistics. The variogram
is a two-point statistical measure. Journel et al (2000) discuss the difference between
Kriging and simulation in terms of connectivity.
A semivariogram or covariance model represents a measure of variability or
connectivity between any two points u and u+h separated by a given vector h (twopoint statistics). We can define connectivity by looking at indicator values, where a
threshold Zc is exceeded or not. We can compute the probability of n locations along a
given direction to be all valued above the threshold using:
The greater the spatial entropy of the field, the faster the decrease to zero of the
probability P(n;Zc) as n increases. Because of smoothing, kriging will generate
surfaces with too large P-connectivity functions for certain threshold values Zc. In
other words, calculating connectivity through any result which is Kriging-like in its
computation (ie a minimisation) will result in connectivity being significantly overestimated.
If we consider these statements along with the underestimation of volume previously
described them we have the following, apparently contradictory consequences of
EARTH
WORKS
predicition under uncertainty:
The volume of an accumulation is underestimated

(for a depth /OWC case)
The area of an accumulation is underestimated
The accumulation appears more contiguous/connected

than it really is
The last point can be illustratd with 2D and 3D seismic mapping. Structures or fields
mapped with 2D seismic generally appear more compartmentalised when subsequently
mapped with 3D seismic.
EARTH
WORKS
DETERMINISTIC, STOCHASTIC AND BEST ESTIMATE

With the adoption of geostatistics and stochastic modeling it has become very
popular to refer to conventional maps as deterministic and to describe the results
of geostatistics as stochastic. Deterministic maps which have been calculated with
different parameters (eg different depth conversions) are sometimes referred to as
realizations. This is incorrect.
Deterministic refers to a model from which predictions are determined directly

via a functional relationship. Examples are Darcys Law, chemical reaction rate
and Newtons laws of motion
Best Estimate is a model which minimises the prediction error. This includes
kriging, regression models and Wyllies Equation.
Stochastic is a model which generates non-unique solutions and honours
higher order statistics such as spatial correlation and the histogram.
All estimation or prediction requires the use of a model. Very few earth science
processes are sufficiently well understood to allow the use of deterministic models in
estimation.
Deterministic model
A functional relationship is known. Output results are directly determined from

input parameters via a process model.
Error on output is proportional to error on input
Few (no?) earth science processes can be modelled deterministically. Yet.
Deterministic reservoir modelling schemes are the subject of research and
development.
Best estimate model
A model which minimises the prediction error.

Prediction accuracy is obtained by estimator shrinkage - output range is less
than input.
Estimator shrinkage means prediction accuracy is at the expense of higher
order statistics - reduction of variance
Best estimate models are for prognosis, not higher order statistics.
Origin of uncertainty
Uncertainty arises because of
Imprecise measurement, which is often disregarded, although it can be

specified in variograms via the nugget component
Non-exhaustive sampling, which is the uncertainty we associate with spatial
interpolation. This is the principal objective of kriging.
Uncertainty = (Z* - Z) and is therefore not quantifiable. In other words, it is
the difference between a known and unknown value, which by definition is not
known.
EARTH
WORKS
Selection of an appropriate method to predict the unknown value of an attribute

at an unmeasured location. The method of analysis should capture the essential
features of the data for prediction. It should not necessarily treat the data as
exact, nor should unnecessary cosmetics that may have no physical basis, such
as smoothness, be imposed on the prediction method. (Laslett and McBratney,
1990)
Uncertainty arises from the decision of a model.
Cui and Blockley (1990) suggest that uncertainty contains three distinct components :
fuzziness, incompleteness and randomness (FIR).
Fuzziness is the imprecision of definition. It occurs because of the way we choose
to express the parameter with which we are dealing. For example, we could describe
top reservoir depth in a well as :
(a)
Very deep
(b)
3,200 m
(c)
3,226.59 m
These three descriptions have varying degrees of precision associated with them. The
level of description (or fuzziness) must be chosen appropriately for the problem so that
extremes of spurious precision or unhelpful vagueness are avoided.
The traditional way of expressing a crossplot is by fitting a function (often a straight
line) through the points by a regression analysis. The function then expresses the
relation but the obvious scatter has to be accommodated. This is done by modelling
the difference between the function and the actual measurements as a random
quantity.
Incompleteness exists because there is that which we do not know. Since a model
is by definition not the reality, it is only partial and is incomplete. Incompleteness falls
into two groups : that which cannot be foreseen, which would include the existence of
phenomena which were previously unknown, and causes which are foreseeable but
are:
EARTH
WORKS
(a)
(b)
(c)
ignored in error
ignored by choice
ignored because of lack of resource
All of the above are sometimes referred to as bias. Incompleteness resulting from lack
of resource is dramatically illustrated when a comparison of interpretations on based
2D and 3D seismic is made.
Randomness is the lack of a specific pattern in the observation or series of
measurements. If the variation in some measurement cannot be explained in terms of
a deterministic cause and effect it is usually referred to as being random. Statistics and
probability theory is about deriving patterns in populations of data where there is no
discernible systematic relationship.
Consider the depth to top reservoir in a well. If we make many measurements of the
depth under standard conditions they will vary because it is not possible to repeat
the measurement experiment exactly. If we plot a histogram of the results and fit a
curve such that the area under the curve is one then we have a probability density
function (pdf). The depth can now be thought of as a random variable. The pdf may
be represented by characteristics of the distribution such as the moments. The first
moment is the mean (expected value) and the second moment is the variance (square
of standard deviation).
Stochastic Model
Prediction accuracy is sacrificed in order to accurately reproduce the higher

order statistics such as the variance, CDF and spatial correlation properties.
Non-linear properties of input can be investigated without risk of bias. These
operations include truncating the CDF with a cutoff or fluid flow calculations.
Uncertainty can be parameterised and explored (but not quantified since this
requires knowing Z - Z*)
The realisations are equi-probable.
Best estimate and stochastic models are complementary. We choose one or other
depending on our estimation objectives: best estimate for prediction/prognosis
(linear problems) or stochastic for volumes/connectivity/fluid flow behaviour (nonlinear problems). The best estimate is the average of the (infinite) set of realisations.
Different best estimate cases produced with different parameters (for example,
different depth conversion layers) are not realisations. In geostatistics, we refer to the
best estimate as kriging - a minimum error variance estimate. In statistics, the best
estimator is the mean.
EARTH
WORKS
Best estimators and cutoffs

It was the observation that cutoff predictions based on best estimators result in bias that
led Danie Krige to his thesis and gave rise to a new science of geostatistics.
If we consider a simple cross-plot relation, such as the permeability-porosity example
shown below, we can predict permeability at unmeasured locations (such as on a
porosity log) by obtaining the linear regression line.
The linear regression equation is the best predictor of permeability from porosity
under certain assumptions, which include Gaussian distribution of error.
It is common to associate a cutoff on permeability with a porosity cutoff. In this
instance 1mD = 9.5% porosity
When we apply the porosity cutoff to the cross-plot, we are counting the
samples in quadrants B+D. This defines 92.5% of the samples as belonging
to net reservoir.
The permeability cutoff really intends to classify samples in quadrants A+B as
net reservoir, which would be 82.5%.
It is only correct to apply the cutoff procedure by transform if r 1!
Cutoff calculations require higher order statistics to be honoured in order to
avoid the risk of bias.
SUPPORT
SUPPORT
The support of a measurement refers to the volume over which the
measurement has been made. For example, a core plug porosity is made
over a volume of about 20 cm3 of rock. A log measurement, such as bulk
density, is made over a larger volume of rock, dependent on the tool
resolution and depth of investigation. Seismic data measures the earth
response as an average over a large volume determined by the frequency
content and bandwidth.
The support of a measurement is a key issue in geostatistics and accounting
for changes in support is considered a key step in mining applications of
geostatistics. Unfortunately the support question is largely ignored in the oil
industry and very few software packages provide tools for correctly handling
prediction with change of support.
If we krige a grid of values using average porosity data then strictly we are
making our estimates at the same scale of support as our measurements.
During scale-up in reservoir models, the regular support obtained from well
log measurements is often changed to an irregular support within the cells of
the model, due to varying cell sizes, well geometries and the algorithms used
for simulation and estimation.
As the support of a measurement
increases so the variance
decreases. For a linear
averaging property the mean is
generally unaffected by a change
of support.
Changes in support can change
form of the functional relation
between two properties and this
is particularly the case when the
two properties average in
different ways such as porosity
and permeability. Change in
support of seismic measurements will modify or even destroy a relation
between a seismic attribute and property such as porosity.
The figure below is from Partyka et al (2000) and shows how the relation
between acoustic impedance and effective porosity changes with effective
scale of measurement, both in seismic bandwidth and reservoir thickness.
Seismic properties scale according to a Backus average when the effective
layers are below the resolution.
The key statements we can make concerning an increase in the support of a
measurement are:

All rights reserved
SUPPORT
Modelled effects of seismic backus averaging and reservoir thickness from Partyka et al 2000
The observed variance is reduced

The histogram (PDF) becomes narrower about the mean
The CDF becomes steeper and narrower
The sill of the variogram decreases
The range of the variogram increases
The effect of the support of the volume over which we averaging will be
greatest for data which have no spatial correlation. Classical statistics then
tells us that the standard deviation of block averages is inversely proportional
to their area or volume. Data sets where extreme values are poorly
connected have high entropy, conversely where extreme values are well
connected we have low entropy. Changing support therefore changes the
apparent connectivity (entropy). Note that most geostatistical methods such
as simulation tend to maximise entropy.
The volume scale change from a well log sample to a reservoir model cell is in
the region of 500,000 : 1. Typically between 10 and 300 log samples are
averaged into a cell. This depends on the cell orientation or, more usually, the
deviated trajectory of the well passing through the cell. Scale-up in reservoir
modelling is largely arbitrary and the cells are unlikely to contain the
appropriate porosity and variability for the support defined by the cells using
currently available methods.

All rights reserved
SUPPORT
Although we will discuss scale-up in a later section, note that resampling into
cells (nearest point) leaves the support of the measurement unchanged.
Averaging is a smoothing operation and changes the effective scale of
measurement.
The scale change between core data and the reservoir model cell is very
large, perhaps as much as 100 million : 1. With properties such as
permeability it is important to remember that they do not average
arithmetically and so the relation between porosity and permeability defined at
core scale does not necessarily apply at log or cell scale.
For seismic data, the seismic attributes or impedance estimates generally
have a lower resolution than the cells of the model. For this reason seismic
generally provides us with estimates of a zone average property. Currently,
the standard algorithms in Petrel used to constrain estimation of properties
such as porosity using seismic data are incorrect.
Support Effect on Variograms
The change of support of a measurement can be observed on variograms and
is widely studied in the mining industry where ore grades are estimated at one
scale (core) and used to predict average grades over large blocks which are
actually mined out. Smoothing of data gives rise to two effects in variograms:
An inflexion appears at the origin

The range increase slightly as the support increases
From a variogram we can

estimate the missing variance
and true point sample
variogram. To the right are the
DEM variograms we met
earlier, with and without
smoothing. The smoothed
data variogram (blue) has a
lower sill and inflexion at the
origin.
Zooming into the origin, we can
clearly see the change of
support. The points marked in
yellow are the points which
when a straight line is fitted to
them gives the steepest slope
and most negative intercept
with the Y-axis. The fitted line
has an intercept of -107, the
missing variance caused by
the smoothing.
3000
2500
2000
1500
1000
500
-500
0
1000
2000
3000
4000
5000
Raw
25to1
800
1000
6000
7000
8000
1400
1600
9000
10000
Origin
1500
1000
500
-500
0
200
400
600
Raw
1200
25to1

All rights reserved
Origin
1800
2000
SUPPORT
Our true variance is estimated
by adding this to the observed
variance. It turns out to be
slight underestimate the true
variance difference is about
140.
1500
1000
500
y = 1.3584x - 107.58
0
There are a number of ways

in which observations can be
corrected for the support
effect. The basis of most
methods is to leave the mean
unchanged and correct the data about the mean using a variance adjustment
factor f, defined as the block to point variance ratio.
-500
200
400
600
800
Raw
25to1
1000
1200
Origin
1400
1600
1800
2000
Linear (Origin)
Correction methods include:
Affine correction
Indirect lognormal correction
Stochastic simulation based estimates
The affine correction is based on the data quantiles. The quantiles are
transformed (rather like a normal score transform) about the mean using the
following formula:
q =
f (q m ) + m
Where f is the variance adjustment factor. Affine correction, although simple

to apply, is not available in current reservoir modelling software.
The kriging implemented in Petrel (and most reservoir packages) is point
kriging. Geostatistics allows for block kriging, where the estimates is a linear
average over the cell volume. Changes of support are important in estimation
and ignoring them is a major limitation of current reservoir modelling software
Porosity and permeability scale changes could be handled much better if the
necessary tools were available. A particular problem is constraining
properties such as porosity to seismic data. Petrel appears to support this
through the collocated co-simulation and locally varying mean algorithms but
in fact their use is incorrect because they fail to take into account the severe
support differences between the upscaled cells used for modelling and the
seismic observations which are closer to zone averages. This will be looked
further in the petrophysical modelling section of the course.

All rights reserved
EARTH
WORKS
ENTROPY AND BAYES THEOREM

Entropy
Although geostatistics relies heavily on the variogram we should consider that variance
does not necessarily act as an adequate descriptor of uncertainty. An alternative is the
quantity entropy.
Entropy H(X) is a statistic that quantifies the intrinsic variability of some variable X and
can be computed from the pdf p(X):
Note that entropy is computed from a histogram, not from individual values.
Consider a categorical value X = {shale,sand}
If P(X) = {0.5,0.5} then H = 0.693

If P(X) = {0.9,0.1} then H = 0.325
If entropy is reduced, then there is now less disorder, less uncertainty and therefore
more predictability.
If we consider a univariate variable X = {-2, -1, 0, 1, 2}
If P1(X) = {0.03, 0.44, 0.06, 0.44, 0.03}

o Variance = 1.12
o Entropy = 1.10
If P2(X) = {0.09, 0.20, 0.42, 0.20, 0.09}
o Variance = 1.12
o Entropy = 1.44
Variance is a measure of deviation from central tendency and is not always sensitive
to uncertainty. P1 and P2 have the same variance but P2 has larger entropy and is
therefore more uncertain (Mukerji et al, 2001).
The bounded pdf with maximum entropy is the uniform distribution. If the minimum and
maximum bounds of a uniform distribution are a and b then the entropy H is ln(b - a).
This decreases to - as the interval [a, b] becomes narrower, corresponding to greater
certainty. For a given fixed variance, the unbounded pdf that maximises entropy is the
normal (Gaussian) distribution (Journel and Deutsch, 1993).
ENTROPY AND BAYES THEOREM
EARTH
WORKS
Bayes Theorem
Consider the following problem:
A man is knocked down and injured by a taxi cab in a hit & run incident (the
driver doesnt stop).
In the city where it happens, taxis are either green or blue. 85% are green and
15% are blue.
A witness reports seeing the man hit by a blue cab, but when tested later during
the court case the witness was only able to distinguish correctly the colour of a
cab 80% of the time.
What is the chance that the cab was blue?
The solution to this type of conditional probability can be found by applying Bayes
Theorem, named for Thomas Bayes, a 19th century English Clergyman. This can be
stated as:
Which means that the joint probability P(A|B) of both A and B occurring is equal to the
conditional probability P(B|A) that B will occur given that A has occurred multiplied by
the probability P(A) that A will occur. By a series of further steps (see Davis, 1986) we
can deduce the following general equation
The notation and terminology used with conditional probability is rather confusing, so a
worked example may help. We will illustrate the application of Bayes Theorem using
the taxi problem posed above.
The confusion with probabilities is usually people do not work out the outcomes
properly in the form of a tree and then work the probabilities backwards. In the
taxi example, there is a temptation to think of four outcomes, trying to decide if the
witness was right or wrong and which colour cab hit the man. In fact there are just
two outcomes. The man was hit by a cab and it can only be blue or green. The
probabilities from the proportion of cabs of a particular colour in the total population of
cabs are
Hit by blue cab (B1 = 0.15)

Hit by green cab (B2 = 0.85)
From the witness we conclude the conditional probabilities are as follows
It really was blue, P(A|B1)P(B1) = 0.80*0.15 = 0.12

It really was green, P(A|B2)P(B2) = 0.20*0.85 = 0.17
Therefore the chance it was blue = 0.12 / (0.12 + 0.17) = 41.4%, which amounts to
normalising these two outcomes to sum to 1.
EARTH
WORKS
CO-KRIGING AND MARKOV-BAYES

Full Co-Kriging
The external drift approach described in the previous is largely based on linear
regression and indeed follows many of the assumptions used in linear regression. Cokriging is the simultaneous estimation of two or more variables under an assumption of
a joint spatial model of covariance.
Recall from our basic statistical theory that the correlation coefficient is simply a
normalised expression of the covariance between two variables. If we consider the
covariance between two variables, we perhaps think of a crossplot in order to visualise
the relationship. A conventional crossplot plots the value of one variable against
the value of a second variable measured on the same sample. In other words, the
conventional correlation coefficient (or covariance) is measured on a zero lag (h=0)
between the two variables. It is therefore the first point on a cross-variogram between
the two (or more) variables. We can compute the covariance between variables for
any other lag by taking pairs a vector h apart where one value of a pair is from one
variable and the second value of the pair the other variable. In doing so we obtain
the cross-covariance (or cross-semivariogram if we choose). This describes the joint
spatial behaviour of the two variables as a function of lag. Note that cross-variogram
can be negative as well as positive, because correlation between variables can also be
negative or positive.
In addition, we also need to consider the variogram or covariance function for the
variables themselves, in the usual manner which we have seen so far in the univariate
case. Thus to map a pair of variables simultaneously we require three variograms
that of the primary and secondary variables and the cross-variogram. For three
variables we require six variograms primary, secondary and tertiary plus primarysecondary, primary-tertiary and secondary-tertiary cross-variograms.
When mapping two variables simultaneously, we use the cross-variogram to reduce
the uncertainty in mapping each variable using the other variable and use the individual
variograms in the usual way. The matrix inversion includes all these covariances
and results in weights for mapping primary and secondary variables being derived
simultaneously.
Full co-kriging is most appropriate for mapping variables such as porosity and
permeability. The procedure is as follows:
Compute variograms of primary and secondary data and cross-variogram

between primary and secondary data.
Fit authorised joint model to the variograms.
Co-krige.
No approximation is made about the correlation coefficient or spatial
dependence of the cross relation between the variables. Rolls-Royce
integration
Uncertainty of primary and secondary data is accounted for
Uncertainty of cross-relationships are accounted for.
EARTH
WORKS
There are some practical difficulties with co-kriging, in particular
Variograms of primary and primary to secondary cross-relation may be difficult

to estimate unless there is a lot of primary data.
Modelling can be difficult, especially if more than 2 variables. Apparently
acceptable models may not have kriging solutions.
If there is plenty of primary data, secondary data contributes very little and will
receive very little weight or no weight in the Kriging system.
The significance of the last point has led to a simplified co-kriging approach which will
be described next.
Collocated cokriging
It was observed in cokriging that where the secondary variable was available at the
target estimation position, then the only secondary data point which receives any
weight in the cokriging system is the value at that location, all other secondary points
have a weight of zero (which is generally logical, provided the secondary variable does
not contain a noise term). Therefore including secondary data in each estimation,
other than the secondary point coincident with the target node, is superfluous. This is
strictly applicable to problems involving surfaces or attributes obtained from 3D seismic
data.
Collocated co-kriging is a modification of co-kriging in which the secondary

variable at the target node only is retained
The method is very fast
Similar to, but more rigorous than using Kriging with external drift because a
joint model including the primary and secondary cross terms is used
The variogram of the secondary variable is not required in this approach (because
there is no requirement to estimate the secondary variable since it is required to be
known at all target locations)
Markov-Bayes hypothesis.
There are a number of practical considerations which have been noted in the previous
methods.
With 2D seismic data a collocated or external drift method is really not

appropriate as we can only estimate locations where the secondary variable is
known
Interpolating the secondary variable to all target locations and then applying
collocated or external drift methods ignores the uncertainty in predicting the
secondary variable
Variography for co-kriging can be difficult because of too few data and the fitted
models may then be somewhat arbitrary.
EARTH
WORKS
An alternative is to use a co-kriging approach but simplify the model. By assuming a

Markov-Bayes model we can achieve some of this.
Markov-Bayes hypothesis is very useful for oil industry data integration

problems
The secondary variogram is usually straightforward to compute (lots of
samples)
The primary variogram is a scaled version of the secondary variogram - the
scalar is the variance ratio
The cross-variogram is computed from the variance ratio and the linear
correlation coefficient, r.
When r = 0.999, we have kriging with external drift
When r = 0.001, we have primary variable only
The model is defined from three inputs:
Secondary variable covariance model
Variance ratio between primary and secondary variables
Correlation coefficient, r
The variograms for the primary variable and the primary to secondary cross-variogram
are then calculated as follows:
Primary variable covariance model
Cross-covariance model
The effect of the weighting function on the primary and secondary variables under the
Markov-Bayes hypothesis is shown for various correlation coeffecients. When the
correlation is effectively zero then the primary variable only is kriged, the secondary
receiving no weight. When the correlation is 1 the primary only is used at a primary
variable location, but the weight applied to the secondary builds up very rapidly when
moving only a short distance from the primary data point.
EARTH
WORKS
These diagrams were generated used a different derivation of Markov-Bayes proposed

by Doyen et al, 1996. In this method, which requires simple kriging and therefore strict
stationarity, the results are obtained by combining the primary and secondary kriging
and standard deviation maps post calculation, blending them using the correlation
coefficient and the following equations
Kriging estimate
Standard Deviation
The advantage of this approach is the computational speed is very high and the
blending, controlled by the correlation coefficient, can be done on the fly.
EARTH
WORKS
Petroleum Case Study

This case study shows the application of geostatistics to integrate a relative acoustic
impedance slice from a 3D seismic survey with porosity data at the wells. The example
is used by Geovariances as a demonstration data set. Originally the case study from
Amoco was presented in Yarus and Chambers (1994) using only the external drift
method. I am grateful to Richard Chambers for providing me with an original copy of
the full data set to work up the examples shown here.
The full data set comprises 55 wells at which porosity, thickness and porosity-thickness
are measured. A 3D seismic horizon slice through a relative acoustic impedance
volume is available.
Initially we are going to demonstrate some procedures with a subset of the data
comprising just 7 wells.
10000
8000
6000
4000
2000.
0
0
2000
4000
6000
8000
10000
We proceed to Kriging via after
Exploratory data analysis

Variogram analysis
Variogram modeling
The variogram is difficult to estimate and so we have cheated by stealing the variogram
from the 55 wells later on in the analysis.
EARTH
WORKS
The result of mapping just the seven wells is shown first, along with the confidence we
have in our estimate, shown by the standard deviation map.
Next we will integrate the seismic information. We start by finding the nearest
impedance values to each well and then cross-plotting the porosity thickness against
the seismic impedance, as we did for the external drift exercise.
EARTH
WORKS
We can test the correlation we have to see if it is significantly different from no

correlation. This can done with a calculation of spurious correlation using a Student
T test, a standard statistical test (see Kalkomey, 1997). A Java applet to make this
calculation can be found at www.sorviodvnvm.co.uk. This is very simple to calculate
in a spreadsheet as well, and the formula is given below. 3 cells are set up for
entering the values for number of wells, correlation coefficient and number of attributes
considered. A fourth cell contains the formula to calculate the probability of spurious
correlation (Psc). The calculation cell contains the following entry :
=1-(1-(TDIST((R * (SQRT(Nwells-2))/(SQRT(1-R^2))), Nwells-2, 2)))^Nattr
where Nwells, R and Nattr are the 3 cell references where the values are entered. The
Psc may be expressed as a percentage.
Next we use collocated Kriging under the Markov-Bayes hypothesis to integrate the
seismic impedance into the estimation. We can easily define the variograms from the
seismic impedance data and then use the Markov-Bayes relation to define the primary
and cross variograms.
EARTH
WORKS
Based on the spurious correlation calculation, we can also define the confidence limits
and thus the level of weighting we should apply to the seismic. The 95% confidence
interval on the observed correlation suggests the actual correlation could be as low
as 0.635 or as high as 0.991. Adjusting the weighting gives two alternative results,
representing a low and a high case confidence in the seismic impedance to help us
map porosity-thickness.
We can illustrate the effect of changing the weight on the mapped result by
systematically changing the correlation coefficient used in the Markov-Bayes and this is
shown next.
CO-KRIGING AND MARKOVE BAYES
WORKS
EARTH
EARTH
WORKS
Finally we repeat the procedure, but this time using all 55 wells. The first observation
is that Kriging with all the wells results in a reduction in uncertainty, as shown by the
standard deviation map.
Regenerating the cross-plots but with all 55 wells shows we were misled by the 7
wells. There is no correlation of thickness with impedance at all and the best relation is
between impedance and porosity.
Finally we shown the Markov-Bayes model overlaid on the bivariate variography to

show that it is a quite reasonable choice of model, and also to show the primary,
secondary and cross-variograms.
EARTH
WORKS
INTRODUCTION & FRAMEWORK

INTRODUCTION
Geostatistics has found a range of applications within the oil industry in the
last 20 years. Initially it found favour in depth conversion and volumetrics,
then in combining seismic attributes with reservoir properties at wells. But by
the far the most significant application has been the development of the 3D
stochastic reservoir model.
In the simplest sense a reservoir model is simply a set of grids, one for each
reservoir layer, containing estimates of reservoir properties. The grids may be
irregular and their geometry is often complicated by faults and structure. But
a 3D stochastic reservoir model is far more than that, providing as it does a
platform for the integration of geophysical, geological and engineering
knowledge and the ability to view the results in 3D. These are geologic
models which include geostatistical tools in their construction.
From a geostatistical perspective, a good reservoir model has:
Layering selected according to flow units or facies

A spatial correlation model
Honours the CDF and spatial geometries through realisations
Can generate best estimates and realisations of poro-perm data
Many reservoir models are too ambitious in their scope. The principal
problems are defining a grid and layer definition at too high a resolution. This
leads to large models, well beyond what can be input to fluid flow simulation,
which are also slow to simulate. The second problem is in specifying to many
facies. It is important to try and cut down the number of facies or flow units to
the fewest which will capture the essence of the reservoir. This does not
necessarily result in large, average blocks in the reservoir. A thin interval may
be important if it has extreme permeability characteristics a low permeability
interval may act as a vertical barrier, a high permeability streak may give rise
to early breakthrough. Remember also that two different facies types may
have very similar flow properties, so it may be possible to combine them.
Caution should be exercised here as facies with similar flow properties may
have different geometries which may need capturing in the model.
There are five basic steps in constructing a geological model:
Structural framework and faults

Grid and layer definitions
Scale-up of well logs into grid cells
Categorical prediction of facies or flow units
Continuous prediction of petrophysical properties within facies

All rights reserved

FRAMEWORK
For most 3D reservoir models, the first step is constructed directly from the
depth converted seismic interpretation. Few reservoir models include
uncertainty in the structural framework and this is a major limitation of the
approach used currently in many companies.
The zones are the main intervals representing the geological intervals. They
will usually be mapped as isochores between the depth converted seismic
horizons.
The layers are the fine detail sub-dividing the zones. The layers can be
constructed within each zone as:
Proportional
Base or top conformal
Fractions eg 1,2,1 proportions
The grid and layer definitions are generally selected in relation to layer
characteristics of the geological setting and are a compromise between
capturing significant detail and creating an over large model with too many
cells.
Segments are areas of the model defined by faults. The main use of
segments is to allow different fluid contacts in different parts of the field
The simbox is the name given to the layering after flattening the model to
remove faults and structural shape. Simbox geometries are essentially
continuous within the layers and used during the mapping and simulation of
the geology. They can also be selected as a visualisation option.

All rights reserved

A question which arises is how thin to construct the layers. One approach is
to build coarse layers. This gives simple models which are quick to compute.
However it should be noted that this involves cell averaging which may be
quite severe and is not the same as upscaling for permeability.
The alternative is to build fine-scale layered models and then upscale after
estimating or simulating the geological facies and petrophysical properties.
This has the advantage of offering more rigour for the treatment of scale
changes of permeability and allows sensitivity to upscaling tobe tested but it
may result in large models which are slow to load, save and compute.

All rights reserved
STRATTON FIELD
STRATTON FIELD DATA SET
The data-set used here as the training set for stochastic modelling is the
Stratton Field 3D seismic and well log data package, prepared by the Bureau
of Economic Geology, Austin, Texas, USA (Levey et al, 1994).
Stratton Field is an on-shore
gas field producing from the
Oligocene Frio Formation in
the NW Gulf Coast Basin,
Gulf of Mexico. The Frio
Formation is a sediment
supply dominated
depositional sequence
characterised by rapid
deposition and high
subsidence rates and forms
one of the major
progradational off-lapping
stratigraphic units in the basin. Commercially, the Frio Formation of Texas is
volumetrically the largest gas producing interval of ten major depositional
stratigraphic packages in the Cenozoic of the Gulf Coast
Basin.
Chronostratigraphy and lithostratigraphy of major Cenozoic episodes of onshore

Gulf Coast Basin (from Levey et al, 1994)
The top of the middle Frio formation is about 1200 ms, corresponding to a
sub-sea elevation of around -4,500 ft. There is little faulting in this interval and
the formations relatively undeformed and flat lying. Reservoir facies of the
middle Frio are interpreted as multiple amalgamated fluvial channel-fill and
splay sandstones. The composite channel fill deposits range from 10 to 30 ft

All rights reserved
STRATTON FIELD
thickness and show either an upward fining or a blocky log profile. The
composite channel deposits can be up to 2,500 ft in width. Splay deposits
show typical thicknesses of 5 to 20 ft and are proximal to the channel
systems. Porosities in these fluvial reservoirs range from 15 25 % with air
permeabilities of less than 1 to greater than 4,000 milliDarcies.
There are a total of 21 wells in the field. A well display for each of the wells is
included at the end of this section. Wells 01, 02, 03, 04, 05, 06 and 21 have
limited log suites and in particular do not have acoustic impedance data
available. Well 16 is only logged over a deeper section and does not have
any data over the modelled interval.
Base Map of Stratton Field showing well locations
Frequency
Sands are generally indicated by high impedances and have typical velocities
of 12,000 ft s-1, a 30 ft sand thus being around 5 ms thick in time. In the
example model a simple two
16%
facies definition has been
14%
adopted, comprising
12%
sand and shale. Sand facies
10%
have been identified with a
Shale
combination of ILM
8%
Sand
(resistivity), Vclay and
6%
acoustic impedance cutoffs.
4%
An impedance cutoff of 8,000
2%
ft s-1 * g cm-3 gives a good
0%
seismic discrimination
5500
6500
7500
8500
9500
10500
11500
between sands and shales
Impedance (m.s * g/cc)
(see histogram).
Histogram of acoustic Impedance by facies type

All rights reserved
STRATTON FIELD
StrattonCleanModel.pet
All of the data has been loaded to Petrel and the model framework
constructed. Two seismic horizons have been picked form coloured
impedance data and used to provide the structural framework. These are a
middle Frio pick (MFRIO) and a pick which corresponds to the top of the
Upper C38 formation.
A simple velocity model has been used for
depth conversion. Seismic volumes
including coloured impedance,
deterministic inversion and stochastic
inversion data, including probability cubes
have been loaded, depth converted, and
re-sampled in to the model.
Two zones have been defined for the
model and these have been sub-divided
into five layers.
Zone 1
o MFrio
o B46
Zone 2
o C38 Upper
o C38 Lower
o D11
East-west section showing Coloured Impedance seismic data and AI logs

All rights reserved
STRATTON FIELD
East-west section showing Coloured Impedance seismic data and Vclay/ILM logs
East-west section showing layer definitions in StrattonCleanModel.pet, overlay Vclay/ILM logs

All rights reserved
STRATTON FIELD
Typical well showing logs, zone and layer definitions

All rights reserved
STRATTON WELL DISPLAYS

All rights reserved

All rights reserved

All rights reserved

All rights reserved

All rights reserved

All rights reserved

All rights reserved

All rights reserved

All rights reserved

All rights reserved

All rights reserved

All rights reserved

All rights reserved

All rights reserved

All rights reserved

All rights reserved

All rights reserved

All rights reserved

All rights reserved

All rights reserved
SCALE-UP LOGS
SCALE UP OF WELL LOGS
Prior to use the loaded well data must be scaled-up to the vertical resolution in
the 3D grid. Scale-up is usually based on some form of averaging. Scale-up
will be applied to continuous measurements such as porosity but also to
discontinuous or discrete logs. Scale-up of discrete measurements such as
facies is usually different to the averaging applied to continuous
measurements and is often based on selecting the most frequently occurring
value. After this type of scale-up it is particularly important to check that the
results are still consistent with the original well logs, particularly with regard to
the scaled-up logs correctly reproducing the original facies proportions and
thickness distribution.
There are many options in reservoir modelling packages allowing different
approaches to scale-up. This includes biasing the method of property
averaging to a discrete log, usually the facies log. Petrel offers the following
algorithms:
Scale-up methods for discontinuous logs:
Most of (greatest frequency)

Median
Minimum
Maximum
Arithmetic
Weighted
Scale-up methods for continuous logs:
Arithmetic mean
Harmonic mean
Geometric mean
RMS
Minimum
Maximum
Scale-up in Petrel can be found under Property Modelling. For facies logs,
the scale-up is usually made using the Most Of algorithm. For porosity log
scale-up the arithmetic mean is usually used, with the additional criteria of
biasing the scale-up using the facies log.
In addition to selecting the averaging method, a decision must also be made
on the method by which to populate the model cells penetrated by the well
path. In common with other reservoir modelling packages Petrel offers a
several methods. Although it depends on the method used, scale-up
generally causes a change in the PDF of the data. Thought must also be
given to the problem of well geometry. The averaging will not be regular

All rights reserved
SCALE-UP LOGS
within the cells if there is significant well deviation. Typically, between 10
and 300 log samples are averaged into a cells depending on the deviated
trajectory of the well passing through the cell.
In addition we should also consider that the difference in measurement

volume between a well log sample and the modelled cell is of the order of
500,000 : 1. No matter what method of scale-up is chosen, the cell will almost
certainly not contain the appropriate average for the scale of the cell. This
has implications both for the estimated PDF (histogram) and the predicted
range of uncertainty and is a limitation of current reservoir modelling software.
The general rules for averaging and resampling are:

Resampling by selecting a subset of data does not change the scale of
measurement and so the CDF/PDF remain the same as the original measure.
Populating model cells by this approach will result in values measured at log
scale being used to populate cells defined at a much larger scale.

All rights reserved
SCALE-UP LOGS
Averaging is a smoothing operation and changes the effective scale of
measurement. The PDF becomes narrower and the CDF narrower and
steeper. For a reservoir model, simply averaging the logs into the cells does
not result in the correct average for the cell size. Depending on cell size
variation and deviated well path geometries, typically between 10 and 300 log
samples will be sampled into a cell. This is not the correct scale-up and the
result may also give rise to a variation of the PDF from cell to cell. For a data
set with cells oriented perpendicular to the well path (eg low relief and vertical
wells) the arithmetic scale-up is the best choice with current software.
The scale-up options available in Petrel to control which cells are populated in
the model include:
Simple all cells that the well
trajectory goes through will be
populated with values according
to the scale-up method chosen
Through Cell cells will only be

populated if the well trajectory
passes through two opposite
walls of the cell e.g. top and base.
Neighbour Cell averaging is

based on all cells immediately
adjacent to the traversed cell
which are in the same layer

All rights reserved
SCALE-UP LOGS
After scale-up it is important to undertake quality control to ensure that the
scale-up logs are still representative of the original well logs. Depending on
the log type being scaeld-up, these checks will include looking at:
Comparative log plots

Facies Proportions
Property Histograms
Thickness Histograms
As a minimum the original and scaled-up logs should be overlaid and their
histograms compared. In addition basic statistics should be checked, looking
to ensure reproduction of the mean and to verify that facies proportions are
preserved.
A further quality control check available in Data Analysis is to compare the

vertical proportion curves (vertical histogram) within each layer for the original
and upscaled logs. This will be illustrated in the Data analysis section.

All rights reserved
SCALE-UP LOGS
STRATTON FIELD SCALE-UP
Facies Logs
Scale up of well logs is found on the
Processes tab under Property
Modelling.
At the top of the menu choose create
new property. In the centre select
input from well logs and select the
logs as Facies.
The averaging method to use is
Most Of, treat logs as lines. As all
the wells are vertical, the Simple
method for populating all cells
penetrated by a well will be a good
choice.
After scale-up, it is important to
check the model facies log against
the original facies log (see next
page). In particular, it is important to
check that the scale-up manages to
capture sufficient detail in the
layering and that there are no major busts between the original and upscaled
logs. The most likely problems will be encountered if the cell size of the
model is too coarse or if the facies changes are strongly cyclical.
Next a check should be made of the facies proportions. This can be done on
the Models tab. Double click the Facies (U) object, select the Disc. Stat. tab
on the pop-up dialogue and check that the upscaled cells for the zone contain
proportions close to those in the original logs. A visual comparison can be
made using the histogram tab.

All rights reserved
SCALE-UP LOGS

All rights reserved
SCALE-UP LOGS
A final quality check is to inspect the thickness histograms for each reservoir
layer. This can be done using the Data analysis option in Property Modelling.
Compare the thickness histograms between the original and upscaled logs.
Note that the after upscaling, the minimum thickness is determined by the
average cell thickness. Typically the thickness distribution for sands will show
an exponential decline such as for the B46 interval for Stratton Field.
MFRIO Zone
B46 Zone
C38 Upper Zone

All rights reserved
SCALE-UP LOGS
C38 Lower Zone
D11 Zone
Porosity Logs
Scale up of the porosity logs is
relatively straightforward. Again,
create a new property and select
input from the original well logs.
The important element is to use the
Use Bias option. This ensures that
samples averaged only come from
cells with same facies type as
upscaled cell.
An arithmetic average is an
appropriate choice and, as for facies
log upscaling, the Simple method is
fine for vertical wells.

All rights reserved
SCALE-UP LOGS
Recall that averaging changes the shape of the distribution (the PDF and the
CDF). By comparing the original logs with the upscaled cells we can see how
averaging moves the samples towards the mean, causing the distribution to
become more peaked with fewer extreme values.
Comparison of original (left) and upscaled (right) porosity histograms

All rights reserved
DATA ANALYSIS
DATA ANALYSIS
After creating the structural, zone and layer framework and preparing the well
logs by the application of scale-up the next stage in a reservoir modelling
study is to undertake exploratory data analysis.
Petrel distinguishes between data analysis for continuous properties (such as
porosity) and discontinuous properties (such as facies logs). For continuous
properties the tools take the form of Transformations. The data analysis
transformations available for continuous properties are:
Input Truncation
Output Truncation
Logarithmic Transformation
Cox-Box Transform
1D Trend
2D Trend
3D Trend
Scale Shift
Normal Score Transformation
For discrete variables, the tools available for analysis include:
Proportion (Vertical Proportion Curves)

Thickness
Probability
Common to data analysis of both continuous and discontinuous data types

are the variogram analysis. Variogram analysis can be tackled in various
ways in Petrel, not just in the Data Analysis module. Variogram analysis will
be considered in a separate section.
Discrete Properties
Proportion tab allows analysis of the facies
proportions vertically through the model, or within
each zone or layer of the model. The proportion
curve is a vertical histogram of facies type and is
characteristic of the depositional system within a
sequence.
The usual approach would to be to fit a function to
the vertical proportion curve in order to ensure
that the facies have the correct vertical
distribution within each layer. In this example
(right) example, most of the sand is concentrated
in the base of an interval and we would expect
our model to replicate this.

All rights reserved
DATA ANALYSIS
The vertical proportion curve can be analysed for the original logs or upscaled
logs and is a further useful quality control on the results of scale-up of the
facies logs.
Comparison of Vertical proprortion curves between original logs (left)

and upscaled logs (right)
When modelling vertical proportions either no trend could be applied (regular

distribution throughout interval) or more likely either a vertical probability or a
depth trend will be defined. The vertical probability modelling would usually
be applied separately for each layer. In this example (below) the trend has
been modelled as a distribution from the original logs (it could also be
modelled from the upscaled logs) and then smoothed.
Modelling vertical proportion curves: original (left); fitted (centre); smoothed (right)

All rights reserved
DATA ANALYSIS
B46 Zone
C38 Upper Zone
C38 Lower Zone

All rights reserved
DATA ANALYSIS
D11 Zone
Thickness tab is used to visualize the thickness distribution of the facies. Its
use has been described in the scale-up section as a quality control tool. An
example is given below comparing the original and upscaled facies thickness.
As noted above, remember that the minimum thickness in the upscaled log is
determined by average cell thickness. Also note the characteristic
exponential shape to the sand thickness distribution in the original logs below
a strong argument for not using symmetric distributions for sand thicknesses
in object modelling.
B46 Zone
Probability tab can be used to
constrain a facies model to a
continuous property such as a
seismic attribute. It can be used to
convert the seismic attribute cube a
facies probability cube ready for use
in constraining Facies Modelling using
Sequential Indicator Simulation (SIS).
The usual functions for fitting either
trends or probability distributions are
available in this panel.

All rights reserved
DATA ANALYSIS
Continuous Properties
The most commonly modelled continuous properties would be porosity or
permeability.
Input Truncation limits the minimum and maximum range of values used in
the analysis and modelling. Any values outside the range set will be set as
null.
Output Truncation is usually used as the final step after a Normal Score or
similar back transformation to ensure that the final modelled property values
stay within required (or physical) limits.
Logarithmic Transform is a straightforward logarithm of the data values,
particularly useful for permeability or other data with lognormal distributions.
Cox-Box Transform (sometimes referred in statistical literature as Box-Cox)
is a method of removing skewness from a set of data. The transform is:
Where z is the transformed variable, x is the original variable and lambda is

the Cox-Box transformation parameter, expressing the degree of skewness.
The software will iteratively solve for the best value of lambda down to an
accuracy of 0.1.
A logarithmic transformation of a variable will remove skewness for a variable
following log-normal PDF. This is a special case of Cox-Box where the
coefficient lambda = 0. Be aware that the uncertainty is not transformed
linearly under a logarithmic transform and a Normal score transform may be
preferred.
After a Cox-Box or logarithmic transform to remove skewness the mean of the
variable may not be zero and the standard deviation not equal to one. This
can be corrected with the scale-shift transform.
Scale-Shift Transform will correct the data to have a mean of zero and a
standard deviation of one. This is achieved by subtracting the mean from all
data values (to shift it to a mean of zero) and then scaling by divinding the
shifted samples by the standard deviation, which standardises the variability
to a standard deviation of one. Note that scale-shift does not change the
shape of the histogram.
Normal Score Transform is a complete approach which forces the variable
CDF to a standard normal distribution N[0,1]. Recall that a standard normal
distribution is essential as the input to Sequential Gaussian Simulation (SGS),

All rights reserved
DATA ANALYSIS
the algorithm which should usually be used for generating porosity and
permeability properties within the model.
The normal score transform should be considered as the standard approach
to be used in preference to other methods in most cases. The normal score
transform may be difficult to apply with very small data sets. In this case
consider a simpler method such as Cox-Box followed by scale-shift, or
manually edit the normal score transform. Care should always be taken with
modelling of the tails of the distribution and truncation of the output.
Trend removal in Petrel can be in 1D, 2D or 3D. For a 1D trend removal the
direction X, Y or Z should be specified, although it is most common to remove
vertical trends. The distances used in trend removal can be based on the real
coordinates or the simbox coordinates.
Examples of trends might be:
Vertical porosity-depth trends

Horizontal map trends
Spatial volume trends
Trends in 3D such as those derived from seismic inversion cubes should best
be used as a secondary property in petrophysical modelling under a
collocated co-kriging or co-simulation framework.
Important Note Use of Inversion Products in Reservoir Modelling

It is very important to note that deterministic inversion data should NOT be
used to constrain a reservoir model. Deterministic inversion data MAY be
used if the low frequency component is filtered out to remove the influence of
the wells from the deterministic inversion. A seismic attribute or relative
impedance (such as coloured inversion) CAN be used. The ideal
impedance product to constrain a reservoir model would impedance
realisations generated from a stochastic inversion. It is important to note
that some stochastic inversion methods (such as Promise) produce output
similar to deterministic inversion schemes and the same caution is advised
before using these as conditioning for the model.
A typical transformation sequence for porosity would include:
Output truncation to limit minimum and maximum values in realisations

Input truncations to remove outliers in input data
1D trend removal to account for porosity-depth trend
Normal score transform for input to SGS

All rights reserved
DATA ANALYSIS
Stratton Field Porosity Transforms
In the Stratton
data the
supplied total
porosity log
has been
defined form
the bulk
density log. It
contains
erroneous
high porosity
estimates for
any non-sand
interval. Therefore a first step has been to use input truncation on the
upscaled logs to set the total porosity values to null in the shale facies. For a
reservoir model, the shale facies porosity values would probably be set to a
low or zero value across the model after generating the facies.
Setting the porosity to null in this case allows the complete porosity maps,
only generated using shale values, to be viewed for the entire model. This is
particularly useful for training purposes.
In the Stratton data set, the best approach appears to be work over the whole
of the upscaled and not zone by zone. Working with all the upscaled cells
gives a clear porosity-depth trend which can be modelled and gives a lot of
data for the subsequent normal score transform. Click the Zones button off
and select the sand facies.
1D Trend has been used
to model the porosity-depth
trend across all the zones.
It is usually worth noting
down the depth trend
coefficients in case you
need to use this function in
a later stage of modelling,
or report it.
Remember to select the Z
button to model a vertical
trend.

All rights reserved
DATA ANALYSIS
Normal Score Transform
has been used to convert
the residual porosity values
(after removal of the depth
trend) to a Gaussian or
normal distribution. By
using all zones a large
number of upscaled values
are available and this
makes the normal score
transform more robust.
Use the fit function button.
In some cases you may
wish to edit the resulting
function manually
After performing the
transforms use the copy
button to copy the parameters, select Use Zones and then paste the final
transforms to all zones.
A relation
between porosity
and impedance
has been
investigated using
a function window.
Crossplotting
coloured
impedance with
porosity suggests
that there is a
weak correlation
between the two
and this might
lead to the use of
the seismic
impedance data in
constraining the
porosity
simulation.
However, care should taken to properly understand the impedance
information. The crossplot only considers the upscaled cells (sensibly, since
these will be averaged to be closer to the seismic resolution). Unfortunately,
this means that we are selectively comparing only sands to the coloured
impedance data.

All rights reserved
DATA ANALYSIS
An inspection of the original log data shows that for clean sands there is clear
functional relationship between impedance and porosity (Wyllies relation
this is Gulf of Mexico data), the presence of shales means that it would be
incorrect to apply this relation to model porosity within our model. This is
because our simulated facies will not be constrained and could then be
populated with seismic driven data which is inappropriate at a given location.
The crossplot of sand and shale shows that we must first constrain the facies
distribution in the model using the seismic, the histogram shows there is
discrimination of sand and shale using the seismic data.
13000
12000
Impedance (m/s * g/cc)
11000
10000
Non-clean
9000
Clean sand
8000
7000
6000
5000
0%
5%
10%
15%
20%
25%
30%
35%
Porosity (%)
Impedance vs porosity for sands and shales

16%
14%
12%
Frequency
10%
Shale
8%
Sand
6%
4%
2%
0%
5500
6500
7500
8500
9500
10500
Impedance histograms for sands and shales

All rights reserved
11500
EARTH
WORKS
FACIES MODELLING
So far in this course all of the geostatistical examples we have considered involve
spatial estimation of continuous measurements. The variable can take on any value,
subject to some limitations concerning physical bounds.
Some types of data are not continuous measurements. For example, male and female.
We can convert any continuous variable into a categorical variable by classification.
Many sciences, including geology, are pre-occupied with classification of observations
into classes. An example in geology is classification by lithology such as sand or
shale. The problem with these classification systems is they assume black/white and
there is ambiguity in how to handle shades of grey in between. The problem is usually
solved by defining a cutoff such as < 25% shale volume rocks will be classified as
sand, greater than this shale.
Most reservoir models are constructed by initially defining a series of facies (or
sometimes flow units). This is for two reasons
Different units may have different shape or form in the earth such as channels
or mouth bars and it is important to capture this form in the model
Different units may have different porosity-permeability characterstics and
so the facies distribution is used to provide a template into which poro-prm data
are mapped or simulated.
The facies act as templates for the subsequent simulation of petrophysical properties,
notably porosity and permeability.
We will first consider the simplest problem of mapping classes or indicators by
considering Indicator kriging (and its simulation counterpoart). We will also consider
some of the limitations of the indicator methods and, in the following section, consider
the alternative of object modelling.
Indicator Kriging and Simulation

The original purpose of indicator Kriging was to provide a least-squares estimate of the
conditional cumulative distribution function at different cutoffs for a continuous variable.
However, indicator Kriging has been most used for estimating the indicator transform of
a variable or, even more familiarly, for estimating facies, and this is the application we
are going to consider here. Be aware that the implementation in Isatis is based on the
original purpose and we have to do some quick calculations in File -> Calculator after
running the indicator kriging in order to obtain the individual facies probabilities. The
Indicator simulation implementation in Isatis is set up for facies simulation in the usual
manner of most reservoir modelling packages.
If the variable is a binary categorical variable, eg set to 1 for sand and 0 if not a sand
then the direct Kriging of the ones and zeros provides a model for the probability that
the rock type represented by 1 prevails at the target location.
FACIES MODELLING
EARTH
WORKS
The following figure shows a basemap posting of sample locations and the rock type
present at each location.
0 = Shale
1 = Channel Sand
2 = Crevasse Splay
Indicator kriging is based on the principle of kriging of a binary indicator. If
we use 1 to indicate the presence of a particular rock type and 0 to indicate
the absence of that rock type then kriging of the 1 / 0 data set will give us the
probability (expected value) of the rock type represented by 1.
By repeating the process with several rock types, redefining the 1 / 0 definition
for each, we would obtain a set or probability maps, one for each rock type.
The sum of these maps will be 1.
FACIES MODELLING
EARTH
WORKS
FACIES MODELLING
EARTH
WORKS
Indicator kriging solves the multiple indicator (multiple rock type) problem by kriging the
cumulative distribution function cut by the indicators rather than the individual 1 / 0 data
sets by category. The result is the same from the cumulative we can calculate the
probabilities for the individual facies.
The question of what variogram model to use does arise. For facies based indicator
kriging, we can define a variogram on each category, but for true indicator kriging
where we are estimating the CDF then the variograms are defined on the cutoffs. A
simple alternative is to define a single variogram, either on the original data (ie the
facies classifications) or the cutoff which splits the data into two halves (the median
cutoff. The implementation will be generally be software dependent.
Indicator simulation of the categories (facies) is readily solved using a sequential

approach. The procedure for sequential indicator simulation (SIS) is as follows:
Randomly select a target grid node

Krige each of the facies probabilities at the target grid node
Check sum of probabilities is 1, otherwise normalise
Arrange facies probabilities cumulatively from 0 1
Draw a random number on the interval [0,1] and determine facies at target grid
node
Insert facies code into grid
Repeat with new target grid node treating the previously simulated facies point
as additional data points
FACIES MODELLING
EARTH
WORKS
FACIES MODELLING
EARTH
WORKS
FACIES MODELLING
EARTH
WORKS
Three example SIS realisations are shown below.
From looking at the indicator simulation realisations the principle drawback of the
method is apparent the shapes of the facies do not look like geological structures.
The problem is the variogram is a 2-point statistic and therefore cannot characterise
the geometry of the facies. Recall also that it is the maximum entropy solution.
In the illustration above, the channel clearly has a complex geometry that cannot be
reproduced in the indicator simulation. Also, the facies proportions may be incorrect
we bias our sampling with wells in order to target the better quality reservoir rock
such as might be found in the channel. In fact, the proportions of shale, sand and
splay lithologies in the above diagram are 0.63, 0.24 and 0.13, different to the 0.45,
0.35 and 0.20 sampled in the wells. Also, be aware that facies proportions are also a
function of the size of the model area.
An improvement to indicator simulation that would allow more complex geometric
shapes to be generated would be to use higher order statistics such as 3-point or multipoint. Direct solutions are not currently available. Existing methods, such as SNESIM,
use a template geometry and solve the problem by a sort of pattern matching.
FACIES MODELLING
EARTH
WORKS
The other alternative is to condition to implicit multi-point statistics such as distributions

of size and shape parameters. This is best achieved by the now widely used objectbased simulations.
Indicator simulation is an example of a pixel-based facies simulation method. For
many years there have been several competing methods for simulating facies which
are pixel-based, in contrast to the object methods.
Each of the pixel-based methods have their strengths and weaknesses (as does the
object method approach).
As we have already met indicator simulation, it is probably a good place to start in
comparing the available methods. Its advantages are:
Spatial model honoured independently for each facies

Sound theory
Handles any facies organisation
Widely available and fast
And its disadvantages
Realisations not very geological

Entropy too high?
Indicator simulation was invented in Stanford. As you might expect, there is an

equivalent approach which was invented in Fontainebleau. Its called truncated
Gaussian.
Sound theory based on spatial correlation

Hierarchical facies succession (may be disadvantage too)
Proportion curves an excellent geologic descriptor
Disadvantages include:
Realisations not very geological

Entropy too high?
Not widely available
It was recently extended with the so-called plurigaussian method which has addressed
the facies hierarchy problem but it still remains a method with limited availability. The
principal packages implementing it are Heresim (from IFP) and Isatis.
The truncated Gaussian and sequential indicator methods are somewhat similar and
both suffer from the same fundamental weakness they are unable to generate the
complex geometric forms we associate with geology owing to their 2-point statistic
basis. They are essentially pixel-based simulation methods.
FACIES MODELLING
EARTH
WORKS
Object Simulation
The alternative to pixel-based methods is too attempt to model facies using objects.
The idea of object simulation was demonstrated as early as the mid 1970s (Bridges
and Leeder, 1976) with a simple 2D cross-section model of shales.
The idea of defining geological facies such as channels as objects came to the fore
with work from Norway in the 1990s. These models are now widely used for reservoir
modelling, relegating the pixel-based methods as second line.
Object models are excellent at reproducing the complex geometries we associate with
geological objects, but this is also their weakness. Having a means to define them still
leaves the problem of determining the parameters. Rarely do we have sufficient data
from wells to reliably parameterise the objects.
The process of object simulation involves trial and error. A neutral background facies
is first set in all cells of the model (this is usually the shale, but sometimes it is another
type). Objects are generated according to probability distributions defining their
characteristics such as width and thickness of channels. As each object is generated,
the software attempts to insert it into the model. Its insertion must not result in passing
through a well where a different facies is coded and ultimately all facies coded in the
FACIES MODELLING
EARTH
WORKS
well as channel should end up with a simulated channel passing through them. We
keep inserting objects until all well with that facies have been included in an object. In
addition, we must ensure the volume proportion is correct. We cannot insert channels
if we end up with too high proportion of sand in the model. Alternatively if we insert
channels and satisfy our well conditioning data, we may have insufficient channel
proportion and have to simulate more non-conditional objects. This is controlled by a
stop criteria, usually the net:gross ratio.
In object modelling the wells are typically honoured first and then the interwell region
is simulated second. Care must be taken by the algorithm in simulating the interwell
region to avoid conflicts with the known sequence of lithologies in the wells.
(a) Background facies in model
(c) Interwell simulation avoiding

conflict at wells
(b) Wells simulated first
(d) Final object realisation

stopped when N:G achieved
FACIES MODELLING
EARTH
WORKS
Object methods give excellent reproduction of entropy but they are relatively slow due
to the accept/reject mechanism. The main problem is parameterisation of the objects.
Some of the parameters are simply guessed at, as the following typical example of
object parameters from a reservoir modelling report shows:
Body Type:
Low sinuosity Channel
Shape
Half-cylinder
Thickness
Triangular Distribution
Width
Expression
Orientation
Amplitude
Expression
Wavelength
Stop Criterion Volume (Net:Gross)
Branch Points
Uniform Distribution
Branch Location
Uniform Distibution
2 4 10 m
Thickness * 25
10 50 90 degrees
Width
1000 3000 5000 m
26 %
10 20/1000 m
0 default
The limitations of object modelling are primarily the difficulty of parameterising the
models. To some extent it can be argued that indicator models are more data driven,
although the paucity of wells to produce reliable variograms negates this argument
somewhat. Choice of models and parameters can be assisted by knowledge of the
geological environment and could be supplemented by analogue databases comprised
of outcrop or historical field data. Another source of data could be analysis of the
seismic expression of objects such as channel widths, sinuosity which might come from
analogous shallow intervals where seismic resolution is good.
Objects in Petrel
Object modelling in Petrel requires specification of a series of parameters to be
specified. For a fluvial channel the channel direction (orientation of the channel
axis) must be specified along with the wavelength, amplitude, width and thickness
distributions or relations. Levees can also be defined and these will require additional
width and thickness parameters as will the occurrence of crevasse splays.
In addition Petrel has a number of trend options available for use with fluvial channels.
These include:
Aerial probability maps

Vertical probability curves
Flow lines
Source points
A pair of flow lines can be used to give orientation and an evelope within which
channel objects will be inserted to give bundles of objects. Source points are used
as the start points for channels and can be combined with flow lines (but not used in
conjunction with the probability map).
FACIES MODELLING
EARTH
WORKS
FACIES MODELLING
EARTH
WORKS
Most reservoir modelling packages, like Petrel, can also offer a wide variety of others
facies bodies geometries including:
Box
Pipe and upper/lower half pipes
Ellipse (full, half and quarter)
Deltaic/Alluvial fan
Aeolian sand dune
Oxbow lakes
The facies bodies are generally defined by orientation, minor width, major/minor axis
ratio and thickness parameters.
Erosion rules allow the next simulated facies types to erode previously simulated
facies. In its simplest form, all subsequent facies erode the background facies, of
course. Typical erosion rules include:
Replace NONE other facies

Replace ALL other facies
Replace ONLY the specified facies in the previous bodies
Replace ONLY the following facies in the background property
Other option, which are generally slow to run, including only replacing previously
simulated bodies if they are completely within a new object and replacing a facies with
itself (if this is not allowed only isolated objects of that facies will be present).
The following pages show some examples of indicator kriging, indicator simualation
and object simulation of facies within a zone
FACIES MODELLING
EARTH
WORKS
Layer 1
Layer 6
Layer 2
Layer 7
Layer 3
Layer 8
Layer 4
Layer 9
Layer 5
Layer 10
Indicator kriging of facies in layers 1-10 of a zone
FACIES MODELLING
EARTH
WORKS
Layer 1
Layer 6
Layer 2
Layer 7
Layer 3
Layer 8
Layer 4
Layer 9
Layer 5
Layer 10
Sequential indicator simulation of facies in layers 1-10 of a Zone
FACIES MODELLING
EARTH
WORKS
Layer 1
Layer 6
Layer 2
Layer 7
Layer 3
Layer 8
Layer 4
Layer 9
Layer 5
Layer 10
Object simulation of facies in layers 1-10 of a Zone
VARIOGRAMS IN PETREL
In Petrel there is more than one way to construct and model experimental
variograms. None of the schemes provided is ideal and variogram work in
Petrel should be regarded as generally sub-standard compared to other
industry packages.
The two routes for generating variograms in Petrel are:
Model -> Properties -> Property Settings

Data Analysis
Property Settings Route

The original method of estimating and modelling
variograms in Petrel is through the Property
Settings. This is more reliable for the user to
successfully be able to generate and correctly
model variograms but it is still tedious, with a lot
of repetitious work required by the user.
Variogram modelling using this method is
accessed by double clicking on a property in the
model to open the Property Settings and
scrolling to the last tab.
The first point to note is the check-box option
labelled Isotropic (circled orange, below right).
This is very misleading. The presence or
absence of anisotropy is a property of the
geological data we are trying to detect or
establish through variogram analysis. The
button should properly be labelled
Omnidirectional, to denote ignoring directional
information in the analysis.
To perform indicator simulation we need to
generate experimental variograms by facies
and zone and model them individually. This
can be done using the property modelling
route by double clicking on properties and, on
the last tab, set the appropriate filter for facies
and zone (see figure below). On the
variogram tab (shown right) turn on the Use
property filter option.
The resulting variograms will be stored in the
variograms folder in the Input tab. If making
several variograms remember to deselect the
Overwrite last option at the bottom of the panel.

All rights reserved
Properties Filter: Select facies (left) and zone (right)
The Properties Settings route is the most

appropriate way in which to calculate
variograms from seismic properties. With
the seismic attribute resampled into the
Petrel model use the index filter to allow
horizontal variograms to be generated by
each required K layer within the model.
Again, remember to turn on the Use
property filter in the Settings dialogue.
For many zones or K layers the Property
Settings route to generate experimental
variograms is slow, requiring the filter to
be amended each time a different zone or
K layer is used for computation.
To model a variogram using this approach,

first display the experimental variogram in a
Function Window. Then use the Define
Variogram button on the Petrel right hand
tool bar. This opens a menu for defining a
variogram model (see right). Define the
appropriate settings and press Ok. The
variogram model is superimposed on the
variogram window and can be edited
interactively. An example is shown below.

All rights reserved
Data Analysis Route

The most commonly used
approach is the Data
Analysis route. This is
the newest method
implemented in Petrel.
The interface in data
analysis has some clever
graphical gimmicks, such
as implementing the GSLIB variogram diagram as
part of the user interface
to select parameters, but
it basically fails to provide
the essential functionality
for what is the most basic
geostatistical step. In
particular it has great
potential for errors or bad parameter choices, both through user ignorance but
also due to fundamental design flaws in the interface.

All rights reserved
Some of the key problems with the variogram facilities provided in Data
Analysis are:
Confusion of omnidirectional and directional variogram principles

Sill rescales for every variogram computation and direction
User has to ensure same parameters in different directions when
recalculating
Cannot compute multiple directions and display together
Cannot compare directions for models, especially horizontal
No line joining variogram points
No ability to mask samples to exclude from variogram computations
Must APPLY to keep settings, even when switching between zones!
The histogram displays the number of pairs used to calculate the experimental
variogram value at each lag useful for determining reliability of the
variogram. Generally, vertical variograms will be reliable whereas horizontal
variograms will be difficult to interpret and model. Seismic derived attributes
may give limited additional information.
To generate an omnidirectional variogram in data analysis set the tolerance
angle to 90.

All rights reserved
FACIES VARIOGRAMS
Stratton Field Facies Variograms
Assuming we want to attempt a Sequential Indicator Simulation for facies
generation within our model, we will require to model a variogram for each
zone and facies modelled.
For a two facies case (as we have here in the Stratton Field data set) we only
need to generate variograms and models for one of the facies. For a two
facies case, the facies codes are a binary complement and so the variograms
will be identical for the two facies. For three or more facies the variograms
must be generated individually and modelled separately for each facies.
Vertical Variograms
With well data it is usually
best to begin with the
vertical variogram. The
vertical variogram is
usually straightforward to
generate, with plenty of
samples available to
generate sufficient pairs
for a reliable variogram.
Check the vertical
variograms in real mode
rather than simbox mode.
Note that the Data
Analysis variogram
functionality always
normalises the total
variance to 1. The true
variance without normalisation would be related to the facies proportions.
Above is an experimental facies
(indicator) variogram for the MFRIO
zone. The variogram model in the
above diagram has the default
parameters, with the sill normalised
to 1. Note how the experimental
variogram rises and dips on the
shorter lags whilst staying below 1
and then rises to a variance above
1 in the later lags before again.
This is possibly indicative of nonstationary behaviour. Examination
of the vertical proportion curve for
this interval supports this
interpretation (see right). The
sands are clearly clustered at the base and top of the interval. Short lags tend

All rights reserved
FACIES VARIOGRAMS
to be in sands or shales. With long lags, pairs tend to be in sand (one sample
of pair in the top sand, one in the base sand) causing the reduction of
variance at long lags.
One possibility in this data set is to
consider the vertical variogram for
all zones simultaneously. Of
course, in deciding to generate and
model just one variogram vertically
we gain an advantage in that a
stationary assumption is quite
reasonable but this must be
contrasted with the possibility that
the vertical variogram range could
change between the zones, for
example due to compaction effects.
Using the data from all zones does
result in a well behaved variogram.
Using the data from all
zones together, a vertical
variogram has been
modelled. Choose the
model type. In this case
an exponential model
provides a god fit to the
data, a common result for
well log based
variograms. The sill is set
to 1. Remove the nugget
we do not interpret the
facies as having a
random noise component.
When satisfied with the
the model, copy and
paste to all facies before
turning on the zones and
pasting to all zones. Remember that with more than 2 facies, each facies
must be modelled separately.
Horizontal Variograms
After establishing the vertical variograms attention should turn to modelling
the horizontal variograms. Experimental horizontal variograms are likely to be
less reliable than vertical variograms unless sufficient wells are available.
Also, the essential short range information is usually missing unless wells are
very close together or have substantial horizontal well section.

All rights reserved
FACIES VARIOGRAMS
The Stratton Field data
contains only vertical
wells although the
interwell spacing is
relatively short, being as
close as several hundreds
of feet. When setting the
experimental variogram
calculation parameters
consider the maximum
pair separation distance.
For Stratton, the
maximum pairs are
separated about
L=16,000 ft, so we would
expect an experimental
variogram to have reliable
lags to about 8,000 ft
(L/2).
Begin with an omnidirectional variogram. Set the tolerance angle to 90 and a
large bandwidth. It is usually better to work in simbox mode for the horizontal
variograms, the variogram calculation then following the layering. The
thickness is the vertical offset in distance units for a pair to be accepted. To
obtain the maximum information from the variogram, this should be set to the
sample size (either log sample rate or average vertical cell thickness). Set the
number of lags to give a sensible lag spacing, not too many lags or the
statistics in each will poor. To few lags and the experimental variogram will be
averaged to much. Use the same judgement as you would to decide the bin
widths for a histogram plot.
After considering the omnidirectional variogram and establishing the basic
parameters for calculating the experimental variogram, next analyse
directional variograms. As already discussed, at least 4 directions should be
considered. Set a tolerance angle of 5 20 degrees, depending on the data
availability. Remember that decreasing the bandwidth will also restrict the
number of pairs available for inclusion in the experimental variogram.
Although it is awkward to do so in Petrel, you need to compare the horizontal
directions in order to establish any evidence for anisotropy in the horizontal
plane. The only way to do this may be set all parameters and then capture
the image for each direction and compare in PowerPoint or some similar
manner. When switching from major to minor direction it is very important in
Petrel to ensure that you have consistent calculation parameters to ensure
valid comparisons are made.
If you set the variogram calculation parameters for one zone, it is
recommended to copy/paste these calculation parameters to all facies/zones
for convenience in calculating variograms for subsequent zones.

All rights reserved
FACIES VARIOGRAMS
For Stratton Field it is very difficult to see any evidence for anisotropy in the
facies variograms. The horizontal plane has been modelled as isotropic, the
model being fitted to the omnidirectional variogram with a range of 1400 ft.
The vertical range was already modelled as 22 ft, giving a horizontal:vertical
anisotropy of around 60:1.
Directional horizontal indicator variograms for MFRIO
Omnidirectional horizontal indicator variogram and fitted model for MFRIO

All rights reserved
FACIES VARIOGRAMS
Directional horizontal indicator variograms for B46
Omnidirectional horizontal indicator variogram and fitted model for B46

All rights reserved
FACIES VARIOGRAMS
Directional horizontal indicator variograms for C38 Upper
Omnidirectional horizontal indicator variogram and fitted model for C38 Upper

All rights reserved
FACIES VARIOGRAMS
Directional horizontal indicator variograms for C38 Lower
Omnidirectional horizontal indicator variogram and fitted model for C38 Lower

All rights reserved
FACIES VARIOGRAMS
Directional horizontal indicator variograms for D11
Omnidirectional horizontal indicator variogram and fitted model for D11

All rights reserved
SEISMIC VARIOGRAMS
SEISMIC RESAMPLING
Seismic SEGY data is usually loaded into Petrel in the time domain. A
velocity model is then required to depth convert the seismic SEGY data. A
number of depth conversion strategies are available within Petrel. In
preparing this course, an attempt was made to use the V0 + kZ method. This
has a serious error in its implementation within Petrel in that it does not
correctly construct the velocity model and depth convert in the presence of
negative elevation units. For Stratton Field a simple layer cake model has
been used for velocity modelling and depth conversion. Two flat time slices,
one at 1320 ms and the other at 1750 ms have been used to define a model.
Average and interval velocities have been used from the well positions and
simply mapped to provide a smooth model for depth conversion. As Stratton
Field (in the shallower intervals we are considering) has almost no relief, the
depth conversion is very accurate. This is an important consideration as
misalignment of the seismic data with the model will seriously impair any
attempts to find relationships with seismic attributes or constrain the model
with seismic data.
East-west section showing Coloured Impedance seismic data and AI logs
It is strongly recommended that some time be spent,

such as during an inversion study, in ensuring a
consistent well tie framework between seismic and well
logs and that this framework is correctly imported into
the model.
After constructing the velocity model and depth
converting the seismic SEGY data the seismic can be
resampled into the model cells. Seismic resampling
can be found in Property Modelling -> Geometrical
Modelling.

All rights reserved
SEISMIC VARIOGRAMS
In the Geometrical Modelling dialogue
select Seismic Resampling. Select the
appropriate seismic volume and its
accompanying property template to ensure
that the colour table is carried through to the
property resampled into the model.
There are a number of options available to
control how the cells are populated.
Closest takes the nearest or most central
seismic sample and assigns it to the cell. If
there is a significant difference in resolution
between the seismic and model grid this is
not the best choice
Interpolate using a weighted interpolation of the four seismic cells closest to
the cell centre. Cannot be used with discrete data types such as a facies
volume.
Intersecting uses all the seismic samples within the cell boundary. No
correction is made for the intersection volume. If bringing in a discrete data
type such as a facies variable then make sure to use the most of or median
averaging options.
Exact is the same as Intersecting but with the addition of a volume correction.
This is the slowest method.
The averaging method chosen should be appropriate for the data type. For
continuous properties this may be arithmetic or other mean calculation. For
discrete data types such as facies then most of (mode) or median should be
used.
There is almost certainly a difference in resolution between the seismic data
and the model. Vertical seismic resolution is likely to be lower than the
resolution of the cells within the model. Horizontal resolution may be similar,
although could be larger or smaller. We call this the support of the
measurement in geostatistics.
The seismic time sample rate and trace spacing do not define the resolution
or support of the seismic measurements. It is the seismic amplitude spectrum
(frequency content and band width) which defines the support. Resampling
data does not change the resolution. Upscaling (averaging) does change the
resolution to a large support. The popular notion of downscaling is nave:
once a process is measured at a larger support a change to a smaller scale of
support can only be made in probabilistic (stochastic) sense.

All rights reserved
SEISMIC VARIOGRAMS
SEISMIC VARIOGRAMS
Given the difficulty of generating horizontal variograms from well data we
often need to consider alternative sources of information on variogram
parameters. One possible source is the use of analogue and outcrop data
bases. Another source, almost always to hand, is to analyse the seismic data.
The most appropriate date for this purpose is a relative impedance volume
such as that produced by coloured inversion. Deterministic or stochastic
inversion data should not be used for estimating variograms.
Having resampled the seismic attribute
such as coloured impedance into the
Petrel model, horizontal directional
variograms can be calculated by K layer
within model and used to confirm or
supplement the variogram analysis made
using the well data. To calculate
experimental variograms by K layer,
double click on properties and turn on
Index Filter and select K index filter and
define the appropriate layer.
To perform the variogram calculations, double

click the appropriate property in the model (in
this case, seismic coloured impedance) and
select the Variogram tab (last tab to right).
Make sure the property filter is turned on to
filter only the values for the required K layer.
Choose Sample Variogram and Horizontally
and then set the direction and angular
tolerance. Recommended starting point
would be 4 regular directions with a narrow
angular tolerance of 5 degrees or less.
Ensure the Overwrite last option is
unchecked. Each direction and K Layer
experimental variogram will have to be
calculated manually. If the generation is
systematic and they are grouped properly in
the variogram folder then they will be coloured consistently for subsequent
display and analysis

All rights reserved
SEISMIC VARIOGRAMS
Coloured Impedance Horizontal Variograms
K layer = 1
K Layer = 20
K layer = 40
K Layer = 47
K layer = 60
K Layer = 80

All rights reserved
SEISMIC VARIOGRAMS
Coloured Impedance Horizontal Variograms
K layer = 100
K Layer = 120
K layer = 140
K Layer = 160
K layer = 180
K Layer = 191

All rights reserved
SEISMIC VARIOGRAMS
Inspection of the horizontal variograms from the seismic coloured
impedance suggests little evidence for anisotropy, with the short lag
element of the directional variograms of all layers appearing isotropic.
At longer lags the total variance exceeds 1 and this suggests that there
they may be some small non-stationary component present. This nonstationary element cannot be a function of the mean impedance
though, as this is a seismic attribute and has no low frequency
component present due to the limitations of seismic bandwidth.
As previously
described, a model
can be added to the
experimental
variograms using the
variogram function on
the right hand toolbar.
The generate
variogram dialogue
will be displayed,
allowing selection of
the model. After
pressing Ok the
chosen model will be
overlaid on the
function window.
The variogram displays on the following pages show the model fitted to
each of the K layer horizontal variograms. The model parameters are
given in the table below. There does not appear to be a pattern with
depth. In addition, the ranges are generally larger than those
estimated from horizontal well variograms.
K Layer
1
20
40
47
60
80
100
120
140
160
180
191
Function
Exponential
Exponential
Spherical
Exponential
Spherical
Exponential
Exponential
Exponential
Exponential
Exponential
Exponential
Exponential
Range (ft)
2000
2800
2050
3400
1300
3000
2600
2000
2800
3500
1800
1600
Sill
0.50
0.78
0.91
1.00
0.23
0.94
0.65
0.67
0.75
0.95
0.88
0.62
Table of variogram models fitted to horizontal variograms from coloured impedance

All rights reserved
SEISMIC VARIOGRAMS
Coloured Impedance Horizontal Variograms and Fitted Models
K layer = 1
K Layer = 20
K layer = 40
K Layer = 47
K layer = 60
K Layer = 80

All rights reserved
SEISMIC VARIOGRAMS
Coloured Impedance Horizontal Variograms and Fitted Models
K layer = 100
K Layer = 120
K layer = 140
K Layer = 160
K layer = 180
K Layer = 191

All rights reserved
FACIES SIMULATION
SEQUENTIAL INDICATOR SIMULATION
First step for performing any facies simulation is
to make a copy of the upscaled Facies object.
Rename the copy Facies SIS. All the data
analysis settings will be retained if you copy the
Facies object for which the data analysis was
performed.
Double click Facies Modelling to open the facies
modelling dialogue.
Select both facies from the facies

tab. Of the three buttons, select
on the first button (use variograms
from data analysis) and the third
button (use vertical proportion
curves from data analysis). Make
sure that you did model the
vertical proportion curves for both
sand and shale and that the
variogram models were copied to
all facies.
Choose the modelling method as
Sequential Indicator Simulation.
For this first example we will
ignore any lateral trends.
There is a quality control question
here: there is no way to verify
which variogram models are
actually picked up by the program,
which is disconcerting.
Pressing Ok will run the SIS
simulation.

All rights reserved
FACIES SIMULATION
K=41
K=42
K=43
K=44
K=45
K=46
K=47
K=48
K=49
K=50
K=51
K=52
K=53
K=54
SIS for K Layers 41 54 Zone B46

All rights reserved
FACIES SIMULATION
QUALITY CONTROL OF SIS
When the simulation has completed a number of checks should be made of
the results:
Check facies proportions by zone

Check vertical proportions
Check thickness distribution by zone
Checking statistics and histogram reproduction for zone B46
Checking thickness distribution between (l-r) original logs, upscaled and SIS cells

All rights reserved
FACIES SIMULATION
SIS WITH SEISMIC CONSTRAINT
The sequential indicator simulation can also be run using a 3D probability
cube as a constraint. In the Stratton Field dataset a stochastic seismic
inversion has been run to generate impedance realisations. These
realisations have then been used to generate a probability cube based on an
impedance cutoff of 8,000 (described in the Stratton Field introduction section)
East-west section showing sand probability cube and Vclay/ILM logs
To use a probability cube with SIS

in Petrel you must specifiy a
probability cube for each facies.
The sum of the cubes should be 1.
The sand probability cube is
turned into a shale probability
cube in calculator where Pshale = 1
Psand. Copy the Facies property
to a new property called Facies
SIS seismic.
Turn off the vertical proportion
from Data Analysis button. For
each facies turn on the Probabiity
from property option and assign
the correct probability property for
each facies. Repeat for all zones
and facies.

All rights reserved
FACIES SIMULATION
K=41
K=42
K=43
K=44
K=45
K=46
K=47
K=48
K=49
K=50
K=51
K=52
K=53
K=54
Sand Probability from stochastic seismic inversion for K Layers 41 54 Zone B46

All rights reserved
FACIES SIMULATION
K=41
K=42
K=43
K=44
K=45
K=46
K=47
K=48
K=49
K=50
K=51
K=52
K=53
K=54
SIS with Seismic for K Layers 41 54 Zone B46

All rights reserved
FACIES SIMULATION
OBJECT MODELLING
For the Stratton Field we have just two facies to model. The depositional
environment is predominantly shale (88 - 95%, depending on zone) with fluvial
sands only comprising 5 12% of the facies. The first step in object
modelling is to select the background facies, clearly for Stratton Field the
shale is the appropriate background facies.
Next we should consider the objects to be modelled. We can summarise the
sands as:
Fluvial Channels
East-west orientation
5 30 ft thickness
Composite widths of up to 2,500 ft
For the object modelling we have to set distributions (PDFs) describing the
channels. For thickness, we can check the actual sand thicknesses.
Thickness distribution of sands in B46 Original logs (left) and upscaled cells (right)
Note that the thickness distribution is exponential. This is quite a common

distribution for sand thickness. Remember that we need to consider the
thickness for our objects. Do the thicknesses refer to composite,
amalgamated channels or individual channels? Which are we modelling?
Petrel does not offer an exponential distribution for modelling, but we can
approximate this distribution by using a truncated normal distribution.

All rights reserved
FACIES SIMULATION
Although we have a geological description of channel complexes having
widths up to 2,500 ft, we should also consider any other evidence such as
seismic data to support this. An example channel is delineated in a stochastic
inversion realisation at K layer 47 in the display below. The channel appears
to vary in width between 400 1200 ft.
K=47 Layer slice through stochastic inversion realisation #0008
Having decided on initial parameters we can set up the object simulation.

Again, copy the original property Facies to a new property Facies Objects and
double click to edit the facies simulation. Select object modelling. Next, on
the Background tab set the background facies to shale. Select the Facies
Bodies tab. Under the Settings (lower tab) select sand and use the estimated
fraction. Note other settings.

All rights reserved
FACIES SIMULATION
Next select the Layout (lower tab). Here we can set distributions describing
the orientation, amplitude and wavelength of the channel. The example
shown has selected normal distributions for each. Orientation is East-West
with 95% of the channels expected to be oriented between 80 and 110
degrees. The Drift parameter allows for change in the channel characteristics
along the channel.
On the Channel (lower tab) set the width and thickness distributions.
Finally, on the Trends lower tab

ensure that you select to use the
vertical proportion curves from data
analysis.
For the first run, the parameters
selected have been copied to all
zones. Select Ok (or Apply) to run the
object modelling.

All rights reserved
FACIES SIMULATION
QUALITY CONTROL OF OBJECT MODELLING
Check the results of object modelling with
care. Because the method is based on
accept/reject methods and the parameters
such as channel widths and thicknesses are
often difficult to estimate, the object
modelling may not converge or result in a
model with poor reproduction of sand facies
and not correctly honour the well data.
The first output from object modelling is a
listing showing the progress of the algorithm
in each layer. Inspect this carefully noting
the following:
Well match
Facies Fractions
Number objects inserted/rejected
The well match shows how many channel cells in the wells have been
modelled with a channel. In this example it is 40%, a bad result. Compare
the final facies proportions with the observed proportions or your desired
proportions (remember, this may be determined by you and related to the size
of model). Here we have a sand fraction of under 6% but the log data has a
proportion of nearly 10%. Check the number of objects inserted and rejected.
The results above indicate we have problem and will need to reconsider our
parameters.
As for SIS you should also check the reproduction of the vertical proportions
and thickness distributions using Data Analysis. In addition, inspect the
results visually. Do the channels fit with your geological concept?
Because object methods involve filling large amounts of model space with
each object inserted, note there is a relation between the choice of
parameters for channel dimensions, facies proportion and the ability of the
algorithm to condition to both the facies proportions and to the well control
points. Any disparity in the results will suggest that the model parameters are
incompatible with the data in the wells and revision of the parameters is
required.
Object Modelling Example Run 1
We will demonstrate object modelling and the requirement to update the
parameters in an iterative manner. We will begin with the following initial
parameters in all zones:

All rights reserved
FACIES SIMULATION
PDF
Layout Parameters
Normal
Orientation
Normal
Amplitude
Normal
Wavelength
Channel Parameters
Normal
Width
Thickness
Truncated Normal
Drift
Min
Mean
Max
0.1
0.1
0.1
90
1000
4000
5
200
500
800
500
0.1
Initial channel object parameters for all zones
For the purposes of this exercise we will only investigate the first two layers,
MFRIO and B46. In order to provide a realistic practical, we force an
incompatibility in the parameters by falsely setting the target proportion in the
B46 to be 5.5% (actually the proportion for the MFRIO) when the true
proportion is 9.6%. This can be done by first setting the MFRIO zone and
then copying parameters to all zones an easy mistake to make!
After running the object modelling, we first inspect the algorithm listing output.
For the MFRIO interval the results look excellent, with a 100% match to the
wells and a facies proportion of 5.8 %, close to the target proportion of 5.5 %.
Object model listing for MFRIO (left) and B46 (right) after Run 1

All rights reserved
FACIES SIMULATION
For the B46 zone there is clearly a problem. Although we have reached the
target proportion, the well match is very poor with only 52% of well channel
cells intersected by a channel object. This suggests that the inserted objects
are either too large, causing the algorithm to stop when the facies proportion
is reached in the zone but without correctly conditioning to all the wells, or that
the facies proportion is wrong (which it is, of course).
Checks should also be made of the reproduction of the vertical proportion
curves and the thickness distribution. The various comparisons are shown
below. The results for the MFRIO are generally good, although the thickness
distribution might be improved.
Vertical proportion curve QC MFRIO (left) and B46 (right)
Checking thickness distribution between original logs, upscaled and object cells for MFRIO

All rights reserved
FACIES SIMULATION
For the B46 the vertical proportion curve is not properly reproduced, being
rather patchy. The thickness distribution is also poorly reproduced, with an
underestimate of the thicker part of the distribution.
Checking thickness distribution between original logs, upscaled and object cells for B46

We will proceed by attempting to improve the well match in the B46 zone by
modifying the channel parameters. For our second run the following
parameters have been used in the B46, changing the width and thickness
parameters:
PDF
Layout Parameters
Normal
Orientation
Normal
Amplitude
Normal
Wavelength
Channel Parameters
Normal
Width
Thickness
Truncated Normal
Drift
Min
Mean
Max
0.1
0.1
0.1
90
1000
4000
5
200
500
300
100
0.1
Run 2 updated channel object parameters for B46 zones
With the modified parameters we obtain a better fit to the wells in B46, Run 2
resulting in a match of 78%.

All rights reserved
FACIES SIMULATION
The vertical proportion is
improved compared to Run1,
partly because more channel
has been inserted, intersecting
with the wells. The thickness
distribution is now much
improved, although thicknesses
could still be reduced.
Run 2 B46 Thickness distribution (left) and vertical proportions (right)

To try and improve the match to the thickness distribution, the thickness range
has been reduced still further.
PDF
Layout Parameters
Normal
Orientation
Normal
Amplitude
Normal
Wavelength
Channel Parameters
Normal
Width
Thickness
Truncated Normal
Drift
Min
Mean
Max
0.1
0.1
0.1
90
1000
4000
5
200
500
300
100
0.1
Run 3 updated channel object parameters for B46 zones

All rights reserved
FACIES SIMULATION
The result is a 94% match to
the well data. The vertical
proportion curve reproduction is
improved and the thickness
distribution shifted as expected.
Run 3 B46 Thickness distribution (left) and vertical proportions (right)
On the following pages is a comparison of the object simulations through K

layers 41 54 in the B46 interval for the three different. Although the final
Run 3 improves the numerical fit, it clearly results in thin channels that are not
consistent with the geological model. A good moment to check everything
carefully and realise that the target facies proportion in the B46 zone is set too
low!

All rights reserved
FACIES SIMULATION
K=41
K=42
K=43
K=44
K=45
K=46
K=47
K=48
K=49
K=50
K=51
K=52
K=53
K=54
Object modelling Run 1 for K Layers 41 54 Zone B46

All rights reserved
FACIES SIMULATION
K=41
K=42
K=43
K=44
K=45
K=46
K=47
K=48
K=49
K=50
K=51
K=52
K=53
K=54

All rights reserved
FACIES SIMULATION
K=41
K=42
K=43
K=44
K=45
K=46
K=47
K=48
K=49
K=50
K=51
K=52
K=53
K=54

All rights reserved
PETROPHYSICAL MODELLING
After simulating the facies we need to populate each facies with appropriate
estimates of porosity and permeability. The most common approaches to
used for porosity are:
Map porosity (kriging or functions)

Simulate porosity (SGS)
Incorporate trends such as those from seismic inversion or attributes
Because permeability is usually only available as core data, permeability

estimation is more problematic and includes
Linear transform of porosity mapped in cells

Linear transform of logs or upscaled cells and then mapping
Simulation or co-simulation of above
For the purposes of training, we will run each algorithm using only the porosity
values defined fort he sands, but view the results across all cells in the model.
We can then see the results of each method without any clipping by the facies
simulation. Of course in reservoir modelling, the porosity is estimated or
simulated for each facies and the cells in the model are populated with the
appropriate porosity for the facies simulated in that cell. In this case, a typical
approach would be to select an estimation method to populate the sand facies
and to assign a constant value of porosity to the shale facies.
Typical modelling by facies: Estimation within sands (left) and assign value to shales (right)

All rights reserved
The methods for filling the cells are referred to as interpolation in Petrel and
include:
Functional methods eg inverse distance

Kriging
Although there are a whole range of methods within Petrel for creating
surfaces we will make comparisons between a subset comprising the
following algorithms:
Parabaloid Surface
Bilinear Surface
Kriging
Kriging (GS-LIB)
Sequential Gaussian Simulation (SGS)
It should be noted that there are two kriging implementations available within
Petrel. There is an internal implementation and an external implementation
which uses the GS-LIB code executable directly. The manual describes the
differences as:
Petrel internal Kriging implementation
Works in XYZ
Only considers data within the variogram range
Faster because no transfer of data to external algorithms
The hints menu for the internal kriging implementation states the following:
Since kriging in Petrel uses all input observations, the number of
observations should limited. Algorithm will be slow with too many
observations (> 100).
This appears to directly contradict the claim that the internal algorithm is
faster. Also the statement that it only considers data within the variogram
range is a matter for concern the manual notes that this may lead to strange
effects in areas with no data when trends have not been removed correctly. If
the algorithm really only uses data out to the range of the variogram, this is
likely to be undesirable behaviour that will give problems with the estimation of
the mean.
GS-LIB Kriging implementation
Works in IJK
Control of advanced settings especially neighbourhood search
Co-located cokriging available

All rights reserved
The GS-LIB implementation gives
access to the choice between simple or
ordinary kriging, which is not usually too
critical. Recommend simple kriging as
best default as it will ensure the correct
mean. The search parameters should
ideally use all the samples available
within each K layer plus some vertical
sample inclusion across the K layers
(but not the zones). This is not
straightforward to achieve with the GSLIB program.
The GS-LIB implementation is consistent
with the SGS implementation. It makes
sense to use the algorithms in
conjunction. There are some bad
interface bugs and the program does not
always allow the estimation to proceed
even though the parameters are
reasonable.
POROSITY VARIOGRAMS AND MAPPING

Before proceeding to
make a comparison of the
various algorithms, the
geostatistical options will
require the preparation of
variograms as a
prerequisite.
In Data Analysis we have
already described and
applied a 1D vertical
depth trend and normal
score transform to the
porosity data. In Data
Analysis we chose to
work with the porosity
from all zones together
which is a good choice for
this data but is not a
general recommendation
for all data sets.
Vertical porosity variogram for all zones
The vertical variogram for all zones is shown above. The variogram shows a
large peak at the intermediate lags, with the variance above the standardised

All rights reserved
1.0 variance. This
corresponds to the low
porosity values in the mid
zones of the data set (see
1D trend to right).
As previously described,
variogram analysis should
include both
omnidirectional and
directional variograms.
Firstly, we begin here
looking at these
variograms for all zones
combined. The
directional variograms are
not straightforward to
analyse.
Vertical 1D trend in porosity
Directional horizontal porosity variograms for all zones

All rights reserved
Further analysis looks at omnidirectional variogram, for all zones and by each
individual zone. The results are inconclusive. The vertical variogram is well
defined but not the horizontal. One alternative is to consider the seismic
properties as previously. But is this a proxy for facies, porosity or both?
Omnidirectional variogram: All zones
Omnidirectional variogram: MFRIO
Omnidirectional variogram: B46
Omnidirectional variogram: C38 Upper
Omnidirectional variogram: C38 Lower
Omnidirectional variogram: D11

All rights reserved
Parabaloid Surface
Select Functional (Interpolation) and
choose to follow layers (this would
be a usual practice). Select
parabaloid and use the inverse
distance squared weighting. The
accompanying figures show a 3D
view and layer view of the resulting
interpolation. The key data to
consider, however, is what happens
to the histogram. All estimation
methods are smoothers and cause
the histogram to be squeezed
towards the mean. This poses a
problem both for volumetric
calculations and permeability
prediction.
3D View of parabaloid surface interpolation
Comparison of histograms from logs, upscaled cells and parababloid surface
Porosity estimated using parabaloid surface, K layer 44

All rights reserved
Bilinear Surface
Select Functional (Interpolation) and
use the follow layers option. Select
Bilinear surface and inverse
distance squared weighting. Result
is shown as 3D view to right and
layer map and histograms for
comparison are shown below.
3D View of bilinear surface interpolation
Comparison of histograms from logs, upscaled cells and bilinear surface
Porosity estimated using bilinear surface, K layer 44

All rights reserved
Kriging (Petrel Implementation)
Select Kriging (Interpolation). There
is no follow layers option as this
implementation works in XYZ.
Select to use the variograms from
data analysis.
Note how the kriged surface
appears as a series of halos or
bulls-eyes around the well data
control points. We have discussed
the objectives of estimation
previously and described this
behaviour.
3D View of Kriging (Petrel) interpolation
Comparison of histograms from logs, upscaled cells and Kriging (Petrel)
Porosity estimated using Kriging (Petrel), K layer 44

All rights reserved
Kriging (GS-LIB Implementation)
Select Kriging by Gslib
(Interpolation). There is no follow
layers option as this implementation
works in IJK (Simbox). Select to
use the variograms from data
analysis. The expert tab allows the
choice of simple or ordinary kriging
and options to edit the number of
cells to use.
The results seem to show some
artefacts which may be related to
XYZ domain for calculation. Note
also that the mean is different to that
from the previous example. This is
due to neighbourhood search
differences.
3D View of Kriging (GS-LIB) interpolation
Comparison of histograms from logs, upscaled cells and Kriging (GS-LIB)
Porosity estimated using Kriging (GS-LIB), K layer 44

All rights reserved
Select SGS. There is no follow
layers option as this implementation
works in IJK (Simbox). Select to
use the variograms from data
analysis. The expert tab allows the
choice of simple or ordinary kriging
and options to edit the number of
cells to use. This algorithm is
consistent with the Kriging (GS-LIB )
algorithm.
Remember simulation will reproduce
variograms and the histogram
correctly (see below), so the colour
range is larger. It is shown in full in the 3D view and on the same range
View of Kriging (GS-LIB) interpolation
As previously for the layer view to

facilitate comparison.
Comparison of histograms from logs, upscaled cells and SGS
Porosity estimated using SGS K layer 44

All rights reserved
3D
For display surfaces, the user might consider any acceptable functional or
gridded surface. For the purposes of reservoir modelling and the resulting
fluid flow simulation, the only algorithm which should be considered is SGS.
As we have seen, this is the only approach which correctly reproduces the
variability (in the form of the histogram) and the spatial connectivity through
the variogram. Both of these are essential properties of an attribute to
reproduce if we intend to use the model for fluid flow simulation.
PERMEABILITY ESTIMATION
So far the analysis has considered only porosity estimation. In common with
many reservoir models, the Stratton Field data set does not have permeability
data loaded. The permeability data is generally only defined on core data and
is usually analysed through the use of permeability-porosity crossplot to
establish a relation between permeability and porosity.
Having established a relation, a permeability field can be estimated by
applying the transform at one of several points in the typical work flow.
Transform well log porosity to permeability

Transform mapped porosity to permeability
Transform simulated porosity to permeability
None of these methods are suitable for constructing reservoir models suitable
for fluid flow simulation. Permeability should be co-simulated with porosity.
Permeability-Porosity Crossplots
Consider the crossplot to the
right. The line has been fitted
using least squares and is a
good estimator for predicting
permeability given porosity.
For a net reservoir calculation,
we might apply this relation to
the porosity well logs to
estimate a permeability log,
truncate at our cutoff of 1 mD
and estimate net pay.
This action is identical to using
the regression line to convert
our permeability cutoff to a
porosity cutoff (9.5%) and
truncate the porosity logs. The
net predicted from the well logs would be the same with either approach.
Unfortunately although this is a widespread and almost universal practise it is

All rights reserved
wrong and the net prediction will be biased. If we apply the porosity cutoff of
9.5% to the crossplot we would count samples in quadrants B+D as
contributing to net, a proportion of 92.5% of the total samples. This is what
we are doing when we apply the transform to the well logs (whether converted
to permeability or remaining as porosity). If we were to apply the permeability
cutoff to the crossplot we would include samples in quadrants A+B as net pay.
This gives 82.5% net, a significant change. Cutoff calculations cannot be
applied to the output from an estimation procedure because the estimator
causes the histogram to change and leads to bias. In the diagram above,
using the transform to predict permeability will result in the lowest estimated
peremeability being predicted from the lowest porosity. Similarly our highest
predicted permeability will be from the highest porosity. A glance at the graph
shows that the predicted permeability range must be narrower than the true
permeability range hence the predicted permeability will have a narrower
historgram about the mean.
The problem is exacerbated because the function is fitted with symmetric error
terms under a log transform of porosity. This results in asymmetric errors on
permeability, with large permeability values significantly underpredicted as
compared to a slight overprediction of low permeability values.
The correct solution is to use a simulation method which, while following the
form of the relation defined by the transform, reproduces the variability and
hence the histogram of permeability. In addition, we need to preserve the
correlation between porosity and permeability.
10000
0.3007x
y = 0.0204e
2
R = 0.6087
1000
Permeability (mD)
The permeability-porosity data

used for Stratton Field are
shown here. Note that
permeability is not available for
the Stratton data set in the
public domain and so these
data have been borrowed
from another field. The mean
porosity and permeability have
been shifted to be consistent
with the Stratton Field well log
porosity data. The correlation
is good with R=0.78.
100
10
0.1
5
10
15
20
25
30
35
40
Porosity (%)
The following layer displays
show the result of applying the
transform defined by the regression line shown on the crossplot to two of the
porosity estimations previously described:
Bilinear surface inverse distance

SGS

All rights reserved
These displays will be identical to the porosity displays but the colour table
annotations have changed in to permeability units.
Permeability predicted by linear transform of bilinear porosity grid
Permeability predicted by linear transform of SGS porosity grid
From the cells in the model we can view the poro-perm relation for the above
two examples. Note that the scatter is not reproduced and the permeability
range is highly restricted, although there is more dynamic range in the SGS
result as it correctly reproduces the porosity histogram.

All rights reserved
Modelled poro-perm relation: bilinear function (left) and SGS (right)
A proper approach to generate a suitable permeability field would be to cosimulate with an SGS simulated porosity. True co-simluation is not available
in Petrel, but the co-located co-simulation is an adequate alternative.
Simulating in this way would require permeability logs to be available at the
wells. It would not be correct to substitute linear transforms of the porosity
logs as this forces an exact relation between porosity and permeability at the
wells.
Correlated Permeability
An alternative compromise, using just the permeability-porosity crossplot
information is to simulate a separate but correlated permeability field (not
based on well logs ie non-conditional). This is then coupled to a porosity field
produced using SGS via a joint relation. This method reproduces the form of
the linear transform as well as the correlation coefficient and allows the
permeability field to include a separate spatial correlation function.
The procedure is as follows:
Properties -> Filter Tab set Upscaled to Exclude

Make a copy of the porosity called X2
Petrophysical Modelling select Common and tick Use Filter

All rights reserved
Select the variograms from data

analysis
Turn off PDF and depth trend
from data analysis
Simulate a normal distribution
with a mean of 0 and standard
deviation of 0.0385 (This is the
variability of the residual porosity
without the depth trend).
Set min and max to absolute and
use values of 5
After applying the above, turn off the

filter and then use the calculator to
combine the non-conditional porosity
with the SGS porosity. Note we also
have to correctly account for the depth
trend.
Recall from the statistics section that the
correlated part is a combination, weighted by R2, of the residual parts only,
without the mean. Here we are modelling a non-stationary mean in the form
of a vertical depth trend.

All rights reserved
The result has been called Porosity_X3, which is a porosity field correlated
with the SGS porosity field. Porosity_X2 and Porosity_X3 can be compared
to SGS porosity to ensure they have consistent histograms.
Comparison of SGS, Porosity_X2 and Porosity_X3 histograms
Having created a correlated porosity field Porosity_X3 we now convert it to

permeability (Perm_X3) using our linear transform. A crossplot shows that we
now have a correlated permeability field rather than a linear functional
dependence on our simulated SGS porosity. We may not be entirely happy
with the reproduction of the permeability range or the correlation coefficient,
so these can be adjusted empirically by adding a porosity correction, this
could be to the mean in the calculator option, by changing the standard
deviation in the non-conditional simulation or a combination of both.

All rights reserved
Perm_X3 vs SGS (left) and Perm_X4 vs SGS (right)
Permeability histograms by transform: Parabaloid (left) and bilinear (right)

All rights reserved
Permeability histograms by transform: Kriged (Petrel) (left) and Kriged (GS-LIB) (right)
Permeability histograms by transform: SGS (left), Perm_X3 (centre) and Perm_X4 (right)
GEOLOGICAL MODEL TO FLOW SIMULATION

The final problem with reservoir modelling which we should mention is what to
do after simulating, say, 100 realsiations of a million cell model? With current
technology we cannot run fluid-flow simulation on such a large problem and
certainly not on so many realisations. After undertaking stochastic reservoir
modelling there is still a tendency to choose just one realisation for upscaling
and simulation (the first one generated, sometimes), or just a few realisations.
This poses a real problem is there any point in generating multiple
realsaitions if we can only afford to perform fluid-flow calculations on one?
One solution is to select a few realisations on the basis of static properties
such as total volume or a connectivity index, choosing realisations with low,
intermediate or high values on these criteria. This gives some confidence that
a range of possible models is being tested. However, it should be
remembered that the reason for fluid-flow simulation is partly because statis
properties are not good proxies for dynamic properties.
A strategy for reducing the number of realisations whilst testing many
parameters is to use the latin hypercube. By crossplotting static properties we
can reduce the number of tests if the statis properties are correlated. This
principal is illustrated below. On the left the two properties extracted from the
realisations are crossplotted. Clearly a low value of reservoir volume could
correspond to low, intermediate or high connectivity in the realisations. This
shows that we would have to test more realisations in order to investigate the
two parameters, perhaps 9 realisations.
On the right we have a clear correlation between the parameters and this
reduces the number of realisations we might test. If we fluid-flow through a
low reservoir volume realisation we are also testing a low connectivity index
realisation.

All rights reserved
Conceptual example illustrating latin hypercube sampling

All rights reserved
SEISMIC CONDITIONING
CO-KRIGING AND CO-SIMULATION
Petrel offers both co-kriging and cosimulation based on the GS-LIB
algorithm. The co-kriging or cosimulation is constrained by a secondary
variable. The weight placed on this
constraint is controlled by the correlation
coefficient under a Markov-Bayes model.
The secondary variable can be either a
property, horizontal surface or a vertical
function. Petrel offers two choices of
algorithms:
Locally varying mean

Co-located
In the co-located method the correlation

coefficient is used to calculate the
influence of the secondary variable. No
scalar information is used, so the
secondary variable can be in any units.
The only requirement is that there is a linear relation between the property
estimated and the secondary variable.
In the locally varying mean method, the constant mean assumed for a simple
kriging is replaced by the mean of the secondary variable.
The implementations in Petrel are directly from GS-LIB and both work
correctly. Problems usually arise because of misunderstanding over the
objectives of these methods.
Stratton Field Example
As an example, we will use colocated co-simulation to map
porosity using seismic coloured
impedance as a constraint. We
have previously observed a
correlation between porosity and
seismic coloured impedance.
The observed correlation
coefficient is R=-0.291.
However, Petrel correctly
requires the correlation
coefficient after both variables
are normal score transformed, which can be found using the estimate option.
The correct correlation coefficient is R=-0.275.

All rights reserved
SGS without seismic constraint, K layer 44, B46 zone
SGS with co-located seismic constraint, K layer 44, B46 zone
Coloured impedance used as seismic constraint, K layer 44, B46 zone

All rights reserved
The influence of the seismic property is clearly seen. Remember the
correlation is negative. The porosity around wells 01 and 16 is clearly
increased and the porosity west of wells 12 and 17 reduced by introducing the
seismic constraint. A check of the histograms shows that the distribution is
modified by the seismic constraint.
Comparison of SGS and co-located SGS histograms
LIMITATIONS OF CONSTRAINING TO SEISMIC DATA

The use of deterministic seismic inversion to generate quantitative seismic
impedance estimates is becoming common as an input to reservoir modelling.
Determinisitic seismic inversion combines data from wells and seismic to
create a broad bandwidth impedance model of the earth. However, we must
always remember that the seismic resolution is at a coarser scale than the cell
size of most reservoir models and so the seismic is usually informing the
model of variations in the average properties over a zone, zones or part zone,
depending on the actual zone thickness and seismic resolution.
In the vertical direction, the typical averaging from seismic inversion is over a
thickness of perhaps 20 m (10 50m, depending on resolution). In the
horizontal direction, seismic averages over a typical 50 x 50 m are. It is
important to note that the seismic time sample rate and trace spacing do NOT
define the resolution. Resolution is determined by frequency content and
bandwidth.
Seismic data is bandlimited. In particular, it
does not contain low frequencies and
therefore absolute impedances cannot be
recovered directly from the seismic trace.
All inversion schemes with an absolute
impedance output require a low frequency
model or constraint. The low frequency
scalar is usually obtained from interpolation
of well data, stacking velocities or a
combination of these. After inversion the
low frequency model is embedded in the deterministic inversion. Artefacts in
the low frequency model manifest themselves as equivalent artefacts in the
deterministic inversion. Conditioning a reservoir model using this type of data

All rights reserved
is equivalent to conditioning to a map of the wells. For this reason
deterministic inversions should NOT be used to condition reservoir models.
Some inversion schemes

amount to little more than a
relative impedance (from the
seismic) added to a low
frequency model. This can be
easily demonstrated by filtering
the low frequency out of the
inversion result and comparing
with relative or coloured
impedance impedance.
An example is shown in the
accompanying figures. A
commercial sparse spike
inversion algorithm has been
used to compute absolute
elastic impedance.
By applying a low pass filter,
the low frequency model is
obtained. Artifacts
(bullseyes) around the well
locations can be clearly seen.
These arise because the low
frequency is simply an
interpolation of the wells, just
like a 3D property estimated
using kriging or bilinear
interpolation in Petrel.
A further concern in this case
is revealed by the application
of a high pass filter. This
removes the well interpolation
element and shows the
contribution to the inversion
obtained from the seismic.
In this example, this seismic
contribution has been
compared to a simple relative
impedance estimate obtained
by simple processing of the
seismic data (similar to the
coloured impedance used in

All rights reserved
the Stratton Field example.
Apart from a slight smoothing
caused by the application of a
trace mix in the full
deterministic inversion, the two
displays are indistinguishable,
suggesting that the
deterministic inversion has
made no contribution in this
case.
The value of inversion is in the well matching, wavelet estimation and

deconvolution to zero phase. By integrating the seismic traces output from
these processes we obtain relative impedance which is not contaminated with
the artefacts often caused by efforts to obtain absolute impedance by
including a low frequency constraint.
Important Note Use of Inversion Products in Reservoir Modelling

It is very important to note that deterministic inversion data should NOT be
used to constrain a reservoir model. Deterministic inversion data MAY be
used if the low frequency component is filtered out to remove the influence of
the wells from the deterministic inversion. A seismic attribute or relative
impedance (such as coloured inversion) CAN be used. The ideal
impedance product to constrain a reservoir model would impedance
realisations generated from a stochastic inversion. It is important to note
that some stochastic inversion methods (such as Promise) produce output
similar to deterministic inversion schemes and the same caution is advised
before using these as conditioning for the model.
Our ideal for constraining a reservoir model with 3D seismic data would be to
have the model fully consistent with pre-stack seismic amplitude information.
Currently there are several possible approaches considered for solving this
problem.
Probability transforms from seismic to rock properties are simple to apply
and valid but do not take into consideration the spatial elements of the
property. This is essentially a 1D PDF method.
Stochastic seismic inversion, either pre- or post-stack can be used to
generate impedance realisations which correctly include the uncertainty in
seismic inversion. These impedance realisations can be included directly into
a Petrel model using the co-located option or converted to probability
volumes. With current technology this is the most practical method.

All rights reserved
Forward modelling constraints in which the Petrel model is converted to

seismic properties using a rock physics model and then forward modelled to
produce a synthetic seismic response. This response is then compared to the
observed seismic data. Difference will show where the model requires
updating. Currently, updating the Petrel model is very difficult to do.
Algorithmically, to update the model comprehensively and consistently require
huge computer processing capacity. These methods, while ultimately our
objective, are not yet practical.
EXAMPLE PROBLEMS IN PETREL

The following examples were prepared by Chris Townsend and show the
problems of attempting to condition a reservoir model using seismic
impedance data. In this instance the constraining data comes from Promise.
Although Promise uses a stochastic methodology in the inversion, the results
input to Petrel are equivalent to a layer average map as would be obtained
from deterministic inversion.
Aside from the fact that the embedded well models contained in a
deterministic inversion mean this data should not be used to constrain a Petrel
model, attempts to map using a seismic constraint such as from deterministic
inversion (and this includes the Promise output) will be also compromised due
to the inversion data being a layer average property. The problem should
properly be addressed by accounting for the scale change between the model
cell estimates and the layer average from seismic. Petrel is not able to do
this.
The first action is to copy the seismic
constraint into the model. With a 2D
surface such as from Promise, this
should be copied to each layer within
the zone.
Using the co-kriging option, the SGS
porosity modelling is constrained by the
seismic data.
In map form, looking at the zone
average, the results appear quite
reasonable with the map following the
seismic trends. However, note that
when using the co-located option the
porosity magnitudes from the seismic
data are ignored, the porosity scaling
coming from the well data only.

All rights reserved
Porosity average compared to seismic trend: zone average map (left), section (right)
Only when we look in cross-section

do we see the problem, with the
values across all layers following the
same trend. The algorithm is working
correctly. What we have is a classic
problem of support. We really want
only the SGS average for the whole
zone to be following the seismic
average, not for every layer
We should also note that the well data does not have the same histogram as
the porosity from the seismic property (see right). This is because the seismic
porosity map contains a smooth estimate due to kriging the prior constraint
(the low frequency part mapped from the wells).
The alternative might be to use the locally varying mean (LVM) option in
Petrel. If the well data, cells and seismic constraint were all on the same
support this would be the preferred option. Because they are not the results
are unsatisfactory.
Three LVM options are compared
LVM only
LVM plus Transform using input distribution
LVM plus variance reduction
Results are shown on the following page.

All rights reserved
Comparison of different LVM options
Using LVM only gives a large range of values. Using the transform option
gives a good reproduction of the spatial trends from the seismic but with a
much reduced dynamic range observed in the histogram. The variance
reduction option allows some control over the final histogram but the
reproduction of the spatial trends is poor. The figures below compare the
histograms for the transform and variance reduction methods to the upscaled
cells.
Comparison of histograms: LVM with Transform (left) and with Variance reduction (right)
The following displays compare the LVM and co-located options with SGS
only and the seismic porosity constraint in both map and cross-section form.

All rights reserved
Comparison of seismic conditioning methods displayed as zone average maps
Comparison of seismic conditioning method displayed in cross-section

All rights reserved
APPROXIMATE SUPPORT CORRECTION
There is currently no algorithm available to correctly constrain SGS or kriging
to a seismic property where there is a large change of support. Current
methods are designed to work where the well samples and seismic data are
at the same support. They work correctly where a layer average in the wells
is mapped using the seismic property as a constraint where both are defined
over the same interval. The thickness of that interval is determined by the
seismic frequency content and bandwidth.
It is possible to devise an approximate work around to this problem as follows:
Define an interval comprising a stack of K layers

At each well compute the vertical well average over the interval
At each well subtract the vertical well average from the individual
samples to obtain a residual in each cell
Use SGS to simulate the residuals into the 3D grid for the layer
Calculate the vertical average map of the simulated residuals
Krige a map of the well average property for the interval constrained by
the seismic property using co-located co-kriging or locally varying mean
For each cell subtract the value of the vertical average map in the
same vertical position and add the value of the kriged well average
map (from the previous step)
Result is a simulated property with the correct histogram of the original well
properties but with a vertical average corresponding to the trends from the
seismic property. This method assumes that the residuals are strictly
stationary and structured. The method can be run manually in Petrel.
It is very important to note that this method is a support correction only. In no
way to change the previous comments concerning the use of deterministic
inversion data as a constraint to a reservoir model. Deterministic inversion
data (and this includes Promise output) should NOT be used to constrain a
reservoir model. The following seismic data CAN be used to constrain a
reservoir model, with an appropriate support correction:
High pass filtered deterministic inversion or Promise data

Seismic attributes
Relative or coloured impedance
Stochastic inversion impedance realisations
Stochastic inversion probability volumes

All rights reserved
The final set of displays
clearly illustrates the
danger of using
Promise data as a
constraint to a reservoir
model. First map
shows the porosity
estimate for an interval
output from Promise.
Second display shows
the prior model (low
frequency) which is
input to Promise. It is a
map of the wells only
that gives the expected
value at each cell. This
cannot be updated in
the inversion because
the seismic does not
contain any low
frequencies and
therefore carries no
information about
variation in the mean.
This surface is already
estimated when we run
co-kriging in Petrel. To
include it in the seismic
constraint counts it
twice and substantially
underestimates the true
uncertainty
The third map shows
the difference between
the Promise output and
the prior model. This is
the information which
the seismic contributes
and it is this that can be
used to constrain the
model.
Note that the residuals
are the prior output,
which gives them the
wrong sign.

All rights reserved
A check of the crossplots between the well data and the maps at each stage
is quite revealing. The first crossplot (left) is between the well average and
the Promise output which is what would be observed in Petrel if using the
Promise data. The correlation is 0.935, giving the impression that the seismic
is a good predictor of porosity. The second crossplot (centre) is between the
prior model in Promise and the well average, demonstrating that the strong
correlation of the previous crossplot arises because we are largely comparing
the same information, not because the seismic is an excellent porosity
predictor. The final crossplot (right) shows the correlation between the well
average and the seismic property. The correlation is 0.494. With 29 wells this
is a statistically significant correlation and so there is justification to use this
residual as a constraint in our reservoir (with a suitable support correction, of
course).

All rights reserved
SEISMIC INVERSION DATA

Overview of Seismic Inversion Workflow
A typical seismic inversion workflow, illustrated opposite, is based on a phased
approach. Subsequent phases depend on completion of previous phases of work.
The end of each phase is a convenient break-point at which the work and results can
be evaluated with the client and a decision made to proceed to more sophisticated
analysis in subsequent phases.
Phase 1 involves the most time consuming aspects of any inversion study.
Objectives of Phase 1 include preparing the well logs, investigating relationships
between impedance and reservoir properties and tying the well logs to the seismic.
After tying to the seismic, the well log data is used to estimate a seismic wavelet. By
application of zero phase deconvolution a broad-band zero-phase dataset is obtained
which forms the input to coloured inversion (Lancaster and Whitcombe, 2000).
Coloured inversion converts the seismic data to a relative impedance data set. The
advantages of coloured inversion are the speed of calculation and avoidance of
artefacts that may be introduced by a model. Coloured inversion, whether acoustic
or elastic impedance (Connolly, 1999), is an excellent qualitative interpretation tool.
Phase 2 attempts to obtain more resolution from the seismic and to provide absolute
impedances through the procedure often referred to as deterministic inversion. A
model of impedance is built from the wells and seismic horizon interpretation and this
model used to constrain the subsequent inversion. Our model building method is
geostatistical and involves 3D anisotropic variogram analysis and kriging. We offer
two deterministic inversion approaches, both of which are delivered as standard to
the client. The first is band-limited inversion, in which a scaled relative impedance
from the seismic data is mixed with a low frequency element from the model. The
second approach is model-based inversion technique (Russell and Hampson, 1991).
Phase 3 uses the geostatistical model constructed in phase 2 but instead of using
deterministic inversion to obtain the expected value of impedance we compute
alternative realisations of the impedance through our ultra-fast stochastic inversion
technology (Francis, 2001; 2002). The set of impedance cubes represent the
uncertainty or non-uniqueness in the inversion. The realisations can be calculated at
any resolution and reproduce the well log impedance distributions. This allows
relationships defined on the well logs to be applied directly to the impedance
realisations, something which cannot be done with deterministic inversion. In
addition, the forward convolution of each impedance realisation will match the
seismic traces. Application of classifiers or transforms from impedance to reservoir
properties, identified in Phase 1, across the impedance realisations give the typical
output from the stochastic inversion in the form of a single seismic cube representing
the probability of sand, porosity or saturation distribution through the reservoir.

All rights reserved
Seismic Inversion Workflow

Phase 1
Well Log
Preparation
Well/Seismic Tie &

Wavelet Estimation
Lithology, porosity,
saturation prediction
Zero-phase
Deconvolution
Coloured
Inversion
Phase 2
Model Building &

Resolution Analysis
Deterministic
Inversion
Phase 3
Stochastic
Inversion
100+ Impedance
Realisations
Lithology / Porosity /
Saturation Classification
Probability
Cubes
Volume
Distribution Curves

All rights reserved
Modelbased
Inversion

Phase 1 Coloured Inversion
Well Log Preparation
After data loading and checking, well log preparation comprises a series of
procedures designed to prepare the log data for comparison with the seismic.
Typical procedures applied include log editing, application of Gassman fluid
substitution, checkshot calibration and calculation of acoustic and elastic responses.
Impedance Analysis
In order to understand the predictive capabilities of impedance, forward modelling will
be used to investigate resolution issues and sensitivity of seismic to changes in
lithology and reservoir parameters. Cross-plotting of impedances and reservoir
parameters such as lithology, saturation or porosity allow quantitative relationships
and reservoir property predictors to be established. Predictors might be defined by
impedance cutoffs, probability density functions or fuzzy classifications and may
include near and far offset information. In the example (page 7, top left) porosity is
poorly discriminated, but an impedance cutoff (> 8,150 m s-1 * g cm-3) or fuzzy
classification can discriminate clean sands, as shown by the separation on the
histograms (page 7, top right) (Francis, 1997). It is important to remember that these
relationships are defined at well log scale and can only be used for prediction based
on stochastic inversion, which reproduces the impedance distribution through
geostatistical conditional simulation.
Well Ties
After preparing the well logs and reaching an understanding of the reliabilty of
impedance for predicting reservoir properties the wells are converted to the time
domain and tied to the seismic. Our well tie procedure includes phase independent
methods based on amplitude envelope and Backus upscaling. Backus upscaling
attempts to account for wave propagation through thin-layered formations and can be
a useful additional tool in improving the tie of the well to the seismic.
At Earthworks we never apply stretch or squeeze in order to arbitrarily improve well
ties. In our experience, these practices are unnecessary and tend to propagate
noise into the wavelet estimation. This in turn tends to result in inconsistent wavelets
estimated at the different wells.
An example well tie and some of the individual estimated wavelets at different wells
are shown on page 7 (centre).
Wavelet Estimation & Zero Phase Deconvolution
Wavelet estimation proceeds through a choice of three methods:
Constant Phase
Wiener-Levinson full phase
Roy White method (White, 1980; Walden and White, 1984)
These are tested in turn to select the most suitable given the seismic data quality.
Examples are shown on page 7 (bottom), the results of de-phasing using constant
phase (CP), full phase (FP) and Roy White (RW) are compared to the raw seismic.

All rights reserved
13000
16%
12000
14%
11000
12%
10%
10000
Non-clean
9000
Clean sand
Frequency
Impedance (m/s * g/cc)
8000
6%
7000
4%
6000
2%
5000
0%
Shale
8%
Sand
0%
5%
10%
15%
20%
25%
30%
5500
35%
Raw
6500
7500
8500
9500
10500
11500
Effective porosity (%)
CP
FP

All rights reserved
RW

Phase 1 (continued)
Relative Impedance & Coloured Inversion
After obtaining a wavelet estimate, the wavelet is used to design an inverse operator
to zero-phase deconvolve the seismic. The deconvolution may be a de-phase only
or a full-inverse procedure to correct both the phase and the amplitude spectra. Preand post-deconvolution seismic is compared on page 9.
After zero phase deconvolution, a simple approach to estimate relative impedance is
by trace integration. The result obtained from this straightforward method is also
shown on page 9. Generally, we do not deliver this product to clients, preferring
instead to supply the coloured inversion result shown at the bottom of page 9.
Coloured inversion is reliable, quick to produce and may be generated for acoustic
and elastic impedances, thus conveniently capturing AVO effects.
In the Stratton Field data set used here for illustration (see notes below), there is
sand characterised by higher impedances in Well-08 at a time of 1275 1280 ms.
The sand can be qualitatively interpreted in the coloured inversion display as the
mainly red interval, tracking down into the 1280 1290 ms interval either side of the
well. The same interpretation can be made on the relative impedance, but the
resolution and clarity are poorer.
At Earthworks we consider coloured inversion to be the most cost-effective,
qualitative impedance product that we can deliver to our clients. Its advantages are
ease of interpretation and, being a seismic attribute, it avoids artefacts which may be
introduced by models used to constrain deterministic inversions (Francis and Syed,
2001). However, coloured impedance is still a relative measure of impedance
changes and therefore is not suitable for use in quantitative estimation of reservoir
properties, as may have been indicated by analysis of the relationships between
lithology or reservoir properties and acoustic or elastic impedance.
It may be that for the target reservoir under study impedance cannot predict reservoir
properties quantitatively. If this is the case then consultation with the client and a
decision to stop further analysis may be taken. The coloured impedance cube would
then be the final deliverable for qualitative interpretation by the client.
Notes on Stratton Field 3D Seismic Data
The data-set used here to demonstrate stochastic inversion is the Stratton Field 3D seismic
and well log data package, prepared by the Bureau of Economic Geology, Austin, Texas,
USA (Levey et al, 1994).
Stratton Field is an on-shore gas field producing from the Oligocene Frio Formation in the
NW Gulf Coast Basin. The top of the middle Frio formation is about 1200 ms, the start of the
data shown here. There is little faulting in this interval. Depositional environment is multiple
amalgamated fluvial channel-fill and splay sandstones. The composite channel fill deposits
range from 10 to 30 ft thickness and up to 2,500 ft width. Splay deposits shown typical
thicknesses of 5 to 20 ft and are proximal to the channel systems.. Sands are generally
indicated by high impedances and have typical velocities of 12,000 ft s-1, a 30 ft sand thus
being around 5 ms thick.
Each seismic cross-section is cross-line 154 from the 3D cube. The sand maps on pages 14
and 15 show the location of this section (as a red N-S line) and its relationship to the wells.

All rights reserved
Original Seismic
Zero-phase deconvolved
+1.0
-1.0
Relative Impedance
+1.0
-1.0
Coloured Inversion

All rights reserved

Phase 2 Deterministic Inversion
Model Building
Earthworks has developed its own geostatistical model-building scheme, writing
software specifically for this purpose. The basic framework is provided by the
seismic horizons. Between the horizons, we interpolate the well impedance values
using kriging, either proportional to the seismic horizons or conformal top or base to
the horizons.
As our method is geostatistical and the same model will be used subsequently in
stochastic inversion, we analyse the data using variograms. Each layer (bounded by
the seismic horizons) is able to have its own 3D anisotropic variogram model
definition.
Vertical variogram analysis is made directly from the well log data (page 11, top left).
Horizontal variogram analysis is made from horizon slices through the relative
impedance data obtained from the coloured inversion completed in Phase 1 (page
11, top centre). If sufficient wells are available, the variogram model from the
horizontal slice analysis will be compared to the horizontal well variogram (page 11,
top right). The directional variograms define the model shape and range, the vertical
direction also defines the sill.
Deterministic Inversion
Conventional seismic inversion to absolute impedance is commonly referred to as
deterministic. We usually compute these cubes as a quality control step before
proceeding to stochastic inversion, but for some clients the deterministic impedance
cubes may be the final delivered product from an inversion study.
Band-limited inversion is a method in which the relative impedance component from
seismic is scaled and added to a low pass filtered (low frequency) model. The result
is an absolute impedance cube. Band-limited inversion is useful as a qualitative
interpretation tool, but suffers from some drawbacks. In particular, the wavelet is not
removed and resolution is not improved
In model-based inversion, the initial model is iteratively updated using generalised
linear inversion in order to obtain an optimal impedance solution whose forward
convolution is a good match to the seismic. It is a robust and reliable method, able to
remove the wavelet and, given a good initial model, improve resolution and remove
tuning effects.
There are two significant limitations of all deterministic inversion schemes (including
sparse-spike inversion). The first is that the model is embedded in the result and this
may introduce artefacts which can be misleading in interpretation (see Francis and
Syed, 2001; Francis, 2002). In order to reduce the risk of mis-interpretation we
always deliver the model in addition to the inversion results to the client and
recommend that horizon slices are compared between model and inversion.
The other limitation of deterministic schemes is that, because they produce optimal
solutions, they are unable to reproduce the full range of impedance observed in the
wells. This means that cutoffs or classifiers determined from well logs should not be
applied to the absolute impedances obtained from deterministic inversion schemes.

All rights reserved
10,000
5,000
Initial Impedance Model

7,600
6,100
Band-Limited Inversion
10,000
5,000
Model-based Inversion

All rights reserved

Phase 3 Stochastic Inversion
The stochastic inversion technique developed at Earthworks is a hybrid approach
which enables the computation of stochastic impedance realisations using a
conventional inversion algorithm.
Using the same seismic horizon framework, well logs and 3D anisotropic variograms
defined for kriging the initial model for deterministic inversion in Phase 2, we use a
very fast FFT-based spectral simulation method to generate impedance realisations,
conditional to the well impedance data. As necessary we can also generate coupled
conditional realisations as may be required in joint inversion for near / far offset
impedance or time-lapse studies. The initial impedance realisations are then
updated by application of the generalised linear inversion (GLI) algorithm in order to
make them conditional to the observed seismic.
The algorithm is ultra-fast, allowing routine calculation of 100+ impedance
realisations of large 3D seismic datasets without any special computer hardware
requirements. Using a conventional, high specification two processor PC the dataset
shown here takes under 5 hours to compute one hundred 3D realisations.
Some impedance realisations are shown on page 13. At the top is the band-limited
deterministic inversion from Phase 2 and to its right the original seismic data. Below
are shown four impedance realisations (left) and their forward convolution to a
synthetic seismic section (right). Comparison of the synthetic sections with the real
seismic at top confirms each realisation is conditional to the seismic.
Comparison between the impedance realisations shows how significant is nonuniqueness in seismic inversion. All realisations share a common colour table, with
the high impedance sand previously described shown as the red to blue colours
around 1280 ms. The continuity and thickness variations between just these four
realisations are highly significant and there magnitude may be surprising.
Using the impedance criteria of > 8,150 m s-1 * g cm-3 to indicate clean sands, the net
sand in the entire impedance cube has been computed from the model-based
deterministic inversion from Phase 2 and for each of the 100 stochastic impedance
realisations. The deterministic inversion gives a net sand of 8.5 % whereas the
impedance realisations range from 11.6 to 15.5 % with a mean value of 13.5 % net
sand. The wells show an average net sand of 13.2 %. The cumulative distribution
function of net sand is shown below (red curve).
100%
90%
Cumulative Probability (Percent)
80%
Wells = 13.2%
70%
60%
50%
40%
30%
Deterministic = 8.5%
20%
10%
0%
0.080
0.090
0.100
0.110
0.120
0.130
0.140
0.150
0.160
Net Sand (Fractional Volum e)

All rights reserved

Deterministic
Model-based Inversion
5,000
Original Seismic
10,000
Stochastic Realisations
Impedance Section
4,500
Forward Convolution
11,500
All rights reserved

Quantitative Analysis and Probabilistic Sand Prediction
The significant under-prediction of net sand from deterministic inversion is expected
from geostatistical theory. It was mentioned in Phase 2 (see page 10) that
deterministic inversion schemes are unable to reproduce the full range of impedance
as their output is optimal and therefore smoother than the impedance observed at the
wells. If we truncate the distribution and integrate samples above the cutoff (in this
case > 8,150 m s-1 * g cm-3) we will find too few samples and hence systematically
underestimate net sand. The stochastic inversions, whilst uncertain and non-unique,
do reproduce the impedance distribution and therefore there is no bias when we
apply the cutoff. By looking at many realisations, the uncertainty in net sand variation
is quantified, as shown by the distribution curve on the page 12.
We can make further predictive use of the impedance realisations. For each
stochastic inversion realisation we check if each time sample is classified as clean
sand. By counting the proportion of realisations which show the sample as clean
sand, we obtain the probability of sand at this sample. This resultant cube is
sometimes referred to as an iso-probability cube. Cross-line 154 from the cube
calculated from this data is shown on page 15 (top). The colour table shows a
chance of 50% or better of being clean sand. Using a voxel system to pick the
envelope of the sand around 1280 ms from this volume we can then compute the
isochron and hence thickness of the sand. As a comparison this has also been done
for the deterministic inversion and the two sand thickness maps are shown on page
15. The deterministic net sand map has less sand predicted and is generally thinner.
The channel in the top left (NW) corner of the P50 stochastic net sand map is nicely
defined but not evident at all in the deterministic net sand map.
The maximum sand probability map shown on page 15 is the peak probability within
the P50 sand thickness envelope. This is similar to a standard deviation map. There
is a very high probability of sand around wells with sand. From the isochron map
note the thick sand SE of Well-08. The maximum probability map shows this to be
high probability of sand too: a clear candidate for infill drilling. To summarise, the
display below is the interpreted P50 stochastic net sand map. The channel at the top
is nicely defined, together with the thick depositional trend across the centre of the
area. A possible splay is interpreted around Well-17 and an abandoned meander
(ox-bow) adjacent to Well-13. The infill drilling target is clearly indicated.
Thickest Predicted
Sand for Infill Drilling

All rights reserved

Iso-probability sand prediction P50 Net Sand
100 %
50 %
48.0 ft
Deterministic Net Sand
0.0 ft
P50 Stochastic Net Sand

100 %
50 %
Maximum Sand Probability

All rights reserved
References
Connolly, P., 1999, Elastic Impedance. The Leading Edge, April 1999, pp 438-452
Francis, A. M., 1997, Acoustic impedance inversion pitfalls and some fuzzy analysis.
The Leading Edge, March 1997, pp275-278
Francis, A. M., 2001, Understanding and improving acoustic impedance inversion.
Presented at SEG Development & Production Forum Taos, New Mexico.
Francis, A. M. and Syed, F.H., 2001, Application of relative acoustic impedance
inversion to constrain extent of E sand reservoir on Kadanwari Field. Presented
at SPE/PAPG Annual Technical conference, 7-8 November 2001, Islamabad,
Pakistan.
Francis, A. M., 2002, Deterministic Inversion: Overdue for Retirement? Presented at
PETEX 2002 Conference and Exhibition, London, UK.
Russell, B. and Hampson, D., 1991, A comparison of post-stack seismic inversion.
61st Ann. Internat. Mtg. Soc. of Expl. Geophys., pp 876-878
Lancaster, A. and Whitcombe, D., 2000, Fast-track coloured inversion. Presented
at SEG 2000 meeting Expanded Abstracts
Levey, R. A., Hardage, R. A., Edson, R. and Pedleton, V., 1994, 3-D Seismic and
well log data set: Fluvial Reservoir Systems, Stratton Field, South Texas, Bureau
of Economic Geology, University of Texas, Austin, Texas, 78713-7509, USA
White, R. E., 1980, Partial coherence matching of synthetic seismograms with
seismic traces. Geophysical Prospecting 28 pp 333-358
Walden, A.T. and White, R.E., 1984, On errors of fit and accuracy in matching
synthetic seismograms and seismic traces. Geophysical Prospecting 32 pp 871891
NB: Copies of papers by Francis may be obtained either from our website at
http://www.sorviodvnvm.co.uk or by email to ashley.francis@sorviodvnvm.co.uk

All rights reserved

Geostats Manual 2006

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Geostats Manual 2006

Uploaded by

Copyright:

Available Formats

Geostatistics

Technical Note: Coloured, Deterministic & Stochastic Inversion

2006 Earthworks Environment & Resources Ltd

Advanced Property Modelling

2006 Earthworks Environment & Resources Ltd

ENVIRONMENT & RESOURCES

ENVIRONMENT & RESOURCES

Random variable or random function can be defined as a measurable

Feller (1950) describes a random variable as a function defined on a sample space.

Number of aces in a hand at Bridge

We can also regard physical systems as random variables, such as

Position of a particle under diffusion

ENVIRONMENT & RESOURCES

Classical statistics assumes a non-increasing variogram with a sill equal

ENVIRONMENT & RESOURCES

ENVIRONMENT & RESOURCES

This is the blocks in series analogy to Ohms law of electrical current.

ENVIRONMENT & RESOURCES

A weighted linear combination where each sample value xi is multiplied

A method of computing the appropriate weights

ENVIRONMENT & RESOURCES

ENVIRONMENT & RESOURCES

Central limit theorem

ENVIRONMENT & RESOURCES

ENVIRONMENT & RESOURCES

Normal score transform (anamorphosis)

Normal Score value

ENVIRONMENT & RESOURCES

Normal Score value

Normal Score value

ENVIRONMENT & RESOURCES

In simple terms, three types of variability can be observed on a scatterplot. The

ENVIRONMENT & RESOURCES

Computing correlated variables

To create a second variable Y which is correlated with X, we need another

ENVIRONMENT & RESOURCES

ENVIRONMENT & RESOURCES

Table shows minimum R having a probability of spurious correlation of 5% for given

ENVIRONMENT & RESOURCES

ENVIRONMENT & RESOURCES

In general, joint variation is too complex to be summarised by a single coefficient. In

ENVIRONMENT & RESOURCES

Potential risks when using seismic attributes

resented by that sample. Enthusiastic practitioners can

Figure 1. Simple flow diagram showing use of seismic

For seismic attributes and reservoir properties this

THE LEADING EDGE

ENVIRONMENT & RESOURCES

actually uncorrelated. How likely is it that this could

where n is the sample size or number of locations with

measurements of both the reservoir property and the

ENVIRONMENT & RESOURCES

Figure 3. Possible outcomes when selecting a seismic

1 < 1 < p sc  - p sc 1 < p sc

where psc is the probability of a spurious correlation for

spurious correlation decreases. To avoid this, one

THE LEADING EDGE

ENVIRONMENT & RESOURCES

associated with choosing seismic attributes to predict a

THE LEADING EDGE

Figure 8. Estimation errors. (a) Areas overestimated by

The cost of a Type II Error (rejecting a seismic attribute

ENVIRONMENT & RESOURCES

extracted at 10 wells. Figure 4 shows a scatter plot of

1 < 1 < p sc - p sc 1 < p sc