Geographic Statistics Petroleum Engineers: Conditional Simulation in

NATIONAL UNIVERSITY, HCMC
BACH KHOA UNIVERSITY

FACULTY OF GEOLOGY & PETROLEUM ENGINEERING
GEOGRAPHIC STATISTICS
Petroleum Engineers
Lecturer: PhD. Ta Quoc Dung
CONDITIONAL SIMULATION IN
GEOSTATISTICS AND APPLICATION
Group’s members:
o Huỳnh Thanh Toàn - 1652610

o Nguyễn Văn Nhân - 1652444
Topic:
WHAT IS THE CONDITIONAL SIMULATION IN
GEOSTATISTICS AND APPLICATION
I. INTRODUCTION:
The methods that enable us to study seismic series from spatial as well as temporal or energetic
points of view can be classified into two types: deterministic and probabilistic methods. In this
study we suggest the use of probabilistic methods, founded in probability theory, because of the
lack of information about the causes of the sequence of seismic events (Kagan 1992); the
complexity of the cause-effect relations of the seismic series (De Miguel 1976; Posadas et
al 1993a, b); the fact that the earthquakes are considered as non — linear, chaotic phenomena
(Kagan 1997; Feng et al. 1997) that are self — similar with no scale variability (Kagan 1997);
the lack of a complete theory about the occurrence of earthquakes; the complexity and the
difficulty in interpreting the character of the data; the multitude of variables involved in an
earthquake (which implies that we are dealing with a multidimensional process); and various other
factors (Kagan & Jackson 1996).
Geostatistics offers various methods that are important in quantifying and solving for many
geological variables. All these methods consider the value of each point as well as the spatial
and/or temporal position of each point with respect to the others. This aspect achieves results that
approximate the real values better than those reached with other methods.
II. Definition of Conditional Simulation in geostatistical:
Conditional simulation has come be used in analysis and mapping of regionalized variables. An
important advantage to the geostatistical approach to mapping lies in the modeling of spatial
covariance that precedes interpolation; semi vario - gram models derived from this step can make
the final estimates sensitive to directional anisotropies present in the data. On the other hand, the
smoothing property of kriging can also mean that one throws away detail at the mapping stage. A
simulation is said to be 'conditional' when it honors the observed values of a regionalized variable.
Haas, T.J. (1990)
III. Important Concepts:

3.1/ Geostatistics concept:
Geostatistics has its basis in Matheron's (1965) Theory of Regionalized variables. A random
variable is one that has a variety of values in accordance with a particular probability distribution
(Journel & Huijbregts 1978). If the random variable is distributed in space and/or time we say
that it is a regionalized variable. These variables, due to their spatial — temporal character, have
a random as well as a structural component (Matheron 1963). At first sight, a regionalized variable
seems to be a contradiction. In one sense it is a random variable that locally does not have any
relation to the nearby variables. On the other hand, there is a structural aspect in the regionalized
variable that depends on the distance of separation of the variables. Both characteristics can be
described, however, using a random function for which each regionalized variable is a particular
realization. By incorporating the random as well as the structural aspects of a variable in a simple
function, the spatial variability can be accommodated on the basis of the spatial structure shown
by these variables (Carr 1983). In this sense, a regionalized variable is a variable that qualifies a
phenomenon that is distributed through space and/or in time and that presents a certain correlation
structure. To date, regionalized variables have been used in geological disciplines that represent
phenomena in a qualitative as well as quantitative manner; for example, the law of a mineral, the
thickness of strata, the pluviometry of a region, the impermeability of the land and the resistivity
of the ground (Chica Olmo 1987).
3.2/ Variogram concept:
From a geostatistical point of view, the sequence of magnitudes of the earthquakes of a seismic
series can be considered as a regionalized variable. This variable is interpreted as a function mag(x)
that provides the magnitude mag of an earthquake x within the sequence of a seismic series. The
regionalized variable mag(x) behaves like a random variable, and an earthquake can be considered
as a particular realization of the random function mag(x), made up of a set of random variables
[mag(x1), mag(x2),… , mag(xn)], a sequence of earthquakes within the seismic series. The
transitional behaviour (between a deterministic and random state) is typical of a regionalized
variable (Herzfeld 1992). One way of examining the spatial structure of a regionalized variable is
to relate the changes between the variables to the distance that separates them. If the average
difference between the variables increases as the separation distance increases, there is a spatial
structure and the variables are regionalized. If, on the other hand, the average difference between
the variables changes erratically, irrespective of the distance of separation, the variable is random
and there is not a spatial structure. It is thus possible to identify analytically the spatial behaviour
of a random variable (Carr 1983).
Taking into account the concept of statistical variance, and considering that the expectation of a
random function at point x is the mean m(x), then the variance is an expectation and is (Journel
& Huijbregts 1978)
For two regionalized variables Mag(x1) and Mag(x2), which have variances in x1 and x2, a
covariance that relates these variables can be expressed (Journel & Huijbregts 1978) as
The intrinsic geostatistical hypothesis only assumes the stationarity of the second order of the
increases, and so the data can perfectly well have a derivative. This aspect is very important
because the majority of the phenomena analysed in Earth Sciences usually have
derivatives/tendencies (Herzfeld 1992). Owing to the fact that the intrinsic hypothesis m(x1) =
m(x2) holds, the variance of the incremental difference between the variables Mag(x1) and
Mag(x2) can be expressed as follows:
If the points x1 and x2 are separated by a distance h, then x1 = x, x2 = x + h, and
and
Eq.(4) defines the analytical equation of the variogram (or variogram function), and eq.(5) defines
the semi-variogram, although it is very common to call the semi — variogram the variogram too.
These equations describe the structural aspect of a regionalized variable, and the function γ
measures the spatial continuity, or, in other words, the spatial correlation. In this case, γ(h)
represents half of the average of the squares of the differences between the magnitudes of the
earthquakes separated by a step or distance h. Since this methodology originated in the solution of
mining problems, some of the original terms are retained, such as the case of the step h or distance.
In this way, it is deduced that γ(h) is a vectorial function depending on the modulus and the angle
of the vector of distance h. Considering the sequence of the magnitudes of a seismic series, the
vector distance h is linear (angle = 0°) and its modulus is a function of the order of the sequence.
The statistical inference of the direct variogram is obtained from the estimator γ*(h) of the
variogram γ(h) of eq.(5):
where Np(h) is the number of pairs of magnitudes of the earthquakes at a distance h, and this
equation represents half of the average squared increases between the earthquake magnitudes
separated by a distance h.
IV. The Conditional Geostatistical Simulation Method for the

Energetic Simulation of Seismic Series:
As in almost all of the fundamental concepts of geostatistics, the first steps of the geostatistical
simulation method were given by Matheron, who proposed and implemented this method in the
form of the Turning Bands Method. Later, the theoretical bases were affirmed and supported
(Guibal 1972; Journel 1974a,b; Chiles 1977; Journel & Huijbregts 1978; Alfaro 1979) and the first
applications appeared (Deraisme 1978a, b).
The development of this methodology originally came from the mining industry, to solve the
problems found in design and mining planning in which the values estimated turned out to be
‘smoothed’ with respect to the real values, not reflecting the degree of variability and detail present
in the values of the mining variables (Armstrong & Dowd 1994).
In the 1990s many theoretical aspects have been developed, including new more efficient
algorithms such as applications relevant to multiple fields, analysis of basins, treatment of images,
modelling of karstic media, simulations of geological lithofacies, etc. (Armstrong & Dowd 1994).
This development has been aided by the widespread use of computers, making it possible to deal
with the numerous calculations required.
The values of the models obtained using geostatistical simulation agree with the experimental
information and reproduce the observed variability.
We have proved that the following functions of the real values mag(x) of a random function
Mag(x) and the values obtained by conditional simulation magCS(x) coincide:
The averages:
The variances:
The variograms:
and the histograms:
Moreover, a conditioning of the simulated model is imposed on the experimental values; that is,
at a point or experimental time x, the simulated and the experimental values coincide:
The fact that the variograms of the simulated values and the real values coincide implies that both
sets of values have the same spatial and/or temporal variability.
4.1/ Experimental variographic analysis:

Variographic analysis begins with the calculation of the experimental variogram, from the
experimental data using formula (6) for the experimental estimator of the variogram. In the case
of a temporal series, the calculations will be in a 1-D space. However, the experimental variogram
calculated from the experimental data is not a true variogram, in the sense that it is not a positive
defined function (Journel & Huijbregts 1978), and therefore a theoretical variogram function must
be fitted to the experimental one. This process is called variogram fitting, and requires the types
of parameters needed to define the theoretical variogram function to be chosen.
Once the experimental variogram has been calculated we have to find the theoretical variogram
that fits it best, defining the type of function fit, its range, its sill and the parameters required to
define some of these theoretical functions. These parameters of the variogram and the type of
function offer us the spatial and/or time information needed to handle the problems related to the
estimation and simulation of the variables. The fitting method is based on an empirical process:
the user selects the type and parameters of the variogram functions after performing tests, and
based on their experience. The fitted mode is later validated by means of a simple cross —
validation method. In addition to this classical method, there are other recent methods based on
non — linear regression and weighting of variables. These latter methods are automatic, and the
user's contribution is not needed. The selection of a method is a matter for the user, to be made
taking their experience into consideration. We have chosen the classical form for the present work.
Variographic analysis is very important because it enables us to define the structure of the data,
and the degree of correlation. Note that the variogram function indicates the variability that exists
among the data in the set considered. In the case when the variogram is stationary, that is it reaches
a limiting value in its growth (sill) that represents the maximum variability or minimum correlation
among the data, the relation between the variogram function and the covariance function is given
by the expression (Journel & Huijbregts 1978).
From this we can deduce that the variogram and the covariance functions are symmetrical if the
phenomenon with which we are dealing is stationary.
From expression, we also define the correlogram, ρ(h), and the relative variogram, γr(h), which
enable us to compare series of different data. Thus, if we normalize the previous expression with
respect to the variance σ2 we have:
From which,
As Pardo-Igúzquiza (1992) indicate, this entire analysis of structural character can be performed
using three moments of the conditional geostatistical simulation process: first, on the
experimental data to characterize its spatial variability; second, on the normalized values obtained
when the experimental values have been transformed by the non — conditional simulation; and
third on the values obtained as a result of the conditional simulation in order to test the goodness
of the results obtained.
4.2/ Conditioning of the simulation of the experimental values:
The number of possible simulations of a random function Mag(x) that fulfil the condition of being
isomorphic to the experimental realization, meaning that they have the same mean and variance
values and the same variogram functions, is infinite.
The conditioning process of the simulation enables us to choose from all the possible realizations
of the simulation those that incorporate the experimental data points. To do this, starting from non
— conditional simulation, a series of operations to make the simulated values coincide with the
experimental values is carried out. This is achieved using a technique called kriging (Matheron
1970; Journel & Huijbregts 1978):
where ycs(x) is the conditionally simulated Gaussian datum, ys(x) is the non-conditionally
simulated Gaussian datum, y(x) is the normalized experimental datum, y*k(x) is the value
estimated by kriging from the datum y(x), y*sk((x) is the value estimated by kriging from the
datum ys(x), λi are the measured weights of the kriging system, and N is the number of nearby
points to be considered in the kriging
V. Application of Conditional Simulation:
1. THE ENVIRONMENTAL REMEDIATION SCENARIO:
Conditional simulation might be applied in a remediation situation.

The problem involves sampling contaminated soils to select portions of
the site with concentrations in excess of a regulatory standard which
require remediation. We assume: an initial data set from a representative
sub-area of the site; a specified action level applied to a remediation unit
(RU) of a specified size appropriate to the remediation method (e.g.,
removal by front-end loader); and an economic objective function where
total cost equals the sum of sampling cost plus remediation costs for all
selected RU's, plus the cost of residual contamination in all non-selected
RU's.
The decision rule is that if the estimated value of an RU exceeds the

action level, it will be remediated. Unit RU remediation cost is constant;
unit RU non-remediation cost is proportional to concentration; and the
two unit costs are defined to be equal at the action level. The objective is
to estimate the number of samples in a single-phase campaign which
would result in the lowest total project cost. This defines the optimal
sampling density for the remainder of the site. We will use the following
notation:
No The optimum number of samples, where the expected value of
the total project cost is a minimum.
Ne An estimate of No from the conditional simulation design

procedure.
nc The number of initial samples used to condition a simulation

(step 2, below).
ns The number of simulated samples taken in one iteration of the

design procedure (step 4, below).
n The number of samples to be taken in an actual field sampling

campaign.
2. THE SAMPLING DESIGN PROCEDURE:
The procedure is a Monte Carlo resampling scheme which simulates the remediation
operation, including data collection, interpolation, and selection.
1) Estimate the variogram model from the initial (conditioning) data set.
2) Generate a detailed simulated site model which is consistent with both the
conditioning data and the variogram model. The site model is a dense array of
possible sample values.
3) Compute "true" RU values from the site model. Each RU value is the mean of all
simulated values within it.

3. Application of the Method to the Berja Seismic Series (Almeria,
South East Spain) 1993 December-1994 March
The Berja (Almería, Spain) seismic series took place in an area in the southeast
of the Iberian Peninsula (Fig.1a) that has a high level of seismicity, between 1994
December 23 and 1994 March 12. The locations of the epicentres classified
according to magnitudes can be seen in Fig.1(b).
The seismic series began with the earthquake of greatest magnitude, 5.0 on the
Richter scale; after this there were several aftershocks, two of which reached 4.0 on
the same scale. The rest of the earthquakes were of lower magnitudes.
Figure 2:
Fig.2 (a) shows the sequential evolution of the magnitudes of the seismic series: the
order in the sequence of the series for each earthquake is shown on the x — axis and
the magnitude corresponding to each event is shown on the y — axi
Fig.2 (b) shows the temporal evolution of the magnitudes, with the time of
occurrence on the x — axis.
The previous steps have shown the calculations carried out for one of the cases in
detail. The whole process of conditional geostatistical simulation is repeated 100
times for each case to be simulated, using the numbers 1 to 100 inclusive as random
numbers. With all these data it is possible to calculate the cumulative histogram
(Fig.3), which contains the information from all these calculations in probability
terms. As an example we consider a given magnitude of 2.5. The probability that an
earthquake of magnitude greater than 2.5 will occur is 10 per cent, and the
probability that this same earthquake will be of magnitude less than or equal to 2.5
is 90 per cent. In the example case, earthquake number 61 had a magnitude equal to
1.5.
Figure 3: Probability occurrence curve for earthquake number 61 of the Berja

seismic series, calculated from 77 values .
The conditional geostatistical simulation method is adequate to analyses the
variability of the magnitudes of the earthquakes of a seismic series and permit
the calculation of the values that can be reached by the magnitude of an
earthquake in one specific moment of the development of a seismic series,
expressed using the probability occurrence curve.
The statistical distribution of the magnitudes of the Berja (Almería, Spain)
seismic series, 1993 December 1994 March, is slightly biased to the right, which
indicates a slight preponderance of earthquakes with magnitude greater than the
value of the median of the distribution.
The earthquakes that make up a seismic series are not independent phenomena,
as they have a similar spatial and temporal genesis. The seismic series are
naturally correlated data sets. It is this interdependence and correlation that
allows us to know the structure that exists among the earthquakes of a seismic
series, the structure becoming obvious with the use of the variogram function. It
is this structure that makes it possible to calculate the magnitude of the
earthquakes that will occur with greatest probability after a given moment.
4. OTHER APPLICATION:
Solute transport prediction is always subject to uncertainty due to the scarcity of
observation data. The data worth of limited measurements can be explored by
conditional simulation. Conditional simulation is efficient approach for the solute
transport in a randomly heterogeneous aquifer.
Conditional simulation also provides optimal estimates but honors original data at
their locations so can be used to map sharp boundaries in a domain. Inverse distance
weighting is probably the best non-geostatistical interpolation technique, based on
simple nearest neighbor calculations.
VI. Conclusion:
Conditional simulation really helpful in geostatistics and other industries. There is
also get advantages and some rules which help engineer get a high results in works.
Moreover, application of conditional simulation contributes for science background
expand which help engineer in the experiment.

References
1. Deutsch, C.V. (1993) Conditioning reservoir models to well test information,
in A. Soares (ed.) Geostatistics Troia ‘92. Kluwer Academic Publishers,
Dordrecht.
2. Isaaks, E. (1990) The application of Monte Carlo Methods to the Analysis of

Spatially Correlated Data. Unpublished PhD dissertation, Stanford
University.
3. Murray, C.J. (1994) Identification and 3-D modeling of petrophysical rock

types, in J.M. Yams and R.L. Chambers (eds) Stochastic Modeling and
Geostatistics. American Association of Petroleum Geologists, Tulsa.
4. Haas, T.J. (1990) Lognormal and moving window methods of estimating

acid deposition. Journal of the American Statistical Association.

Geographic Statistics Petroleum Engineers: Conditional Simulation in

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Geographic Statistics Petroleum Engineers: Conditional Simulation in

Uploaded by

Copyright:

Available Formats

NATIONAL UNIVERSITY, HCMC

BACH KHOA UNIVERSITY

o Huỳnh Thanh Toàn - 1652610

factors (Kagan & Jackson 1996).

III. Important Concepts:

If the points x1 and x2 are separated by a distance h, then x1 = x, x2 = x + h, and

IV. The Conditional Geostatistical Simulation Method for the

and the histograms:

4.1/ Experimental variographic analysis:

1. THE ENVIRONMENTAL REMEDIATION SCENARIO:

Conditional simulation might be applied in a remediation situation.

The decision rule is that if the estimated value of an RU exceeds the

Ne An estimate of No from the conditional simulation design

nc The number of initial samples used to condition a simulation

ns The number of simulated samples taken in one iteration of the

n The number of samples to be taken in an actual field sampling

2. THE SAMPLING DESIGN PROCEDURE:

operation, including data collection, interpolation, and selection.

possible sample values.

simulated values within it.

Figure 3: Probability occurrence curve for earthquake number 61 of the Berja

variability of the magnitudes of the earthquakes of a seismic series and permit

the calculation of the values that can be reached by the magnitude of an

earthquake in one specific moment of the development of a seismic series,

expressed using the probability occurrence curve.

The statistical distribution of the magnitudes of the Berja (Almería, Spain)

indicates a slight preponderance of earthquakes with magnitude greater than the

value of the median of the distribution.

naturally correlated data sets. It is this interdependence and correlation that

is this structure that makes it possible to calculate the magnitude of the

Solute transport prediction is always subject to uncertainty due to the scarcity of

observation data. The data worth of limited measurements can be explored by

conditional simulation. Conditional simulation is efficient approach for the solute

transport in a randomly heterogeneous aquifer.

weighting is probably the best non-geostatistical interpolation technique, based on

simple nearest neighbor calculations.

Moreover, application of conditional simulation contributes for science background

expand which help engineer in the experiment.

2. Isaaks, E. (1990) The application of Monte Carlo Methods to the Analysis of

3. Murray, C.J. (1994) Identification and 3-D modeling of petrophysical rock

4. Haas, T.J. (1990) Lognormal and moving window methods of estimating

You might also like