You are on page 1of 23

SPE-170636-MS

Integration of Principal-Component-Analysis and Streamline Information


for the History Matching of Channelized Reservoirs
C. Chen, Shell International Exploration and Production; G. Gao, Shell Global Solutions US Inc.; J. Honorio, MIT;
P. Gelderblom, Shell Global Solutions International; E. Jimenez, Qatar Shell GTL Limited; T. Jaakkola, MIT

Copyright 2014, Society of Petroleum Engineers


This paper was prepared for presentation at the SPE Annual Technical Conference and Exhibition held in Amsterdam, The Netherlands, 2729 October 2014.
This paper was selected for presentation by an SPE program committee following review of information contained in an abstract submitted by the author(s). Contents
of the paper have not been reviewed by the Society of Petroleum Engineers and are subject to correction by the author(s). The material does not necessarily reflect
any position of the Society of Petroleum Engineers, its officers, or members. Electronic reproduction, distribution, or storage of any part of this paper without the written
consent of the Society of Petroleum Engineers is prohibited. Permission to reproduce in print is restricted to an abstract of not more than 300 words; illustrations may
not be copied. The abstract must contain conspicuous acknowledgment of SPE copyright.

Abstract
Although Principal Component Analysis (PCA) has been widely applied to effectively reduce the number
of parameters characterizing a reservoir, its disadvantages are well recognized by researchers. First, PCA
may distort the probability distribution function (PDF) of the original model, especially for non-Gaussian
properties such as facies indicator or permeability field of a fluvial reservoir. Second, it smears the
boundaries between different facies. Therefore, the models reconstructed by PCA are generally unacceptable for geologists.
A workflow is proposed to seamlessly integrate Cumulative-Distribution-Function-based PCA (CDFPCA) and streamline information for assisted-HM on a two-facies channelized reservoir. The CDF-PCA
is developed to reconstruct reservoir models using only a few hundred of principal components. It inherits
the advantage of PCA to capture the main features or trends of spatial correlations among properties, and
more importantly, it can properly correct the smoothing effect of PCA. Integer variables such as facies
indicators are regenerated by truncating their corresponding PCA results with thresholds that honor the
fraction of each facies at first, and then real variables such as permeability and porosity are regenerated
by mapping their corresponding PCA results to new values according to the CDF curves of different
properties in different facies. Therefore, the models reconstructed by CDF-PCA preserve both geological
(facies fraction) and geostatistical (non-Gaussian distribution with multi-peaks) characteristics of their
original or prior models. Our preliminary results indicate that the history-matched model using the
CDF-PCA alone may not satisfy the requirement of geologists, e.g., some channels may become
disconnected during history-matching. Therefore, we propose a method of combining CDF-PCA together
with streamline information. Because velocity of the tracer in the streamline provides connectivity
information between injectors and producers, it enhances channel connectivity without over-correction on
cell-based permeability during the process of history matching.
The CDF-PCA method is applied to a real-field case with three facies to quantify the quality of the
models reconstructed. The history matching workflow is applied to a synthetic case. Our results show that
the geological facies, reservoir properties, and production forecasts of models reconstructed with CDFPCA are well consistent with those of the original models. The integrated HM workflow of CDF-PCA

SPE-170636-MS

with streamline information generates reservoir models that honor production history with minimal
compromise of geological realism.

Introduction
Both object-based and MPS-based (Multi-Point Statistics) models generate relatively more geologically
realistic channel bodies compared to conventional two-point geostatistics-based techniques. However,
conditioning such models to production data and correctly sampling the posterior probability distribution
are challenging problems. One of the major challenges is that the number of parameters to be tuned during
the process of history-matching (HM) is too large to be effectively handled by available HM workflows,
especially when adjoint gradient is unavailable. Another challenge is that the models obtained after HM
generally violate or distort the geological and geostatistical characteristics (e.g., channelized reservoir) of
the original or prior models.
A geological model which is used to describe the depositional environment and to predict future
production is highly relevant to field development planning. To best predict the future reservoir performance and reduce the reservoir uncertainty, the geological models need to be conditioned to any types of
available data, e.g. 3D seismic data, well log data, production data, time-lapse seismic data, core analysis
data, etc. There are many tools or algorithms to generate geological models subject to geostatistical
description; however, it is not easy to adjust geological models to match data and retain geological
consistency, especially when the geological model contains channels, lobes or such non-Gaussian object
based models.
In only a few cases, e.g. aeolian system, wave-dominated delta system or one facies problem, the
reservoir model parameters (e.g. permeability and porosity) can be assumed random Gaussian fields that
can be characterized by variogram. A Gaussian field can be modeled with two-point statistics [14], in
which a variogram is used to quantify how properties are correlated with each other spatially. Many
methods have been developed to quantify uncertainties of reservoirs model parameters and production
forecasts through conditioning to production data, such as Ensemble Kalman filter [1] method and the
randomized maximum likelihood (RML) method [16]. Both methods are formulated on the basis of
Bayesian framework under the assumption of prior Gaussian reservoir models. However, in most cases,
the prior reservoir model is non-Gaussian, e.g. a fluvial system, where the uncertain parameters are highly
relevant to structures and patterns that cannot be characterized by simple two-point correlations. For
different lithology/facies, rock properties could be sampled from very different probabilistic distributions
so that the overall probability distribution of rock properties is non-Gaussian. To model non-Gaussian
geology, object-based modeling [11] and Multi-Point-Statistics modeling [19] techniques are able to
provide clear shape and boundary of geological bodies and some detailed geological/channelized features.
The channelized geostatistical descriptions (e.g. lithology distribution, channel width, wavelength, orientation, etc.) are usually input to a geological modeling software to generate unconditional realizations
by sampling with random seeds. By running simulations of these realizations of reservoir models, one can
quantify the uncertainty of production forecasts. However, conditioning these non-Gaussian models to
various production data and properly sampling their posterior probability distribution are still very
challenging tasks.
Recently, KarhunenLoeve (K-L) expansion[6] has been introduced to approximate geological models
with an infinite linear combination of orthogonal functions. As a linear K-L approximation, Principal
Component Analysis (PCA) has been applied to effectively reduce the number of parameters characterizing a reservoir and preserve major characteristics of two-point geostatistics. PCA has been applied in
many fields, such as face recognition, history matching [15, 21], seismic interpretation [18] etc. Mohamed[15] and Chen [4] used PCA for model reparameterization and history matching the synthetic
Brugge field. PCA decouples the selection of the basis functions and the estimation of coefficients. In
another words, the basis function represents the parameter/model uncertainty in a library of the given prior

SPE-170636-MS

realizations[15, 21] and a given set of coefficients represent a realization of reservoir model, e.g.
stochastic spatial distribution of properties. The coefficients can be calibrated with the observed dynamic
data. By applying PCA, a reservoir model can be reconstructed by a linear combination of eigenvectors
obtained with a given set of prior reservoir models using only a few principal components. In fact, PCA
can be regarded as a smoothing operator or tool that removes high frequency noise from the original data
sets. The fewer the number of principal components, the smoother the reconstructed parameters. The most
outstanding advantage of PCA is its capability of capturing the strong spatial correlations among
parameters, with only a few principal components. However, its smoothing effect also raises some issues
because some subtle but important statistical details (e.g., cumulative density or probability density
function) of prior models are no longer preserved. For the channelized facies problem, PCA transforms
discrete facies indicators to real numbers, smears boundaries between different facies, and distorts the
probability distributions of reservoir parameters, especially when the prior models present multi-peak
non-Gaussian PDF. Sarma et al. [17] applied a Kernel-PCA method, which is a nonlinear form of K-L
expansion, to address the limitations associated with the linear K-L expansion. The results of Kernel-PCA
are better than the results of PCA. However, as shown by Sarma et al. [17], the Kernel-PCA also generates
unclear boundaries between facies.
M. Khaninezhad[12, 13] applied the K-SVD method to capture the channel features for the fluvial
models, and compared their K-SVD results with results of PCA for several synthetic cases. Their results
indicate that by sparsifying the prior models to learn a geologic dictionary and using sparse reconstruction
techniques, it is feasible to represent and estimate complex non-Gaussian geologic features from limited
flow data. However, computational cost for applying K-SVD to a large reservoir model is quite expensive.
One of our major objectives is to reconstruct channelized geological and reservoir models with much
fewer number of uncertain parameters so that we can condition these models to production data through
automatic/assisted history matching using model-based derivative-free optimization algorithms. The
geological and reservoir models obtained by HM have to capture major flow dynamics and are also
constrained by histogram of lithology distribution and correlations between reservoir properties of the
original model. The following four important questions must be addressed when regenerating geological
realizations using new parameters.
1. Are new parameters able to regenerate both static models (e.g. facies) and dynamic models (e.g.
permeability field)?
2. Are the models regenerated by new parameters still geologically realistic and relevant? In another
words, does the new geological and reservoir models satisfy geologists requirement?
3. Are the realizations generated from sampling the probability distribution of these new uncertain
parameters (generally they are Gaussian and much easier to sample) are equivalent to (but not
biased from) the realizations generated from sampling the prior probability distribution?
4. Is it feasible to match history production data by tuning these new parameters, .e.g., using
derivative-free optimization methods, especially when adjoint gradient is unavailable?
PCA provides possibility to capture some main features of non-Gaussian models, but loses some minor
features. As a model reconstructed with PCA tends to filter out low frequency information and the
statistics after reconstruction deviate from the original statistics, a straightforward idea is to use normal
score transformation[5] to obtain a non-Gaussian field. Several researchers have applied normal score
transform and/or truncated Gaussian to history matching problems with non-Gaussian prior models (e.g.,
channelized models). Zhao et al.[22] applied the truncated Plurigaussian models to generate facies models
and the Ensemble Kalman Filter to update the associated permeability field. Zhao et al.[22] considered
correcting water saturation, which tends to be deviated from its original distribution at each assimilation
time step, using normal score transform; however, their results indicate that such a correction does not
yield significant improvement over standard EnKF. Zhou et al.[2325] applied normal score transform on

SPE-170636-MS

each element of the state vector (e.g. pressure and saturation fields) before EnKF updating step and
back-transform after EnKF updating step.
Inspired by the normal score transform applied in these prior researches, we proposed a new procedure
(CDF-PCA) that can reproduce exactly the same CDF of each original individual realization. Integer
variables such as facies indicators are regenerated by truncating their corresponding PCA results with
thresholds that honor the fraction of each facies at first, and real variables such as permeability and
porosity are regenerated by mapping their corresponding PCA results to new values according to the CDF
curves of different properties in different facies. Therefore, the models reconstructed by CDF-PCA
preserve both geological (facies fraction) and geostatistical (non-Gaussian distribution with multi-peaks)
characteristics of their original or prior models. If parameters defining the CDF curves are strongly
correlated to those new uncertain parameters (or PCA coefficients), analytical response-surface-models
using Radial-Basis-Function (RBF) [27] are constructed to represent those CDF transform mapping
functions. If they are weakly correlated to PCA coefficients, the facies fraction and the statistics of rock
properties are treated as independent uncertain variables which can be quantified by prior information and
to be tuned during history matching. By seamlessly integrating the CDF-based mapping functions with
PCA, the new method (also called CDF-PCA) inherits the advantage of PCA to capture the main features
or trends of spatial correlations among properties, and preserves the PDF of all geological and reservoir
properties by properly correcting the smoothing effect of the traditional PCA. For discrete properties (e.g.,
facies indicators), the CDF-mapping of CDF-PCA is similar to the truncated Gaussian method, and their
roots can be traced back to the theory of inverse cumulative density function [3]. Truncated Gaussian is
applied on a Gaussian random field. In contrast, the CDF-PCA is able to capture connectivity uncertainty
in the given training realizations; therefore the reconstructed field with CDF-PCA is not limited to a
Gaussian field.
Our preliminary results indicate that the history-matched model using the CDF-PCA alone may not
satisfy the requirement of geologist, e.g., some channels regenerated by the CDF-PCA during historymatching may become broken or disconnected. To alleviate such kind of negative impacts and therefore
to improve the image quality, we propose a method of combining CDF-PCA together with streamline
information which is relevant to flow behavior. We split the history matching process to two steps. Firstly,
we use a streamline simulator to generate the velocity map based on a model generated from PCA before
performing CDF-mapping. Although this model is not perfect, it is sufficient for streamline simulator to
generate the velocity map. Because velocity of the tracer [20] in the streamline provides connectivity
information between injectors and producers, it will enhance channel connectivity without over-correction
on cell-based permeability during the process of history matching. Secondly, we use velocity map as
multipliers to adjust facies generated from PCA and then apply CDF-mapping. In the second step, facies
in high velocity area gain higher weight to be sand and vise versa.
We also demonstrate the feasibility of conditioning to production data by tuning the PCA-coefficients
with the simultaneous perturbation and multivariate interpolation (SPMI) optimization algorithm [4, 8,
and 29] that does not require analytical or adjoint derivatives of the objective function with respect to
parameters to be tuned during the process of HM. The SPMI algorithm is a derivative-free optimization
tool and has been proven robust in In-situ Upgrading Process (IUP) production optimization problems [8,
29]. In each iteration, the SPMI will generate perturbation and searching points simultaneously, and a
quadratic model will be constructed by interpolating perturbation points and searching points iteration by
iteration, using both the value of the objective function and the available derivatives evaluated at each
point. SPMI is a massively parallelized model-based optimization method with or without derivatives.

SPE-170636-MS

Figure 1CDF mapping. The left subfigure (a) shows a cumulative density function (CDF) for a prior realization of a 2-facies model. The right
subfigure (b) shows a CDF for a realization reconstructed with PCA. By taking the same CDF values, the y1 and y2 in the right subfigure can be
mapped into, respectively, 0 and 1 in the left subfigure.

Methodology
PCA with CDF mapping to reconstruct facies and properties
The mean reservoir model can be estimated from Ne unconditional samples m(j) (for j 1, 2,. . ., Ne)
by,

where m(j) is a Nm dimensional vector and Nm Ne. Using PCA, we can approximate a reservoir
model m by y,
(1)
where is an N Nm matrix that is composed of the first N orthogonal basis vectors (corresponding
to the first N largest Eigen-values of Cm, the covariance matrix of the original model), and (called
PCA-coefficients vector) is an N dimensional random vector with mean zero and identity covariance
matrix. PCA maps m from a high (Nm) dimensional domain to in a much lower (N) dimensional domain.
However, when applying PCA to discrete facies indicators, they become real numbers, boundaries
between different facies become unclear, and the probability distributions of reservoir parameters are
distorted.
Because the PCA model in Eq. 1 is derived from minimizing the mean-square-error (MSE) of facies
indicators and reservoir properties between the original prior model and the reconstructed model.
Therefore, it is reasonable to assume that the orders for most data in the original model are preserved in
the reconstructed PCA model. If the value of a property (e.g., porosity) in the i-th gridblock is greater than
the value of the same property in the j-th gridblock for the original model, i,prior j,prior, then it is most
probable that the same order holds for the PCA mode, i.e., i,PCA j,PCA. Based on this observation, we
propose a CDF mapping method to remapping real valued facies indicators obtained with PCA to discrete
facies indicators that honor the fraction of each facies of the prior model, and then remapping reservoir
properties (e.g., porosity and permeability) in each gridblock from their original PCA values to new values
such that the new model preserves the prior PDF or CDF of these properties.
Schematically, the CDF mapping is shown in Figure 1. The dark blue curve in Figure 1(a) shows a

SPE-170636-MS

Figure 2The first step to identify facies, Egg example. The fourth subfigure shows the blue curve lies upon the black curve after applying
CDF-mapping. (a) Original facies; (b) Reconstructed facies with PCA; (c) Reconstructed facies with CDF-PCA; (d) Comparison of cumulative density
function between (a)-(c) facies models.

cumulative density function (CDF) for a realization of a 2-facies Egg model. As it has only two integer
values, 0 or 1, the CDF curve is a stair-like curve where no point appears between 0 and 1. However, the
CDF curve of the model reconstructed by PCA becomes a continuous curve, and the approximated values
of facies indicators in all gridblocks become real values ranging from 0.5 to 1.5, see the dark blue curve
shown in Figure 1(b). A natural way to obtain integer type facies indicators is to simply truncate the PCA
results either to 1 (when y0.5) or to 0 (otherwise). However, the fraction for each facies (or the CDF
curve) of the truncated facies indicators are not guaranteed the same as that of the original model, because
the threshold for the truncation is fixed to 0.5. In contrast, when CDF mapping is applied, we always
guarantee that the fraction for each truncated facies indicators are exactly the same as that of the original
model. In Figure 1(b), we take one variable (y1) and follow the path in red color to find its percentile value
(F(y1)). Then we map F(y1) to the left subfigure and find its corresponding new value of 0 (m1, prior 0)
with the same percentile value on the prior CDF curve. For y2, we follow the green lines and map to the
left subfigure to find its corresponding value of 1. After we repeat the procedure for all gridblocks (or yis),
we obtain a new facies model which has the same CDF curves as the prior facies model. In fact, the
CDF-mapping for integer variables (facies) can be regarded as a truncation operator defined as: f (y;)
0 for y and f (y ; ) 1 for y , where is the threshold that is predetermined by the prior model.
For the example shown in Figure 1, 0.495. The switching point is referred to as the fraction of the
first facies, which corresponds to the threshold. For example, the switching point in left subfigure of
Figure 1 is 0.61.
The CDF-mapping procedure discussed above can also be applied for real variables (e.g. permeability,
porosity, Net-to-Gross values, etc.). For each realization, we include both facies indicators and rock
properties in the original model mprior and the PCA approximate model y. We use two steps to reconstruct
the realization of facies and rock properties. First, we use CDF-mapping to reconstruct facies indicators.
Second, known the facies indicator for each gridblock, we calculate the CDF curves of the rock properties
for each individual facies from prior realizations and then apply the CDF-mapping procedure for rock
properties, i.e., remapping the value of a rock property in a gridblock (the corresponding element in the
vector y) to a new value according to the CDF curve of that property for the facies that has been already
regenerated in the same gridblock.
Figures 2 to 4 show results obtained by applying CDF-mapping for both facies and permeability field
of the Egg model. Figure 2 shows the first step to identify facies. An unconditional realization generated
from the prior Egg model is shown in Figure 2a. For the purpose of comparison, we also show the
corresponding model obtained with PCA (Figure 2b) and the corresponding model obtained with
CDF-PCA (Figure 2c). We first obtain the basis matrix from the covariance matrix of the prior model
that is estimated from a set of training realizations (1000 realizations for this example). Note that the

SPE-170636-MS

Figure 3The second step to reconstruct permeability, Egg example. For different facies, normalized permeability is adjusted based on prior
histogram information.

Figure 4 Comparison of normalized permeability, Egg example. (a) Original permeability; (b) Reconstructed permeability with PCA; (c)
Reconstructed permeability with CDF-PCA; (d) Comparison of cumulative density function (CDF) curves for three permeability models.

training realizations can be generated by any geological modeling method, e.g., object based modeling,
multipoint statistics, etc. A model y (Figure 2b) corresponding to the prior realization of mprior shown in
Figure 2(a) is then reconstructed with PCA (Eq. 1) by solving the corresponding principal components
. We should note that the mprior shown in Figure 2(a) is not in the training
from
set. For the given mprior, the fraction of facies 0 is known (0.42), see the black curve with stars shown in
Figure 2(d). Therefore, the threshold ( 0.49) for facies is determined. For a realization that is to be
generated during history matching procedure, the threshold or switching point can be treated either as a
function of the coefficient or as an uncertain parameter to be tuned. Using the PCA model (y) and the
threshold value ( 0.49), the 2-facies model mCDF-PCA is reconstructed (as shown in Figure 2c). The
Figure 2d shows the corresponding CDF curves for the prior model (black curves with stars), the PCA
model (the green curve), and the CDF-PCA model (the blue curve), respectively. Obviously, the CDF
curve of the CDF-PCA model is exactly the same as that of the prior model, i.e., the blue curve is identical
to the black curve with stars. However, the CDF curve of the PCA model (the green curve) deviates from
the prior CDF curve significantly.
Figures 3(a) and (b) show the second step to obtain permeability for facies 0 and facies 1 using
CDF-mapping. In both subfigures, the black curve with stars, the blue curve, and the green curve
correspond to CDF curves of the normalized permeability for the original/prior realization, the CDF-PCA
realization, and the PCA realization, respectively. The fact that the black and the blue curves are identical
clearly validates that the CDF-mapping preserves the probability distribution of the original model.

SPE-170636-MS

Figures 4(a), (b), and (c) illustrate the permeability fields of the prior model, the model reconstructed by PCA, and the model reconstructed by
CDF-PCA, respectively. The noticeable difference
between Figure 4(a) and Figure 4(b) clearly indicates that PCA may distort the permeability field
significantly. In contrast, after CDF-mapping, the
new model reconstructed with CDF-PCA shown in
Figure 4(c) is quite similar to the original or prior
model shown in Figure 4(a). The difference between
the CDF curve of PCA (the green curve) and the
CDF curve of the prior (the black curve with stars)
is also noticeable, as shown in Figure 4(d). The
Figure 5QA/QC for response surface between switching points and
original CDF curve of permeability (in black with PCA coefficients, Egg example.
stars) clearly indicates that the prior distribution is
non-Gaussian, and the original PDF curve has two peaks: one peak corresponds to the shale facies with
very low permeability with the normalized k being about 0.1 in Figure 4(d); whereas the other peak
corresponds to the sand facies with quite large permeability with the normalized k being about 1.0 in
Figure 4(d). In contrast, the CDF curve of permeability from the traditional PCA shown as the green curve
in Figure 4(d) looks like a normal (Gaussian) distribution, and its PDF curve has only one peak.
Determination of switching point and parameters defining CDF curves for random realizations
When performing CDF-mapping, we need to specify the CDF curves for reservoir properties or
switching point for facies indicators. Typically, different realizations have different CDF curves or
different switching points. For a model that is reconstructed corresponding to a prior realization generated
from geostatistical software, the switching points for facies and CDF curves for real valued reservoir
properties in each facies can be determined from the prior realization. However, when a new realization
is generated with CDF-PCA but it does not correspond to any of those prior realizations, the CDF curves
or switching points used for the CDF-mapping for this new realization are not readily available. One can
treat the switching point and the statistic parameters that define the CDF curves of rock properties as
independent random/uncertain variables. If we assume that rock properties in the same facies follow
Gaussian distribution, then the statistic parameters to define the CDF curves of rock properties could
include the mean value (CDF,i,j) and the standard deviation (CDF,i,j) of the i-th property for the j-th facies.
The mean value and the standard deviation of a specific property in a specific facies may vary for different
realization. Based on those prior realizations, we can quantify the uncertainty of these new uncertain
and
) and their covariance matrix. In such
parameters, i.e., to determine their mean values (
a situation, the parameters to be adjusted during assisted history matching include PCA coefficients, the
switching point and the statistics parameters to define the CDF curve of each rock property for each facies.
For a 2-facies problem, the total number of parameters is nc 1 4 nr, where nc is the number of PCA
coefficients, 1 is for the switching point, and nr is the number of rock properties. 4 comes from two
parameters (mean value and standard deviation) for each type of rock property for two facies.
In a more general case, one can assume that CDF parameters are correlated to , but with correlation
coefficient not equal to 1. They depend on in some degree, but not completely determined by . For
example, as shown in Figure 5, a response surface with RBF is built between switching points and PCA
coefficients for Egg example and 1000 QA/QC switching points were used to check the quality of the
response surface. If the quality of response surface meets satisfactory, i.e. the points are close to the
diagonal straight line in this QA/QC plot, there are strong correlation between switching points and the
PCA coefficients. One can use the predicted switching point as mean value and use history matching

SPE-170636-MS

Figure 6 Realization generated with (a) PCA without hard data, (b) CDF-PCA without hard data, (c) PCA conditioned to hard data, (d) CDF-PCA
conditioned to hard data. Blue is facies 0 and yellow is facies 1. Hard data in the four corner gridblock are known. The red open circle represents
facies 0 and the red solid dot represents facies 1.

procedure to tune the noise for the switching point. In this case, a Support-Vector-Regression [26, 28]
could be applied to capture both their dependence on and their stochastic features, e.g., using the RBF
kernel. Research on this respect is beyond the scope of this paper.
Conditioning reservoir models to hard data with CDF-PCA
Hard data here is referred to data obtained from direct measurements (e.g., well log data), typically
point data obtained from wells. The term hard data is used to emphasize the fact that the modeling method
should exactly reproduce these measured values at their corresponding locations. However, training
realizations can be generated by object-based models (e.g., channelized models) which may or may not
have been conditioned to hard data, e.g., when new hard data acquired from a new wells that are drilled
later. These realizations can be easily reconstructed by PCA and conditioned to hard data by minimizing
the objective function defined in Eq. 2.
(2)
where mhard is a column vector of dimension Nhard, the number of hard data, and Chard is the Nhard
Nhard covariance matrix for measurement errors of hard data. The matrix Chard becomes diagonal when all
the measurement errors of hard data are independent to each other. One should pre-calculate the index of
the hard data in the vector m. Then
y is a sub-vector consisting of values with the same index as mhard
in y. ( 0)T ( 0) is a regularization term and is the regularization coefficient, where 0 denotes
the PCA coefficients of an unconditional realization that is generated without conditioning to new hard
data and denotes the new coefficients of the conditional realization by conditioning to new hard data. We
exists. In case of no measurement error associated with hard data,
does not
use 1 when
exist, and it is equivalent to set Chard to an identity matrix while setting to a very small positive value,
e.g., 10-8 in our implementation. We can solve from p H 0,
(3)
where

is a sub-matrix of that corresponds to the indices of new hard data. After solving the new

coefficients , we can follow the CDF-PCA workflow discussed above to generate the conditional
realization. Comparison between results generated from unconditional and conditional realizations are
shown in Figure 6. In this example, the training realizations (or images) are generated by multi-pointstatistics approach. An unconditional realization is regenerated with PCA and CDF-PCA, as shown in
Figure 6(a) and (b), respectively. For the unconditional realization shown in Figure 6(b), the facies
indicators at the four corners of (0,0), (60,0), (0,60) and (60,60) are 0, 0, 0 and 1, respectively. After
drilling 4 new wells that are located on the four corners of the reservoir domain, hard data (facies

10

SPE-170636-MS

indicators) in the four corner gridblocks are known. The two open red circles at the tow top corners
indicate that the two wells are drilled through facies 0 and the two solid red dots at the two bottom corners
indicate that the other two wells are drilled through facies 1. Core samples reveal the true facies indicators
are 1, 1, 0 and 0. Obviously, the unconditional realization does not match the hard data. The conditional
realization after conditioning to hard data using Eq. 3 is shown in Figure 6(c) with PCA and the
conditional CDF-PCA realization is shown in Figure 6(d). The conditional realization yields correct facies
at all four corners. Because no reservoir simulation is required for conditioning to hard data, it is quite
convenient to determine with Eq. 3 even if the number of PCA components is huge. Therefore, the
CDF-PCA provides a very efficient way of generating conditional realizations by conditioning to new
hard data without losing information from the original unconditional realizations.
Conditioning reservoir models to production data with CDF-PCA
A reservoir model usually consists of millions of gridblocks and there are a few unknown properties
(permeability, porosity, net-to-gross ratio, initial water saturation, etc.) at each gridblock. Due to reservoir
heterogeneity, history matching is typically an ill-conditioned problem, i.e. the number of unknown
parameters is much larger than the number of available observed data. History matching is usually posed
as a mathematical problem of generating posterior samples by conditioning to production data within a
Bayesian framework. When CDF-PCA is applied for reparameterization, it has been proved that the new
uncertain parameter vector is Gaussian with zero mean and identical covariance matrix. Therefore, the
logarithmic-PPDF (posterior-probability-density-function) of after conditioning to measurement data
dobs is given by the Bayes rule:
(4)
where dobs is an Nd dimensional observation data vector and CD represents the Nd Nd covariance
matrix of measurement errors for dobs; and g*() g(m()) is the corresponding predicted data vector
obtained by running the reservoir simulation with reservoir model m(). During the process of history
matching, we adjust the parameter vector to minimize the objective function O() for the maximum a
posteriori (MAP) estimate.
Adjustment of PCA-models using streamline information
To improve the quality of the history matched model in terms of connectivity for object based models,
we account for the flow behavior information. Flow relevant information should be considered to assist
correcting the incorrect facies indicator before any types of static-information-based correction. In another
word, the idea is to update reservoir properties within stream tubes before applying CDF-mapping
correction.
The interstitial velocity indicates the speed of a particle moving from one grid to another grid.
Therefore, high velocity area implies that the particle moves in a high permeability zone and vise versa.
For example, if a gridblock in a prior model was identified as facies shale with low permeability but the
velocity in this gridblock (e.g., by running streamline simulation) is much higher than the velocities in
other nearby gridblocks with shale facies, it is a good indicator that this grodblock should be connected
to a channel nearby, and therefore we should adjust the facies on this gridblock from shale facies to sand
facies with high permeability, because it is more probable to observe a relatively high velocity in a
gridblock when the gridblock connects to a channel instead of depositing randomly on a shale background.
Figure 7 illustrates an example of adjusting PCA-models using the streamline (or velocity) information. Starting from the top left figure (the facies model obtained with PCA), there are two paths in Figure
7: the top path shows the example of using streamline information to highlight channels and then
performing CDF-mapping approach; the lower path shows the example of directly using CDF-mapping
to obtain the facies model. As shown in the upper left subfigure (the original PCA model) and the bottom
right subfigure (the CDF-PCA model) in Figure 7, the channel branch near the top tends to be broken from

SPE-170636-MS

11

Figure 7An example of combining streamline information with CDF-PCA.

the results generated from PCA. However, the velocity map (see the left of the two pictures shown in the
upper middle subfigure in Figure 7) suggests that the top channel branch should be connected. The right
of the two pictures shown in the upper middle subfigure in Figure 7 is obtained after correcting the original
PCA results (the top left). The connectivity of the upper channel branch is improved by correcting the
original PCA-model using the velocity (or streamline) information. By gradually increasing the weighting
factor of velocity information, the upper channel branch will become connected again.

Case Studies
Quantifying quality of reconstructed models for a field case
The CDF-PCA method is applied to a real field example to quantify the quality of the reconstructed
models and to prove that the method is capable for a large scale and complex channelized model. The
reservoir model has 0.2 million gridblocks, and is composed of sand, levee and shale facies. 1000
unconditional realizations were generated with object-based prior geological model, and they were used
to form the basis matrix of (see Eq. 1) for both PCA and CDF-PCA. Two measures were used to
quantify the quality of reconstructed data sets, including the fraction of incorrect facies (Ef) and the
mean-square-error (MSE or the 2 norm) of permeability and porosity between the images of the original
and the reconstructed data sets (CDF-PCA). As 2 norm is not a good choice for evaluating the difference
of facies between two models, we use Ef for facies instead. The fraction of incorrect facies, Ef, is actually
the normalized 0 norm, i.e., calculated by normalizing the number of gridblocks with incorrect facies by
the total number of gridblock.
In this example, a set of basis matrix with different number of principal components (N 25, 50,
100, 200, 400, and 800) was tested to show the impact of N on the quality of the reconstructed model.
Table 1 summarizes quality analysis for models reconstructed with different methods for the real field
case. The results listed on the second and the third rows in Table 1 clearly show that the CDF-PCA can
improve the quality of the reconstructed facies model significantly. The PCA solution minimizes the total
MSEs for facies, permeability, and the porosity between the original and the reconstructed data set. When
N is small (e.g., less than 200), the MSEs of CDF-PCA for permeability (the fourth row) and porosity
(the sixth row) become larger than those of PCA (the fifth row and the seventh row), because the CDF
correction makes the new solution deviates from the PCA solution that minimizes the total MSEs.

12

SPE-170636-MS

Table 1Quality analysis of models reconstructed with different methods for the real field case
Number of principal components

25

50

100

200

400

800

Fraction for incorrect facies (CDF-PCA)


Fraction for incorrect facies (PCATruncation)
MSE for permeability(CDF-PCA)
MSE for permeability (PCA)
MSE for porosity(CDF-PCA)
MSE for porosity(PCA)

0.1824
0.2332
0.2455
0.2075
0.2275
0.1925

0.1654
0.2197
0.2249
0.1914
0.2095
0.1790

0.1391
0.1905
0.1911
0.1714
0.1810
0.1612

0.0972
0.1352
0.1451
0.1384
0.1409
0.1312

0.0519
0.0745
0.1014
0.1051
0.1028
0.1010

0.0007
0.0012
0.0345
0.0494
0.0408
0.0485

Figure 8 (a) Relative improvement in quality, and (b) impact of the number of PCA coefficients on quality of facies, permeability, and porosity
models reconstructed by CDF-PCA method.

However, when N is large enough (e.g., larger than 200), the increment in MSEs of CDF-PCA for
permeability and porosity becomes negligible. Therefore, the gain from improving the quality of the
reconstructed facies model, and more importantly, from preservation of the non-Gaussian characteristics
of the CDF-PCA weights much more than the lose due to the negligible increment in MSEs for the
permeability and the porosity, see Figure 8(a).
Figure 8(a) illustrates the relative comparison for quality of facies (in black), permeability (in red), and
porosity (in blue) models reconstructed by CDF-PCA method and traditional PCAtruncation method.
When compared to the traditional truncation method (PCAtruncation) where the threshold for truncation
is fixed to 0.5, the CDF-PCA method can reduce the fraction of incorrect facies by 22~42%, see the back
curve with open circles in Figure 8(a). Similarly, the negative relative improvement shown by the red (for
permeability) and blue (for porosity) curves indicates that the MSEs of CDF-PCA for permeability and
porosity are larger than those of the PCA, when N is less than 200. Plots shown in Figure 8(b) indicate
that the measures of model error between the prior model and the reconstructed model with CDF-PCA
(fraction of incorrect facies or MSEs for permeability and porosity) decrease as the number of principal
components increases. When N 200, the fraction of incorrect facies is less than 0.1, and the MSEs for
permeability and porosity are less than 0.15.
In the following example, only 50 principal components are used to reconstruct the facies, permeability
and porosity models of the real field case. Figure 9 shows images of facies distribution in one layer of the
prior realization (the top left), the PCA realization (the top middle), and the CDF-PCA realization (the top
right). Obviously, the facies model reconstructed with PCA (the top middle subfigure) is unacceptable for
geologists, because there is no clear boundary between three facies. It is very difficult to identify the levee
facies from the PCA facies image. Even though we have reduced the number of uncertain parameters from
600,000 to 50 with a reduction factor of 12,000, the facies model reconstructed by the CDF-PCA is quite
similar to the original model. As we expected, the CDF-PCA preserves fractions (or PDF) of all three
facies, e.g., by comparing the bar plots shown on the bottom right (generated from the realization

SPE-170636-MS

13

Figure 9 Layer images of facies (the first row) and their corresponding PDF (the second row) for the prior model (the left column), and the models
reconstructed by PCA (the middle column) and CDF-PCA (the right column) with 50 principal components.

reconstructed with CDF-PCA) with those shown on the bottom left (generated from the prior realization).
In contrast, except for those inactive cells (as indicated by the left-most blue bar on the bottom middle
subfigure), the PDF curves for the realization reconstructed with PCA behaves more like a Gaussian
instead of three distinct bars.
Figure 10 shows images of facies (subfigures on the top row), permeability (subfigures on the second
row), and porosity (subfigures on the bottom row) in a vertical cross-section plane. These images are
generated form a prior realization (subfigures on the left-most column), reconstructed with PCA (subfigures on the middle column) and CDF-PCA (subfigures on the right-most column), respectively.
Compared to the prior model, the porosity and permeability fields reconstructed by PCA are much
smoother and they are unacceptable for geologists and reservoir engineers. In contrast, after CDF
correction, the porosity and permeability fields reconstructed by CDF-PCA are quite similar to their
original counterparts. When more principal components are added, the difference between the models
reconstructed by CDF-PCA and their original prior models will become negligible; see plots shown in
Figure 8(b).
For the prior realization, some levee facies are within the channel facies (see the thin light green levee
in the red channel on the top left subfigure in Figure 9). However, such subtle features disappear in the
reconstructed facies model with CDF-PCA shown on the top right subfigure in Figure 9. You may also
notice that more levee facies (with indicator of 1) locate between the shale facies (with indicator of 0) and
the channel facies (with indicator of 2) in both the prior model and the model reconstructed with
CDF-PCA. According to the geological or depositional process, levee facies behaves much like an
transitional facies between channel facies and shale facies, and we also follow such a natural order of
depositional process to order these facies indicators, i.e., 0 for shale, 1 for levee, and 2 for channel.
However, in a more general case, such kind of natural order among facies may not exist, especially when
more facies co-exist. The CDF-PCA method presented in this paper fails to reproduce satisfactory facies
images for such a case. Because of this limitation, in the following sections of this paper, we will be
focused on reconstructing models with only two facies. Inspired by these observations, we have
investigated an alternative method of CDF-based pluri-PCA to overcome the limitations of the CDF-PCA,
where all facies are reconstructed according to the rock-type-rules and the order of facies indicators will

14

SPE-170636-MS

Figure 10 Vertical cross-sectional images of facies (the first row), permeability (the second row), and porosity (the third row) for the prior model
(the left column), and the models reconstructed by PCA (the middle column) and CDF-PCA (the right column) with 50 principal components.
Table 2Parameters for Egg model
Sand fraction
Orientation mean
Amplitude mean
Wavelength mean
Width mean

0.42
53
125 m
847 m
84.3 m

Thickness mean
Sand permeability mean
Sand permeability std
Shale permeability mean
Shale permeability std

3
2,163
1,320
2
0.5

m
md
md
md
md

not impact the results of the reconstructed models. Details of the CDF-based pluri-PCA and how to
automatically determine the rock-type-rules based on prior realizations are beyond the scope of this paper,
and it will be discussed in more details in another paper to be published later.
Quantifying uncertainty of production forecasts for the Egg model
In this section, we use the Egg model to demonstrate the similarity of production uncertainty quantification between the prior models and the models reconstructed with CDF-PCA method. Egg model is a
meandering channel system and the channels are populated with object-based modeling approach and the
facies 0 (sand) is modeled with channel fraction of 0.42. The facies 1 (shale) is the background lithology.
The orientation, amplitude, wavelength, width and thickness of channels are sampled with triangle
distribution and their mean values are listed in Table 2. The minimum (or maximum) value of the triangle
distribution is defined as the min_factor (or max_factor) multiplied by the mean value. The porosity and
permeability in the sand body are modeled with Sequential Gaussian Simulation. The correlation lengths
along major (or long) principal axis, the minor (or short) principal axis, and the vertical axis are,
respectively, 173 m, 93.5 m and 3.2 m. The azimuth is -102.0 degree. Egg model contains 60 60 10
gridblocks. The grid size is x y z 4.572m. There are 6 producers and 4 injectors in the
reservoir. Note that the models have been conditioned to hard data of facies and well log data of
permeability and porosity already. The standard deviation of permeability and porosity are estimated from
well log data directly. Only water and oil phases are present in the system.

SPE-170636-MS

15

Figure 11Initial Egg model with 10 wells.

Figure 12Comparison of the first layers for facies models for Egg model.

Figure 13Reconstructed facies with random coefficients for Egg model.

Figure 11 shows one realization of permeability field with the well locations in the reservoir. The red
color indicates the sand deposition in channel with high permeability and the light blue color indicates the
matrix with low permeability. The gray wells are injectors and the golden wells are producers.
In this case, we treat the shale fraction and the statistic parameters that define the CDF curves of rock
properties as uncertain parameters to be solved in addition to PCA coefficients. Figure 12 compares the
first layer for one realization of the original training model (the left subfigure) with the model reconstructed with PCA (the middle subfigure) and the model reconstructed with CDF-PCA (the right
subfigure). In this example, 100 principal coefficients were used to reconstruct the model for both PCA
and CDF-PCA. After reducing the number of uncertain parameters from 108,000 to 100 with a reduction
factor of 10,800, the facies model reconstructed with the CDF-PCA is almost identical to the training
model. Figure 13 shows the PCA and CDF-PCA images reconstructed with a group of random coefficients
sampled from Gaussian distribution with mean zero and variance 1. The CDF-PCA image shown in the
right subfigure in Figure 13 is randomly sampled and it still captures the main features (meandering

16

SPE-170636-MS

Figure 14 Comparison of probability density function of production data for Egg model. Black: Training, ensemble 1; Green: CDF-PCA, ensemble
1; Red: CDF-PCA, random; Blue: QA/QC, ensemble 2; Light blue: CDF-PCA, ensemble 2, 200 components; Purple: CDF-PCA, ensemble 2, 100
components.
Table 3Comparison of P10-50-90 values of cumulative oil and cumulative water
Cumulative oil (STB)

Ensemble-1: training
Ensemble-1: CDF-PCA-100
Ensemble-2: QA/QC
Ensemble-2: CDF-PCA-100
Ensemble-2: CDF-PCA-200
Random: CDF-PCA-100

Cumulative water (STB)

P10

P50

P90

P10

P50

P90

7.381e5
7.45E05
7.349e5
7.403e5
7.325e5
7.130e5

8.281e5
8.38E05
8.302e5
8.391e5
8.365e5
8.184e5

9.586e5
9.67E05
9.563e5
9.698e5
9.623e5
9.473e5

1.358e6
1.35E06
1.357e6
1.343e6
1.351e6
1.366e6

1.485e6
1.48E06
1.483e6
1.474e6
1.477e6
1.496e6

1.576e6
1.57E06
1.578e6
1.573e6
1.581e6
1.600e6

channels) of the training image shown in Figure 12, although locations of channels in both figures are not
exactly the same.
For the purpose of quality analysis and quality control (QA/QC), we have generated two ensembles of
1000 realizations with object based modeling approach. The ensemble 1 is used as training realizations
and the ensemble 2 is used as QA/QC realizations. We compute PCA coefficients for both ensembles, and
then reconstruct their corresponding realizations using CDF-PCA.
Figure 14 shows the comparison of probability density function of different production forecasts data
for the Egg model, including cumulative oil (a) and cumulative water (b), all are evaluated at the end of
the 8 years production period. There are six curves in each subfigure. The black and blue curves,
respectively, represent the production forecasting results obtained from ensemble 1 of the training
realizations and the ensemble 2 of the QA/QC realizations. The green and purple curves are, respectively,
the production forecasting results obtained from ensemble 1 and 2 realizations that are reconstructed with
CDF-PCA using 100 principal coefficients. The red curve corresponds to the pdf of production forecasts
obtained from 1000 realizations that are generated with CDF-PCA by sampling the 100 random principal
coefficients. For the purpose of QA/QC, we also use 200 coefficients to reconstruct the ensemble 2
(QA/QC realizations) and show its probability density function in light blue color. As shown in Figure 14,
the black curves are quite close to the blue curves, which clearly indicate that the quality of using 1000
training realizations is reasonably satisfactory. The light blue curve is fairly close to the purple curve,
which indicates that using 100 principal coefficients are sufficient for this case. Results listed in Table 3
are P10-P50-P90 values of cumulative oil and cumulative water for these six different scenarios. All
curves in different colors shown in the same subfigure in Figure 14 are very close to each other, and the

SPE-170636-MS

17

P10-50-90 values for different scenarios listed in the


same column in Table 3 are also very close to each
other, which further validates that the uncertainty
quantifications based on realizations of geological
and reservoir models reconstructed/regenerated by
the CDF-PCA using 100 principal coefficients are
almost the same as good as using their original
realizations. In summary, CDF-PCA not only honors the uncertainty characteristics of the original
static models, but also honors the uncertainty characteristics of their production forecasts.
Assisted history matching by integration of
CDF-PCA with streamline information for the
Egg model
Figure 15Normalized objective function VS iterations.
In this history matching example, an additional realization, which is not within the training data set,
was generated as the true model. This true model was used to generate true measurement data of
water injection rate in all water injectors and water production rates in all producers by running reservoir
simulation under the liquid rate constraints of 200 BBL/day for all producers and maximum bottom-hole
pressure constraints of 400 bars for all water injectors. The production period is 8 years. The observed data
is generated by adding synthetic measurement noise to the true measurement data. The standard
deviation of measurement error is 5 BBL/day for the rate measurement. The parameters to be tuned
include PCA coefficients, the switching point of facies and the statistical parameters that define the CDF
curves of permeability and porosity in each facies. The parallelized optimization tool SPMI is applied to
minimize the objective function defined by Eq. 4.
Figure 15 shows the normalized objective function versus iterations. As shown by the red curve in
Figure 15, the normalized objective function converges to 14.5 after 25 iterations if only CDF-PCA is
applied. The blue curve in Figure 15 indicates that the normalized objective function can converge to a
smaller value of 9.38 after 23 iterations by integration of CDF-PCA with streamline correction. Even at
the first iteration where the initial prior model was used, streamline correction reduces the normalized
objective function from 74 to 64, because the initial models after performing CDF-mapping with and
without streamline correction are different. Figure 7 shows how streamline information can be used to
improve channel connectivity in facies model. Visual inspection tells us that the facies model obtained
with combining streamline information and CDF-mapping is better than the model obtained with
CDF-mapping only. Although the channels in black circles in both subfigures are disconnected, the one
with streamline information correction shows better connectivity after CDF-mapping.
In this example, we use 500 PCA components to reconstruct geological realization. Different permeability realizations in the first and the tenth layers of the Egg model are illustrated in Figures 16 and 17.
The top left subfigure is the true model; the top middle one is the prior realization generated by object
based modeling method; the top right one is the model reconstructed with CDF-PCA corresponding to the
prior realization; the bottom left is the best model obtained after history matched using the CDF-PCA
approach; and the bottom right one is the best matched model using the hybrid approach of CDF-PCA and
streamline information. Because streamline enhances channel connectivity, the history matched model
with streamline correction (the bottom right) looks more similar to the prior model than the one (the
bottom left) obtained with CDF-PCA alone.

18

SPE-170636-MS

Figure 16 Comparison of the 1st layer of models. Upper left: True; Upper middle: Initial generated by an object-based geological modeling tool;
Upper right: Initial by CDF-PCA; Lower left: Best model by CDF-PCA; Lower right: Best model by CDF-PCAstreamline.

Figure 17Comparison of the 10th layer of models. Upper left: True; Upper middle: Initial by Initial generated by an object-based geological
modeling tool; Upper right: Initial by CDF-PCA; Lower left: Best model by CDF-PCA; Lower right: Best model by CDF-PCAstreamline.

SPE-170636-MS

19

Figure 18 Comparison of the outcrop of models. Upper left: True; Upper middle: Initial by Initial generated by an object-based geological modeling
tool; Upper right: Initial by CDF-PCA; Lower left: Best model by CDF-PCA; Lower right: Best model by CDF-PCAstreamline.

From Figures 16 and 17, the following two evidences are observed by comparison between the prior
model (top middle) and the best matched model (bottom right) that is obtained by the hybrid approach of
CDF-PCA with streamline information:
1. The injector myinj3 and the producer myprod1 are most probably connected through a channel
in the upper layers of the reservoir;
2. The injector myinj1 and the producer myprod1 are most probably connected through a channel
in the lower layers of the reservoir.
This information could help geologist to improve the geologic model, especially if justified by the
geologic concept and potentially with other soft information. Figure 18 compares cross-sections of the five
models. These models show that the channel connectivity in the vertical direction between layers is also
geologically reasonable.
Plots shown in Figures 19 and 20 include the observed data (green) of water production rate in each
producer (Figures 19) and water injection rate in each injector (Figures 20), and those predicted with the
initial (red) model, the HM matched model with CDF-PCA alone (black) and the HM matched model with
CDF-PCA plus streamline correction (blue). Overall, the matching quality is improved significantly,
although we still see some gaps in some producers and injectors. Except for the producer myprod3 (see
the bottom left subfigure in Figure 19), the HM matched model obtained from the hybrid approach of
CDF-PCA with streamline correction yields predictions (blue curves) that match the observed data (green
curves) much better than those of applying CDF-PCA alone. Because the streamline correction enhances
the connectivity between producers and injectors, it is most probable that the thin channel in the black
circle shown on the bottom right subfigure in Figure 17 connects the producer myprod3 and the injector
myinj4, and therefore results in earlier water breakthrough in myprod3. In our current implementation,
the weighting factor for streamline correction is fixed. We believe that introducing the weighting factor
as a new uncertain parameter to be tuned during the process of HM would further improve the
performance.

20

SPE-170636-MS

Figure 19 Production water rate matches for Egg model. Red: Prior model; Blue: Best match by CDF-PCAstreamline; Black: Best match by
CDF-PCA only; Green: measurement data.

Figure 20 Injection water rate matches for Egg model. Red: Prior model; Blue: Best match by CDF-PCAstreamline; Black: Best match by
CDF-PCA only; Green: measurement data.

Discussion and Conclusions


We propose a new procedure that reproduces exactly the same CDF curves of facies and reservoir
properties in each facies of each original individual realization. By seamlessly integrating the CDF-based
mapping functions with PCA, the new method (also called CDF-PCA) has the following advantages:

It inherits the advantage of capturing correct spatial correlations of PCA, and preserves geostatistics of all physical properties by properly correcting the smoothing effect of PCA.
It can be used for reconstructing the training realizations of non-Gaussian fields, e.g., those
generated with multi-point-statistics modeling or object based modeling tools.
It provides a convenient way of conditioning non-Gaussian model (e.g. channelized model) to hard
data with the analytical solution.

SPE-170636-MS

21

It not only honors the uncertainty characteristics of the original static models, but also honors the
uncertainty characteristics of their production forecasts.

The CDF-PCA method is applied to a field case to quantify the quality of the reconstructed three-facies
models. When natural order among facies exists, we show that the quality of the reconstructed model is
quite satisfactory. To overcome the limitations of the CDF-PCA that requires the existence of natural
ordering among facies, an alternative method of CDF-based pluri-PCA has been investigated, where all
facies are reconstructed according to the rock-type-rules and the order of facies indicators will not impact
the results of the reconstructed models. Details of the CDF-based pluri-PCA and how to automatically
determine the rock-type-rules based on prior realizations are beyond the scope of this paper, and it will
be discussed in more details in another paper to be published later.
We also apply the CDF-PCA method to a synthetic case. Our results show that the geological facies,
reservoir properties, and production forecasts of models reconstructed with CDF-PCA are well consistent
with those of the original models. The integrated HM workflow of CDF-PCA with streamline information
generates reservoir models that honor production history with minimal compromise of geological realism.
Usually, it is feasible to sample the prior uncertain model (or its PDF) by generating enough
unconditional realizations and then to quantify the uncertainty of production forecasting by running
reservoir simulation with these unconditional realizations. To sample the posterior uncertain model (or its
PDF), one needs to perform history matching using each unconditional realization as the initial guess, e.g.,
when applying the randomized maximum likelihood (RML) method. The high dimension of unknowns
makes it quite expensive to perform HM for such a large number of unconditional realizations, especially
when the adjoint-gradient is not available. The CDF-PCA approach can effectively reduce the number of
uncertain parameters without compromising the uncertainty quantification characteristics of the original
or prior model, which makes it feasible generating posterior realizations by history matching production
data. Furthermore, the Gaussian features for the reduced set of uncertain parameters lends itself more
naturally to both EnKF and RML methods that are formulated by assuming Gaussian prior distribution.
The CDF-PCA plus streamline correction HM workflow can be easily integrated with EnKF and RML
methods to further quantify uncertainty of reservoir properties and production forecasting by conditioning
to production data.

Acknowledgement
The authors would like to thank Shell International Exploration and Production Inc. for permission to
publish this paper. We also want to thank the following colleagues for their suggestions: Faruk O. Alpak,
Douglas Leyden, Matthew Wolinsky, Jim Jennings, Gosia Kaleta and Jeroen Vink. We also want to thank
Mr. Hai Vo for providing MPS training realizations in the section of Conditioning reconstructed model
to hard data.

References
[1] S.I Aanonsen, G. Nvdal, D.S. Oliver, A.C. Reynolds, and B. Valls. The Ensemble Kalman
Filter in Reservoir EngineeringA Review. SPE J., 14:393412, 2009.
[2] M. Armstrong, A. Galli, H. Beucher, G. Loch, D.B. Doligez, R. Eschard, and F. Geffroy.
Plurigaussian Simulations in Geosciences. Springer.
[3] G. E. P. Box and Mervin E. Muller. A Note on the Generation of Random Normal Deviates. The
Annals of Mathematical Statistics, 29:610 611, 1958.
[4] C. Chen, L. Jin, G. Gao, D. Weber, J.C. Vink, D.F. Hohl, F.O. Alpak, and C. Pirmez. Assisted
History Matching Using Three Derivative-free Optimization Algorithms. SPE-154112-MS, Proceedings of SPE Europec/EAGE Annual Conference, 2012.

22

SPE-170636-MS

[5] C.V. Deutsch and A.G. Journel. GSLIB: Geostatistical Software Library and Users Guide
(Applied Geostatistics Series). Oxford University Press.
[6] G.R. Gavalas, P.C. Shah, and J.H. Seinfeld. Reservoir History Matching by Bayesian Estimation.
SPE Journal, 16:337350, 1976.
[7] G. Gao, G. Li, and A.C. Reynolds. A Stochastic Optimization Algorithm for Automatic History
Matching. SPE Journal, 12:196 208, 2007.
[8] G. Gao, J.C. Vink, F.O. Alpak, and W. Mo. An Efficient Optimization Workflow for Field-scale
In-situ Upgrading Developments. URTec-1885283, Proceedings of the Unconventional Resources
Technology Conference held in Denver, Colorado, USA, 2527 August 2014, 2014.
[9] L. Ingber. Very Fast Simulated Re-annealing. Mathematical and Computer Modelling, 12(8):
967973, 1989.
[10] L. Ingber. Simulated Annealing: Practice Versus Theory. Mathematical and Computer Modelling, 18(11):29 57, 1993.
[11] M. Kelkar and G. Perez. Applied Geostatistics for Reservoir Characterization. Society of
Petroleum Engineers.
[12] M.M. Khaninezhad, B. Jafarpour, and L. Li. Sparse geologic dictionaries for subsurface flow
model calibration: Part i. inversion formulation. Advances in Water Resources, 39(0):106 121,
2012.
[13] M. M. Khaninezhad, B. Jafarpour, and L. Li. Sparse Geologic Dictionaries For Subsurface Flow
Model Calibration: Part ii. robustness to uncertainty. Advances in Water Resources, 39(0):122
136, 2012.
[14] J. Leguijt. Using Two-point Geostatistics Reservoir Model Parameters Reduction. Proceedings
of the 13th European Conference on the Mathematics of Oil Recovery, 2012, Biarritz, France,
2012.
[15] L. Mohamed, M. Christie, V. Demyanov, E. Robert, and D. Kachuma. Application of Particle
Swarms For History Matching in the Brugge Reservoir. Proceedings of the SPE Annual
Technical Conference and Exhibition, 2010.
[16] D.S. Oliver, A.C. Reynolds, and N. Liu. Inverse Theory for Petroleum Reservoir Characterization and History Matching. Cambridge University Press, 2008.
[17] P. Sarma, L.J. Durlofsky, and K. Aziz. Kernel Principal Component Analysis for Efficient,
Differentiable Parameterization of Multipoint Geostatistics. Mathematical Geosciences, 40(1):
332, 2008.
[18] J.R. Scheevel and K. Payrazyan. Principal Component Analysis Applied to 3d Seismic Data for
Reservoir Property Estimation. SPE Reservoir Evaluation Engineering, 4:64 72, 2001.
[19] S.B. Strebelle and A.G. Journel. Reservoir Modeling Using Multiple-Point Statistics. Proceedings of SPE Annual Technical Conference and Exhibition, 2001, New Orleans, Louisiana, 2001.
[20] M.R. Thiele. Streamline Simulation. Proceedings of the 6th International Forum on Reservoir
Simulation, Sep 3rd-7th, 2001, Schloss Fuschl, Austria, 2001.
[21] S. Yadav. History Matching Using Face-recognition Technique Based on Principal Component
Analysis. Proceedings of the SPE Annual Technical Conference and Exhibition, 2006.
[22] Y. Zhao, A. C. Reynolds, and G. Li. Generating Facies Maps by Assimilating Production Data
and Seismic Data with the Ensemble Kalman Filter. Proceedings of SPE Improved Oil Recovery
Symposium, 2008, Tulsa, USA, 2008.
[23] H. Zhou, J. J. Gmez-Hernndez, H. H. Franssen, and L. Li. An Approach to Handling
Non-gaussianity of Parameters and State Variables in Ensemble Kalman Filtering. Advances in
Water Resources, 34(7):844 864, 2011.

SPE-170636-MS

23

[24] H. Zhou, L. Li, H.H. Franssen, and J.J. Gmez-Hernndez. Pattern Recognition in a Bimodal
Aquifer Using the Normal-score Ensemble Kalman Filter. Mathematical Geosciences, 44(2):
169 185, 2012.
[25] H. Zhou, L. Li, and J. J. Gmez-Hernndez. Characterizing Curvilinear Features Using the
Localized Normal-score Ensemble Kalman Filter. Abstract and Applied Analysis:118, 2012.
[26] V. Vapnik. The Natural of Statistical Learning Theory. Springer-Verlag, New York, 1995.
[27] Buhmann, Martin D. (2003), Radial Basis Functions. Theory and Implementations, Cambridge
University Press.
[28] X.J. Yao, A. Panaye, J.P. Doucet, R.S. Zhang, H.F. Chen, M.C. Liu, Z.D. Hu, B.T. Fan.
Comparative Study of QSAR/QSPR Correlations Using Support Vector Machines, Radial Basis
Function Neural Networks, and Multiple Linear Regression. Journal of Chemical Information
and Computer Sciences, 44(4): 125766, 2004.
[29] F.O. Alpak, J.C. Vink, G. Gao, and W. Mo. Techniques for Effective Simulation, Optimization,
and Uncertainty Quantification of the In-Situ Upgrading Process. Journal of Unconventional Oil
and Gas Resources, 3-4: 114, 2013.

You might also like