You are on page 1of 15

Intelligent Data Analysis 8 (2004) 183196

IOS Press

183

Prediction of oil well production: A


multiple-neural-network approach
H.H. Nguyena , C.W. Chana, and M. Wilsonb
a Department

of Computer Science/Energy Informatics Laboratory, University of Regina, Regina Sask


S4S 0A2, Canada
b Office of Energy and Environment, University of Regina, Regina Sask S4S 0A2, Canada
Received 23 January 2003
Revised 2 March 2003
Accepted 2 May 2003
Abstract. This study presents an application using both single and multiple interval prediction models implemented with
artificial neural networks to estimate the future production performance of oil wells. The single interval prediction model was
developed using NOL (Gensym Corp., USA). The multiple neural network (MNN) model is a novel approach that combines a
group of neural networks, with each component neural network being responsible for predicting a different time period. The
approach is designed to improve the accuracy of long-term predictions. In addition to conducting both short and long term
prediction of oil production, the study also investigates different approaches for modeling the application domain parameters.
The MNN model for prediction of future well performance is applied to the time series data obtained from four pools of wells
in the southwestern region of Saskatchewan, Canada. The results showed that a MNN model performed better than a single
neural network model for long-term predictions.
Keywords: Neural networks, petroleum production prediction

1. Introduction
Estimations of future production and potential recoverable oil reserve of petroleum wells are important
for cost-effective operations in the petroleum industry. Such estimations, including optimizing introduction of secondary and tertiary recovery schemes, can assist petroleum engineers in project design,
facilities construction scheduling and economic forecasting. It is the objective of this study to investigate
the feasibility of using feed-forward neural networks and time-series modeling techniques to forecast
petroleum production.
Reservoir engineers typically predict primary performance through curve fitting to existing production
data. Experience from past production, particularly from wells within the same or similar pools (i.e.
pools with similar oil and geological characteristics) can lead to reasonable predictions of primary
performance. The decision to make the transition to secondary and tertiary production requires a more
time-consuming and complex use of reservoir simulators that utilize reservoir characteristics based on
core and log analysis, as well as historical production. This also requires significant computational

Corresponding author: E-mail: chan@cs.uregina.ca.

1088-467X/04/$17.00 2004 IOS Press and the authors. All rights reserved

184

H.H. Nguyen et al. / Prediction of oil well production: A multiple-neural-network approach

capacity. At the same time, there is a huge amount of data readily available from within companies and
from public sources that is barely used. The data can be used to enhance understanding of the production
process, to optimize timing for initiation of advanced recovery processes and, potentially, to identify
candidate wells for production or injection. Thus the set of time-series data can be used for building
a model to predict production opportunities. In this study, the historical data are analyzed to identify
data patterns and, assuming the patterns will continue into the future, they are extrapolated in order to
generate oil production forecasts.
In contrast to simulation models formulated with explicit reservoir characteristics such as porosity,
permeability and the presence of fracture systems, systems that adopt the Artificial Neural Network
(ANN) approach can learn from past data and implicitly recognize these reservoir characteristics.
Also, the use of neural network systems allows for ready comparisons between wells and between pools.
Such benchmarking can help to identify potential problems and may even point to solutions based on
actions taken on similar wells.
This study is an attempt to test the use of single and multiple neural networks to predict the primary
performance of wells using data collected by Saskatchewan Industry and Resources (Saskatchewan
Energy and Mines). A portion of the data available was retained for verification of the prediction results.
ANNs are characterized as computational models with particular abilities to adapt, learn, generalize,
recognize, cluster, and organize data. The advantages of ANN include its computational efficiency,
non-linear characteristics, generation properties, fault tolerance, freedom from a priori selection of
mathematical models, and ease of working with high-dimensional data. This study presents two models
that estimate petroleum production based on historical data: a single artificial neural network model to
make single-step-ahead forecast, and a multiple artificial neural network model to make multiple-stepahead forecasts.
One common problem with the time series forecasting model is low accuracy of long term forecasts.
The estimated value of a variable may be reasonably reliable for short terms into the future, but for
longer terms, the estimate is liable to become less accurate. There are several reasons to account for
this inaccuracy. One reason is that the environment in which the model was developed has changed
over time and therefore the assumptions held valid during the development process are no longer true
after some time. Another reason is that the model itself was not well developed. The inaccuracy arises
due to immature training or the lack of appropriate data for training. The third cause of inaccuracy is
propagation errors during recursive model predictions. Usually a model is built to predict one step ahead
and used recursively 10 times when a 10-step-ahead prediction is required. Every model is likely to be
associated with an error. For short-term prediction, the error can be less than an acceptable threshold,
but for long-term prediction, this error is accumulated and can increase beyond the threshold.
The multiple-neural-network (MNN) approach presented in this study attempts to deal with the third
problem by reducing the number of recursions necessary. In this approach, several neural networks built
to predict from short- to long-term are combined into one model. The ultimate goal of this combination
is to increase accuracy of long-term forecasts.
The paper is organized as follows. Section 2 presents some background literature on neural network
applications in the process industry and introduction of temporal neural networks. Section 3 provides a
brief introduction of the application problem domain of petroleum production prediction. Sections 4 and
5 present the two approaches to predict petroleum production using single and multiple neural network
models. Section 6 examines the results of the two models. Section 7 presents some discussions and
observations. Section 8 discusses some conclusions and possible directions for further research work.

H.H. Nguyen et al. / Prediction of oil well production: A multiple-neural-network approach

185

2. Background literature
2.1. Applications of neural networks in the process industry
Artificial neural networks have broad applicability to real world business and industrial problems and
have already been successfully applied in many industries. Since neural networks are best at identifying
patterns or trends in data, they are well suited for prediction or forecasting problems including sales
forecasting, target marketing, risk management and industrial process control.
Some examples of applications of ANN for environment and energy process modeling and control
are listed as follows. Melas et al. [15] presented a back-propagation neural network (BPNN) model
developed for 24-hour prediction of photochemical pollutant levels. Reich et al. [16] proposed a BPNN
with momentum to identify the apportionment of a small number of sources from a data set of ambient
concentrations of a given pollutant. Kavchak and Budman [12] presented an adaptive radial basis
function neural network based on dilation adaptation. The neural network was incorporated into an
adaptive controller and tested on a continuous stirred tank reactor. De Veaux et al. [5] proposed a model
that includes both first principle differential equations and an artificial neural network to forecast and
control biological treatment for water contamination. Haiwen et al. [10] discussed simulation studies on
application of Bayesian-Gausian neural networks for predicting dynamic behavior of a non-linear singleinput single-output system, and for predicting static performance and dynamic behavior of circulating
fluidized bed boilers. Bhrat and McAvoy [2] and Zhang and Stanley [19] introduced model-based
process control systems. Chih-Chou et al. [4] attempted to improve the performance of neural networks
by integrating neural networks with rule-based expert system. An integrated application was proposed
for prediction of power load at the Taiwan Power System. Other applications in the petroleum domain
are described below in greater details.
Huang and William [11] developed a model for predicting porosity and permeability from well logs
using ANN techniques. Such a model is desirable because: (1) the number of core measurements are
limited and hence unable to provide a complete picture of the reservoir interval, and (2) porosity and
permeability estimated from well logs based only on theoretical physical model or empirical equations
from other basins are erroneous. To improve efficiency, the author used a fast BPNN modified with the
Marquardt algorithm. Although the core measurements were not used for constructing training examples,
the predicted curves and the actual measurements agree except for a few data points.
Wong and Taggart [18] described a model similar to that in Huang and William [11] but which includes
information on lithofacies as input. The ANN approach was chosen for it can simultaneously handle both
discrete data such as lithofacies and continuous data such as porosity. Two BPNN models were used,
one of which predicted porosity and the other predicted permeability. Experimental results showed that
two separate ANN models gave better results than only one model with two output nodes. The results
showed that the standard neural network method gave lower root mean square error (RMSE) compared
to the simulated method but the simulated method produces better statistics of the actual data including
mean, standard deviation, coefficient of variation, maximum and minimum value.
Gharbi [9] presented a universal neural-network-based model as an alternative method for predicting
Pressure-Volume-Temperature (PVT) properties. Ideally, the PVT properties of crude oil are measured
on collected bottom-hole samples or recombined surface samples. However, sometimes these samples
cannot be obtained or are very expensive. The ANN model showed very high accuracy as compared
to any other correlation method; it also produced the lowest errors, the lowest standard deviation, and
highest correlation coefficient for both outputs.

186

H.H. Nguyen et al. / Prediction of oil well production: A multiple-neural-network approach

Amizadeh [1] demonstrated the use of the ANN technique in estimating oil field reservoir parameters
from remote seismic data. The author demonstrated the use of k-fold cross validation technique to obtain
confidence bounds on an NN accuracy statistic from a finite sample set. An ANN models classification
accuracy was dramatically improved by transforming data into a feature space that maximizes the linear
separation between classes.
The following conclusions can be drawn from the above literature:
The prediction or classification accuracy was affected when some important predictors were either
not included in the analysis due to unavailability or were under represented,
The accuracy was also reduced when the measurements of some data were local and not representative
of a larger area,
Data pre-processing is one of the most important steps in applying the ANN approach to geological
problems.
2.2. Temporal artificial neural networks
This study attempts to use multiple neural networks for time series prediction. The term temporal
neural networks refers to a group of neural networks that deal specifically with time series. There
are several kinds of temporal neural networks. We review here two neural network models recently
developed.
Time-Delay Neural Network (TDNN) was developed by Lang, Waibel and Hinton [13]. The model is
a modification of the multi-layer perceptron whose input consists of slices which are delayed through a
delay line. The slices are shifted each time step. The network input at a given time step is stored long
enough to influence subsequent inputs. Time-Delay Neural Networks have the ability to detect conserved
combinations of sequence elements in a set of larger sequence fragments without initial knowledge of
the location and sequence characteristics of the elements. Formally, time delays are identical to sliding
windows adopted by the single neural network approach in this study.
The Elman networks [7,8,17] constitute a kind of Simple Recurrent Networks (SRN). In this architecture, the state of the hidden layer in the previous step is copied in a supplementary context input vector
in the next step. This architecture is able to memorize past sequences. Activation of a unit is calculated
as a function of past activations and past input values. The effect of time is implicit in the networks
internal representations. Elman (1990, 1991) trained a SRN to learn the lexical structure of a set of
sentences generated from a limited vocabulary and grammar, and then used it to predict the next word
in the sentence. Unlike a TDNN, in theory, a SRN can model unlimited past values in its representation.
However, it is criticized that in practice this type of recurrent network cannot deal with arbitrarily long
history [3].
3. Application problem domain
The estimation of monthly production from oil wells is important because it can potentially indicate problems with the productivity of a well and timing for implementation of secondary and tertiary
production methods. The artificial neural network technique was adopted to obtain functional relationships between production time series, core analysis and drill stem test (DST) results, which can
assist petroleum engineers in designing production equipment and surface facilities, planning future
production, and making economic forecasts.

H.H. Nguyen et al. / Prediction of oil well production: A multiple-neural-network approach

187

Fig. 1. Well production history.

The petroleum industry acquires, stores and processes large volumes of geoscience data such as core,
DST and seismic data. Analysis and interpretation of these data are usually done by core analysts and
reservoir engineers with the support of numerical simulation programs. However, the process can be
technically difficult, time consuming and expensive in terms of both labor and computational resources,
and an alternative automation approach is desirable. The ANN approach was therefore adopted in this
application.
Time series modeling is also applied in conjunction with the ANN approach. This technique utilizes
the vast amount of production data that has been collected but hardly used in order to build a model of
production over time. In time series analysis, historical data are analyzed in an attempt to identify a
data pattern, and then assuming that the pattern will continue in the future, it is extrapolated to produce
forecasts.
The following are the key concepts from petroleum engineering relevant to this application domain:
Petroleum: Petroleum includes oil and gas. Oil can be categorized into four crude types based on
density: light, medium, heavy and bitumen. Since the oil type has high influences on production as
well as fluid parameters, petroleum experts suggested that one model should be built for each crude
type. In this study, only medium oil was considered.
Reservoir: A petroleum reservoir is a volume of porous sedimentary rock that has been filled by
petroleum and possibly other fluids. Oil, along with varying amounts of water and gas, reside in the
porous spaces of the rock.
Well: Many wells can be drilled to recover fluids, including oil and gas, inside the boundary of each
reservoir. Wells can be categorized according to their usages. In this paper, the term well means
producer wells.
Horizon: Each horizon is a formation layer with unique geological characteristics. The data used in
this study was taken from one horizon to ensure all wells have the same geological characteristics.
Production history
The production rate fluctuates with time as shown in Fig. 1. Fluctuation can be due to the following
reasons. Activities such as well stimulation can create fractures in the near well-bore area, which
increases production. Production decline is often caused by pressure decline in a reservoir or deterioration
of the mechanical condition of the production wells. An effective way of slowing the decline may be
supplemental recovery operations such as water flooding. Another method is to shut down the well for
a period of time to regain pressure. While production drops to nil during the shut-in period, it usually
goes up afterwards. Eventually, however, the decline will recur [6].

188

H.H. Nguyen et al. / Prediction of oil well production: A multiple-neural-network approach

Fig. 2. Elimination of incomplete records.

Saskatchewan Energy and Mines supplied the data sets used in this study. The entire data set contains
14538 monthly production rates and 49 core analysis and pressure data points recorded from 49 oil
producer wells. These 49 wells are located in four independent reservoirs in the southeastern region of
Saskatchewan including Flat Lake, Hoffer, Neptune and Skinner Lake that produce the same medium
type of crude oil from the same horizon, the Ratcliffe.
The production data were collected in a period of about 30 years from the 1960s to 1995. While there
were sufficient data to develop an accurate time series model, the total number of available data patterns
for core and DST analysis is only 49. Since insufficient core and DST data may mean that meaningful
correlation between production rate and analysis data cannot be found, it would be desirable to increase
the number of reservoirs to enhance validity of the model.
A number of preprocessing steps were taken as follows. The permeability and porosity data were
raw data obtained from core logs. Permeability was averaged from horizontal and vertical permeability.
For each well, permeability and porosity values were measured from core samples taken from different
depths. If those values fall below a cut-off value set by the petroleum engineering expert, they were
ignored. Then the remaining values taken from the same wells were averaged. Reservoir angle was
ignored in the calculation since it was relatively small in the area selected.
The monthly productions were scaled to 600 hours a month, which is the mean of the number of
production hours. All the months with zero production were eliminated. This process of elimination has
the drawback of producing discontinuity in the time series data since individual months with missing
production data were ignored. In this way, the number of long-term records is significantly reduced. A
record contains both the input and the expected output. For example, if production of the three months of
January, February and March are to be used to predict production two months ahead, the record should
contain production from January to May. If there was no production in the 7th month (P 7 = 0), P7
was eliminated. Therefore the following records of five-month duration were also eliminated: P 3 -P7 ,
P4 -P8 , P5 -P9 , P6 -P10 , P7 -P11 . The eliminated records are shown as gray rows in Fig. 2. The longer the
duration of records, the more likely that the number of records being eliminated is high.
In the future, this drawback can be avoided if an attempt is made to repair the missing values based on
neighboring values instead of eliminating the entire record.

H.H. Nguyen et al. / Prediction of oil well production: A multiple-neural-network approach

189

In our models, each data record contains either three or six inputs, and one output. All the values in a
record were derived from the same well. The records from all of the 49 wells were placed sequentially
one after another to form a data set.
The data set was divided into three subsets in the proportion of 60:20:20 for training, validation and
testing. The training set is used to train the neural network. The validation set is used to determine the
performance of a neural network on records that are not used during learning. Training and validation
occurs simultaneously, and the two sets of data are used for exploration of parameter values of a network
configuration. When the error from validation runs begins to increase, training is halted for the increase
indicates that over-fitting has begun. The test set is used for finally checking the overall performance of
a neural network when parameter values have been determined in the model.

4. Prediction using NOL Single interval prediction


The initial modeling was conducted using NeurOn-Line (NOL from Gensym Corporation, USA), a
tool-kit for neural networks modeling. NOL supports fast development of a neural network application
to enable rapid assessment on whether a meaningful model can be built on the available data and whether
the set of chosen variables are suitable for the task.
4.1. NeurOn-Line Tool-kit
NOL is a complete graphical, object-oriented software tool kit for building neural network applications
and applying them to dynamic environments. NOL includes facilities for managing data sets, training
the network, testing the fit between model and data, and deploying the application in the operation
environment. Using the NOL tool kit to develop ANN applications is straightforward and typically
involves three steps: cloning blocks, connecting them and configuring their behavior. NOL is an
application layer built on top of the G2 Expert System shell. Thus it can be deployed and integrated
with G2. Hybrid neural, expert, and fuzzy logic systems are simple to configure in NOL.
However, NOL also has some limitations. NOL implements only four types of neural networks:
back-propagation networks, radial basis function networks, rho networks and auto-associative networks.
Since users are allowed to configure only a small number of parameters, their control over behaviors of
the network constructed is correspondingly limited. Also, while familiarity with G2 is not needed for
using NOL, it is a requirement for full utilization of the tool.
4.2. Development of a model of production time series and geoscience parameters
The first neural network model developed on NOL includes geoscience parameters as input. The
factors that have been identified to influence production include permeability, porosity, viscosity, density,
fluid compressibility, oil saturation, pressure and well location. However, since not all the associated
parameters for a well are available, only the following three parameters that are easily obtainable are
included in the model:
1. Permeability (k) describes the relative ease with which fluids can move through the reservoir and is
therefore a factor in determining well productivity.
2. Porosity () is an expression of the volume of void space in the rocks and thus is related to the
volume of oil or gas that can be recovered from the reservoir.

190

H.H. Nguyen et al. / Prediction of oil well production: A multiple-neural-network approach


Table 1
Network configuration model 1
# Hidden units
2
3
4
5
6

Training error
0.034
0.034
0.034
0.034
0.033

Validation error
0.034
0.033
0.032
0.034
0.034

3. First Shut-in Pressure (p) is used as a proxy variable for initial formation pressure.
The permeability and porosity values are obtained from laboratory core analysis, and the first shut-in
pressure data are derived from drilled stem test (DST) analysis. It should be noted that these parameters
reflect only the conditions in the immediate vicinity of the well itself and can change significantly
throughout the reservoir. In addition to the above parameters, production time series data were used
as a source of input. The production rates of the three months prior to the target prediction month are
included as input variables. If P t denotes the production of the target month t for which a prediction is
made, then the productions of the three previous months are P t3 , Pt2 and Pt1 .
In the ANN model, the six conditional variables are permeability, porosity, pressure and the oil
production volumes of the three previous months and the consequent variable is production of the current
month. Since there are only six input variable and one output variable, it was assumed that the number
of hidden neurons is also small. In practice, it is best to have as few hidden nodes as possible because
fewer weights need to be determined.
A scaled data set was used to select the best network configuration. The same training and validation
sets were applied to train and validate five different back-propagation networks with 2, 3, 4, 5 and 6
hidden units respectively. As can be seen in Table 1, the network with 4 hidden nodes produced the least
validation error, and was the best model found. The training error was 0.034 and the validation error was
0.032.
With the training and validation sets specified above, the back-propagation neural network was trained
three times with different initial weights. During training, the root mean square error (RMSE) on the
training set declined steadily but the amount of decrease became insignificant after the first few cycles.
The ANN was saved every 300 cycles. The validation error started to increase between cycles 300 and
600, which indicated over-fitting had occurred. The saved 600-cycle ANN was the final model. We
interpreted the fact that three training runs gave similar results to indicate that the global minimum had
been reached. The training error was 0.029 while the validation error was 0.04.
4.3. Development of a production time series model
A sensitivity test was conducted to measure the impact of each input variable on the output in the first
ANN model developed on NOL described in Section 4.2. The results showed that all the three geoscience
variables had limited (less than 5%) influence on the production prediction. This confirmed our concern
earlier about the limited amount of geoscience data and its applicability to only a small portion of the
reservoir. Hence a second model was developed which consists of the three input or conditional variables
of the oil production volumes of the three previous months, and the output or consequent variable is
production of the current month.
A similar preprocessing, configuration, training and validation process was conducted as for the first
model. As can be seen in Table 2, the network with 3 hidden nodes produced the least test error, and was
found to be the best model. The training error was 0.035 while the validation error was 0.03.

H.H. Nguyen et al. / Prediction of oil well production: A multiple-neural-network approach

191

Table 2
Network configuration model 2
# Hidden units
2
3
4
5
6

Training error
0.033
0.034
0.035
0.033
0.034

Validation error
0.036
0.03
0.031
0.035
0.034

5. Multiple neural networks for multiple interval prediction


Petroleum engineers often confronts the question on how long it takes for a well to reach its economic
limit. To answer this question, forecasts of not only one but several months ahead need to be made. The
models presented in Section 4 could not make long term prediction with reasonable accuracy. Therefore
we proposed the multiple neural network (MNN) approach for time series modeling to make long term
predictions.
Time series models for long-term forecasts are usually not accurate for the following reasons. It is
often the case that a one-step-ahead model is developed and then used recursively to forecast multiple
steps ahead. Typically, the network is used to forecast the time series value of the immediate future and
then the initial output is used as part of the inputs for the second forecast and the output from the second
forecast is used as part of the inputs for the third round and so on. In other words, the network recursively
propagates the time series forward in time to make forecasts several steps ahead. The problem with
this approach is that the one-step-ahead model is likely to be associated with an error. For short-term
prediction, the error is likely to be less than an acceptable threshold, but for long-term prediction, this
error is accumulated and can increase beyond the threshold.
The multiple-neural-network (MNN) approach presented in this study attempts to deal with this
problem by reducing the number of recursions necessary. In the proposed approach, several neural
networks built to directly predict from short- to long-term are combined into one model.
A MNN model is a group of ANNs working together to perform a task. Each ANN is developed to
predict a different time period ahead. The prediction terms are powers of 2, that is, the first ANN predicts
1 unit ahead, the second predicts 2 units ahead, the third predicts 4 units ahead, and so forth. Hereafter an
ANN that predicts 2n units ahead is referred to as an n-ordered ANN. There are two reasons to support
the choice of binary exponential as the variant in the model. First, big gaps between two consecutive
neural networks are not desirable. The smaller the gap is, the fewer steps the model needs to take in order
to make a forecast. Secondly, binary exponential does not introduce bias on the roles of networks. A
higher exponential model tends to use more lower-ordered neural networks in order to propagate ahead.
A MNN prediction model can be viewed as a single partially connected ANN as illustrated in Fig. 3.
However, unlike a complex single ANN requiring a long time to train, a MNN breaks down the training
into sub-ANNs and trains each component network separately. Figure 3 shows a sample MNN with two
sub-ANNs.
Calculation of an output with a MNN is a recursive procedure as follows.
1. Find the highest-ordered neural network necessary based on the prediction term
2. If all inputs of the neural network are observed values, use the inputs to calculate the output
3. Otherwise, for an input that is not an observed value, return to step 1 to calculate the predicted
value of the input.

192

H.H. Nguyen et al. / Prediction of oil well production: A multiple-neural-network approach

Fig. 3. A sample MNN Model.

The training of high ordered ANNs depends on the previously trained low ordered ANNs. In the
development of a MNN, it is necessary to implement multi-step validation. One-step-ahead validation
does not take into account the models sensitivity to errors that arises due to multi-step prediction [14].
In our model, multi-step validation was used, but the validation window for each ANN is different from
that of the others. The validation error of the n-ordered network is calculated as the average of root mean
square errors (RMSE) of the (2n )-step-ahead to the (2n+1 1)-step-ahead. In order to calculate these
steps, a higher ordered ANN needs to use the prediction values of the lower ordered ANNs.
Since it is not convenient to modify the ANN structures within the NOL environment, we implemented
the MNN system in Java. A single ANN is equivalent to an MNN with one component ANN.
The MNN was trained with the following parameters

Number of maximum training cycles: 3000


Validation error threshold: 5%
Number of hidden units for each ANN: 5
Number of input variables for each ANN: 3
Lead time: 100 months
Number of ANNs: 7
Learning rate: 0.7
Momentum: 0.3

The number of ANNs in the MNN was determined based on the length of the prediction term. Since
< 100 < 27 , seven ANNs were used to predict 100 months ahead. However, if there is insufficient
data available for training high ordered ANNs, this number can be set smaller. The weights of the first
ANN were initialized with small random values. The initial weights of subsequent ANNs were copied
from the previously trained ANNs.
Validation was done every four training cycles. Four is selected to minimize the validation time spent.
The training process halts under one of the following conditions:
26

The number of cycles is equal to the maximum number of training cycles preset by the user
The training and validation errors are smaller or equal to the validation error threshold set by the
user

H.H. Nguyen et al. / Prediction of oil well production: A multiple-neural-network approach

193

Fig. 4. Predicted vs. Target Model 1.

Fig. 5. Predicted vs. Target Model 2.

The values of the last n validation errors increase monotonically, which indicate over-fitting is likely
to have begun. In our experiments, n = 10 was used, and the ANN that produced the least validation
error was saved.
The first component ANN which predicts one month ahead was used as the single ANN in our
comparison.
6. Results
6.1. NOL models
The test data set was run with the saved ANN models. The testing error rates found was 0.04 for the
first model developed on NOL that incorporates both time series and geoscience data and 0.033 for the
second model on NOL with only time series data. Figures 4 and 5 show the predicted values of the two
models (indicated by the line) versus the target values (indicated by the squares).
Sensitivity tests were conducted over the two trained models to identify input variables that have
strong influence on an output variable, or inputs that have little or no influence on the output variable.
Sensitivity testing is useful for understanding the correlations in the data, which may lead to a greater
understanding of the physical causality of the process. Sensitivities (or influences of parameters) are
obtained by taking the average of the local derivative information.
In the first model, the sensitivities of the six inputs over production are as follows.

194

H.H. Nguyen et al. / Prediction of oil well production: A multiple-neural-network approach


Table 3
Sensitivities model 1
k
3.5%

1.3%

P
0.6%

Pt3
9.6%

Pt2
21.9%

Pt1
63%

Table 4
Sensitivities model 2
Pt3
Pt2
Pt1
11.3% 27.6% 61.2%

Fig. 6. Test errors for MNN and Single ANN for different prediction periods.

As can be seen in Table 3, the influence of the core (k and ) and DST (P) analysis on the production
is small (less than 5%). The production of the most recent month has the strongest effect at 63%, and
the productions of the previous two months are also significant at 9.6% and 21.9%. From this result, it
was decided the second model should include only production time series data.
The sensitivities of the three input variables over the output variable in the second model are similar
to the previous model as can be seen in Table 4.
6.2. Multiple-ANN and Single-ANN models
In order to facilitate comparison between a MNN and a single ANN, the same test set of data was
applied to the MNN and the single ANN to predict monthly production up to 100 months ahead. The
average RMSE was 0.053. Figure 6 illustrates the errors for different periods from 1 to 100 months
ahead.
Figure 6 indicates that the MNN generally performs slightly better than the single first ordered ANN.
As the prediction term increases, the variance in performance is more significant. This indicates a MNN
performs better than a single ANN in long term forecasts.
Figure 7 illustrates the desired and predicted outputs from the MNN and ANN models for a prediction
term of 100 months. With the exception of approximately the first 100 values in the graph, the predictions
from the ANN and MNN models are quite close to the desired results.
7. Discussion
The results of the models developed on NOL indicate that the production time series model works
compatibly with the mixed causal and time series model. The fact that geoscience data has insignificant
influence on production rates can be explained as follows. Firstly, core analysis is taken at well bore

H.H. Nguyen et al. / Prediction of oil well production: A multiple-neural-network approach

195

Fig. 7. Desired vs. Predicted Outputs.

and may not represent the real permeability and porosity values over the entire well. Secondly, pressure
usually changes over a wells lifecycle but only information about the initial pressure is available for the
study. Thirdly, the time series data may already incorporate all the information of the core and DST
because there are correlations between previous productions and a wells parameters. Lastly, there may
not be sufficient core and DST analysis data points to study the influence of these parameters on the
production.
The NOL tool-kit is a convenient and generic tool to develop and deploy an ANN application. However,
modifying network structures in the tool-kit or deployment in an environment that involves non-Gensym
products are more complicated. The MNN tool is a program developed for the specific purpose of
combining ANNs into a MNN. Currently, only back-propagation ANN is included. However, it is
possible to add more network types into the system as Java is an object-oriented language.
It is observed that a MNN approach has some disadvantages. First, MNN is more complex than a
single ANN although it is only linearly more complex than a single ANN. Secondly, a high-ordered
ANN requires more data to train and validate.
In comparison with the simple recurrent neural network approach, the MNN approach suffers from
the weakness that it cannot include unlimited past values into its model. Instead, it only models past
values at exponential intervals. However, since the Elman networks have been criticized for being able
to model only limited past values in practice, the Elman model also suffers from the same weakness.
The advantage of the MNN approach is that its input is as close to the observed values as possible while
Elman networks usually depend more on the recent predicted values than the distant observed values.
8. Conclusion and future work
This paper investigates applicability of the artificial neural network technique for short- and long-term
prediction of petroleum production, and the results show that the ANN technique is indeed useful for
the task. The ANN models are efficient and adaptable and can be used to aid petroleum engineers in
a variety of design, optimization and control tasks. It can be concluded from the experiment that core
analysis and DST data make little contribution to the model output of petroleum production and time
series data alone is sufficient to develop a meaningful model.
The MNN model shows superior performance over the single ANN model in long term prediction.
Aside from ANN, it is possible to incorporate other numerical prediction techniques in a multiple-order
model to perform long term prediction.

196

H.H. Nguyen et al. / Prediction of oil well production: A multiple-neural-network approach

Future research includes testing the ANN and MNN methods using different sets of time series data
with dissimilar characteristics from the petroleum production data. Current work involves building a
model for prediction of gas consumption and constructing a graphical user interface for training and
testing the MNN model.
Acknowledgements
The authors would like to thank Saskatchewan Energy & Mines for providing us with both expert
discussions and the data sets used in this study. We thank Dr. Gary Zhao of University of Regina
for useful discussions. We are also grateful to Natural Sciences and Engineering Research Council of
Canada for their generous financial support and to Gensym Corporation USA for providing us with the
NOL toolkit.
References
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]

F. Aminzadeh, J. Barhen and N.B. Toomarian, Estimation of Reservoir Parameter using a Hybrid Neural Network,
Journal of Petroleum Science and Engineering 24(1) (1999), 4956.
N. Bhat and T.J. McAvoy, Use of Neural Nets for Dynamic Modeling and Control of Chemical Process Systems,
Computers Chemical Engineering 14(4/5) (1990), 573583.
J.C. Chappelier and A. Grumbach, Time in Neural Networks, SIGART Bulletin 5(3) (1994), 311.
C. Chih-Chou, K. Ling-Jing and D.F. Cook, Combining a Neural Network with a rule-Based Expert System Approach
for Short-Term Power Load Forecasting in Taiwan, Expert System with Applications 13(4) (1997), 299305.
R.D. De Veaux, R. Bain and L.H. Ungar, Hybrid Neural Network Models for Environmental Process Control (The 1998
Hunter Lecture), Evironmetrics 10(3) (1999), 225236.
A.J. Dikkers, Geology in petroleum production, Elsevier (1985).
J.L Elman, Distributed Representations, Simple Recurrent Networks and Grammatical Structure, Machine Learning 7
(1991), 195224.
Elman, J.L. Finding Structure in Time, Cognitive Science 14 (1990), 179211.
R.B. Gharbi, A.M. Elsharkawy and M. Karkoub, Universal Neural-Network-Based Model for Estimating the PVT
Properties of Crude Oil Systems, Energy & Fuels 13 (1999), 454458.
Y. Haiwen, N. Rainer and R. Lothar, A Bayesian-Gaussian Neural Network and Its Applications in Process Engineering,
Chemical engineering and processing 37(5) (1998), 439449.
Z. Huang and M.A. William, Determination of Porosity and Permeability in Reservoir Intervals by Artificial Neural
Network Modeling, Offshore Eastern Canada, Petroleum Geoscience 3(3) (1997), 245258.
M. Kavchak and H. Budman, Adaptive Neural Network Structures for Non-linear Process Estimation and Control,
Computers and Chemical Engineering 23(9) (1999), 12091228.
K.J. Lang, H. Waibel and G.E. Hinton, A Time-Delay Neural Network Architecture for Isolated Word Regonition, Neural
Networks 3(1) (1990), 2344.
J. McNames, J.A.K. Suykens and J. Vandewalle, Winning Entry of the K. U. Leuven Time Series Prediction Competition,
International Journal of Bifurcation and Chaos 9(8) (1999), 14851500.
D. Melas, L. Kioutsiouskis and L.C. Ziomas, Neural Network Model for Predicting Peak Photochemical Pollutant Levels,
Journal of the Air & Waste Management Association 50 (2000), 495501.
S.L. Reich, D.R. Gomez and L.E. Dawidowski, Artificial Neural Network for the Identification of Unknown Air Pollution
Sources, Atmospheric Environment 33(18) (1999), 30453052.
P. Rodriguez, J. Wiles and J.L. Elman, A recurrent neural network that learns to count, Connection Science 11 (1999),
540.
P.M. Wong and I.J. Taggart, Use of Neural Network Methods to Predict Porosity and Permeability of a Petroleum
Reservoir, AI Applications 9(2) (1995), 2737.
Q. Zhang and S.J. Stanley, Real-Time Water Treatment Process Control with Artificial Neural Networks, Journal of
Environment Engineering 125(2) (1999), 153160.

You might also like