You are on page 1of 9

SPE 106075

Data Management in Reservoir Simulation


S.C. Gencer, ExxonMobil Upstream Research Co.; B.P. Ketcherside, ExxonMobil Global Services Co.; and
G.O. Morrell, E.L. Mulkay, and K.D. Wiegand, ExxonMobil Upstream Research Co.

Copyright 2007, Society of Petroleum Engineers


The architecture and experiences presented in this paper may
This paper was prepared for presentation at the 2007 SPE Reservoir Simulation Symposium be unique in the industry. The DMS was designed, developed
held in Houston, Texas, U.S.A., 2628 February 2007.
and deployed over a ten year period. It is a successful
This paper was selected for presentation by an SPE Program Committee following review of
information contained in an abstract submitted by the author(s). Contents of the paper, as
software story and is viewed, along with the simulator, as a
presented, have not been reviewed by the Society of Petroleum Engineers and are subject to key enabling technology for success with reservoir simulation
correction by the author(s). The material, as presented, does not necessarily reflect any
position of the Society of Petroleum Engineers, its officers, or members. Papers presented at within ExxonMobil.
SPE meetings are subject to publication review by Editorial Committees of the Society of
Petroleum Engineers. Electronic reproduction, distribution, or storage of any part of this paper
for commercial purposes without the written consent of the Society of Petroleum Engineers is Introduction
prohibited. Permission to reproduce in print is restricted to an abstract of not more than
300 words; illustrations may not be copied. The abstract must contain conspicuous Reservoir simulation is inherently a data-intensive process. It
acknowledgment of where and by whom the paper was presented. Write Librarian, SPE, P.O.
Box 833836, Richardson, Texas 75083-3836 U.S.A., fax 01-972-952-9435.
starts with geological models and their properties, and
assignment of phase behavior or equation of state data,
Abstract relative permeability and capillary pressure information and
Data from a wide variety of sources are required for reservoir geomechanical data. It requires layout of the surface facility
simulation. Simulation itself produces large quantities of data. network, subsurface configuration of wells, their attributes,
Yet, good data management practices for reservoir simulation pressure and rate limits and other production and optimization
data are typically neither well-understood nor widely constraints. Very often, production history information,
investigated. This paper presents a specific architecture to hydraulics tables, completion tables and logic for runtime
manage reservoir simulation data, discusses experiences from management of wells and surface facilities are needed.
six years of global use, explains adjustments to support Finally, special cases like thermal and fractured reservoir
changing workflows and outlines challenges that lie ahead. simulations require their own set of additional data.

The architecture consists of a Database Management System During simulation, timestepping information, convergence
(DBMS) and files in managed file directories, called Reservoir parameters and well performance data can be logged and
Input Output System (RIOS). All simulation input data and analyzed. Results, such as pressures and rates from wells and
results are maintained by a Data Management System (DMS). surface facilities and pressures and saturations from the
The reservoir simulator reads input files written from the simulation grid can be monitored and recorded. The state of
DBMS to RIOS and writes results to files in RIOS. DBMS, the simulator can be recorded at specificied intervals to enable
RIOS and integrated management tools (DMS) make up the restart of a run at a later time.
data management environment.
This results in an abundance of data to analyze, visualize,
The environment has been in use inside ExxonMobil since late summarize, report and archive. Over the years, many authors
2000 and now supports close to 500 users (85% of reservoir have tried to address one aspect or another of this data
engineers). There are over 30 individual databases containing management problem and many commercial and proprietary
2TB of online data and about 6TB of online RIOS data. The simulators have made allowances to simplify users work in
environment itself introduces some additional work. Support this area1-3. However, in general, data management has not
staff is required for maintenance of databases, RIOS areas and been a widely investigated aspect of reservoir simulation.
problem resolution. Direct user manipulation of data is not
permitted and additional tools are required to access and Data management in reservoir simulation enables workflows
interpret data. and collaboration, insures data integrity, security and
consistency and expedites access to results. In todays
The environment provides many benefits. While it insures computing environment, data management is an enabler to
data integrity, security and consistency, it also automatically meet the growing need for reservoir simulation and to make
updates defaults, limits, associations, types, etc. This allows simulation available to a wider audience of professionals,
running of older simulations and generation of aggregate including many kinds of engineers and geoscientists.
statistics and usage audit trails.
2 SPE 106075

With its EMpower TM reservoir simulator4-5, ExxonMobil spent connections. Example facilities are wells, platforms,
considerable time and effort in developing, deploying, separators, terminals and the pipelines that connect them. All
supporting and maintaining a data management environment facilities have attributes and constraints that describe them and
surrounding the reservoir simulator. These experiences - and their behavior. For example, all facilities have a name and
not the computational aspects of the reservoir simulator - are active state and all wells have a rate or pressure limit.
the subject of this paper.
A key feature distinguishing ExxonMobils current reservoir
Elements of the Data Management Environment simulation system from its predecessors is its use of an
The data management environment encompasses all extended surface facility network model that is fully integrated
simulation input, results and restart data and a collection of with the reservoir. This key feature contributes greatly to the
software programs, tools and procedures for their management complexity of the data model. All facilities in the network are
(DMS). directly accessible and can be manipulated by the reservoir
engineer for maximum flexibility. In addition, users can add
Simulation Data their own attributes and procedures to a given facility type.
The top-down view of the simulation data starts with a This capability is extremely important. Assume, for instance,
hierarchy of projects, models and cases (Figure 1). A project that the reservoir engineer wants to model submersible pumps
usually encompasses a particular reservoir study. Models are in a way that the current simulator version does not support.
used to distinguish between different simulation approaches, The needed variables and functionality can be added to the
which may require fundamentally different discretizations or well facility type by the engineer and made a part of the
fluid representations such as black-oil vs. compositional timestep calculation. This flexibility is very powerful and
simulation, fractured vs. non-fractured, etc. Cases within a allows rapid prototyping of new functionality.
given model are generally expected to represent minor
changes in the input data or facility network representation, Well Management Logic
with most of the data being shared among them. Currently, Facilities are the most dynamic part of reservoir simulation.
approximately 1,000 projects with 5,000 models and 20,000 In EMpower, they are managed at runtime with user defined
cases are managed worldwide. logic called Well Management Logic. This is part of the input
data but it is such a distinctive concept that it deserves a more
detailed description. The timeline of a reservoir simulation is
usually divided into two segments. The first is history
matching while the second is prediction. During history
matching, the goal is to design a model that will match
historical rates and pressures. During prediction, reservoir
engineers want to experiment with various scenarios in order
to approximate a good production profile for the field. For
instance, the engineer wants to test if it is sufficient to reduce
high GOR wells and increase production of low GOR wells in
order to maintain a given oil-production plateau while keeping
the fields gas production in check, or whether it is necessary
to work-over some wells. While it is theoretically possible to
hard-code scenarios like this, it is impossible to pre-conceive
every possible strategy a reservoir engineer might want to try.
Figure 1: Subset of data model showing project/model/case Allowing the engineer to define such strategies using a
hierarchy and their relationships. Projects can contain programming environment greatly enhances the flexibility and
one or more models, each of which can contain one or utility of a reservoir simulator while complicating the data
more cases. management environment.
All data needed for and produced by a simulation fall within Data Management System
one of three broad categories: Arrays, Granules and Facility In EMpower, the DMS is the central work environment for the
Network Data. Simulation cell and interface data such as simulation engineer. It is the single point of entry for
pressure, mole fractions, fluxes, etc. fall into the first category. preparing, running and analyzing simulations and therefore, it
Granules are collections of parameters that are intended to be has several distinguishing characteristics and requirements.
small in size while containing a variety of different data types.
For instance, black-oil fluid parameters for a given domain First, it is data driven; all dialogs work from data definitions.
comprise such a collection; solver parameters and timestep- Some can display the three data types (arrays, granules and
control parameters are further examples. A Facility Network facility network data) without knowledge of actual data
is a collection of physical facilities represented as nodes and content. Second, user access is controlled by login and data
access is controlled by user, group and world permissions. It
TM
EMpower is a trademark owned by ExxonMobil Upstream is possible to completely hide projects, models and cases from
Research Company. other users and it is also possible to setup a project, model or
case for use by a specific group of users. Third, the DMS
SPE 106075 3

insures backward compatibility, interoperability and data domain is a user-defined region inside the simulation model
integrity with tools that validate and upgrade data and check (Figure 2). An identical copy of a case does not duplicate any
integrity of arrays, granules and facility data. Finally, a set of data, but triggers the creation of a second set of relationships.
administrative tools are supplied to test components of the Selected data can then be unshared to facilitate differences
data management environment, to support different access between cases.
models (administrator, manager, user, etc.) and to provide
functions like managing users, migration of data from one Variable Attribute Repository
version to another and reporting of project, model, case Since the development of a reservoir simulator is an ongoing
statistics. process with new features being added on a regular basis, care
must be taken to avoid frequent changes in the data layout,
Simulation Workflow and Data Management which is costly. Therefore, early in the development process,
One of the great advantages of using a DMS is that it allows it was decided to create a meta-layer between the data model
the definition of dependencies between input data, results data and the data layout to avoid frequent changes in data layout.
and simulation times. For instance, if a user changes input data This meta-layer is called the Variable Attribute Repository
at time t0, the system is able to determine what data becomes (VAR) and describes data items of all three categories
invalid at times t>=t0. Or assume that the user changes from mentioned earlier: arrays, granules and facility network data.
black-oil to compositional simulation. The DMS is able to Assume a new array needs to be added to the system. From a
indicate what additional input data is needed and can provide data layout perspective this is just another generic array that
appropriate defaults. Data validation options such as checking can be linked to a case, time and domain. The VAR however,
fluid property tables or timestep controls can prevent the user (whose layout is fixed) will have an additional entry detailing
from wasting time by supplying ill-conditioned parameters to the purpose of the array, its description, default value, etc.
the simulator. Facility data description is even more versatile: not only is it
possible to define any kind of attribute for a facility type, new
facility types can also be defined from a base set of facility
types. For example, a separator node is similar to network
nodes, but with some unique attributes of its own, such as
temperature.

The VAR is extended as users define new facility attributes


and arrays during their work. For example new attributes can
be calculated and used in well management logic or new
arrays can be created to modify transmissibilities. The
definitions of these attributes and arrays are stored in User
VAR at model level and are available to all cases the model
contains in the same manner as regular VAR definitions.

Data Mining
Potentially the greatest benefit of managing reservoir
simulation data, though, is the capability for data mining. The
amount of data generated for and by simulation is significant.
Figure 2: Subset of data model showing relationships of a It is not easy to analyze results just for one study, let alone
variable to a case, domain and time. The relationships are across many. However, with well-defined data management,
managed with variable use class and each case has a automated tools can scan and analyze data areas to generate
hashed list of variables which is managed by case to overall statistics and trends; this capability is known as data
variable use class. mining. Data mining enables quick overview of just what
kind of models are being worked on as well as providing
Data Sharing insight into the type of problems users run into, etc. This
The project/model/case hierarchy implies that the majority of improves quality control and opens the door to a self learning
the simulation input data is shared among cases within the system.
same model. For instance, the user may try different
permeability values during a history match or test different Architecture of the Data Management Environment
solver parameters to achieve better performance, and there is The simulation environment has been implemented as a
no need to create a complete new set of data that duplicates heterogeneous, distributed, three-tier, client-server
the simulation grid, input arrays, granules and facility architecture. The DMS is the client software at end user
network. However, as simple as the concept sounds, the data workstation. All reservoir simulation data are stored in the
sharing code within the DMS can be quite complicated, since second tier consisting of a database and file directories in a
almost all input data can be time dependent. Mathematically, mass storage area called RIOS. The simulator running on
data sharing is established via a unique relationship (data-item different compute servers is the third component and
to case, time and domain for variables, data-item to case, time represents the server side. Figure 3 summarizes this
and facility for facility attributes and constraints) where architecture.
4 SPE 106075

and the simulator, ability to handle large cardinalities of


relationships and for implementation of the VAR concept and
generic storage of arrays, granules and facility data, an object
database was deemed the better choice.

Object Database
The object database provides many desired features including
transaction oriented, multi-user access with object locking and
rollback functionality. It manages the schema and object
relationships. It enables definition of granularity of
transactions based on user actions. Management of object
relationships is probably its biggest strength. This is difficult
and involved to implement with a relational or object
relational database. There are more than 100 unique object
classes, and approximately 150 distinct object relationships,
some of which can have tens of millions of rows in a simple
relational table implementation.

Figure 3: Diagram of the three tier simulation The database schema is a logical decomposition of the users
environment architecture. The DMS is the client piece of view into the data model. It stores parameters and works in
the architecture and the simulator is the server piece. The parallel with locking. For the development team, a guiding
database and RIOS comprise the second tier. Middleware principle was to minimize changes to the database schema
(not detailed in this paper) manages communication since each change requires migration of existing data, which is
between tiers. cumbersome and time consuming. Therefore, the database
schema is kept relatively simple. The meta-schema or VAR
This architecture explicitly decouples the simulator from the concept is built on top of the database schema and enables
DMS and database. The simulator has completely different definition of all granules, arrays, facility types and attributes
requirements that guide what platform it should run on. It has without any database schema modification.
access to RIOS areas, reads its input from files in RIOS and
writes all its output to files in RIOS. Relationship Management
Executing lookups of array, granule or facility data is a key
Access control, security, data integrity and scalability features performance issue. A quick response time is critical. The
discussed above are inherently addressed by commercial number of domains, arrays and granules in a case is on the
databases. Databases are also ideal for managing and relating order of hundreds of objects. Thousands of objects result
large sets of data. Therefore, for the second tier, a database when many cases in the same model share the same arrays and
was selected for managing primarily input data. When granules. Depending on the number of facilities and time
running a case, the DMS writes input data from the database variant changes, the number of facility attribute and constraint
to RIOS and launches the simulator. Results and restart data objects managed by a case can reach into tens of thousands.
and runtime log files written by the simulator to RIOS are When multiple cases share the same facility network, the
managed and used by the DMS as well. object count can reach hundreds of thousands and more. To
simplify searches, facility data is looked up from a facility
Middleware instead of a case, but still this can mean examining tens of
The three-tier architecture will not work without a middleware thousands of relationships per facility. In a highly interactive
component. The middleware keeps track of running environment, where hundreds to thousands of attributes and
simulations, figures out where files are and enables constraints may be looked up during a user action, slowness of
communication between the DMS and simulator. It consists this capability can be a major bottleneck. To maximize
of one master network service per site, which manages performance, a hashing technique based on cryptographic
services running on different compute servers at this site. hashing keys was developed. With this technique, object use
Each compute server service is aware of RIOS directory lookups are reduced to an average of one or two searches into
names and their mapping to simulation jobs and passes simple hundreds of thousands of elements.
commands and their return codes between the DMS and
simulator. The details of the middleware are not discussed in Although the database schema is relatively simple, the
this paper. quantity of relationships and the number of objects in each
make the database environment quite complex. Test programs
Database were developed to exercise and validate functionality at the
There are several choices for the type of database, including unit level and maintenance programs were written that correct
relational, which is highly pervasive in many industries. inconsistencies and problems with the database. Although the
However, for the needs of this project, which include VAR concept has minimized the need for schema changes,
compatibility with the object oriented paradigm of the DMS programs had to be written to manage upgrades of data when
schema changes occur.
SPE 106075 5

RIOS
The RIOS concept was developed to enable sharing of data
between the DMS and the simulator, as the simulator was
designed to be independent of the database. Every case has a
RIOS directory with a unique name. The database case
objects know about their RIOS directories. Every RIOS area
is associated with a specific business unit. Access can be
controlled in a fashion similar to database permissions with
system level owner, group and world permissions. It is
possible to completely hide a certain RIOS area by allowing
only a specific group ownership and access to it. RIOS areas
can be network accessible or local. When local, unless public
access is granted explicitly by the user, the RIOS is only
accessible by local simulation jobs.

A RIOS directory contains two types of files: (1) a collection


of files that contain input, restart and results data and (2) a set Figure 4: The Log Browser tool provides an interactive,
of log files that store per timestep runtime information and HTML interface to critical timestep information, like
user requested output generated by well management logic. timestep cuts, timings, material balance, I/O, etc. with text
information, tables and charts. Hyperlinks enable
Input, Restart and Results Files complete cross-referencing of this critical data.
As explained earlier, before launching a simulation, the DMS
generates an input file for the simulator with all the input DMS
granule, array and facility network data in the case RIOS The DMS is probably the most visible component of the data
directory. As the simulator is running, it appends its complete management environment. Written completely in C++ on the
state information to this file at the restart times requested by Windows operating system, it brings together many home
the user, or the user may request a set of restart data be written grown applications, vendor applications, 2D and 3D
on demand at any point during the run. The simulator also visualization tools and other 3rd party packages to enable
writes specified arrays, granules and facility data to the results engineers to do their work without worrying about system
file at user requested times. These results can be monitored details. The DMS is where all the data and their unique
while the simulator is running. characteristics and associations are exposed to the engineer as
intuitively as possible. Therefore, it is an area of constant
The input/restart and results files are self-referencing files; evolution. The front-end, the main entry point for users, is
they can refer to data within the same file or file A can refer to shown in Figure 5 with sample viewers along with the
data in file B or vice versa. They can be ASCII or binary and project/model/case tree and data manager, which lists data
are completely portable. The format and structure of these items for the current case.
files have been developed in-house over many years and are
proprietary.

Log Files
The log files record timestepping information, convergence
parameters, well performance data and information on
problem nodes. The presentation of log file data is extremely
sophisticated with a web style interface that displays highly
detailed tables, charts and graphs. The power of this interface
is further enhanced with its ability to present user messages
written from well management logic in these formats as well.
A screenshot of this tool is presented in Figure 4.

When a case is run, the DMS deletes any existing RIOS files
first. When a case is restarted, all RIOS files are truncated to
restart time. The DMS accesses results arrays, granules and
facility data from RIOS files directly.

Figure 5: The Front-End is the main entry point for


users to the data management environment. It presents
the filtered project/model/case hierarchy, the data items
for the current case, and ability to manage the data items
for both input and results.
6 SPE 106075

Archiving business use in the research organization. After this the


One of the crucial tools available in DMS is the archiving Information Technology organization takes over the full
capability. Models, which are self-contained, can be archived deployment and training tasks. Besides users, system
with or without their RIOS data to a selected destination, such administrators, database administrators and user support staff
as a LAN drive, a DVD drive or a local disk. The archive file must be trained. The deployment includes upgrade of existing
is in XML format and is therefore completely device and databases and is a phased process which takes several months.
application independent. This data can eventually be migrated Once deployed, the environment requires care and feeding by
to off-line storage. The model can be deleted from the on-site user support staff and database administrators, who in
database as well as the associated RIOS data to free up disk turn can rely on central user support, application support and
space. This process is crucial because it guarantees problem- database administration organizations. The long cycles from
free access to data even years later, regardless of version of development to deployment require maintenance of two or
the DMS, database schema or VAR, by applying all relevant three versions of the environment concurrently at any one
changes necessary to upgrade the data at the time of restore. time. The issue faced in development and support of this data
management environment is similar to those faced by larger
Deployment and Usage software vendors: increasing user base, multiple versions and
The data management environment was first released in late data compatibility needs require substantial development and
2000 along with the deployment of ExxonMobils EMpower support time.
reservoir simulator. It has been in use inside ExxonMobil
since then and now supports close to 500 users. There are 32 Data Complexity
individual databases containing 2TB of online data and about Reservoir simulation data can cover a wide spectrum: the
6TB of online RIOS data corresponding to about 5,000 simulation grid can be one cell or millions of cells; the facility
models. The environment has been successfully deployed in network can be a single well or tens of thousands of wells; the
eight countries on five continents outside the United States time range of simulation may cover milliseconds to millions of
including Europe, Asia, Australia, Africa and South America. years; time dependence of data can be from none to seconds to
The system has gone through three major, five minor and three every minute to days; results can be reported so frequently or a
patch releases. Not including the simulator, there are run can be so long that the files written by simulator can reach
approximately nine million lines of software code (with six over 10GB in size. A good simulator and its data management
million lines of 3rd party vendor code and over half a million system must be flexible enough to handle any of these
lines of database related code) and 25,000 files (18,000 of requirements.
which are 3rd party vendor source code files).
Size
From the beginning of the project, this variability in quantity
and spectrum of data has continuously taxed the data
management environment. Schemes have been developed to
compress RIOS files and to handle files greater than 2GB in
size. The log files have been compacted from an initial
ASCII format to a compressed, delimited form. The restart
data has been made less persistent by implementation of
disposable restarts, which keeps track of latest restart data
only. None of these, however, have been as taxing as the
work required on the database. There has always been a
feature or an action not performing well enough or not
working at all with a particular set of data or use of it. Either
the implementation had not considered this kind of use or the
database design reached its limits in dealing with this data.
Some of the actions most fragile in this aspect include copying
cases, archiving models, creating restart cases and unsharing
facility network data. These issues have all been addressed
Figure 6: Flow of a support issue through organizational over time. The database was initially designed to be all
components making up the support structure for the comprehensive and included loading of all facility results and
simulator and its data management environment. meta-data for results arrays into the database as well.
Development and business validation are in Research However, after several years of dealing with performance
organization, while all other components except users are problems with results loading and deletion, RIOS files are
in Information Technology organization. now the single source of results and restart data. The database
still stores all input data and has a pointer to the RIOS
Support Issues directory of each case to insure consistency between RIOS and
There is a significant support infrastructure built around the database.
reservoir simulator and its components (Figure 6). The data
management environment adds additional complexity to this
work. The environment is developed and validated for
SPE 106075 7

Database Migration incorporates many hyperlinks to interrelated information


The biggest data challenge came when business drivers (Figure 7). It contains over 1100 HTML pages. Users would
demanded change of the database vendor. This was a big like the help system to incorporate newer search capabilities,
undertaking, especially since object databases are not similar to those used in internet search engines.
standardized like relational databases and the code was not
developed with a distinct, database independent data access
layer. However, the design was adequate and a distinct
separation of database transactions from other code is in place.
This huge effort was brought to a successful completion and
also ushered in the XML archiving scheme, which enabled
archive and restore of models independent of database vendor,
with backward compatibility. Upgrading older data schema
models to new data schema using archive/restore turned out to
be a natural extension.

Today, the database provides a secure and consistent


repository that works in a collaborative environment.
However, it is an ongoing project that requires continuous
improvement to support increasing data and flexibility
requirements.
Figure 7: The HTML based Help System provides detailed
User Perspective information on program functionality and underlying
Users of the reservoir simulator are as varied as the simulation science. It is context sensitive, and hypertext links and
data. There are engineers who are new to the company and index and search capabilities facilitate finding of
who are trying to use reservoir simulation for the first time; information.
there are experienced engineers who use simulation only now
and then; there are geoscientists who do studies of different Developer View
dimensions and then there are the experienced, hard core, day There are on average eight developers devoted to development
to day users who know the simulation process inside and out. and maintenance of the data management environment. About
It is not possible to meet all the desires of this diverse set of three versions of the software must be maintained
users when building a system. Less experienced users prefer concurrently and the same resources must also deal with
rigidly defined workflows that rely heavily on graphical user nightly builds, regression tests and porting to new Windows
interfaces, while more experienced users find the graphical operating systems. Developers on average spent about 50% of
user interface limiting and want to do their own analysis using their time on maintenance and support issues.
tools like SAS, Excel and MatLab. The data management
environment, initially designed to be all encompassing and Most of the developers are seconded from the Information
self-contained, is now asked to be more open. Some progress Technology organization; the turnover of staff is pretty high
has been made to this end. Many functions of the DMS can however, as the IT career development process rotates staff
now be driven from scripts and a new API is being made every two to three years. This puts an extra burden on the
available for accessing input and results/restart data. Further project, as it takes at least four to six months for new staff to
developments are in progress to connect the DMS with other become fully productive. Overall complexity of the system
applications using Windows Workflows. and interrelationships of its components do not make it any
easier. However, the fact that the system has successfully
Database/RIOS as Integral Part of Work gone through many releases and has continously increased its
One of the most visible results of the data management user base is a positive aspect of the project, especially
environment is awareness of database and RIOS usage by both considering many of the original developers have long ago
users and administrators. Consolidation of all relevant moved on to other assignments.
simulation data to these areas has clearly brought to light the
quantity and nature of data that has to be managed. Disk Management of reservoir simulation data requires long term
space for both database and RIOS areas is cost allocated to commitment and continuous improvement. From business
business units; therefore, there is a constant check for overuse. perspective, it makes no sense to start anew from version to
Users must always be aware or are reminded when they fill up version. More and more changes must be done and new
the database or RIOS. features added while insuring compatibility with existing data.

Help System The Road Ahead


One of the least talked about but most appreciated components The trend in reservoir simulation is towards more: more users,
of the DMS is the online help system based on Microsoft more models, more cells, more wells, more cases, more data
HTMLHelp. All functionality within DMS is clearly and more integration.
documented and explained. For many users, this is their first
point of support. The help system is context sensitive and
8 SPE 106075

Bigger Grids and Facility Networks want to talk to the database to get or change data with
Simulation engineers would like to be able to build reservoir automated tools. Users want to be able to get to simulation
models with several million cells and manage several data directly to analyze using their favorite tools.
thousand wells with more flexibility. Currently, hardware and
software limit the DMS to a few million unstructured grid cell To meet all of these challenges, the data management
models and a few thousand wells for comfortable operation. environment must be able to bring in new components. It
As grids get larger and larger and number of facilities in the must become more open and more easily communicate with
facility networks reach the many thousands range, both the other applications. It must provide simple interfaces for users
database and the DMS will be taxed even further. To be able to get to data quickly. This is an area of ongoing work.
to handle this kind of load, they will continue to be the subject
of continuous improvement. The computing environment is Conclusions
also changing to support this load: grid computing, high-end The heterogeneous, distributed and multi-tier data
compute servers and 64-bit desktops for clients. management environment that has been described allows
engineers to work on reservoir models, using logically
Movement of input data, at least large granules and arrays centralized, physically decentralized data sources where
related to grid definition and properties, to the RIOS area is integrity, security, consistency, etc. are managed. The
also being considered. This would eliminate duplicate storage environment was designed, developed and distributed over a
of the same data in different forms, two separate sets of input ten year period and has gone through several versions.
and output routines (one to the database and one to the RIOS)
and allow greater flexibility for external programs to supply The environment increases development work and support
and/or modify simulator input data. A database would still be load and can have implementation issues that take time to
used to manage data relationships, but would not be burdened resolve. Nevertheless, it has proven its value: (1) it has
with having to manage huge arrays and millions of objects enabled penetration of reservoir simulation into a much wider
which have been major bottlenecks to database performance. audience than ever before, (2) it has exposed the volume and
diversity of data in use and the need for good data
Automated History Matching and Optimization management of simulation information, and (3) it has opened
With the requirement to be able to run many slight variations doors to new ways of analyzing simulation data, including
of a base case, automated history matching and optimization data mining.
add another dimension to more. Optimization and history
matching are two areas of increasing popularity and research The environment must continuously improve and adapt to
interest. Both need efficient management of tens to hundreds changing requirements and workflows. Initially designed as
of cases that have little variation. Users have to be able to an all inclusive, self sufficient solution, it must now open up to
design experiments easily, many time dependencies must be enable integration, to handle ever bigger models with
managed behind the scenes, and results must be presented in increasing number of facilities and to enable automated
new ways. Data sharing was a good start ten years ago, but history matching and optimization workflows.
now scenario management becomes very important. This
requires substantial work on the architecture of the data The benefits of good data management are not obvious until
management environment and will involve extension of the the system is in place. It requires full backing and
data sharing concept to the RIOS files. commitment of company management for success. The
experiences discussed in this paper would not have been
Data Mining possible without such a technology leadership.
More simulation runs produce more data. Data mining will
come into greater use to extract useful information from all Acknowledgments
this data. With standardized files in RIOS directories and The authors wish to acknowledge B.L. Beckner, B.A. Boyett,
consistent databases, finding the right interpretation will be the T.K. Eccles, J.D. Hindmon and C.J. Jett for their valuable
key. This is a recent area of development for ExxonMobil. assistance to this paper. The authors also acknowledge the
Tools have been developed to go through databases and RIOS management of ExxonMobil Upstream Research Company for
files, extract information and generate statistics; however, permission to publish this paper.
much more work is necessary in this area and may possibly
require use of another database to collect information for Windows is a registered trademark of Microsoft Corporation
analysis. in the United States and other countries.

Integration and Open Environment References


Finally, there is more demand for integration and easier access 1. Huang, A.Y. and Ziauddin, Z.: Use of Computer
to simulation input and results data. Efforts to integrate Graphics in Large-Scale Reservoir Simulation, SPE
subsurface work environment call for visualization and 20343, presented at the 5th Petroleum Computer
interpretation of simulation data with geoscience data and Conference of the Society of Petroleum Engineers,
better exchange of data such as hydraulics tables, pvt, k-value Denver, Colorado, June 25-28, 1990, 145-150.
and eos properties and completion efficiency tables with the 2. Kreule, T., Good, P.A., Hoff, A.H.F.B. and Maunder,
source applications. More and more external applications R.E.: RISRES A State-Of-The-Art Hydrocarbon
SPE 106075 9

Resource Database That Works! SPE 35986, presented


at the 1996 Petroleum Computer Conference of the
Society of Petroleum Engineers, Dallas, Texas, June 2-5,
1996, 27-40.
3. Howell, A., Szatny, M. and Torrens, R.: From Reservoir
Through Process, From Today to Tomorrow The
Integrated Asset Model, SPE 99469, presented at the
Intelligent Energy Conference and Exhibition of the
Society of Petroleum Engineers, Amsterdam,
Netherlands, April 11-13, 2006.
4. Beckner, B.L., Hutfilz, J.M., Ray, M.B. and Tomich, J.F.:
EMpower: New Reservoir Simulation System, SPE
68116, presented at the 2001 Middle East Oil Show of the
Society of Petroleum Engineers, Bahrain, March 17-20,
2001.
5. Beckner, B.L., Usadi, A.K., Ray, M.B. and Diyankov,
O.V.: Next Generation Reservoir Simulation Using
Russian Linear Solvers, SPE 103578, presented at the
2006 Russian Oil and Gas Technical Conference and
Exhibition of the Society of Petroleum Engineers,
Moscow, October 3-6, 2006.

You might also like