Professional Documents
Culture Documents
Hasso Plattner
A Course in
In-Memory Data Management
The Inner Mechanics
of In-Memory Databases
October 8, 2013
213
214 34 Bypass Solution
ETL OLAP
ERP Engine
Traditional DB
w/ Cubes'
Traditional DB'
In the first step of this approach (see Figure 34.2), the IMDB is installed
and connected to the traditional database.
The only dierence will be in data representation: data will be stored in
columns. An initial load to the in-memory database (IMDB) creates a copy
of the existing system state with all business objects in the IMDB. In spite of
the huge volume of data to be reproduced, first experiments with massively
parallel bulk loads of customer data have shown that even for the largest
companies this one-time initialization can be done in only a few hours.
After the initial load, the two storages will be maintained in parallel,
every document and change in a business object is stored in both databases.
For this, established database replication technologies are used. The high
compression rate in a column store helps to decrease the amount of main
memory required for such parallel use of two databases, and hence it does
not lead to a significant waste of resources. At the same time, using the
parallel installation of the IMDB, we can estimate performance and memory
consumption benefits of this architecture for concrete business cases and
prove the need of moving the system to the new data storage.
STEP 1: Install and run the in-memory database in parallel
ETL OLAP
ERP Engine
Traditional DB
w/ Cubes'
IMDB
Traditional DB'
'
SSD'
ETL OLAP
ERP Engine
Traditional DB
w/ Cubes'
IMDB
Traditional DB'
'
New
Applications
SSD'
In a third step, which can be done in parallel with the previous ones, the
BI system is ported to the IMDB as illustrated in Figure 34.4. This will help to
achieve another gain in reporting performance in comparison to disk-based
OLAP systems.
The dierence to the traditional system is that all materialized data cubes
and aggregates will be removed. Instead, aggregates are computed on the fly
and all data-intensive operations are pushed to the database level. In com-
parison with storing all materialized aggregates and indices, this reduces
the amount of main memory, which is necessary for the OLAP system. It
immediately leads to the following benefits: the data cubes in a traditional
BI system are usually updated on a weekly basis or even less frequently.
However, executives, management, and all other decision makers often de-
mand up-to-date information. By running OLTP and OLAP systems on the
same IMDB platform, this information can be delivered in real time. Another
advantage is that the ETL process is radically simplified as the complex cal-
culation of aggregates is omitted. The ETL process can partly be replaced by
the same replication mechanism used between the traditional ERP database
and the parallel installation of the in-memory database. Therefore, we called
the replication in Figure 34.4 simply EL, since no transformation takes
place any longer, just extraction and load remain. Furthermore, the Business
Intelligence (BI) -system will be more flexible as complex cube management
and maintenance operations are abandoned and complexity is reduced.
In most cases, the migration to the in-memory database BI system can be
conducted automatically. This is relatively simple as existing materialized
views are replaced by non-materialized views. Analytical queries can be
rewritten by generators. However, in complex migrations scenarios, factors
such as dierent SQL dialects, data structures, or software parts that rely
on additional data in proprietary formats may pose an obstacle and might
require manual interventions.
The final step of the suggested solution can be executed when the cus-
tomer is comfortable with the parallel in-memory solution. In this step, the
traditional OLTP database is switched o. After that, the customer works
only with the consolidated in-memory enterprise system that is used for
both transactional and analytical queries (see Figure 34.5).
The BI system could theoretically be replaced in a setup with only one
ERP system. Reality shows that many OLTP systems exist and external data
has to be integrated into the analytical system. To this end, the traditional
BI system is used as a data platform consolidating data from these OLTP
systems and external sources.
Additionally, during system evolution, new extensions to the data model
are possible. Adding new tables and new attributes to existing tables in the
column store is done on the fly and this speeds up release cycles significantly.
Figures for Book: Step 3
EL OLAP
ERP Engine
New
Applications
SSD'
New
Applications
SSD'