You are on page 1of 24

OLAP AND DATA WAREHOUSE

BY

W. H. Inmon

OLAP AND DATA WAREHOUSE

The goal of informational processing is to turn data into information. Online analytical
processing (OLAP) is an important method by which this goal can be accomplished in the
data warehouse architecture. As data warehouse users understanding of DSS
processing capabilities increases, and as the volume of data grows, the sophistication of
data warehouse use increases. Figure 1 depicts the data warehouse architectures role in
turning the data into information, and some of the general differences between the
levels of the data warehouse architecture.

The different levels of data warehouse architecture may be described as:


Organizationally Structured Also known as atomic, corporate or current
detail data, this level the heart of the data warehouse is structured to meet
the informational requirements of the entire organization. Data which has been
archived (archived detail) is also considered to belong to this level.
Departmentally Structured Data at this level of warehouse architecture is
structured to meet the focused informational requirements of a distinct group
identified by a specific business function. The data at this level has also been
referred to as lightly summarized or departmental data.
Individually Structured Data at this level is structured to meet an even more
focused set of informational requirements, as defined by a specific management
function. The data at this level has also been referred to as highly summarized
or individual data.

Copyright 2000 by William H. Inmon, all rights reserved

Page 1

OLAP AND DATA WAREHOUSE

OTHER NAMES FOR OLAP


The increasing sophistication of the end users in meeting their informational or
decision support (DSS) processing requirements leads to OLAP processing. OLAP is
a natural extension of the data warehouse. Indeed, the departmentally structured level
of the data warehouse, from the earliest descriptions of the data warehouse
architecture, is ideal for addressing OLAP processing this level of data is also called the
OLAP or data mart level of DSS processing. Figure 2 shows the different names for the
OLAP level of data processing.

THE SOURCE OF OLAP DATA


The OLAP level of data originates from the organizationally structured level of data in the
data warehouse. This detailed, historical data is the heart of the data warehouse and
forms a perfect foundation for the OLAP level of data. The organizationally structured
level of data is fed by the operational environment and, in turn, feeds the OLAP. Figure
3 shows the relationship between the organizationally structured level of data in the
data warehouse and the OLAP level of data.

Copyright 2000 by William H. Inmon, all rights reserved

Page 2

OLAP AND DATA WAREHOUSE

Some of the characteristics of the OLAP level of data are:


Smallness: Compared to the organizationally structured level of data, there is far
less data that resides in OLAP. As a rule there are two to three orders of magnitude
less data,
Flexibility: OLAP processing is much more flexible than the processing that occurs
at the organizationally structured level of data warehouse processing. OLAP is flexible
because there is much less data to contend with and because the software found at
the OLAP level is designed for flexibility in contrast to the software found managing
the organizationally structured level which is designed to manage large amounts of
data,
Limited History: The OLAP environment rarely contains more than six months to a
year's worth of history. The organizationally structured level of data contains from
five to ten years worth of data,
Customized: The OLAP environment is customized by department to suit the
particular needs of the organization that owns and manages it. The organizationally
structured data in the data warehouse is truly corporate data,
Pre-Categorized: Departmentally structured data in the OLAP environment is
usually organized into pre-defined categories to facilitate the informational
requirements of a specific department, while the data in the organizationally
structured level of the data warehouse maintains all of the categories required for the
entire corporate structure.
Source: The source of OLAP data is the detailed data found in the organizationally
structured level of the data warehouse. The source of data for the organizationally
structured level is the operational environment.
DIFFERENCES BETWEEN OLAP AND ORGANIZATIONALLY STRUCTURED DW DATA
There are many significant differences between the departmentally structured (OLAP)
and the organizationally structured levels of the data warehouse.
One of the most important aspects of the OLAP environment is that it is customized by
department. Figure 4 shows that different instances of the OLAP environment can exist
for different departments.

Copyright 2000 by William H. Inmon, all rights reserved

Page 3

OLAP AND DATA WAREHOUSE

In Figure 4 there is an OLAP instance for finance, a separate OLAP instance for
accounting, and yet another separate OLAP instance for marketing, all originating from
the organizationally structured level of the data warehouse.
Figure 5 shows the methods by which the customization for a department from data in
the organizationally structured level of the data warehouse to the OLAP level can be
achieved.

Departmental customization can take many forms, such as:


Subsets Finance will select some detailed data, while marketing will select other
detailed data.
Aggregations - Accounting will summarize their data one way, while finance will
summarize theirs another way. These different approaches may apply to different
data being summarized, to different ways in which the aggregated results are
calculated, or to different sets of categories by which the aggregated data is
organized.
Supersets - One department will denormalize their OLAP data by joining data from
tables A and B, while another department will join data from tables B and C.
Indexing - One department will index their data on keys ABC and BCD, while another
department will index the same data on keys CDE and DEF, and so forth, to provide
more optimal search paths that meet their different departmental requirements for
informational processing.
Derivations A department may want a particular metric precomputed and the
results stored in their OLAP environment. A similar metric may be stored at the
organizationally structured level, but the department wants to compare their
department-specific calculation to the organization-standard one.
Arrays In order to make the data in their OLAP environment more useful, a
department may op to create an array of data to assist them in their informational
goals. For example, data that is stored one record per month in the organizationally
structured detail may be required as an array of 13 months to represent a contiguous
year and facilitate current-year-previous-month analysis.
Copyright 2000 by William H. Inmon, all rights reserved

Page 4

OLAP AND DATA WAREHOUSE

There are as many ways to customize the data for a department, as there are
departments. Indeed, a department is not limited to a single OLAP instance, but may
require several OLAP environments to meet all of the departments business
requirements for information.
The data that feeds the OLAP environment is the detailed data of the organizationally
structured level of the data warehouse. Because the data residing in the detailed portion
of the data warehouse is corporate, it is not optimized to suit the needs of any given
department.
One of the issues that naturally arise in the customization of the OLAP environments is
that of reconcilability. With each department taking its own perspective of the corporate
data found in the data warehouse, isn't there a problem with the loss of reconcilability of
data? The answer is no. The reason why there is no problem with reconcilability is
illustrated by the diagram in Figure 6.

Figure 6 shows that the same detailed data is looked at in many different ways by
different departments. Because all departments are operating from the same foundation
of detailed data, there is always reconcilability of data however the detailed data is
customized. In this way detailed data provides a very satisfactory foundation for OLAP
processing.
Another way in which organizationally structured data provides a very good basis for
departmental OLAP processing is that the price of creating the detailed foundation needs
be paid only once. In other words, suppose an OLAP environment is to be created for the
finance planning department. It is no small task to create the proper detailed
Copyright 2000 by William H. Inmon, all rights reserved

Page 5

OLAP AND DATA WAREHOUSE

foundation. But once the detailed foundation is created for the finance planning OLAP
effort, then the very same foundation can be used for the sales OLAP effort, for the
accounting OLAP effort, and so forth. Once the organizationally structured environment
is created, there is no further incremental cost to the usage of the detailed data found
therein. As many OLAP environments as desired can take advantage of the
organizationally structured data once built.
INDEXING THE TWO ENVIRONMENTS
One of the substantive differences between the OLAP and the detailed data warehouse
environment is that of indexing. Figure 7 shows that the OLAP environment can be
highly and generously indexed, while the detailed environment should be sparsely
indexed.

There may be as many as thirty or forty indexes in the OLAP environment while there
may be as few as two or three indexes in the detailed environment. There are several
reasons for this disparity in indexing. The first is in the volume of data found in the two
environments. Where there is a modest amount of data in an environment, such as the
OLAP environment, the luxury of having many indexes can be enjoyed. Where there is
an immodest amount of data, such as the organizationally structured environment, there
can only be a few indexes. But volume of data is not the only consideration in the
difference in indexing.
There is much direct end user access that occurs in the OLAP environment, while,
relatively speaking - there is little direct end user access that occurs on detailed data,
(once the organizationally structured environment is mature). Because of the disparity in
direct end user access, there is a very real difference in the need for indexing in the
different environments.
One of the important distinctions made between the two environments is the end user
interface found at each level. Figure 8 shows the difference in interfaces.

Copyright 2000 by William H. Inmon, all rights reserved

Page 6

OLAP AND DATA WAREHOUSE

Figure 8 shows that the OLAP interface is optimal for DSS access and analysis of
information. There is a direct end user interface to organizationally structured data but it
is a much cruder and much simpler interface. Much of the activity that occurs in the
organizationally structured data environment is the selection and gathering of data. Very
few analysts conduct detailed, heuristic analysis of data is done there. Those who do are
the corporate explorers, and they typically require more powerful and, correspondingly,
difficult to use access tools. These analysts are typically very knowledgeable about the
organizations data, and are not only comfortable using procedural tools to interface with
the organizationally structured data, but discover things in this vast amount of detailed
data that were previously unknown or unsuspected.
EXPLORERS AND FARMERS
Because of the differences in the volume of data found in the two environments and the
difference in direct end user interfaces, there is a difference in the communities of
usage. As a rule, the organizationally structured data serves the explorer community
while the OLAP environment serves the farmer community. Figure 9 depicts these
differences.

Copyright 2000 by William H. Inmon, all rights reserved

Page 7

OLAP AND DATA WAREHOUSE

The detailed data serves the explorer community because it is organizationally oriented
(corporate), supports random access, and because it is complete and historical. The
OLAP environment supports the farmer community because the data is customized
before the data is sent to the OLAP environment. In order to customize the data it is
necessary to know how the data is to be used, and it is the farmers of the world that are
able to foretell how data will be used.
There are some exceptions to this rule of the different communities of users. Because of
the limited amount of data found there, the large number of indexes, and the elegance
of the interface, some exploration can be done in the OLAP environment. But the OLAP
level exploration is cursory, looking at the broad picture, not the detailed one. For the
most part, the OLAP environment exists for and is optimal for the farmer community,
not the explorer community.
DRILL DOWN PROCESSING
One of the features of the OLAP environment is that it supports drill down processing.
Figure 10 shows the OLAP support of drill down processing.

Copyright 2000 by William H. Inmon, all rights reserved

Page 8

OLAP AND DATA WAREHOUSE

There are two types of drill down processing that are relevant: inter-OLAP drill down
processing and OLAP-to-organizationally structured data drill down processing. InterOLAP drill down processing is used to show the relationships of summarization between
the different instances of data within the OLAP environment.
A lower level of detail exists the organizationally structured level of data, and that level
of serves as a further level of drill down for the entire OLAP environment.
OLAP WITH NO ORGANIZATIONALLY STRUCTURED DATA
One of the temptations the designer has in building the OLAP environment is to not build
the organizationally structured level of the data warehouse. It is a temptation to just
build OLAP immediately on top of the operational environment. After all, the
organizationally structured level of the data warehouse is:
expensive,
complex,
not easy to build.
Figure 11 depicts the problems associated with skipping the organizationally structured
level of the data warehouse in support of OLAP environments.

Copyright 2000 by William H. Inmon, all rights reserved

Page 9

OLAP AND DATA WAREHOUSE

Building OLAP directly on top of the operational environment is a grave mistake for a
variety of reasons:
The operational environment is not designed to support integrated processing, but
the OLAP environment assumes data integration has been done some where prior to
it.
The operational environment contains only a limited amount of historical data. The
OLAP environment requires historical data.
Each OLAP environment must build their own customized interface to the operational
environment. The development effort to do this is not trivial.
Each OLAP environment puts a drag on the performance of the operational
environment. The collective drag of many OLAP environments is very significant.
For these reasons (and many more!) building the OLAP environment directly from the
operational environment is a very poor idea indeed.
METADATA FOR THE OLAP ENVIRONMENT
One of the more important aspects of the OLAP environment is that of metadata. OLAP
metadata is important because it is metadata that keeps track of what is in the OLAP
environment and where it came from. Upon doing an analysis or a new report, the end
user in the OLAP environment first turns to metadata in order to determine what data is
available as a basis for the analysis. Figure 12 shows metadata in the OLAP
environment.

The components of OLAP metadata are very similar to those found in the
organizationally structured level of the data warehouse. The OLAP components of
metadata include:
descriptive information about what is in the OLAP environment:
o content,
o structure
o definition, etc.;
the source of the data (the organizationally structured data or external data;
the business and technical name of the data;

Copyright 2000 by William H. Inmon, all rights reserved

Page 10

OLAP AND DATA WAREHOUSE

a description of the summarization, , subset, superset and/or denormalization


processes that describe the datas journey form the organizationally structured level
into the OLAP environment;
metrics that describe how much data of what type is found in the OLAP environment,
refreshment scheduling information, describing when data has been populated; and
modeling information, describing how the data in the OLAP environment relates to
the corporate data model and to the OLAP data model (if one exits).

Metadata in the OLAP environment is somewhat more complex than metadata found
elsewhere in the data warehouse environment because there is a need for a specialized
kind of metadata in the OLAP environment. Figure 13 illustrates the need for a unique
kind of metadata in the OLAP environment.

There is a need for both local and global metadata in the OLAP environment. Local OLAP
metadata relates immediately to the department that the OLAP instance serves. There
might be financial planning OLAP metadata, sales OLAP metadata, and marketing OLAP
metadata. At the same time there is a need for OLAP metadata that is global to the
OLAP environment. Global OLAP metadata might include descriptions of how different
departments relate to each other, how different sources of data differ, how data might
flow from one OLAP environment to another, and so forth.
As with metadata for any and all aspects of the entire data warehouse architecture,
OLAP metadata needs to be supported as an interactive part of the process of the OLAP
environment; an integral, important aspect of the OLAP environment, not an
afterthought.
One of the important uses of metadata in the OLAP environment is that in can (indeed,
should) be used interactively in the query process. Once the end user has examined
Copyright 2000 by William H. Inmon, all rights reserved

Page 11

OLAP AND DATA WAREHOUSE

OLAP metadata to determine what the possibilities are, then the user is able to use the
metadata interactively in the query process.
MOVING DATA INTO OLAP
Moving data into the OLAP environment from the detailed data warehouse environment
is a non trivial task. Figure 11 shows the positioning of the program required for the
transport of data into the OLAP environment.

There are several functions that are accomplished in this movement of data. These
functions are not necessarily mutually exclusive, and include:
selection of a subset of detailed data,
summarization of detailed data,
customization of the detailed data into a departmental format,
precategorizaton of detailed data to meet departmental requirements,
creation of supersets by merging and joining of detailed data,
creation or update of arrays of detailed data, and so forth.
Some of the issues that must be resolved in the creation and execution of the program
that feeds the OLAP environment from the detailed data warehouse environment are:
frequency of refreshment,
efficiency with which the detailed data is read (acquired),
amount of detailed data to be acquired,
platform that sorting, joining, merging, etc.,
ability to know what data already resides in the OLAP environment so that the same
record is not (unintentionally) created twice,
that unnecessary records will not be created,
whether data once processed is to be appended or updated into the OLAP
environment, and so forth.
One of the interesting aspects of the program that loads the OLAP environment from the
organizationally structured data warehouse environment is its mutability. The
organizationally structured/OLAP interface is an unstable interface because the OLAP
environment supports informational processing, and informational processing inherently
implies instability because of the exploratory nature of how it is utilized. For this reason,
Copyright 2000 by William H. Inmon, all rights reserved

Page 12

OLAP AND DATA WAREHOUSE

the interface needs to as flexible as possible, because maintenance of the interface will
be an everyday occurrence.
Another issue of the program that loads the OLAP environment from the organizationally
structured data warehouse environment is the efficiency of operation. The first issue of
efficiency is that of simple access of organizationally structured data. Assuming indexes
are used wisely, the next issue is that of combining acquisition programs. Instead of
every OLAP instance having its own data acquisition program, if the same detailed data
is going to be accessed by more than one OLAP environment, then there needs to be a
single program acquiring organizationally structured data that feeds all the OLAP
instances that must be supported. By having a single pass done against the
organizationally structured environment, very efficient OLAP data acquisition processing
can be accomplished.
DATA MODELING FOR OLAP
The OLAP environment may or may not have a data model built for it, as shown in
Figure 15.

The usage of a data model in the OLAP environment is questionable because the OLAP
environment is subject to change at a moment's notice. The high degree of flexibility of
the OLAP environment is such that some types of data and results are created and
destroyed faster than they can be modeled. On the other hand, some of the data in the
OLAP environment is very stable and in fact should be modeled. Whether a model is
applicable or not depends on the kind of data that is being considered. There are several
important kinds of data found in the OLAP environment:
permanent detailed data,
nonpermanent detailed data,
static summary data, and
dynamic summary data.

Copyright 2000 by William H. Inmon, all rights reserved

Page 13

OLAP AND DATA WAREHOUSE

Figure 16 shows the different types of data found in the OLAP environment.

Permanent detailed data is data that comes from the organizationally structured level
and is regularly and normally needed in OLAP processing. Permanent detailed data will
be detailed from the standpoint of the department that owns the OLAP platform. In
actuality, the OLAP permanent detailed data may well be summarized as it passes from
the organizationally structured level of the data warehouse into the OLAP environment.
In that respect, what is detailed in any one instance of the OLAP environment may be
summarized from the perspective of the corporate DSS analyst. Referring back to Figure
1, the data warehouse architecture supports maintaining he appropriate level of detail
and summarization to support the informational requirements of the entire organization,
as well as the different functional requirements of different departments within the
organization.
The second kind of data found in the OLAP environment is nonpermanent detailed data.
Non-permanent detailed data is that data that is brought into the OLAP environment on
a one-time only, or a temporary basis. Non-permanent data is used for special reports
and analyses. The data model for the OLAP environment applies to permanent detailed
data and does not apply to nonpermanent detailed data.
Static summary data is that data that can be recalculated repeatedly with the same
result, regardless of when the calculation is made. Nearly all of the data that is
summarized in the OLAP environment is static. As such, a data model can and should be
created that identifies the static summary data that belongs in the OLAP environment.
The farmers that constitute the OLAP community will tell the data modeler what
summarized data is needed. The database administrator (DBA), or whoever is
responsible for monitoring the activity against the data warehouse, will also be able to
provide input to the data modeler as to what detailed data should be summarized and
how, based on patterns of utilization.

Copyright 2000 by William H. Inmon, all rights reserved

Page 14

OLAP AND DATA WAREHOUSE

There normally is some small amount of dynamic summary data that is found in the
OLAP environment. Three of the most common occurrences that affect whether or not
dynamic data will be found in the OLAP environment are:
1. changes in how the department wishes to manipulate OLAP data,
2. corrections to detailed data in the organizationally structured environment, and
3. changes in how the different levels of aggregation of a complex categorization are
organized.
One of the characteristics of the OLAP environment is that it is flexible. Some
departments, especially those engaged in what if types of analyses such as marketing,
have extremely dynamic requirements for information. The OLAP environment,
representing the departmentally structured level of the data warehouse architecture, can
react and respond to these frequently and often radically changing requirements without
requiring a change to the underlying organizationally structured level of the data
warehouse.
While the data in the data warehouse is defined as being nonvolatile, there are
circumstances where, for whatever reason, organizationally structured data must be
corrected. The most common cause of these corrections is business processing rules
that fall outside of the business rules used to trigger data acquisition for the data
warehouse. While these situations usually do not have a significant impact on the data
customized in the OLAP environment, there may be exceptions.
The other, more common reason for summary data to be considered dynamic is changes
to the structure of complex categories. A Department (or even the entire organization)
may have a business requirement to analyze historical data based on the new method
organizing a category, such as sales or product hierarchies. If the data stored in the o
organizationally structured level of the data warehouse is at the appropriate level of
detail, then resummarizing this data will not present a problem. The actual processing
of the resummarization may be considerable, but the ability to meet this requirement of
turning data into information will exist.
PHYSICAL DESIGN OF THE OLAP ENVIRONMENT
The data model that is created for the OLAP database design leads to a physical design.
The basis of physical design is a combination of properly normalized data and the star
schema. For those entities of data that are non-frequently occurring, the data model and
normalization serve as the basis for physical design. For those entities that are
frequently occurring, the star schema serves as a basis for physical design. Figure 17
shows a star schema.

Copyright 2000 by William H. Inmon, all rights reserved

Page 15

OLAP AND DATA WAREHOUSE

The star schema shown in Figure 14 has several components. At the center of the star
schema is the fact table. The fact table represents the entity that is most populous in
terms of data occurrences. The data in the fact table is made up of data elements from
the organizationally structured level that are additive in nature; that is, the values in
these data elements can be summed in a variety of ways without jeopardizing the
integrity of the data. Surrounding the fact table are the dimension tables. The
dimension tables are where constant related data are stored. The data in dimension
tables are descriptive in nature, not additive, there is a prejoined foreign key
relationship relating the fact table to the dimension tables. The purpose of the fact table
is to streamline the information processing that must make use of the numerous
occurrences of data found in the fact table.
The remainder of the data, (i.e., the non-populous entities) make use of the classical
data model as a basis for physical design.
THE ORDER OF BUILDING THE COMPONENTS
There is a predictable order in which the various components of the architecture are
built. Figure 18 shows that order.

Copyright 2000 by William H. Inmon, all rights reserved

Page 16

OLAP AND DATA WAREHOUSE

The organizationally structured level of the data warehouse is the first component of the
architecture that is built and populated. The organizationally structured data begins its
journey in the operational environment. From the operational environment, the data is
transformed and integrated. The organizationally structured portion of the data
warehouse is then populated. After a serious amount of detailed data has been
accumulated, the OLAP environment (departmentally structured) is begun. Only after a
significant amount of organizationally structured data that has been gathered does it
make sense to start to build the departmentally structured environment.
The OLAP environment is populated by the internal data of the organization, as well as
external data source, as seen in Figure 19.

EXTERNAL DATA AND OLAP


External data can come from any number of sources. It may be fed directly into the
OLAP environment or may be fed into the organizationally structured environment where
it can then be passes along to the OLAP environment. When external data is fed directly
into the OLAP environment, the implication is that there is no other corporate use of it
outside of the department that controls that OLAP instance (Figure 19.1).

Copyright 2000 by William H. Inmon, all rights reserved

Page 17

OLAP AND DATA WAREHOUSE

When there is a corporate need for that external data, then the data is fed into the
organizationally structured portion of the data warehouse where it is then available to
any instance in the OLAP environment (Figure 19.2). This design may be the result of an
up-front decision, or may evolve over time from the previous example f just populating
an OLAP instance with external data.

External data may undergo some amount of refinement before being placed in the OLAP
environment. Some of the typical refinements include:
editing fields,
removing selected records,
joining records to other data,
summarizing external data, etc.
PLATFORMS AND OLAP
One of the important aspects of the data warehouse organizationally structured /OLAP
relationship is that there are many, significant differences between the two
environments. If there are very stark differences, the two environments are best placed
on different platforms, as seen in Figure 20.

Copyright 2000 by William H. Inmon, all rights reserved

Page 18

OLAP AND DATA WAREHOUSE

Figure 20 shows that there are more differences than just platform between the two
environments - there could be differences in which DBMS best supports each, in budget,
and in the type and number of end users. As a rule, the expenditures for the creation
and management of organizationally structured data warehouse data is placed in the
corporate IS budget, while the OLAP expenditures should be placed in each of the
departmental budgets that have a need for a departmentally structured environment.
Organizationally structured data users, for the most part, are the corporate explorers
who need to get at raw corporate data. The users of the OLAP environment are the
departmental analysts who have a parochial interest and perspective of the data found
in their OLAP instances. There are mostly farmers at the OLAP level.
When the organizationally structured level of the data warehouse is small, it is probably
most cost-effective to combine the organizationally structured data of the data
warehouse and the OLAP environment together onto a single platform. Once the
organizationally structured data blossoms into significant volumes, there is no real
possibility that the two environments can be subsumed into the same physical
environment.
In most cases, separating the organizationally structured and OLAP environment onto
separate platforms is an evolutionary process. An organization will begin with a single
physical implementation of their data warehouse environment, including the
organizationally structured and OLAP levels. Then, over time, the existence of various
factors at various degrees of completeness will force the necessity of separating the two
environments onto different physical platforms. Sometimes these factors can be
predicted during the design of the data warehouse, and consequently the overall design
takes into consideration a certain degree of physical separation between organizationally
structured and OLAP data and processing. Whether the initial implementation of the data
warehouse does or does not include OLAP processing, and then whether initially
separated or sharing a single platform, eventually most data warehouses evolve to the
point of requiring that the organizationally structured and departmentally structured
levels have their own, dedicated platforms. Some of the factors that affect the decision
to separate or not to separate include:
Size: The size of the organizationally structured level of the data warehouse,
whether initially or through the natural growth process of a data warehouse (addition
of and changes to subject areas, mushrooming end user requirements for OLAP
capabilities, etc.), may require all of the resources of a given hardware platform.
Likewise, there may be real physical limitation to the RDBMS of choice to maintain
the organizationally structured level, requiring that any OLAP instances be housed
elsewhere.
Performance: The primary responsibility of the organizationally structured level of
the data warehouse is to maintain integrated data from a variety of sources I a
manner that facilitates informational processing for the entire organization. The
performance of fulfilling this requirement may be threatened by end user access to
the OLAP environment, or vice versa.
Number of Departments: The sheer number of departments requiring OLAP
capabilities may be mathematically impossible to support on a single platform.
Copyright 2000 by William H. Inmon, all rights reserved

Page 19

OLAP AND DATA WAREHOUSE

Volatility: The kinds of changes to the underlying data structure of an


organizationally structured level necessary to support the information requirements of
the organization may be too disruptive to the OLAP environments, affecting their
stability and performance.
Geographic Location: For a distributed or decentralized organization, it may be
more efficient to physically locate an OLAP environment at geographically diverse
locations. These could be across the street, across the state, nation or world from the
organizationally structured level, or any combination thereof.
User Autonomy: It is not uncommon for end users to complain about sharing
resources with other departments, claiming that it negatively impacts their
performance. True or not, perception is reality, and in order for the users to feel that
they can turn their data into information they may require a physically separate
environment that they can call their own. There may also be security or other
confidentiality issues requiring that some OLAP data be physically separated from the
rest of the data warehouse environment. Whatever the reason, budgetary
independence plays a significant role in whether a separate OLAP environment can be
made available to these departments.
Platform Considerations: Some applications of OLAP processing capabilities may
need to take advantage of unique or physically diverse platforms. A multidimensional database and/or server may be the appropriate configuration for a given
group, rather than a multi-dimensional tool accessing a relational database, requiring
separation.

For these reasons and because of the differences in budget and control of the different
environments, there usually is no problem with having separate platforms for the
different environments. Figure 20.1 represents a platform shared by both an
organizationally structured environment and an OLAP environment, while Figure 20.2
shows them separated. It is important to note that not all instances of an OLAP
environment within an organizations data warehouse will warrant a separate physical
environment. In some cases one or more OLAP instances must or should be physically
separate, while one or more others can continue to function on the same physical
platform as the organizationally structured level of the data warehouse.

Copyright 2000 by William H. Inmon, all rights reserved

Page 20

OLAP AND DATA WAREHOUSE

OLAP AND THE WORKLOAD


One of the interesting benefits of building an OLAP environment is that of the
redistribution of workload from data access (queries) only against the organizationally
structured data, to a combination of data acquisition from the organizationally structured
environment to the OLAP environment and data access against both. Figure 21 depicts
these differences before and after the OLAP environment is built.

Copyright 2000 by William H. Inmon, all rights reserved

Page 21

OLAP AND DATA WAREHOUSE

When there is no OLAP environment all queries MUST be run in the organizationally
structured environment. There simply is no other choice. Running all queries in the
organizationally structured environment may be no problem as long as there is not
much data there or as long as not much data there, or as long as not much processing is
occurring there.
But the instant that there is much data in the organizationally structured environment or
considerable processing against that environment to facilitate data access, then the
need for the OLAP environment becomes apparent. Some of the factors that accelerate
the need for an OLAP environment to facilitate more efficient data access in the quest to
turn data into information include:
preaggregating data for better performance,
precategorizing data to enhance understanding and usability by end users,
standardizing calculation, metrics and other derived data to ensure accuracy
throughout the organization,
one access point (organizationally structured data) vs. many (OLAP data), and so
forth.
Once the OLAP environment is created, the bulk of the queries are executed away from
the organizationally structured environment. Note, that not all queries are shifted to the
OLAP environment. Even in the most mature DSS environments, there are always a
number of queries that simply cannot be done outside the organizationally structured
environment of the data warehouse. Corporate data explorers must certainly have
access to this detailed data. The level of detail or type of data that is needed is such that
ONLY queries at the organizationally structured level will suffice.
The shift to the OLAP environment from the organizationally structured environment has
much to be said for it:
it is economical,
it is highly flexible,
it allows customization of data to occur for a given department,
it takes advantage of different software to meet different requirements, residing on
the OLAP platform,
it allows significant portions of data to be isolated,
it allows subsets of data to be isolated, and so forth.
SUMMARY
The OLAP environment is sometimes called the data mart, the departmental, lightly
summarized or departmentally structured level of the data warehouse. The OLAP
environment is customized for the department that it serves. A subset of the detailed
data maintained in the organizationally structured level of the data warehouse is placed
in the OLAP environment, usually undergoing some form of pre-processing
(summarization, denormalization, etc.) as it moves from the organizationally structured
level. There can be many OLAP environments, at least one for each department needing
to do OLAP processing in meeting their objective of turning data into information. The
Copyright 2000 by William H. Inmon, all rights reserved

Page 22

OLAP AND DATA WAREHOUSE

organizationally structured level of the data warehouse serves as a basis of


reconcilability for many departments that do OLAP processing.
The OLAP environment is highly indexed, in contrast to the organizationally structured
environment, which is sparsely indexed. The OLAP environment entails an elegant
interface, as opposed to the crude (or virtually nonexistent) interface to the detailed
data in the organizationally structured level. Drill down processing is an integral part of
the OLAP environment and building the OLAP environment directly from the operational
environment without first building the organizationally structured level of the data
warehouse is patently a mistake.
An important aspect of the OLAP environment is metadata. There are two types: local
and global OLAP metadata. The data model is used for the design of some of the data
found in the OLAP environment, while observation of how the organizationally structured
data is utilized provides other important insights into the OLAP design. Physical database
design centers on classical normalization and the star schema.
External data as well as internal data can be included in the OLAP environment.

Copyright 2000 by William H. Inmon, all rights reserved

Page 23

You might also like