Professional Documents
Culture Documents
[hide]This article has multiple issues. Please help improve it or discuss these issues on the talk pag
This article needs additional citations for verification. (February 2008)
This article has an unclear citation style. (September 2013)
1 Types of systems
2 Software tools
3 Benefits
5 History
6 Information storage
6.1 Facts
7 Design methodologies
10 See also
11 References
12 Further reading
13 External links
Types of systems[edit]
Data mart
A data mart is a simple form of a data warehouse that is focused
on a single subject (or functional area), such as sales, finance or
marketing. Data marts are often built and controlled by a single
department within an organization. Given their single-subject
focus, data marts usually draw data from only a few sources.
The sources could be internal operational systems, a central
data warehouse, or external data.[1] Denormalization is the norm
for data modeling techniques in this system.
Online analytical processing (OLAP)
OLAP is characterized by a relatively low volume of
transactions. Queries are often very complex and involve
aggregations. For OLAP systems, response time is an
effectiveness measure. OLAP applications are widely used
by Data Mining techniques. OLAP databases store aggregated,
historical data in multi-dimensional schemas (usually star
schemas). OLAP systems typically have data latency of a few
hours, as opposed to data marts, where latency is expected to
be closer to one day.
Online Transaction Processing (OLTP)
OLTP is characterized by a large number of short on-line
transactions (INSERT, UPDATE, DELETE). OLTP systems
emphasize very fast query processing and maintaining data
integrity in multi-access environments. For OLTP systems,
effectiveness is measured by the number of transactions per
second. OLTP databases contain detailed and current data. The
schema used to store transactional databases is the entity
model (usually 3NF).[2] Normalization is the norm for data
modeling techniques in this system.
Predictive analysis
Software tools[edit]
The typical extract-transform-load (ETL)-based data
warehouse uses staging, data integration, and access
layers to house its key functions. The staging layer or
staging database stores raw data extracted from each
of the disparate source data systems. The integration
layer integrates the disparate data sets by transforming
the data from the staging layer often storing this
transformed data in an operational data store (ODS)
database. The integrated data are then moved to yet
another database, often called the data warehouse
database, where the data is arranged into hierarchical
groups often called dimensions and into facts and
aggregate facts. The combination of facts and
dimensions is sometimes called a star schema. The
access layer helps users retrieve data.[4]
This definition of the data warehouse focuses on data
storage. The main source of the data is cleaned,
transformed, cataloged and made available for use by
managers and other business professionals for data
mining, online analytical processing, market
research and decision support.[5] However, the means
to retrieve and analyze data, to extract, transform and
load data, and to manage the data dictionary are also
considered essential components of a data
warehousing system. Many references to data
warehousing use this broader context. Thus, an
expanded definition for data warehousing
includes business intelligence tools, tools to extract,
transform and load data into the repository, and tools to
manage and retrieve metadata.
Benefits[edit]
A data warehouse maintains a copy of information from
the source transaction systems. This architectural
complexity provides the opportunity to :
History[edit]
The concept of data warehousing dates back to the late
1980s[7] when IBM researchers Barry Devlin and Paul
Murphy developed the "business data warehouse". In
essence, the data warehousing concept was intended
to provide an architectural model for the flow of data
from operational systems to decision support
environments. The concept attempted to address the
various problems associated with this flow, mainly the
high costs associated with it. In the absence of a data
warehousing architecture, an enormous amount of
redundancy was required to support multiple decision
support environments. In larger corporations it was
typical for multiple decision support environments to
operate independently. Though each environment
served different users, they often required much of the
same stored data. The process of gathering, cleaning
and integrating data from various sources, usually from
long-term existing operational systems (usually referred
1975 Sperry
Univac introduces MAPPER (MAintain, Prepare,
and Produce Executive Reports) is a database
management and reporting system that includes
the world's first 4GL. First platform designed for
building Information Centers (a forerunner of
contemporary Enterprise Data
Warehousing platforms)
1995 The Data Warehousing Institute, a forprofit organization that promotes data warehousing,
is founded.
Information storage[edit]
Facts[edit]
A fact is a value or measurement, which represents a
fact about the managed entity or system.
Facts as reported by the reporting entity are said to be
at raw level. E.g. if a BTS (Business Transformation
Service) received 1,000 requests for traffic channel
allocation, it allocates for 820 and rejects the remaining
tch_req_total = 1000
tch_req_success = 820
tch_req_fail = 180
Design methodologies[edit]
This section needs additional citations for verification. Please help impr
article by adding citations to reliable sources. Unsourced material may be
removed. (July 2015)
Bottom-up design[edit]
In the bottom-up approach, data marts are first created
to provide reporting and analytical capabilities for
specific business processes. These data marts can
then be integrated to create a comprehensive data
warehouse. The data warehouse bus architecture is
primarily an implementation of "the bus", a collection
of conformed dimensions and conformed facts, which
are dimensions that are shared (in a specific way)
between facts in two or more data marts.
Top-down design[edit]
The top-down approach is designed using a normalized
enterprise data model. "Atomic" data, that is, data at
the lowest level of detail, are stored in the data
warehouse. Dimensional data marts containing data
needed for specific business processes or specific
departments are created from the data warehouse. [15]
Hybrid design[edit]
Data warehouses (DW) often resemble the hub and
spokes architecture. Legacy systems feeding the
warehouse often include customer relationship
management andenterprise resource planning,
generating large amounts of data. To consolidate these
various data models, and facilitate the extract transform
load process, data warehouses often make use of
an operational data store, the information from which is
parsed into the actual DW. To reduce data redundancy,
larger systems often store the data in a normalized
way. Data marts for specific reports can then be built on
top of the DW.
The DW database in a hybrid solution is kept on third
normal form to eliminate data redundancy. A normal
relational database, however, is not efficient for
business intelligence reports where dimensional
modelling is prevalent. Small data marts can shop for
See also[edit]
Accounting intelligence
Anchor Modeling
Business intelligence
Data integration
Data mart
Data mining
Data scraping
Semantic warehousing
Snowflake schema
Software as a service
Star schema
References[edit]
1.
Jump
up^ http://docs.oracle.com/html/E
10312_01/dm_concepts.htm Data
Mart Concepts
2.
Jump
up^ http://datawarehouse4u.info/
OLTP-vs-OLAP.html OLTP vs
OLAP
3.
Jump
up^ http://olap.com/category/bisolutions/predictive-analytics Pred
ictive Analysis
4.
6.
7.
8.
9.
Further reading[edit]
External links[edit]
Data
NDL: 00911488
Authority control
Categories:
Business intelligence
Data management
Data warehousing
Navigation menu
Create account
Not logged in
Talk
Contributions
Log in
Read
Edit
View history
Go
Main page
Contents
Featured content
Current events
Random article
Donate to Wikipedia
Wikipedia store
Interaction
Help
About Wikipedia
Community portal
Recent changes
Contact page
Article
Talk
Tools
Related changes
Upload file
Special pages
Permanent link
Page information
Wikidata item
Create a book
Download as PDF
Printable version
Languages
Azrbaycanca
Catal
etina
Dansk
Deutsch
Espaol
Franais
Hrvatski
Bahasa Indonesia
Italiano
Latvieu
Lietuvi
Lumbaart
Magyar
Nederlands
Norsk bokml
Polski
Portugus
Romn
Slovenina
Svenska
Trke
Ting Vit
Edit links
Privacy policy
About Wikipedia
Disclaimers
Contact Wikipedia
Developers
Mobile view
Skip Headers
Oracle9i Data Warehousing Guide
Release 2 (9.2)
Part Number A96520-01
1
Data Warehousing Concepts
This chapter provides an overview of the Oracle data warehousing implementation. It includes:
Note that this book is meant as a supplement to standard texts about data warehousing. This book
focuses on Oracle-specific material and does not reproduce in detail material of a general nature.
Two standard texts are:
The Data Warehouse Toolkit by Ralph Kimball (John Wiley and Sons, 1996)
Building the Data Warehouse by William Inmon (John Wiley and Sons, 1996)
A data warehouse is a relational database that is designed for query and analysis rather than for
transaction processing. It usually contains historical data derived from transaction data, but it can
include data from other sources. It separates analysis workload from transaction workload and
enables an organization to consolidate data from several sources.
In addition to a relational database, a data warehouse environment includes an extraction,
transportation, transformation, and loading (ETL) solution, an online analytical processing
(OLAP) engine, client analysis tools, and other applications that manage the process of gathering
data and delivering it to business users.
See Also:
Subject Oriented
Integrated
Nonvolatile
Time Variant
Subject Oriented
Data warehouses are designed to help you analyze data. For example, to learn more about your
company's sales data, you can build a warehouse that concentrates on sales. Using this
warehouse, you can answer questions like "Who was our best customer for this item last year?"
This ability to define a data warehouse by subject matter, sales in this case, makes the data
warehouse subject oriented.
Integrated
Integration is closely related to subject orientation. Data warehouses must put data from
disparate sources into a consistent format. They must resolve such problems as naming conflicts
and inconsistencies among units of measure. When they achieve this, they are said to be
integrated.
Nonvolatile
Nonvolatile means that, once entered into the warehouse, data should not change. This is logical
because the purpose of a warehouse is to enable you to analyze what has occurred.
Time Variant
In order to discover trends in business, analysts need large amounts of data. This is very much in
contrast to online transaction processing (OLTP) systems, where performance requirements
demand that historical data be moved to an archive. A data warehouse's focus on change over
time is what is meant by the term time variant.
Contrasting OLTP and Data Warehousing Environments
Figure 1-1 illustrates key differences between an OLTP system and a data warehouse.
One major difference between the types of system is that data warehouses are not usually in
third normal form (3NF), a type of data normalization common in OLTP environments.
Data warehouses and OLTP systems have very different requirements. Here are some examples
of differences between typical data warehouses and OLTP systems:
Workload
Data warehouses are designed to accommodate ad hoc queries. You might not know the
workload of your data warehouse in advance, so a data warehouse should be optimized to
perform well for a wide variety of possible query operations.
OLTP systems support only predefined operations. Your applications might be
specifically tuned or designed to support only these operations.
Data modifications
A data warehouse is updated on a regular basis by the ETL process (run nightly or
weekly) using bulk data modification techniques. The end users of a data warehouse do
not directly update the data warehouse.
In OLTP systems, end users routinely issue individual data modification statements to the
database. The OLTP database is always up to date, and reflects the current state of each
business transaction.
Schema design
Typical operations
A typical data warehouse query scans thousands or millions of rows. For example, "Find
the total sales for all customers last month."
A typical OLTP operation accesses only a handful of records. For example, "Retrieve the
current order for this customer."
Historical data
Data warehouses usually store many months or years of data. This is to support historical
analysis.
OLTP systems usually store data from only a few weeks or months. The OLTP system
stores only historical data as needed to successfully meet the requirements of the current
transaction.
Data Warehouse Architectures
Data warehouses and their architectures vary depending upon the specifics of an organization's
situation. Three common architectures are:
Figure 1-2 shows a simple architecture for a data warehouse. End users directly access data
derived from several source systems through the data warehouse.
In Figure 1-2, the metadata and raw data of a traditional OLTP system is present, as is an
additional type of data, summary data. Summaries are very valuable in data warehouses because
they pre-compute long operations in advance. For example, a typical data warehouse query is to
retrieve something like August sales. A summary in Oracle is called a materialized view.
Data Warehouse Architecture (with a Staging Area)
In Figure 1-2, you need to clean and process your operational data before putting it into the
warehouse. You can do this programmatically, although most data warehouses use a staging area
instead. A staging area simplifies building summaries and general warehouse management.
Figure 1-3 illustrates this typical architecture.
Although the architecture in Figure 1-3 is quite common, you may want to customize your
warehouse's architecture for different groups within your organization. You can do this by adding
data marts, which are systems designed for a particular line of business. Figure 1-4 illustrates an
example where purchasing, sales, and inventories are separated. In this example, a financial
analyst might want to analyze historical data for purchases and sales.
Figure 1-4 Architecture of a Data Warehouse with a Staging Area and Data Marts
Data marts are an important part of many warehouses, but they are not the
focus of this book.
See Also:
Data Mart Suites documentation for further information regarding data marts
SearchSQLServer
Get Started
Home
Database
WhatIs.com
This definition is part of our Essential Guide: Big data applications: Real-world strategies for managing big
data
Sponsored News
Check Your Mirror: Object Storage May Be Closer than It Appears NetApp
See More
Vendor Resources
E-Guide: Best practices for managing the data warehouse to support Big Data SAP America, Inc.
The Event Data Warehouse- Strategies for Improving Business Performance Sensage, Inc.
Exadata
vs.
Sybase IQ
HP Vertica
vs.
Microsoft Parallel Data Warehouse
Netezza
vs.
EMC Greenplum
Infobright
vs.
Teradata
A data warehouse is a federated repository for all the data that an enterprise's
various business systems collect. The repository may be physical or logical.
Hadoop 2 Upgrades: Ready to Take Advantage?
Download Now
By submitting your email address, you agree to receive emails regarding relevant topic offers from TechTarget and
itspartners. You can withdraw your consent at any time. Contact TechTarget at 275 Grove Street, Newton, MA.
You also agree that your personal information may be transferred and processed in the United States, and that you
have read and agree to the Terms of Use and the Privacy Policy.
Data warehousing emphasizes the capture of data from diverse sources for
useful analysis and access, but does not generally start from the point-of-view
of the end user who may need access to specialized, sometimes local
databases. The latter idea is known as the data mart.
There are two approaches to data warehousing, top down and bottom up. The
top down approach spins off data marts for specific groups of users after the
complete data warehouse has been created. The bottom up approach builds
the data marts first and then combines them into a single, all-encompassing
data warehouse.
Typically, a data warehouse is housed on an enterprise mainframe server or
increasingly, in the cloud. Data from various online transaction processing
(OLTP) applications and other sources is selectively extracted for use by
analytical applications and user queries.
PRO+
Content
Find more PRO+ content and other member only offers,here.
E-Handbook
The term data warehouse was coined by William H. Inmon, who is known as
the Father of Data Warehousing. Inmon described a data warehouse as being
a subject-oriented, integrated, time-variant and nonvolatile collection of data
that supports management's decision-making process.
This was first published in February 2015
Related Terms
data mining
Data mining is the analysis of data for relationships that have not previously been discovered.
For example, the sales records ... See complete definition
data preprocessing
Data preprocessing describes any type of processing performed on raw data to prepare it for
another processing procedure. See complete definition
meta
Metadata is a description of data. See complete definition
0 comments
Oldest
Register or Login
E-Mail
Username / Password
Comment
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United
States, you consent to having your personal data transferred to and processed in the United States. Privacy
-ADS BY GOOGLE
#
Powered by:
Latest TechTargetresources
BUSINESS ANALYTICS
DATA CENTER
DATA MANAGEMENT
AWS
ORACLE
CONTENT MANAGEMENT
WINDOWS SERVER
SearchBusinessAnalytics
Oracle Advanced Analytics and other data analytics tools enable business users to explore large
volumes of data.
About Us
Contact Us
Privacy Policy
Videos
Photo Stories
Guides
Advertisers
Business Partners
Media Kit
Corporate Site
Experts
Reprints
Archive
Site Map
Events
E-Products
All Rights Reserved, Copyright 2005 - 2015, TechTarget
Data Warehousing
Data warehousing incorporates data stores and conceptual, logical, and physical
models to support business goals and end-user information needs. A data warehouse
(DW) is the foundation for a successful BI program.
Creating a DW requires mapping data between sources and targets, then capturing the
details of the transformation in a metadata repository. The data warehouse provides a
single, comprehensive source of current and historical information.
Data warehousing techniques and tools include DW appliances, platforms,
architectures, data stores, and spreadmarts; database architectures, structures,
scalability, security, and services; andDW as a service.
This Data Warehousing site aims to help people get a good high-level
understanding of what it takes to implement a successful data warehouse
project. A lot of the information is from my personal experience as a business
intelligence professional, both as a client and as a vendor.
This site is divided into six main areas:
- Tools: The selection of business intelligence tools and the selection of the
data warehousing team. Tools covered are:
Database, Hardware
ETL (Extraction, Transformation, and Loading)
OLAP
Reporting
Metadata
- Steps: This selection contains the typical milestones for a data warehousing
project, from requirement gathering, query optimization, toproduction
rollout and beyond. I also offer my observations on the data warehousing field.
- Business Intelligence: Business intelligence is closely related to data
warehousing. This section discusses business intelligence, as well as the
relationship between business intelligence and data warehousing.
- Concepts: This section discusses several concepts particular to the data
warehousing field. Topics include:
Dimensional Data Model
Star Schema
Snowflake Schema
Slowly Changing Dimension
Conceptual Data Model
Logical Data Model
Physical Data Model
Conceptual, Logical, and Physical Data Model
Data Integrity
What is OLAP
MOLAP, ROLAP, and HOLAP
Bill Inmon vs. Ralph Kimball
- Business Intelligence Conferences: Lists upcoming conferences in the
business intelligence / data warehousing industry.
- Glossary: A glossary of common data warehousing terms.
This site is updated frequently to reflect the latest technology, information, and
reader feedback. Please bookmark this site now.
Data Warehouse
Business Intelligence
Glossary
Site Map
Resources
SQL Tutorial
Privacy
Search