You are on page 1of 7

What is a data warehouse?

A data warehouse is a subject oriented, integrated, non-volatile, and time variant collection of derived data to describe an organizations activities and support of managements decisions.

Data Warehouse
A data warehouse is a repository of an organization's electronically stored data. Data warehouses are designed to facilitate reporting and analysis. This classic definition of the data warehouse focuses on data storage. However, the means to retrieve and analyze data, to extract, transform and load data, and to manage the data dictionary are also considered essential components of a data warehousing system. Many references to data warehousing use this broader context. Thus, an expanded definition for data warehousing includes business intelligence tools, tools to extract, transform, and load data into the repository, and tools to manage and retrieve metadata.

Benefits of data warehousing


A data warehouse provides a common data model for all data of interest regardless of the data's source. This makes it easier to report and analyze information than it would be if multiple data models were used to retrieve information such as sales invoices, order receipts, general ledger charges, etc. Prior to loading data into the data warehouse, inconsistencies are identified and resolved. This greatly simplifies reporting and analysis. Information in the data warehouse is under the control of data warehouse users so that, even if the source system data is purged over time, the information in the warehouse can be stored safely for extended periods of time. Because they are separate from operational systems, data warehouses provide retrieval of data without slowing down operational systems. Data warehouses facilitate decision support system applications such as trend reports (e.g., the items with the most sales in a particular area within the last two years), exception reports, and reports that show actual performance versus goals. Data warehouses can work in conjunction with and, hence, enhance the value of operational business applications, notably customer relationship management (CRM) systems.

Architectre

Data Warehouse

OLTP Vs OLAP

OLTP System Online Transaction Processing (Operational System) Source of data Operational data; OLTPs are the original source of the data. To control and run fundamental business tasks

OLAP System Online Analytical Processing (Data Warehouse) Consolidation data; OLAP data comes from the various OLTP Databases To help with planning, problem solving, and decision support Multi-dimensional views of various kinds of business activities Periodic long-running batch jobs refresh the data

Purpose of data

What the data

Reveals a snapshot of ongoing business processes


Short and fast inserts and updates initiated by end users Relatively standardized and simple queries Returning relatively few records

Inserts and Updates

Queries

Often complex queries involving aggregations

Processing Speed

Typically very fast

Depends on the amount of data involved; batch data refreshes and complex queries may take many hours; query speed can be improved by creating indexes

Space Requirements

Can be relatively small if historical data is archived

Larger due to the existence of aggregation structures and history data; requires more indexes than OLTP

Database Design

Highly normalized with many tables

Typically de-normalized with fewer tables; use of star and/or snowflake schemas

You might also like