You are on page 1of 70

Dr. N.P. Singh, Professor (IT) 15.10.

13

Subject oriented Integrated Near current data delivery Current data Detailed

An ODS is an environment where data from different operational databases is integrated. The purpose is to provide the end user community with an integrated view of enterprise data. It enables the user to address operational challenges that span over more than one business function.

It is the right place to have a central version of reference data that can be shared among different application systems. One way could be that the applications access the data in the ODS directly. Another way is to replicate data changes from the ODS into the databases of the legacy systems. The ODS can help to integrate new and existing systems. The ODS may shorten the time required to populate a DW, because a part of the integrated data already resides in the ODS.

The ODS provides improved accessibility to critical operational data. With an ODS, organizations have a complete view of their financial metrics and customer transactions. This is useful for better understanding of the customer and to make well-informed business decisions. The ODS can provide the ability to request product and service usage data on a real or near real-time basis. Operational reports can be generated with an improved performance in comparison to the legacy systems.

Frequency is how often the ODS is updated, quite possibly from completely different legacy systems, using distinct population processes, and also takes into account the volume of updates that are occurring. Velocity is the speed with which an update must take place from the point in time a legacy system change occurs, to the point in time that it must be reflected in the ODS.

How to position the ODS within the BI architecture

ODS in DSS Environment -Corporate Information Factory

class I
where transactions were moved to the ODS in an immediate manner

from applications - in a range of 1 to 2 seconds from the moment the transaction was executed in the operational environment until the transaction arrived at the ODS. In this case, the end user could hardly tell the difference between an activity that had occurred in the operational environment and the same activity as it was transmitted in the ODS environment.

class II
where activities that occurred in the operational environment were

stored and forwarded to the ODS every four hours or so. In this case, there was a noticeable lag between the original execution of the transaction and the reflection of that transaction in the ODS environment. However this class of ODS was much easier to build and to operate than a class I ODS.

class III
in this case the time lag between execution in the operational

environment and reflection in the ODS is not four hours or so, but is overnight. In a class III ODS there is a noticeable time lag between the execution of the transaction in the operational environment and the reflection of the transaction in the ODS environment. This type of ODS is relatively very easy to build.

class IV
a class IV ODS is one that is fed from the data warehouse from

analysis created by the DSS analyst in the data warehouse environment and condensed down to a point where the results of the analytical processing fit comfortably in the ODS. The input to the ODS can be either regular or irregular. This class of ODS is very easy to build as long as the data warehouse has already been constructed.

Insurance Retail Banking Telecommunications

How can I provide an up-to-date view of insurance products owned by each customer for our Customer Relationship Management (CRM) system? How can I consolidate all the information required to solve customer problems? How can I decrease the turn-around time for quotes? How can I reduce the time it takes to produce claim reports?

How can we give suppliers the ability to comanage our inventory? What inventory items should I be adjusting throughout the day? How can my customers track their own orders through the Web? What are my customers ordering across all subsidiaries? What is the buying potential of my customer at the point of sale?

What is the complete credit picture of my customer, so I can grant an immediate increase? How can we provide customer service have a consolidated view of all products and transactions? How can we detect credit card fraud while the transaction is in progress? What is the current consolidated profitability status of a customer?

Can we identify what our Web customers are looking for in real-time? Which calling cards are being used for fraudulent calls? What are the current results of my campaign? How can we quickly monitor calling patterns after the merger?

How do I provide an up-to-date view of cross functional information for a particular business process when the data is spread across several disparate sources?

To maximize customer satisfaction and profitability, the ultimate data store would contain all of the organizations operational data. This, of course, is not economical or technically feasible at one place with one technology solution.

It seems to have the characteristics of an OnLine Transactional Processing (OLTP) system while at the same time accommodating some of the attributes of a data warehouse (for example, integrating and transforming data from multiple sources).

Transferring the data Data characteristics The ODS environment ODS administration and maintenance

Analyzing the business requirements Defining the ODS type needed Data modeling Defining and describing the different ODS layers

The business scenarios were created from the following three business questions: Banking/finance: What is my customers entire product portfolio? Retail: How can my customers track their own orders through the Web? Telecommunications: Which calling cards are being used for fraudulent calls?

Fig describes the data flow for the order maintenance business scenario. Data is integrated and transformed from multiple heterogeneous data sources and used to populate the order maintenance ODS. An order maintenance application will be used by both customers and the customer service department to access and update the ODS. Changes made to the ODS through the order maintenance application will flow back to the source systems using a trigger and apply mechanism. Regularly scheduled reports

will be created for the inventory management department.

Figure represents the data flow for the consolidated call information scenario. Data is integrated and transformed from multiple homogeneous data sources and used to populate the calling transaction ODS. This data flow into the ODS is real-time. A custom-built fraud application will be used to verify calls and trigger customer service when a suspect call is identified. The existing customer service and billing applications will be migrated to the ODS, eliminating their data stores. A follow-on phase will eliminate the customer data store.

An ODS type A includes real-time (or near-real-time) legacy data access and localized updates (data modifications are not fed back to the legacy systems). The localized updates would typically include new data not currently captured in the operational systems. An ODS type B includes the characteristics of an ODS type A along with a trigger and apply mechanism to feed data back to the operational systems. Typically these feedback requirements would be very specific to minimize conflicts. An ODS type C is either fully integrated with the legacy applications or uses real-time update and access.

The ODS can be directly updated by front-end applications (such as Campaign Management, Customer Service, Call Center) or by the user directly through an application interface (such as a new Web application). The ODS can be a source of data for the warehouse. Batch processes will be used to populate the data warehouse. The ODS complements or extends the operational systems. It is not intended to replace them. Although most sources will be used to populate both the ODS and the data warehouse, two data acquisition streams will probably exist due to the temporal differences in the data required.

For example, the data warehouse may require a monthly inventory snapshot whereas the ODS may require an up to the minute inventory status

Data flows from the operational systems to the ODS through the data acquisition layer. Updates to the ODS can be real-time, store and forward, and/or batch. In a real-time environment changes are applied to the ODS immediately, for example, using the same operational application. A store and forward scheme may use tools such as replication or messaging to populate the ODS. Changes which are only required daily, for example, could use a normal batch process. Operational systems are not updated from an ODS type A.

The ODS type B includes the characteristics of an ODS type A plus the additional feature of an asynchronous triggering mechanism. This triggering mechanism is used to send ODS changes back to the operational systems.

Data flows back and forth between the data sources and the ODS through the data acquisition layer on a real-time basis. The ODS becomes the single source for much of the corporations key operational data.

ODS
ORGN. USERS SIZE GROWTH STRUCT. UPDATE VOLATILE METADATA DESIGN

DW

SUBJECT SUBJECT LARGE NUMBER FEW SMALL VERY LARGE 20 - 30 % Pa 50 - 180 % Pa NORMALIZED DeNORMALIZED SEVERAL NONE YES NO YES YES PROCESS DRIVEN DATA DRIVEN

Issue How Built

Operational

Warehouse

One application at a time in One or more subject the legacy environment or one areas at a time subject area at a time in the ODS Requireme Known Vague nts

Data Access
Critical to

Smaller number of rows retrieved in a single call


Daily Business operation

Large set of data is scanned to retrieve results


Management Decisions that may affect profitability

Issue Tuning

Operational

Warehouse

Highly tuned for frequent Tuned for infrequent access to access to small amounts of larger quantities of data data Volume needed for daily operation Data retrieved to meet daily requirements Larger volumes needed to support statistical analysis, forecasting, ad hoc reporting, and querying Data retained longer to support historical reporting, comparison , analysis etc. Usually does not require as high availability as the production environment unless world wide access is necessary

Data volume Data Retention

Data currency

Must be up to the minute

Data Warehouse
Designed for analysis of business measures by categories and attributes

OLTP
Designed for real-time business operations

Optimized for bulk loads and large, Optimized for a common set of complex, unpredictable queries transactions, usually adding or that access many rows per table retrieving a single row at a time per table Loaded with consistent, valid data; Optimized for validation of requires no real time validation incoming data during transactions; uses validation data tables Supports few concurrent users relative to OLTP Supports thousands of concurrent users

At one hand, ODS is decidedly operational. It provides high response time and high availability and is certainly qualified to act as the basis of Mission Critical Systems. On the other hand, ODS has some very clear DSS features. The ODS is integrated, subject oriented and supports some important kinds of decision support system.

ODS sits between the legacy applications & the DW. It is fed by integration & transformation programs. These program may be the same that feed to DW or different ODS feeds data in to data warehouse. Some operational data traverse directly to DW through I/T layers. Some data passes from the operational foundations in to I/T layers, then to ODS and on to DW.

Complex Structure

ODS is enablement of integrated, collective online processing It support online updates. Integrated many applications. It provide view of the enterprise. It provide decision support processing

Underlying technology Design Monitoring & maintaining

Two types of users


Farmers (same task repetitively, look for small amount of data, always

get what they are looking for, work in structured world Structured data Structured processing Structured procedures and so forth)

Explorers (antithesis of farmer, operate in random manner, does not

know what he/she is looking for, operate in heuristic mode, very large set of data. Look for
Associations, Patterns Relationship
Not yet discovered)

Nothing Huge gold mines Unstructured manner

Satisfy the need of both Classical Design:


DSS environment with a data model, which reflects the

From the data model are generated normalized tables. Tables are known as logical model Tables are combined in to a form of physical design that can be termed as lightly normalized design. Tables are combined on the basis of containing common keys and general common usage. There is a fly in the ointment of this approach
Performance where many tables must be joined Performance where many occurrences of the data User may find it unnatural to join many tables.

informational needs of the corporation.

Second Approach:
Volume & usage Volume & usage of the data are factored in to design,

a mutant form of normalization is achieved. The normalization turn in to heavy normalization A structure star join is created.

Star Join:
Two parts Fact Tables (represent the structure that holds the majority of the occurrence of the data, it combine data and cross reference keys from a variety of other tables)& Dimension tables ( contain data which is not terribly voluminous, related to fact tables by foreign key) Fact tables are efficient to access because data has been pre-joined in to table at the moment of loading

Star Join:
Usage of the data must be known in advance. With out knowing pattern of access & usage of the

data it is difficult to design the fact tables. One department may look differently for the same of data in comparison to other. Star join for finance may be different from join for production.

Normalized Structure
Inefficient to access Holds modest amounts of data Applicable to a wide audience Handles updates

Star Structure
Efficient to access Holds large amount of data Applicable to a restricted audience Does not handle updates

ODS environment serves both operational & DSS environment, the ODS is built with both a waterfall operational & a spiral DSS methodology Water fall methodology

Requirements gathering & assimilation


Analysis & systemization Design Programming Testing implementation

Legacy systems ETL Tools Operational Data store Access Tools

Legacy systems: ERP, CRM, Web or any legacy system, where in operations data is recorded. ETL Tools: These tools are used to extract, transform and load data from legacy systems to operational data stores. Operational data store BI Tools: for analyzing the data & generating reports

Legacy systems : data is extracted from e-mails, direct mails, telemarketing, kiosk, stores, call centers, web using ETL tools and stored in operational data stores. Operational data store. Data warehouse. BI tools or Analytical tools.

Gartner introduced the concept of zero latency strategy which means any strategy that exploits the immediate exchange of information across technical and organizational boundaries to achieve business benefit. Organizations that can make decisions based on upto-the-second information and apply those decisions to operational systems and business processes are known as ZLE. Pull & Push Process

You might also like