You are on page 1of 28

Using Pre-Packaged Data Models to Support Rapid BI Development

David Schoeff, Teradata Corp. Jeff Hoffer, University of Dayton

Jeffrey A. Hoffer

Overall Agenda
Overview of iLDMs Learnings from case studies of iLDM application Workshop on using iLDMs in your organization

Jeffrey A. Hoffer

What Weve Learned From Contrasting Case Studies


Jeffrey Hoffer University of Dayton

Jeffrey A. Hoffer

Agenda
Learning Resources Ive Used Traditional Database Development Processes
Life cycle Prototyping

Case Studies of Rapid Development with LDMs


Overall process Data mapping More general rapid BI environment that made it feasible and successful.
Jeffrey A. Hoffer 4

Learning Resources
On www.teradata.com
Search on Hoberman or logical data models, especially see
Leveraging the Industry Logical Data Model as Your Enterprise Data Model

Search on agile business intelligence

On www.beyenetwork.com
See Dan Linstedt blog, and The 2-Month Data Model by Bill Inmon Search on logical data models or industry data model

On www.tdwi.org
In White Papers, search on agile business intelligence or industry data model

Hay, D.C. 1996, Data Model Patterns: Conventions of Thought, and 2006, Data Model Patterns: A Metadata Map Silverston, L. various dates, several volumes of The Data Model Resource Book and various articles from 2002 in DM Review Moss, Larissa, President, Method Focus see articles, seminars on agile BI And, of course, there is Modern Database Management.
Jeffrey A. Hoffer 5

Traditional (Invented Here) Database Development Process


Tuning: integrate new requirements, improve, fix (mini cycles of Analysis, Design, Implementation)
Conceptual/Enterprise Data Modeling: scope, ISA, EDM

Database Definition: schema, documentation, installation, training

Physical /technical database design: technology design, performance

Conceptual Data Modeling: detailed metadata


Logical Data Modeling: integrate, normalize, integrity, 6 security

Jeffrey A. Hoffer

Database Development with Prototyping (A Learning Together Approach)


Conceptual Data Modeling: preliminary CDM
Identify Need
Initial requirements

Develop Initial Prototype

Logical Database Design: detailed requirements Physical Database Design: new database contents , structures, programs Database Implementation: coding, integrate contents Database Maintenance: evaluate and enhance
7

Database Maintenance: tune, improve for performance

Convert to Operational Form

Working prototype New requirements

If prototype is inefficient

Implement & Use Prototype

Deficiencies

Next version Jeffrey A. Hoffer

Revise & Enhance Prototype

Two Case Studies: LDMs Work in a Variety of Situations


Case Study A
On-line retailer Young Highly competitive, rapidly changing Information-driven Dynamic, immersed leadership team Turbulent period, needed solution quickly Business analysts embedded in units LDM as golden model
Jeffrey A. Hoffer

Case Study B
Technology provider Mature Innovative, detail-oriented, comprehensive Highly analytical Decentralized leadership team Constant pressure and environmental changes Diversified structure for business analysts Internal systems as golden model
8

Data Modeling Process Changes for Rapid BI: Case Study A


Background:
Hoffer, Watson, and Wixom Large, on-line retailer
>300 hourly/daily reports >400 Business Object IDs also SAS, on a Teradata EDW platform

Critical need to get a BI environment up before the next Christmas buying season (core needs of marketing, merchandising, and auction parts of business met in 9 months) Limited internal resources due, in great measure, to simultaneous implementation of a new ERP operational system.
Jeffrey A. Hoffer 9

Overview of Results: Case Study A


LDM was about 80% right before customization (used several LDMs for different industries represented by companys offerings)
Cost of an LDM is about one DBA for one year Saved time, improved quality, less re-work

LDM did not allow them to develop new environment piecemeal needed quick start with a solid foundation for future of rapidly changing business enterprise perspective from beginning Collaboration of external consultants
3 for one month, 2 for another 5 months, 1 for another 6 months and internal data analysts Key for short- and long-term success was to involve internal data analysts, who do evolution of data modeling

Acquisition of the LDMs was one of the key strategic things (we) did to gain quick results and long-term success with data warehousing and BI. DW Director.
Jeffrey A. Hoffer 10

Overview of Results: Case Study B


Why did they use LDMs?
Use data consistently throughout BI applications Adhere to government regulations Understand data across organization using common names
Supplier = Vendor, Commodity = Material

Comprehend transformations (part of LDMs)


Can combine / use for analytics data we didnt know could be analyzed together

Allows for normalized data structures to be traversed from any where to any where without introducing reporting anomalies Allows for quicker building of dimensional star schemas (dependant data marts) because of ease to negotiate data structures.
Jeffrey A. Hoffer 11

Database Development with LDMs


Identify Need Evaluate Alternative Packages

Initial Infrastructure

Applications & Infrastructure Evolution

Customize LDM 6 months from need to first application 2 weeks for data model 90 days for first application 9 months from need to all phase I applications
Jeffrey A. Hoffer

2-week release packages


Evolve for Evolve for New Needs Evolve for New Needs New Needs
12

Application package development overlaps

Observations About Customization


Identify entities, attributes, relationships in the LDM those you need for the future
Concentrate on details for those you need first Create a phased roadmap (can use entity clustering to show this functional decomposition for data)

Customize LDM

Rename data to local terms Refine LDM to local business rules Map LDM data to current databases (e.g., to design migration plans and load processes)
Jeffrey A. Hoffer 13

What Is Mapping?
The process of relating each LDM data element with a source
Do we need it? (now, later?) from either current systems or LDM Where do we get it? When do we get it? How do we define it and what do we name it? Does it need to be transformed? Or do we need more atomic source? Does source system need to be improved?

It is NOT about resolving conflicts between source systems or fixing source systems It is NOT about designing/writing the ETL.
Jeffrey A. Hoffer 14

Key Points About Mapping


Some elements will be missing in LDM and current databases these become obvious because of LDM
Are mismatches really needed? Avoid temptation to always accept current databases as tie-breaker Encourage thinking of the possibilities from elements in LDM not in current databases Current databases are often poorly documented, which makes process difficult Watch for duplicate, inconsistent entries of the same data in different databases.
Jeffrey A. Hoffer 15

Key Points About Mapping


The LDM is comprehensive in business rules (e.g., cardinalities and generalization) and can be complex; thus it is flexible to change
Do you really need all this complexity? Do we need something more restrictive? Does comprehensiveness suggest opportunities? Smartly tailor LDM to organization LDM updates can react to changing standards and regulations Current environment likely has different standards and regulations for different sources.
Jeffrey A. Hoffer 16

Key Points About Mapping


Engage users and managers early because you have a validated prototype data model from the start the LDM provides a visual, comprehensive checklist of possible questions
Would we ever have a customer order with more than one customer? Might an employee also be a customer? Give special attention to elements of LDMs that SMEs did not mention in interviews Will we ever go in that direction? a basis for impertinence its all about the questions you ask!
Jeffrey A. Hoffer 17

Key Points About Mapping


Mapping is critical cant afford to do a bad job Mapping projects are great student projects in a capstone course requires integration of data and systems knowledge and skills, with understanding of differences across platforms, ETL, timing, etc.
Jeffrey A. Hoffer 18

More on Customization
Even with good mapping, do data profiling to identify overloading, obsolescence, empty columns, hidden (undocumented) requirements, outliers the proof is in the data
Understand reasons for inconsistencies
Poorly designed databases Accuracy of current data, which you do not want to migrate to new database for analytics a time for data cleansing

Investigate reasons for missing data for mapped attributes


Application software errors, human data entry errors, optional data (subtypes).
Jeffrey A. Hoffer 19

Data Profiling a Must


Profiling = statistical analysis to uncover hidden patterns and flaws Look for outliers Sorting by date can reveal overloading and patterns for empty values, or when data moved columns over time, or shifts in data Can match shifts in data to major system changes Empty columns can imply entity subtypes Wide tables can imply denormalization, which can encourage erroneous data Can be used to identify flaws in current systems, need for cleanup efforts, and need to improve database design.
Jeffrey A. Hoffer 20

A Chance to Verify Business Rules


Verify each business rule (in the LDM) for your organization
Review metadata (names, definitions, data types, formats, lengths, cardinality, etc.) with the best SMEs Business rules dictate transformations of operational data into analytical database Different operational systems may = different business rules.
Jeffrey A. Hoffer 21

Observations About Evolve


As new business needs arise, conduct mini customization projects to extend current implementation from LDM with a different focus (the LDM implementation easily scales as an architectural foundation for agile development) Dynamic businesses will yield extensions to LDMs, so vendors like feedback LDMs provide the flexibility and speed to react to (to anticipate) new needs BI systems are not complex (although the infrastructure is), which is why LDMs are valuable and agile development works.
Jeffrey A. Hoffer 22

PMI View of Agile Project Management

Source: Sliger, M. A Project Managers Guide to Going Agile, Rally Software Development Corp., 2006 Jeffrey A. Hoffer 23

Typical Evolve Scenario

Jeffrey A. Hoffer

24

An Environment Conducive to Rapid BI: Case Study A


Organizational Climate
Compelled to do rapid development of infrastructure and applications
Business moves quickly dot.com or swarming mentality when leadership turns their focus to it Attitude of weve defined it, lets get it done, then move on perfection not critical

Leaders see firm as an information company


An interaction of technology and retail Using technology and information well is a competitive advantage Needed a drastic change to jump start the transformation the LDMs
LDM also overcomes the hazards of swarming lack of architecture/plan.
Jeffrey A. Hoffer 25

An Environment Conducive to Rapid BI: Case Study A


LDM and Organizational Fit
LDMs essentially modify the agile approach initially by making the business define core requirements up front infrastructure but still supports iterative evolution
A balance to swarming

Leadership team sets priorities and is willing to evolve in phases (normal agile chunk approach)
Synergistic initiative gets greatest attention LDM supports iteration, which builds trust Incremental changes (2-week chunks of work) shows continuing commitment (rather than one time, big bang change), which also builds trust.
Jeffrey A. Hoffer 26

An Environment Conducive to Rapid BI: Case Study A


Need tech and business savvy people
Business analysts embedded in each business area (removes bureaucracy), and report to both VP of business area and head of BI applications, which creates deep knowledge about business and facilitates rapid development Business managers with strong technical aptitude and skills a hiring priority..

Jeffrey A. Hoffer

27

Workshop Questions
To start, do you have any questions about the iLDM? How does your ERD match up with iLDM? What difficulties do you have merging the iLDM with your ERD? In your environment, which model trumps the other and why? Is the iLDM more than you need? Why? How deal with that? Are there things missing in iLDM that you need in your environment? What kinds of resistance would you get for using an iLDM? How would you make use of an iLDM in your environment?
Jeffrey A. Hoffer 28

You might also like