You are on page 1of 21

SQL Server Business Intelligence

Integrate
Data acquisition from source systems and integration Data transformation and synthesis

Analyze
Data enrichment, with business logic, hierarchical views Data discovery via data mining

Report
Data presentation and distribution Data access for the masses

SQL Server Integration Services


Successor to Data Transformation Services
New architecture and implementation Existing DTS packages can be run or migrated

Design time experience in VS 2005 or Business Intelligence

Development Studio Administration in SQL Server Management Studio Extensibility through .NET code

What is it?
Microsofts ETL solution bundled with SQL Server E Extract

T Transform
L Load

Source

Read

SSIS

Write

Destination

Why?
Data exists in many places and in many types. Data is more useful when consolidated.
How do we get data from multiple systems and locations to a central place? How do we convert data? How do we consolidate it? How we structure it so it matches our business domain? How do we make sure every insert, delete and update does not rot the data. How do we fix dirty data?

Data gets dirty.

Administrators are lazy expensive.

How do we do more with less?

Reasons To Consider SSIS


Merging Data from Heterogeneous Data Stores
Populating Data Warehouses Cleaning and Standardizing Data Building Business Intelligence into a Data Transformation

Process Automating Administrative Functions and Data Loading

Reasons Not Consider SSIS


Performance

Consider SQL or BCP for simple imports


File system performance

Data Latency

SSIS is not a near real time solution


No business rules support

SOA, ESB, B2B Integration

Very basic queue support


XML support limited

The SSIS Platform


SSIS provides a platform to component developers
Extensibility points
Control Flow

Tasks, Loop enumerators, Event handlers, Log Providers Data Flow Sources, Destinations, Transformations, Connection Managers

SSIS - Architecture
Visual Studio/ SQL BI Studio SQL Management Studio

Package IS Object Model DTUtil

IS Service Package Store msdb Cfg

SSIS - Packages
Control Flow Package (XML) Container Task

Variables
Event Handlers Connections

Precedence Log Providers Task

Configurations
Control Flow DataFlow Task
Path Source Transform Dest

SSIS - Architecture
SQL Management Studio Package List, Monitor DTExec

Package

IS Service
Package Store msdb

Cfg

Whats New in 2008


Lookup transformation performance improvements and

new caching options ADO.NET Source and destination components Data profiling task and viewer Wizard interface for defining source and destination Scripts (for the Script Transform) are now done in Visual Studio and thus in .NET languages. New package format Three new data formats for working with times

What to Watch Out For?


RecordSet Destination

SLOW (5 times than a raw file!) Memory SSIS is an in memory process. SELECT * Exceptionally bad in SSIS Use many small packages Comments!!! Understand the components Many do the same things in different ways with different trade offs Lookup vs. Merge Join or Execute SQL vs. Execute T-SQL Understand which components run asynchronously and which run synchronously

How Does the Data Flow Work?


In memory ETL engine Metadata set at design time Data is moved through the pipeline in buffers

Transforms and Buffer Copies


Row Based
(synchronous)

Logically works row by row Row Count, Derived Column Buffer is reused

Partially Blocking
(asynchronous)

Works with groups of rows Merge, Merge Join, Union All Data copied to new buffers Needs all input rows before producing any output rows Aggregate, Sort, Pivot

Blocking
(asynchronous)

The Data Pipeline


Destination

Source

Transform

Multicast

ClientI D 100023 100023 100023 100023

OrderID 200802223 200809933 200812492 200833298

ProductI D 432 233 222 567

Destination

Designing Your Data Flow


Orders Customers

Sort

Sort

Merge Join Merge Join Products Sort

How Do I Deploy My Packages?

File System

SQL Server

Package Store

Deployment Utility

Deployment SQL Server


Files are stored in MSDB Execution typically done with SQL Agent
Make use of Proxy Accounts

Store configurations in SQL Server

Deployment Package Store


Deploys to the SSIS service Front end to File and SQL storage

You might also like