Professional Documents
Culture Documents
DQS BOOTCAMP
Microsoft
Top 3 impediments
Demand is on the rise. Overall market size for DQ software in 2010 was $800M. 12.6% increase over 2009. Forecasted 16% yearly grow in next five years.
- Gartner, 2011
Its not only the breadth of functional capabilities. Focus on the business User. Leverage your business resources.
- Gartner, 2011
Business process For data quality (and MDM) initiatives to be a success they need to support integration with the existing business processes
20.1% 30.4%
Data Quality
Issue
Are data elements consistently defined and understood? Is all necessary data present?
Does the data accurately represent reality or a verifiable source? Do data values fall within acceptable ranges? Data appears several times
Before
John Doe
Jane Doe
Name
Male
Gender
Jonathan ln
Street E 60th St Jonathan Lane
36
House # 45W 36
10023
Zip code 10022 10023
Poughkeepsy
City New York Poughkeepsie
NY
State NY NY
21-dec-1954
D.O.B 08/12/64 12/21/54
After
Completeness Before
Accuracy
Address
Conformity
Consistency
Postal Code 34563 34563-2341 City Anytown Anytown Anytown NY City Anytown Anytown Anytown NY
Uniqueness
State New York New York New York NY State New York New York New York NY Cluster 1 1 1 2
Name John Smith Margaret & John smith Maggie Smith John Smith Name John Smith Margaret & John smith Maggie Smith John Smith
545 S Valley View Drive # 136 545 Valley View ave unit 136 545 S Valley View Dr 545 Valley Drive St. Address 545 S Valley View Drive # 136 545 Valley View ave unit 136 545 S Valley View Dr 545 Valley Drive St.
34253
Zip Code 34563 34563-2341 34253
After
10
Amend, remove or enrich data that is incorrect or incomplete. This includes correction, enrichment and standardization .
Cleansing
Matching
Analysis of the data source to provide insight into the quality of the data and help to identify data quality issues.
Profiling
Monitoring
Tracking and monitoring the state of Quality activities and Quality 11 of Data.
11
INTRODUCE DQS
Multiple Secondaries
Reporting Alerts
Power View
Distributed Replay
Availability Groups
SharePoint Active Directory Support
T-SQL
Data Quality Services (DQS) is a Knowledge-Driven data quality solution enabling data stewards to easily improve the quality of their data
High quality data is critical to effective business intelligence and to business activities
DQS is an on-premise Data Quality product in SQL Server 2012, extendible with knowledge from multiple parties thru Azure DataMarket
Richer DQ knowledge and capabilities in the cloud will make it even easier to provide high quality data
14
Based on a Data Quality Knowledge Base (DQKB) Data Domains capture the semantics of your data Acquires additional knowledge the more you use it
Add user-generated knowledge & 3rd party reference data providers User experience designed for increased productivity
Knowledge Management
Build
Connect
Integrated Profiling
Knowledge Base
Use
DQ Projects 16
DQ Clients DQS UI
Knowledge Discovery and Management Interactive DQ Projects
DQ Server
DQ Engine
Data Exploration
Knowledge Discovery
Matching
Reference Data
DQ Projects Store
DQ Active Projects
MS Data Domains
Published KBs
Define
With DQS the IW / Data Expert can get actively involved in Data Quality initiatives
Knowledge-Driven
Rich semantic Knowledge Base Continuous improvement as knowledge is discovered Build once, reuse for multiple DQ improvements Focus on cloud-based Reference Data User-generated knowledge Integration with SSIS and MDS Focus on productivity and user experience Designed for business users Out-of-the-box knowledge (DQ content)
Easy to use
http://northamerica.msteched.com
www.microsoft.com/teched
www.microsoft.com/learning
http://microsoft.com/technet
http://microsoft.com/msdn
DQS Blog
Tips, tricks and guidance on best practices for using DQS courtesy of the DQS team
DQS Movies
A set of getting started movies for an easy introduction to DQS
DQS Forum
Come participate in DQS related discussions in our DQS forum on MSDN
blogs.msdn.com/b/dqs
Available Here
Available Here