ats _ om
How to Meet the
Demand for
Integrated Data
Integrate and translate data into abusiness context for analysisPr
PRO+
a AOR EE E-guide PRO.
MLL
Inthise-guide
[AData integration platforms
take users beyond ETL
software
'RHow to justly the purchase of
‘a data integration tool
IRHow to evaluate the features
‘ofa data integration products
[ASclecting the right data
Integration too for your needs
[AGetting more PRO+ essential
content
age ott
ILLLLALALLLLLLILILLLILLALULALLLLDILLLLLLLLALLALLLILILOLLALLLLALLLOLIDILALLLLALLDY
In this e-guide:
‘The demand for data that has been integrated and translated
into a business context for analysisis making effective data
integration more important than ever to business success.
In this guide, learn how commercial data integration platforms
can help your organization manage and simplify the process of
sharing the increasing volumes of data being generated and
collected. Next, explore how other organizations are using
these platforms to meet their needs.
Our experts also reveal themust-have, should-have and nice-
to-have features for data integration tools, and provide
independent analysis on the top commercial and open source
data integration tools for organizations of all sizes.Pe Ek E-guide PRO+
MLL
Inthise-guide
[AData integration platforms
take users beyond ETL
software
'RHow to justly the purchase of
‘a data integration tool
IRHow to evaluate the features
‘of data integration products
[ASclecting the right data
Integration too for your needs
[AGetting more PRO+ essential
content
Page2ot3t
ILLLLALALLLLLLILILLLILLALULALLLLDILLLLLLLLALLALLLILILOLLALLLLALLLOLIDILALLLLALLDY
Data integration platforms take users
beyond ETL software
Fick Sherman, Managing Partner - Athena IT Solutions
Whether we're discussing the impact that the tsunami of big data is having
‘on organizations or the cloud application takeover of traditional on-premises
‘applications, the common foundation of such trends is an increasing
demand for data. More accurately, there is a need for data that has been
integrated and translated into a business context for analysis. That demand
is making effective data integration already a key component of data
‘warehouse environments ~ even more important to business success.
Data integration involves taking data—- often from multiple sources — and
transforming it into meaningful information for business executives, data
analysts and other enterprise users. As the need to share the growing
volumes of data being generated and collected by organizations increases,
turning to commercial data integration platforms is one way to help manage
~-and simplify ~ the process.
What are data integration platforms?
Packaged data integration software began withextract, transform and load
(ETL) tools designed to automate efforts to pull data from source systems,
convert it into a consistent format and load it into a data warehouse or otherPr
prea
Inthise-guide
[AData integration platforms
take users beyond ETL
software
'RHow to justly the purchase of
‘a data integration tool
IRHow to evaluate the features
‘of data integration products
[ASclecting the right data
Integration too for your needs
[AGetting more PRO+ essential
content
Pagesot3t
MMM
E-guide Content
MMM
target database. The first generation of ETL tools consisted of simple but
‘expensive code generators with limited functionality. Many of the companies
that evaluated these tools found it more effective to develop their own
custom integration code. Second-generation ETL software offered more
functionality, but was primarily batch-oriented and didn't perform well, Based
‘on those two sets of tools, many IT managers felt that ETL software wasn't
worth the cost or the effort to learn, as it wouldn't meet their performance
needs.
But, over the years, ETL tools have evolved in several key areas, including
development, operational processing and integration functienality. To make
them a more viable development platform, ETL vendors added support for
‘code management, version control, debugging and documentation
‘generation. For operational processing, the tools now have builtin
functionality such as error handling, recovery/restart, run-time statistics and
scheduling
‘As the industry gained experience and sophistication in data integration,
best practices were developed that were then added into ETL tools as
prebuilt transformations. These transformations include mectanisms for
‘change data capture slowly changing dimensions, hierarchy management,
data connectivity, data merging, reference lookups and referential integrity
checks. Data integration performance has increased significantly by
leveraging memory, parallelism and various data transport architectures,
In addition, a variant of ETL tools emerged called extract, load and
transform (ELT). These tools eliminate the need for a separate application
server dedicated to ETL functionality and can be deployed at either the dataPr
eer
Inthise-guide
[AData integration platforms
take users beyond ETL
software
'RHow to justly the purchase of
‘a data integration tool
IRHow to evaluate the features
‘of data integration products
[ASclecting the right data
Integration too for your needs
[AGetting more PRO+ essential
content
Paget
E-guide
PRO+
Content
MMLC
‘sources or target systems based on their capacity and configurations. The
ELT approach lets users store raw data as is and then transform all or
‘subsets of it as needed for specificbusiness intelligence (Bl) and analytics
applications,
ETL tools evolve into data integration platforms
Data integration needs also expanded beyond the core ETL use of loading
enterprise data warehouses; data marts; and BI data stores, such as OLAP
cubes, to include these tasks:
B2B integration
Cloud integration
Application and business process integration
Data migration
Data consolidation
Data quality and cleansing
Master data management
{As that happened, the folowing integration categories emerged as well,
targeting specific uses and technologies:
Enterprise application integration (EAl). Often referred to simply as
application integration, this subcategory, which supports interoperability
‘among different applications, is enabled through Web or data services
created using service-oriented architecture andindustry standards such asi PRO+
PRLS E-guide Content
MUA
Inthise-guide
[AData integration platforms
take users beyond ETL
software
'RHow to justly the purchase of
‘a data integration tool
IRHow to evaluate the features
‘of data integration products
[ASclecting the right data
Integration too for your needs
[AGetting more PRO+ essential
content
electronic data interchange. An enterprise service busts a common
architectural approach to implementing EAI functionality
Enterprise messaging system (EMS) This technology focuses solely on
providing messaging among disparate applications using structured formats
‘such as extensible markup language and JavaScript Object Notation. EMS
tools offer a lightweight integration service that can effectively provide reat
time data updates from different data sources,
Enterprise information integration (El).Ell~initaly known as data
federation provided a virtual view of disparate data sources, but had
limited integration capabilites. The current generation, called data
virtualization software, provides both data abstraction and data services
layers to a wide variely of sources, including structured, semi-structured and
unstructured data.
Cloud-based integration. Also referred to as integration platform as a
service (iPaaS), cloud-based integration emerged to provide reaHtime
interoperability between cloud-based applications and databases. These
tools are deployed as a cloud service leveraging EAl and EMS functionality
Eventually, vendors put the various pieces together and began offering full
fledged data integration suites that provide hybrid capabilities spanning ETL,
application integration, clouc-based integration, real-time integration and
data virtualization, as well as data cleansing and data profiling tools. The
suites can support data integration processes in traditional batch mode, or
inreal- or near-real-time through the use of Web services. They can also
handle both on-premises and cloud data and less structured information —Pr
eer
Inthise-guide
MMM
E-guide Content
MMM
system logs, text and other forms of big data, for example —along with
structured transaction data.
[AData integration platforms
take users beyond ETL
software
'RHow to justly the purchase of
‘a data integration tool
IRHow to evaluate the features
‘of data integration products
[ASclecting the right data
Integration too for your needs
[AGetting more PRO+ essential
content
Paget
Dispelling data integration tool myths
If used correctly, data integration platforms greatly improve user
productivity and integration flexibility, scalability and expandability over
‘custom manual coding (see sidebar). But hand coding, either by IT workers
writing SQL scripts or business people using spreadsheets, i stil being
done extensively in organizations.
‘There are several reasons why IT groups believe they should manually write
code rather than use a data integration platform; however, these beliefs are
usually based on the following misconceptions or myths:
Integration tools are too expensive. There's a lingering perception left over
from the early days of ETL that expensive tools are the only choice, but
many data integration platforms priced forcost-sensitive budgets are now
available in the market.
Highly skilled resources are required. Another false perception is that an
enterprise looking to use commercial software needs data integration
developers experienced in the legacy ETL tools that required extensive
skills rather than the newer, easier to use data integration platforms,
Coding is cost-free. There's an inherit bias for the IT staff to generate SQL
code: They know SQL and can create code init quickly, and there are noi PRO+
PRLS E-guide Content
MUA
Inthise-guide
[AData integration platforms
take users beyond ETL
software
'RHow to justly the purchase of
‘a data integration tool
IRHow to evaluate the features
‘of data integration products
[ASclecting the right data
Integration too for your needs
[AGetting more PRO+ essential
content
age ot
license or subscription costs. But what starts out as a simple SQL script can
‘quickly snowball into numerous scripts or stored procedures, creating a
hodgepodge of often undocumented integration processes. Making
‘changes to that code takes longer as it gets more complex, consuming more
‘and more resources just to maintain it,
The data integration platform market
‘A variety of data integration platforms are available, but the market is led by
IBM, Informatica, Information Builders, Microsoft and Oracle. Other vendors
considered leaders either by market share or thought leadership include
Pentaho, SAP, SAS and Talend,
All of these vendors sell data integration products that are deployed on
premises, but can integrate data that resides on premises or in the cloud,
‘Also, both Pentaho and Talend offer open source versions of their products,
‘along with paid-for enterprise versions. Although pricing is a separate
discussion and will be covered more in-depth in a later article, Microsoft is
tunique in that it bundles its data integration product with its database rather
than selling it separately
Data integration continues to primarily be an IT-centric activity based on the
data, database and technology know-how needed. Typically, data
integration platforms are purchased, managed and used by IT groups
responsible for Bl, data warehousing, master data management and other
data management programs. These groups should have the skills and
‘experience to successfully utlze the platforms, Some leading-edgePRO+
E-guide Content
MMLC
oO io
eer
@LLLLLLLLLE™ enterprises with multiple integration use cases and separate IT groups
In thise-guide ‘addressing those uses have created integration competency centers to
a manage their data integration platforms from an enterprise-wide perspective
bats ntogyavon plaionne in an effort to avoid integration and, ultimately, data silos.
take users beyond ETL WILLLLALALLLILLILIILLILLALLLALLLLLILLLALLLLALLALLLILILOLLALLLLLALLOLILOLAALLALALLDY
software
‘SW Next article
'RHow to justly the purchase of
‘a data integration tool
IRHow to evaluate the features
‘of data integration products
[ASclecting the right data
integration tool for your needs
[AGetting more PRO+ essential
content
Papesot3tPRO+
Pe Ek E-guide
MMMM
Inthise-guide
[AData integration platforms
take users beyond ETL
software
'RHow to justly the purchase of
‘a data integration tool
IRHow to evaluate the features
‘of data integration products
[ASclecting the right data
Integration too for your needs
[AGetting more PRO+ essential
content
Paget
ILLLLALALLLLLLILILLLILLALULALLLLDILLLLLLLLALLALLLILILOLLALLLLALLLOLIDILALLLLALLDY
J How to justify the purchase of a data
integration tool
Fick Sherman, Managing Partnor-Aona IT Solutions
‘The growing importance of business inteligence and data analytics
plications in driving business decision making has made data integration's
vital role in the enterprise crystal clea. From gathering data, transforming It
into useful information and delivering it to the business users or processes
that need it, data integration routines provide the crucial ink between a
variety of source and target systems,
{As the first article in ths series examined, several types of packaged
software have emerged to meet the challenges of data integration. The
‘current generation of data integration tools consists of ful-fledged suites
that support extract, transform and load (ETL) processes, application
integration, cloud-based and real-time integration, data virtualization, data
cleansing and data profiling.
How can youdetermine if your organization should invest in a data
integration tool? To help justify the purchase of data integration software,
let's explore how other organizations are using these platforms to meet their
needs.Pr
eer
Inthise-guide
[AData integration platforms
take users beyond ETL
software
'RHow to justly the purchase of
‘a data integration tool
IRHow to evaluate the features
‘of data integration products
[ASclecting the right data
Integration too for your needs
[AGetting more PRO+ essential
content
Page 00
MMM
E-guide Content
MMM
How companies are using data integration tools
‘We've all been hearing about the explosion in data volumes it's big data
wherever you look. But not only is there more data to be integrated, there
‘are more categories of data — including a mix of unstructured andsemi-
structured dala typesin addition to traditional structured transaction data,
‘This makes data integration more critical than ever.
‘The most prevalent use for data integration today is integrating data from
multiple sources to enable Bl and analytics applications, and it’s typically
where packaged data integration software is introduced in an enterprise.
This use case can be further broken down into these three subcategories,
which enable data to be:
+ Integrated into a data warehouse or other analytical data store.
This s the use case that started i all wth ETL extract data from
various sourees, transform it and load it into an enterprise data
warehouse (EDW), Data integration tasks account for the majarity of
Your work when setting up an EDW and populating it with data,
‘Traditionally, relational databases are most commonly used for an
EDW, but nonrelational technology such as Hadoop clusters and
‘columnar of NoSQL data stores are also increasingly being used to
Create what are known variously as hybrid, extended or logical data
warehouse environments. That further adds to the data integration
workloadoO
aa
Inthise-guide
[AData integration platforms
take users beyond ETL
software
'RHow to justly the purchase of
‘a data integration tool
IRHow to evaluate the features
‘of data integration products
[ASclecting the right data
Integration too for your needs
[AGetting more PRO+ essential
content
Page tot
MMM
E-guide Content
MMM
+ Integrated into a Bl data store dedicated to specific analytics uses.
In this case, the primary integration tasks are to transform data sets
from an EDW for use by specific business groups or areas of analysis
land then load them into a special-purpose data store, such as a data
mart, online analytical processing (OLAP) cube or columnar database.
Data from other sources may also need to be added to enrich the
information, Moving data from a relational EDW to arelational data
artis straightforward, but additional transformation work Is required
with an OLAP cube, columnar database or other nonrelational target
system,
+ Blended, prepared or wrangled for use in a BI platform. Although
‘some BI tools let users query data directly, many, such as data
discovery tools work best with data models created using data
integration tools, which are then used to load into an in memory
columnar model for analysis.
Organizations also need to be able to gather data from, and deliver it to, an
increasingly diverse mix of systems, databases and applications running on
premises and in both public and private clouds. Mobile andinternet of
‘Things (loT) applications add to the complexity, as does the use of external
data sources to augment internal information. Typical use cases for data
integration tools beyond BI include:
Migrating, consolidating or converting data from one or more
applications to another application, database or device. The best practice
for application consolidation or migration tasks has shifted from custom
coding to using data integration tools. This change is due to the productivity
gains these tools provide as well as business requirements such as data~~ ; PRO+
Se E-guide Content
MMLC
XXKLL_LLE™ validation and documentation, The advantages of tools-based integration
Inthise-guide include builtin processes for complicated business or technical
oe transformations, iterative testing and profiling data for historical data
Data negation patios conversion, and support for managing parallel testing of new and old
take users beyond ETL applications.
software ‘Acquiring and processing data for master data management (MDM).
_ Depending on the state ofthe data, its sources and uses, general-purpose
ow to justly the purchase of data integration tools may need to be augmented by speciakpurpose tools
‘adataintegration tool to.cleanse or enrich the data. A common example is when customer-related
data such as the names of people or businesses and their addresses need
to be matched, cleansed and enriched - that may call for leveraging things
Row to evaluate the features such as text-based transformations, site or address lists and business entity
‘of data integration products
databases.
[ASelecting the right data ‘Synchronizing data between on-premises systems and cloud
integration tool for your needs applications or loT devices. Although hailed as a means of lowering
technology costs, cloud applications typically must be integrated with
existing systems running on premises. The same applies to the oncoming
Getting more PRO» essential wave of data from IoT or smart devices such as sensors on industrial
content ‘equipment. All ofthis data from the cloud and loT needs to be exchanged
‘and synchronized between applications. Data integration tool capabilities
have expanded to leverage various transport mechanisms and application
program interfaces (APIs) to replace the custom coding that previously was
the only method to perform ths integration.
Exchanging data between bt
different organizations. Much of the intial wave of data exchanges
aget2ot3tE-guide Content
MMLC
oO
aa
OL between companies and their suppliers, business partners, customers and
Inthise-guide prospects were file-based transfers, but a data integration tool can
‘automate such exchanges, increasing productivity and lowering costs.
oat mlegraton patos Delivering and processing data for complex event processing and stream
a — eye processing. |nteroperability and data interaction demands between
‘operational processes such as applications, event streams, message
ae ‘queues, Web services and sensors have steadily increased the need for
FRHow to jusiy the purchase of real-time data integration. As data integration platforms have added reat
‘adataintegration tool time processing, more sophisticated workflow capabilties and support for a
wide variety of APIs, they can be used instead of the custom coding that
IRHow to evaluate the features ‘wes previously required.
‘of a data integration products Virtually gathering and integrating data from disparate systems. Even,
when an enterprise has a data warehouse or MDM hub, there are many
INSelecting the right data business scenarios when data virtualization should be used. First, sometimes
realtime access to disparate systems is crucial, such as when an account
manager or customer support representative is interacting with a customer
regarding their account or outstanding orders. Second, integrating data from
'NGetting more PRO» essential ‘specific sources may occur infrequently or in an exploratory nature:
content precluding the use and cost of integrating that data into a DW. Finally there
may be data sources that have not yet been considered to be integrated into
‘a DW, but stil need to be integrated for an operational process or analytical
analysis,
Integration too for your needs
With the ever-increasing amount of data from disparate systems that needs
tobe integrated to support business operational and analytical processes,
it's imperative to determine your organization's data integration needs and
Pageraot3PRO+
Pr E-guide
eer
LALLA
@LLLLLLLLLE™ use cases. Falling to identify data integration requirements wil either result
In thise-guide in your organization not getting the data it needs or getting itn a very costly,
time-consuming and inefficient manner. And, as already mentioned, custom-
bats ntogyavon plaionne coded data integration may actually create data silos that increase data
‘take users beyond ETL inconsistency.
sofware ‘Once youve determined that your organization could benefit froma data
integration tool, the next step is to examine how the features and functions
'RHow to justly the purchase of of this software match your needs.
‘a data integration tool LALLLALILLLLLLILILLLALLALLLLLLLLLILLLLLLLLALLALLLILLLLLALALLLLLLLLILILLALLLAALDD
‘W Next article
IRHow to evaluate the features
‘of data integration products
[ASclecting the right data
integration tool for your needs
[AGetting more PRO+ essential
content
Page oti PRO+
BeCCR ea E-cuido m
Pr
MMLC
ULLAL Leda
In this e-guide
\ How to evaluate the features of a data
Data intesration patos integration products
take users beyond ETL
saftware Fick Sherman, Managing Partner Aten Soktons
——————— Selecting the right data integration product is crtical to meeting the
Pow to sty the purchase of increasing demand in companies for data that can help drive more informed
‘data integration oo business decisions. The tool you choose to integrate and translate this data
into information that can generete actionable business insights must fulfl
your organization's requirements. Otherwise, it will become expensive,
Unused shelfware. Even worse, custom manual coding of integration scripts
with all its downsides —will prevail
IRHow to evaluate the features
‘of data integration products
INSelecting the right data ‘The data integration product evaluation process starts with gathering and
integration tool for your needs prioritizing requirements such as source and target systems, the types of
data you have to pull together and the forms of integration that will be
needed. There can be a lot of variables in those requirements. For example,
'NGetting more PRO» essential you may have a mix of structured and unstructured data to integrate. And
content the data integration platforms now offered by vendors support a variety of
integration use cases: extract, transform and load processes; application
integration; cloud-based and real-time integration; data virtualization; data
cleansing and data profling
‘Once you have the requirements in hand, you can move on to creating a list
of specific features and functions to evaluate products against. Ultimately,
your organization needs to select the data integration tool that's the best fit
agetsot31PRO+
Pr E-guide
eer
LALLA
@LLLLLLLLLE™ for its use cases and budget, and one that can be implemented given your
In thise-guide enterprise's resources and skils — not necessarily the most featureaden
product, or one deemed the best by industry analysts.
[AData integration platforms
take users beyond ETL
software Data integration product evaluation and selection
criteria
To simplify your selection process classify the lst of features and functions
you put together as must-haves, should-haves, nice-to-haves and will-not-
use items,
'RHow to justly the purchase of
‘a data integration tool
IRHow to evaluate the features
‘of data integration products Must-have features should be unambiguous; if a product doesn't have them,
it should be eliminated from further consideration, Should-have features
‘occupy a gray area between absolutely must-have and merely nice-to-have
Selecting ihe aht data features, where certain capabilities can have a major impact on integration
Integration tool for your needs productivity, scalability and maintainability. Although nice-to-have features
ee aren't required, they're often the differentiators in selecting a product. On
[AGetting more PRO+ essential the other hand, if a data integration product includes features that aren't
content ‘going to be used, don't waste time examining those components of the
software,
‘When determining whether a product has a particular desired feature,
sometimes the answer is, "Yes, it meets the criteria, but." The buts include
‘such things as custom coding is required; an add-on product, possibly from
a third party, needs to be purchased; only specific product editions have that
feature; or the feature is slated to be added in a future release. These
age reatPr
eer
Inthise-guide
[AData integration platforms
take users beyond ETL
software
'RHow to justly the purchase of
‘a data integration tool
IRHow to evaluate the features
‘of data integration products
[ASclecting the right data
Integration too for your needs
[AGetting more PRO+ essential
content
MMM
E-guide Content
MMM
‘exceptions generally mean additional time and expense are required for that
product to meet the criteria, The evaluators need to determine how to
handle such products, both to ensure an objective evaluation and to avoid
surprises if one of them is selected.
‘An obvious response if a required feature is lacking is that you can do
‘custom coding to fil in the gaps. But beware: As part of the product
‘evaluation process, you must estimate the cost of the coding that would be
needed in terms of time, resources and opportunity loss, and then assess
whether its better to just forgo missing features -or choose a different
integration platform that offers them.
Compiling your list of data integration features
Each company’s laundry list of must-have items will differ based on its
detailed requirements. But the following core capabilities are generally
considered must-have data integration features for most organizations:
‘Access data from a mix of sources. The chosen data integration product
needs to directly access various data structures and types of information,
including:
+ Relational, columnar, in-memory and NoSQL databases, plus
multidimensional online analytical processing systems and other
specialized databases.
+ Flat files such as tab-delimited, comma separated value or
spreadsheets.oe i PRO+
De ery E-quide
MMLC
Pr
YAMLULLLLLLALAAAALALALLL Le + Application messaging technologies such as enterprise messaging
In thise-guide systems, extensible markup language and JSON.
OO + Industry'specific protocols such as Health Level Seven International
‘and Society for Worldwide Interbank Financial Telecommunication.
+ Enterprise application integration Web or data services.
+ Business applications such as ERP and customer relationship
[AData integration platforms
take users beyond ETL
swears management systems
—— + Software as a service applications.
How toustiy the purchase of + Mobile applications.
a dataintegttion tool + Unstructured data such as social media data, email, websiterelated
data and documents.
vow to wuaunte hovennures + Proprietary data protocols to communicate with specialized sensors,
vena tenslenpects devices and legacy systems such as mainframes.
ee Write data to target systems. Data integration software needs to be able to
[ASclecting the right data insert, modify and delete data in the target systems of integration processes
Integration too for your needs for example, data warehouses or operational databases that combine
Oe data from various sources for transaction processing
[AGetting more PRO+ essential
content Interact with sources and targets. An integration tool must support a
variety of data capture and delivery methods, including batch acquisition and
delivery, bulk import and extract, and change data capture. Streaming and
near-real-time data ingestion should also be a standard feature of
integration software, along with time-based and event-based data
acquisition - the latter triggered by predefined processing events in
databases or applications,
Pagereafi PRO+
PRLS E-guide Content
MUA
Inthise-guide
[AData integration platforms
take users beyond ETL
software
'RHow to justly the purchase of
‘a data integration tool
IRHow to evaluate the features
‘of data integration products
[ASclecting the right data
Integration too for your needs
[AGetting more PRO+ essential
content
Pagetoaf
‘Transform data. Basic data handling features are crucial, including data-
type conversions, date functionality, string handling, NULL processing and
mathematical functions, The same goes for data mapping capabilities such
‘as join; merge; lookup; aggregate and substitute; and for workflow support,
which enables the creation of an integration process with multiple source-
to-target mappings that are potentially interconnected based on data or
functional dependencies. In addition, integration software should provide
‘workflow orchestration that includes looping, itther-else, case style and
passing variables,
Enable effective design and development. Another key data integration
feature is a graphical design interface that simplifies the construction of
‘source-to-arget mappings and integration workflows with data,
transformations and other elements displayed in design palettes. That
should be accompanied by software development management
functionality, such as version control, support for development, testing and
production environments; and the abllty to attach comments or notes. Data
integration products also need to provide interactive testing and debugging
functionality, and the ability to create reusable and shareable components.
‘Support efficient operations. Features for managing and optimizing
integration processes are vital as well -- for example, runtime process
monitoring; error, warning and condition handling; collection of run-time
stalistics; and security management,
Provide multiple deployment options. A data integration platform must
‘support operating environments both on-premises and in the cloud, the
latter through either hosted deployments or integration platform as a servicePe Ek E-guide
MLL LLL
Inthise-guide
[AData integration platforms
take users beyond ETL
software
'RHow to justly the purchase of
‘a data integration tool
IRHow to evaluate the features
‘of data integration products
[ASclecting the right data
Integration too for your needs
[AGetting more PRO+ essential
content
Page20et3t
offerings. Virtualized servers and distributed processing environments
should also be supported across a variety of operating systems.
‘The following features aren't necessarily must-haves, but they can
significantly enhance developer productivity indesigning data
transformations
+ Support for slowly changing dimensions, if used for business
inteligence or data warehousing,
+ Customized log, error and condition handling,
+ Text string parsing and matching,
+ Data set processing, such as time series and pivots.
Other features that could be classified as should-haves include support for
team-based development and management, as well as release management
for both integration processes and the data structures that are being used,
Repository-based storage and access to process, or run-time, metadata is
another, as it enhances the ability to analyze run-time performance to
identify bottlenecks and trends,
More nice-to-have features include:
+ Self-generating documentation with graphical representations of
workflows.
+ Where-used and what-if capabilities for analyzing the use of sources,
targets and transforms,
+ Data profiling tools to analyze the information in sources and targets,
+ Data quality {oolsto cleanse and enhance data.my ;
ee Eouide
MMLC
Inthise-guide
[AData integration platforms
take users beyond ETL
software
'RHow to justly the purchase of
‘a data integration tool
IRHow to evaluate the features
‘of data integration products
[ASclecting the right data
Integration too for your needs
[AGetting more PRO+ essential
content
Page2tet3t
‘+ Integration with other vendors’ software development, management,
‘scheduling and monitoring tools.
+ Parallelization of integration processes and data loading,
Additional data integration tool selection criteria
‘The following are often included in evaluation criteria. But since they're
subjective, its important to clearly weigh their applicability and importance
to your orgenization
Loading performance This will vary based on the integration complexity,
source systems accessed and data volumes involved. The best practice is to
‘create several prebuilt integration use cases and compare how each
product performs on these specific examples.
‘Scalability. You should supplement the loading performance tests with
stress tests that simulate anticipated growth in the number and size of your
sources and targets.
Ease of use. This will vary based on the knowledge and skills of the data,
integration developers involved.
Training on the data integration product. This may include vendor in-
person classes; online classes, live or pre-recorded; or Web recordings for
specific features or processes.oO io
eer
Inthise-guide
[AData integration platforms
take users beyond ETL
software
'RHow to justly the purchase of
‘a data integration tool
IRHow to evaluate the features
‘of data integration products
[ASclecting the right data
integration tool for your needs
[AGetting more PRO+ essential
content
Page22 013
MMM
PRO+
E-guide Content
VMOU LLL
Documentation and support There should be separate criteria for
developer online help versus technical documentation. How the vendor
provides support -online Q&A for common issues, online chat, in-person
discussions and on-site and pricing of each should also beincluded in the
evaluation,
LALILLALALLLLLLILILLLALLALLLLLLLLLLLLLLLALLALLALLLILLLLLALALLLLLLLLILLLLALLLALLD
‘SW Next articleoO ic
eer E-guide PRO+
MMLC
Pr
MMT LALLA Leda
In this e-guide o 5 + ‘
Selecting the right data integration tool for
IRData integration platforms your needs
take users beyond ETL
saftware Fick Sherman, Managing Parter- Aten Soktons
——————— {As the need to pul together the growing volumes of data that’s generated
Riou tojusty the purchase of land collected by organizations has increased, several types of data
a dataintgraton oo integration software have emerged to help IT teams simply and manage
the process. But with so many products to choose from, what's the best
‘approach to selecting the right data integration tool for your enterprise? It
isn’t about picking the product with the most features, but rather the one
that best matches your integration requirements and enterprise profile.
IRHow to evaluate the features
‘of data integration products
[ASelecting the right data Before you start evaluating data integration platforms, ask some questions
integration tool for your needs inside your organization to help guide the technology selection process,
‘Your inquiries should cover the following topics:
[AGetting more PRO+ essential ‘Source systems. How many do you have? Do you have overlapping
content ‘systems, such as multiple CRM or sales processing applications? Is there
Unstructured or semi-structured datain addition to conventional structured
data? External data sources, in addition to internal ones? What are the data
volumes and frequency of updates?
Integration use cases. Do you need to integrate data for analytics
primarily through data warehousing? What about application consolidation?
Does your organization need to acquire or process data for master data,Pr
eer
Inthise-guide
[AData integration platforms
take users beyond ETL
software
'RHow to justly the purchase of
‘a data integration tool
IRHow to evaluate the features
‘of data integration products
[ASclecting the right data
Integration too for your needs
[AGetting more PRO+ essential
content
Page2 ot
MMLC
E-guide
management (MDM)? What about synchronizing data between on-premises
systems and cloud applications orinternet of Things devices? Or
‘exchanging data between internal business processes or applications and
‘ones at other organizations? Do you have to capture and deliver data for
‘complex event processing or stream processing applications? Is there a
need to integrate data from disparate systems virtually, without moving It to
central data store?
Enterprise size. What is your organization's annual revenue, how many total
employees does it have and what's the IT budget for data integration?
Resources and skill sets. Do you have dedicated IT resources to perform
data integration work? And what's the level of previous experience with data
integration tools?
‘Once you have answers to these questions, i's time to take a look at the 10
leading data integration products to see which one best matches your
needs and profile,
Data integration products for large enterprises
Large enterprises generally share the following characteristics
+ Adiverse set of source systems that often overlap with high data
volumes. Structured data sources are dominant, but unstructured
data sources, such as social media, Web server logs and fiat files, asE-guide Content
MMLC
oO
Sr
@LLLLLLLLLE™ well as semistructured data sources, such as XML or message-
In thise-guide oriented data, aso need to be integrated.
ee *+ Multiple integration use cases.
+ IT budgets sufficient to purchase any of the available data integration
tools and supporting infrastructure as necessary. That doesn’t mean
these enterprises have an open checkbook, but they have fiscal
[AData integration platforms
take users beyond ETL
sotivere means if justified
—_ + Adedicated IT group with existing data integration specialists or the
'RHow to justly the purchase of budget to hire employees or consultants who have experience using
‘a data integration tool the chosen data integration tool
Large enterprises that fit this profile should consider Informatica
Row fo evaluate he eatures PowerCenter and IBM InfoSphere Information Server for Data Integration,
of a data integration products {as these products address the entire spectrum of integration use cases.
ee Both products also provide the scalability to handle the data complexity,
[ASclecting the right data volume and velocity of large enterprises, and can be used across mutiple
Integration too for your needs projects and with any size team. IBM and Informatica both offer MDM and
data cleansing capabilities IBM's product addresses information analytics
‘and management needs, while Informatica concentrates on information
integration. But these robust tools come at a price. In addition to being
generally more expensive than their competitors, they require a more
extensive set of skills and experience to use. Also, they typically require
more extensive infrastructure and complex implementations than their
‘competitors.
[AGetting more PRO+ essential
content
Many of IBM's and Informatica’s competitors have significantly increased
their capabilities and features over the years, providing more alternatives for
large enterprises, especially those with less demanding integration needs
PageasotPr
eer
Inthise-guide
[AData integration platforms
take users beyond ETL
software
'RHow to justly the purchase of
‘a data integration tool
IRHow to evaluate the features
‘of data integration products
[ASclecting the right data
Integration too for your needs
[AGetting more PRO+ essential
content
Page2sot31
MMM
PRO+
E-guide Content
MMM
than outlined above. Data integration tools from SAP, Oracle and SAS
address a wide variety of data sources and integration use cases. Each of
these companies also offers enterprise applications such as enterprise
resource planning CRM and analytics that are used extensively, especially
in large enterprises, and they have leveraged thelr own data integration tools,
with those applications. If an enterprise has a significant investment in any of
these companies’ applications, it's reasonable to consider that vendor's data
integration tools as well.
‘SAP Data Services and SAS Data Management Platform both provide
‘extensive data integration capabilties that support large enterprises. SAP
Data Services, although limited to working with SAP's business applications,
is increasingly becoming more tightly integrated with the company's
‘software portfolio. This means that enterprises that are already SAP
customers should consider this integration product. Likewise, SAS
customers that are using the company's statistical and analytical products
should consider SAS Data Management Platform,
Tools for midsize enterprises with deep integration
needs
Midsize enterprises generally have the following characteristics:
+ Avariety of source systems that handle overlapping data subjects
and that may be on-premises or cloud-based, Data volumes will vary
based on industry or the products or services offered. Structureda ; PRO+
Se E-guide Content
MMLC
XXKLL_LLE™ data sources are stil dominant, and any unstructured data that needs
In thise-guide to be integrated is generally limited in scope.
ee + Extract, transform and load (ETL) and data warehousing are the
dominant integration use cases, although application integration may
arise in the future if data warehousing is addressed.
+ IT budgets are constrained.
+ Asmall IT group to perform both data integration work and business
inteligence development. Hiring specialists dedicated to specific tools
FRHow to jusiy the purchase of may not be fiscally possible.
‘a data integration too!
[AData integration platforms
take users beyond ETL
software
Although midsize enterprises with this profile have significant integration
needs, they're operating with constrained resources in regards to people,
budget and time. These companies should consider data integration
products from Microsoft, Oracle, Information Builders, Talend or Pentaho.
ee Each of these tools provides capabilities to address the data variety, scope
[ASclecting the right data of integration uses and resource constraints typically found in such
Integration too for your needs organizations.
IRHow to evaluate the features
‘of data integration products
Enterprises using Microsoft SQL Server that have developers with deep
‘SQL expertise should consider Microsoft's date-related products, such as
‘SQL Server Integration Services (SSIS). These tools share a common
development approach, enabling IT to work with multiple Microsoft tools
more effectively. Microsoft has been expanding the capabilities of SSIS to
handle more complex integration use cases, such as slowly changing
dimensions and fuzzy lookups, and a variety of data sources beyond flat fles
and relational databases, Although Microsoft's sources and targets aren't
limited to its platform, deployment still remains limited to Windows,
Microsoft's tools have historically been on-premises, but the company has
[AGetting more PRO+ essential
content
Page oftPr
prea
Inthise-guide
[AData integration platforms
take users beyond ETL
software
'RHow to justly the purchase of
‘a data integration tool
IRHow to evaluate the features
‘of data integration products
[ASclecting the right data
Integration too for your needs
[AGetting more PRO+ essential
content
Page2eof3
MMM
E-guide Content
MMM
‘made significant strides in moving capabilities to the cloud. On the down
sside, SSIS lacks some of the robust integration transformations, workflows
land process management of its competitors, such as the abllty to track and
manage processes using a repositary or team-based development
administration functions,
‘Similar to Microsoff, enterprises currently using Oracle databases may wish
to consider Oracle Data Integrator. ODIs a robust data and application
integration tool that can handle a wide variety of data sources and
integration uses, including Bl, MDM and application integration it also
‘enables scalabilty in regards to data volumes and velocity. While the
product has numerous capabilities that can be leveraged, i's often used to
automate SQL scripts ODI does require sufficient training to handle its
‘somewhat complex implementation. The product's abily to work in
conjunction witha variety of Oracle products expands its capebiites, but it
also increases the complexity of deployment, making it dificult to use for IT
staff with limited resources.
Information Builders’ Way Integration Suite can handle complex integration
uses such as MDM, data cleansing and data governance. Way should be
considered when an enterprise is already using other information Builders
information products, as it offers tight integration with those products.
‘These tools are known for their scalability and ability to work in real-time
with operational systems, One drawback: There's a limited pool of expertise
and experience with this product,
Talend's andPentaho's namesake data integration tools can also handle a
variety of integration uses. Both products have open source versions thatPr
eer
Inthise-guide
[AData integration platforms
take users beyond ETL
software
'RHow to justly the purchase of
‘a data integration tool
IRHow to evaluate the features
‘of data integration products
[ASclecting the right data
Integration too for your needs
[AGetting more PRO+ essential
content
Page29 ott
MMM
E-guide Content
MMM
enable an IT group to avoid any up-front licensing costs. The open source
versions offer solid data integration capabilities that fit well for enterprises
that don't have demanding integration needs or for IT groups that are
‘working on a shoestring budget. The enterprise versions of both of these
‘companies’ products provide significantly more extensive capabilities.
What to consider for small enterprises with
demanding integration needs
‘Smaller enterprises in this group generally have the following
characteristics:
+ Avariety of source systems with primarily structured data sources.
+ ETLand data warehousing are the integration use cases.
+ IT budgets are very limited
+ Limited IT staff that multitask in such areas as data integration, Bl and
‘operational systems.
‘These enterprises may want to consider either data integration tools based
‘on the databases they already use --ie, Oracle or Microsoft or the
products from Talend or Pentaho, These tools are cost-effective, as SSIS.
‘comes bundled with SQL Server and the open source versions of Talend or
Pentaho provide more data integration capabilities than many enterprises
even need. One caveat: Smaller enterprises should ensure that their IT
department has sufficient expertise to leverage these tools effectivelyio
eer
Inthise-guide
[AData integration platforms
take users beyond ETL
software
'RHow to justly the purchase of
‘a data integration tool
IRHow to evaluate the features
‘of data integration products
[ASclecting the right data
integration tool for your needs
[AGetting more PRO+ essential
content
Pages0