Professional Documents
Culture Documents
PAGE 20
ROLE OF DATA
INTEGRATION IN
UNLOCKING THE VALUE
OF BIG DATA SOLUTIONS
Kore Technologies
PAGE 22
REAL-TIME INTEGRATION—
THE FUTURE IS NOW
Denodo
PAGE 23
EMBRACE DIGITAL
DISRUPTION WITH
DENODO PLATFORM
IN THE CLOUD
SnapLogic
PAGE 24
10 NEW REQUIREMENTS
FOR MODERN DATA
INTEGRATION
Cask
PAGE 25
DATA
DATA INTEGRATION: A
STEPPING STONE TO
MODERN DATA APPS
INTEGRATION
MODERN
ENTERPRISE
Best Practices Series
18 AUGUST/SEPTEMBER 2016 | DBTA
Data Integration
for the
Modern Enterprise:
CLOUD
Shifts the Balance
Best Practices Series
The rise of cloud computing has changed the whole
concept of data integration. Previously, it was an
information technology department concern, with
IT and data staff sweating it out with manual scripts,
product connectors, and middleware brokers in efforts
to cobble together relevant applications, or surface data
trapped in legacy system silos to deliver it to modern
front-end interfaces. Top-level efforts also consisted of
endeavors to bring data into a common place through
master data management, or organizing information
through ontologies.
The blood, sweat, and tears that go in identifying, managing, and leveraging expected to triple over the next 24
into enterprise data integrations may not the way data flows through IT systems. months. There will be a significant
go away anytime soon, but lately, it has A recent survey of 300 DBAs and IT amount of enterprise data shifting to the
become easier to make data from any and professionals, conducted by Unisphere cloud over the next 24 months as well,
all sources more available to people and Research, a division of Information as enterprises rethink data management
applications that need it. The cloud—and Today, Inc., finds growing interest in in the cloud. Seventy-three percent of
more specifically, database as a service DBaaS as a viable approach to serving managers and professionals expect to be
(DBaaS)—has shifted the challenge of their enterprises’ needs for greater agility using DBaaS within their enterprises by
data integration. and faster time to market with cloud that time, versus 27% at the present time.
Clouds and DBaaS offerings are computing. Many of the early hurdles What does it take to construct and
gaining traction as an online means to in delivering enterprise capabilities for sustain a viable DBaaS strategy? Here
manage and process data. The cloud security and availability in the cloud are some considerations:
offers a major venue for big data become more evident with the reliance
solutions, since they are faster to deploy on hybrid cloud approaches and the SHOW THE BUSINESS
and easier to operate and scale than with need to move enterprise applications to POTENTIAL NEW WAYS DATA
typical on-premises systems. the cloud and back on-premises based CAN BE LEVERAGED.
At the same time, the rise of cloud and on the business requirements of the Moving to DBaaS is more than
DBaaS as an information management organization, their legacy investments, simply making data more accessible,
environment has shifted the balance and regulatory requirements (“Database it also opens new paths to innovation.
of responsibility for enterprise data as a Service Enters the Enterprise Data is a tool, a means, to better engage
integration away from the confines of Mainstream: 2016 IOUG Survey on customers and better understand markets.
data centers to the enterprise as a whole. Database Cloud,” April 2016). Plus, as new ideas and requirements
Suddenly, the entire business has a stake DBaaS is taking off, with adoption
AUGUST/SEPTEMBER 2016 | DBTA 19
arise, DBaaS—especially if delivered by priority as they enter the cloud space. Azure Storage, and
a cloud provider—serves as a testbed Google Cloud offer
to help accelerate innovation and TACKLE THE DATA SECURITY virtually unlimited
experimentation, since cloud providers ISSUE HEAD-ON, AND AS storage resources
will likely have all required features and EARLY AS POSSIBLE. that can serve as
services in place. Shifting more activities Data security isn’t just about securing total storage sites or as
to cloud providers or shared service data from hackers, but also entails access bursting services for spikes
environments frees up enterprises and control, as well as avoidance of any in enterprise workloads. On-premises
their IT staffs to provide higher-level potential for third parties to mishandle solutions—from open source
support to the business. the data. Half of the managers and frameworks to commodity hardware—
professionals in the Unisphere-IOUG also provide scalable storage solutions.
DEVELOP A DATA survey indicate that security and privacy
GOVERNANCE STRATEGY. concerns are the greatest inhibitors to DEPLOY DBAAS AND CLOUD
Data governance has long been a their cloud initiatives. Data ownership WHERE IT FINANCIALLY
challenge for enterprises, and cloud and retention follow closely behind as the MAKES SENSE.
or DBaaS doesn’t make things any second-ranked concern. Often, trusting Not all business cases may be suitable
easier. Essential concerns such as data outside cloud providers with sensitive or for DBaaS or cloud. To determine the
security, quality, and relevance will mission-critical corporate data is seen cost/benefit of cloud sites, enterprises
need to be dealt with at the enterprise as risky, not only in terms of potential need to weigh the traditional on-premises
level. In addition, there is a need to breaches, but also in terms of the potential costs of maintaining hardware, systems,
wrap governance around disparate data need for a relationship between a cloud networks, and storage, versus that
sources, which often were external to provider and consumer to be modified of a shared, multitenant enterprise
enterprises and therefore not under or terminated. The fate of data held by a private cloud, versus subscribing to a
their purview. There are a range of cloud provider may not be clear-cut. public cloud provider. There are also
business requirements that must be management and labor costs that pertain
addressed, from real-time data streaming STANDARDIZE. to project management, development,
to analytics to customer relationship The beauty of cloud and DBaaS is oversight, monitoring, and quality
management. And, organizations must that multiple standards, devices, and management—costs that are likely to
to be able to move large and varied interfaces are supported. However, it’s still be incurred regardless of whether the
datasets at a high velocity through their important that all parts of the enterprise data is managed on-premises or by an
systems—which raises issues for the be on the same page. On a high level, outside cloud provider. Even with the
handlers of this data, such as who owns standards are emerging to help simplify public cloud, there are time and expense
it, who has access to it, and how and cloud-based integration. For example, the requirements related to migration or
where it should be stored. Open Data Protocol (OData) promises integration between various systems, or
to replace the web services standards between cloud and on-premises systems.
GO HYBRID. REST and SOAP to enable greater Other expenses that need to be
From a planning/spending perspective, interoperability between enterprises measured include the cost of software
the future belongs to more hybrid and across the cloud. In many ways, licenses versus subscriptions, or the
approaches. Many organizations continue enterprises are becoming API-driven, costs of upgrades and maintenance
to maintain an abundance of legacy or meaning applications, functions, and versus pay-by-the-sip monthly fees. In
on-premises assets, and this is likely to services can be interconnected, on-the-fly, some cases, it may make sense just to
be the case for some time to come. As to share data as required by the business leave things as they are; in other cases,
long-standing legacy assets, these systems demands of the day. the savings may be significant. But,
have proved their worth and resiliency, ultimately, the true value comes from the
and continue to function well for their ADOPT A SUPPORTIVE IT enhanced opportunities for innovation
organizations. Mainframe systems, in INFRASTRUCTURE. and growth. n
particular, continue to be refreshed by For the private cloud, a robust
IBM and are capable of supporting the enterprise infrastructure is essential. The —Joe McKendrick
largest cloud and DBaaS workloads and internal systems—particularly storage—
the latest protocols. As a result of this need to be highly adaptable and elastic
sizable legacy base, the largest percentage for unpredictable workloads. Public
of organizations in the Unisphere/IOUG cloud services also offer compelling
survey, 44%, see the establishment of storage options. Services such as
hybrid cloud as their most important Amazon S3, OpenStack Swift, Microsoft
20 AUGUST/SEPTEMBER 2016 | DBTA Sponsored Content
analytics in the data streams that deliver the deep data storage provides the best of delivery solution between the cloud and
data to accelerate insights and action. both worlds. Lambda architectures, as on-premise systems and applications.
Streaming analytics that include such combined streaming and deep data Oracle’s Big Data Preparation Cloud
event stream processing shortens the storage architectures are called, provide Service is a next-generation data wrangling
time from data creation to decision a single platform that enables enterprises service that helps business users unlock
drastically. Stream analytics empowers to perform both real-time streaming data quickly from complex business data.
a business audience in any industry analytics, and refine the analytics with Oracle Big Data Preparation Cloud Service
that is looking to create solutions that insights from mining for richer and is built on Apache Spark, combines natural
embrace real-time, instant insight across more complex data recommendations language processing and machine learning
data delivery infrastructures. A good from the deep data storage reservoir. and bridges the line of business—IT divide
when extracting insights from the data
and operationalizing them into enterprise
data integration flows. Meanwhile, the
Oracle Stream Analytics platform provides
a compelling combination of an easy-
to-use visual façade to rapidly create and
dynamically change real-time event stream
processing applications, together with
a comprehensive run-time platform to
manage and execute these solutions. This
tool is business user-friendly and solves the
business dilemmas by completely hiding
and abstracting the underlying technology
platform. Oracle Data Integrator brings
together big data platform portability, the
ability to switch between multiple big data
platforms seamlessly, and powerful data
Oracle offers a full set of products for cloud, big data and on-premise data integration requirements
transformation capabilities to enterprise
stream analytics solution is designed Because big data technologies underpin big data development teams. Oracle Data
to handle large data volumes with the data storage, the cost-to-benefit ratio Integrator provides a unified and common
subsecond latency, while also providing is extremely appealing for organizations interface to build data transformations
a business-friendly, easy-to-use interface. looking to make a difference with their irrespective of the underlying big data
They are designed with drag-and-drop big data investments. technology. This ensures the data
interfaces that help model data streams integration developers can utilize the latest
replicate business models and behaviors. THE ORACLE ADVANTAGE big data platform and language without
Streaming analytics finds great use in Oracle provides a wide range of products having to compromise on productivity
scenarios that rely on very low latency that help with all the moving parts of and with minimal disruption. Oracle’s
business decisioning. Some examples building a differentiated and forward- entire integration platform is governed and
include fraud detection in the financial looking big data integration, management audited by Oracle Metadata Management.
industry to automate stock trading and analytics platform. As part of the data Oracle big data integration offerings are
based on market movement, monitoring integration portfolio, Oracle GoldenGate flexible, robust and complementary to
the vital signs of a patient and setting and Oracle GoldenGate Cloud Service maximize big data investments and unlock
preventative triggers in health care, and ensure real-time data capture and value from these investments now and in
detecting security issues and fraud in streaming from heterogeneous business- the future. n
transportation industries by finding critical transactional systems with minimal
anomalous patterns as they happen to impact to the performance of the source
initiate immediate investigation. systems. Oracle GoldenGate provides ORACLE
Stream analytics combined with the most secure and reliable big data oracle.com/goto/dataintegration
22 AUGUST/SEPTEMBER 2016 | DBTA Sponsored Content
Real-Time Integration—
The Future is Now
With all the cloud-based, mobile and for MultiValue (MV) systems. Kourier without worrying about a user consuming
niche applications, plus the Internet of Integrator streamlines and simplifies all of their resources. Kourier’s REST
Things (IoT), it’s clear that the days of the process of building and testing Gateway makes it easy to:
one-size-fits-all, monolithic applications bi-directional integrations using RESTful • Create policies to manage resource access
are over. It’s now much easier to find a Web Services; the REST Gateway provides – Define the maximum number of
third-party solution to fill a specific need secure, real-time access to MV applications requests per hour, minute and day
or to “go deep” into an area that cannot via REST APIs from outside the firewall. – Define the availability for each
be fulfilled by your primary application. Developers are more productive resource (CRUD)
That’s led to an increase in application working within Kourier’s REST framework • Define users and server/database
integration. However, the growing need because they can focus on the application security
for instantaneous up-to-date data is interface instead of low-level protocol – Associate policies and databases
motivating companies to transition from details such as data validation, resource – Routes requests to server /
traditional batch-oriented techniques security and transaction logging. database
to real-time data integration. Although REST APIs are primarily created via • Visualize performance with the
achieving real-time integration can be specification pages and typically require interactive dashboard
challenging, and accomplished using a minimal programming. Powerful Event – Graph transaction history
variety of technologies, the goal is the Handlers make it easy to leverage existing – View REST resource statistics
same: to transfer accurate, timely data application business logic or add special – Drill down into request headers,
from point A to point B in real-time so instructions within REST resources at parameters and timers
users can make better-informed business- specific timing intervals.
critical decisions. THE REST OF THE STORY
Other developer-friendly features of
The quest for tighter integration
Kourier’s REST framework:
THE NEW REAL-TIME between enterprise applications, third-party
WORLD ORDER
• Automatic data validation solutions and the IoT will drive companies
One technology has emerged as the
• Standard HTTP status code support to use more real-time integration via REST
dominant Web service design model for real-
• Dynamic query parameters to extend and modernize their business
time integration: REST (Representational
• Create REST APIs without coding: operations. Developers will be able to retain
– JSON and XML
State Transfer). Why? RESTful Web services the core functionality and value of their MV
– Query parameter validation
are easier to use and the resource-oriented enterprise applications while extending it as
– Query wildcards
model is more flexible than previous SOAP, needed via integration to other solutions.
– Pagination of large result sets
RPC and WSDL-based interfaces. REST has Kore is helping its partners and clients
– Field limiting
already been adopted by providers such meet this challenge by implementing
– Result filtering and sorting
as Google, Netflix and Twitter and many Kourier Integrator, our award-winning
other enterprise organizations. RESTful
• Automatic transaction logging, enterprise integration and data
history and metrics
architectures and implementations provide management solution, and the Kourier
these characteristics/benefits:
• API versioning REST Gateway. These products facilitate
the building, managing and deployment
• Easy Web integration – uses standard
HTTP methods SECURE, RATED AND of secure, scalable, real-time integrations
MEASURED ACCESS to best-in-class applications via RESTful
• Increased Scalability – stateless
interaction and caching semantics The Kourier REST Gateway is a critical Web Services.
piece of the integration architecture To learn more about our integration
• Reliability – separation between
client, server and data because it’s responsible for providing solutions or to schedule a demon-
secure access to applications from outside stration, please visit our website or call
• Security – via the transport layer
(SSL) and message-level mechanisms the firewall while it monitors, manages and 866-763-KORE (5673). n
measures REST API usage. Connection
• Standard Language – XML and KORE TECHNOLOGIES
JavaScript Object Notation (JSON) pooling is supported for enhanced
performance. Policies can limit (rate) www.koretech.com
SIMPLIFIED REST DEVELOPMENT the maximum number of requests per
Kourier Integrator and Kourier’s minute/hour/day for each user. Gateway
REST Gateway are Kore’s easy-to-use administrators can feel confident about
and versatile REST integration solutions exposing their system to the outside world
Sponsored Content AUGUST/SEPTEMBER 2016 | DBTA 23
Embrace Digital
Disruption with Denodo
Platform in the Cloud
In order to remain competitive in today’s
WHO IS REAPING Mobile Device Protection
digital economy, data-driven businesses
THE BENEFITS? Firm (“The Firm”)
are seeking agile, rapid data integration
Two case studies serve to illustrate the The Firm provides device protection
capabilities that support real-time
benefits of the Denodo Platform for Data and support services for smartphones,
decision making. Data Virtualization
Virtualization in the Cloud. tablets, consumer electronics, appliances,
provides the necessary speed and agility
and satellite receivers. Driving the challenges
by accessing a wide variety of data from
Logitech faced by The Firm were an initiative to move
multiple internal and external data
Logitech is a global provider analytics from on-premises to the cloud on
sources, and transparently combining
of personal computer and tablet the Amazon Web Services (AWS) platform,
(without moving) data to provide
accessories. The company was seeking and an initiative by the CSO organization to
business users with a unified view of all
a cost-effective solution for moving implement enterprise-ready authentication
relevant data for analysis.
its on-premises data to cloud, and to manage data access with appropriate
Data Virtualization is becoming
integrating this data with cloud data security policy.
mainstream for digital enterprises. This
sources. Logitech also faced hurdles In order to enact these initiatives, The
is underscored in the May 2016 Cool
associated with time-to-deliver, Firm needed a solution that would enable a
Vendors in Pervasive Integration, 2016
redundant data and siloed data, business layer as well as a security access layer.
report in which Gartner states that
as well as security issues related to The Firm was building a data lake in the
the “Internet of Things (IoT), digital
unauthorized access to underlying cloud for which a virtual data access layer on
business and logical data warehouse
data sources which raised governance top of the data lake was needed to provision
use cases require data virtualization
concerns. the data for analytics. The security access
approaches for integration to achieve fast
Logitech replaced its on-premises layer was needed to comply with enterprise-
time to value for supporting analytics
data warehouse with Amazon Redshift, wide and legal data access requirements.
and operations.”
and then implemented an LDW The Firm selected the Denodo Platform
Data Virtualization in the Cloud
architecture utilizing Denodo Platform to satisfy both the business layer and security
provides additional advantages,
for Data Virtualization to unify data access layer requirements. The cloud analytic
delivering unparalleled business agility,
across on-premises and cloud data solution uses AWS S3 as the data lake
modernization and cost effectiveness.
sources and provide a single virtual data environment and the Denodo Platform as
The Denodo Platform for Data access layer. Denodo Platform was used the virtual data access and governance layers
Virtualization in the Cloud offers a as a business layer provisioning data to on top of the data lake to provision data to
unique combination of capabilities: all other enterprise analytical tools such the analytical tools such as Oracle Business
4 Self-service data discovery and search as Tableau, Pentaho BA, and other data Intelligence.
empowers users with knowledge about interfaces and Web services. The Denodo Platform enabled The
the data, including data lineage. Denodo’s LDW architecture is the Firm to ensure that the “right people” have
cornerstone for enabling cloud-based access to the “right data”; that governance
4 Real-time intelligence provides users
reporting and analytics, and played a requirements are met across cloud and
with the ability to tap into real-time
critical role in the success of Logitech’s on-premises environments; and was
data for analysis.
Cloud BI and analytics strategy, helping instrumental in the continued success of
4 Multi-structured data support for Logitech achieve faster access to data via The Firm’s cloud analytic strategy. n
wide variety of data sources such as data virtualization without traditional
Hadoop, Spark, NoSQL, Relational ETL; a business access layer for unified DENODO is the leader in Data
and SaaS applications. view of on-premises and cloud data; Virtualization. Please contact us at
4 Hybrid implementation model enables governance enforcement through info@denodo.com, or download
a single virtual view of data, combining Denodo’s single virtual data access layer; Denodo Express using the link:
on-premises and cloud data sources. the flexibility to add new data assets; http://www.denodo.com/en/
and data virtualization in the cloud denodo-platform/denodo-express
4 Enterprise-ready data governance
allows data-access management from supporting Logitech’s cloud strategy to to get started
a single point for all data sources. bring innovation and agility.
24 AUGUST/SEPTEMBER 2016 | DBTA
Data Integration:
A Stepping Stone to
Modern Data Apps
Modern data integration in the enterprise once it is in the data lake? And how do you Through its open source extensions,
is about deriving meaningful insights accelerate the time to value for the business Cask Hydrator and Cask Tracker, CDAP
from a variety of data sources and data by not just solving the data integration and offers capabilities and user interfaces
structures that can be quickly turned into governance problems, but also promoting specifically designed to help with data
actionable business information. The data reuse of components to prevent repeated integration challenges, such as how to
can be structured, semi-structured and/or integration efforts? How do you get on a quickly “hydrate” a data lake, and how to
unstructured data; it can contain business, path to rich, data-driven applications— easily “track” data movement within data
technical and operational metadata; and its such as recommendation engines and lakes and data applications. Cask Hydrator
origins can be from within or from outside anomaly detection systems—which require is a self-service and extensible framework
the enterprise, including everything from application and cloud integration and often designed to develop, run, automate
ERP systems, legacy and modern databases, turn out taking a long time due to siloed and operate data pipelines. Its intuitive
to social media posts (tweets, blogs, etc.), efforts within the enterprise? Merging the drag-and-drop interface integrates with
system and application logs, machine sensor data flow with the application flow in one Hadoop and non-Hadoop storage, and it
data, etc. Data integration tools available cohesive integration approach can help has the ability to switch between different
on the market today are generally viewed reduce the amount of custom coding and processing technologies—MapReduce,
as adequate for helping with ingesting, manual processes, while speeding up the Spark, and Spark Streaming. Cask Hydrator
preparing, transforming and provisioning development and deployment of modern can prepare, blend, aggregate and apply
data for where they eventually are consumed data applications. science to create a complete picture of an
in the business or where they need to be fed This is where the Cask Data Application enterprise’s business data in order to drive
into business processes. They have come a Platform (CDAP) comes in. CDAP provides actionable insights.
long way from supporting manual coding to a unified integration platform that allows Cask Tracker is another open source,
providing more intuitive, visual interfaces, developers, data scientists and IT/operations self-service framework that automatically
and from increasingly handling real-time teams to use a consistent set of tools for captures rich metadata and provides users
data delivery in addition to bulk (batch) data with visibility into how data is flowing into,
movement. They typically now also facilitate out of, and within a data lake. It enables
the scale required with integrating data IT to oversee changes, while delivering
sources from cloud-based applications and trusted, secure data and an audit-trail
data storage. But in a world where big data for compliance in a complex data lake
and applications are converging rapidly, is the environment. Cask Tracker provides access
value traditional data integration tools are to structured information that describes,
providing to the business going far enough, explains, locates, and makes it easier to
or are they merely a stepping stone—albeit retrieve, use and manage datasets, including
an important one—to modern data-driven rich lineage and provenance information.
applications? CDAP, with its extensions Cask Hydrator
Generally speaking, a good, modern and Cask Tracker, is the de facto, open
data integration solution must deal with a source big data application and integration
wide variety of data types and sources, offer platform for building, deploying and
customizable data preparation and cleansing, Image 1: Cask Data App Platform operating data-centric applications and
and support different modes of data delivery data lakes; it also enables IT organizations
(batch, micro-batch, streaming). It should data integration, application integration, to implement well-governed data-as-a-
offer easy and timely access to data, help operations management and governance. service environments designed to quickly
with data governance, and support both As an integrated framework, it has been unlock the value of data. For more
existing and emerging use cases, such as IoT. designed from the ground up for building, information about these products, please
In the world of big data, one of the most deploying and operating self-service data go to the Cask website, and to stay updated
commonly expressed challenges is how to get applications and data lakes on Hadoop and on product and company news, follow us
data from sources into an application or a Spark. It is 100% open source and highly on Twitter @caskdata. n
data lake, where it can generate value. But the extensible, and it supports all major Hadoop
challenges don’t end there: How do you track distributions offering complete portability CASK
where the data goes, and who has access to it within and between the distros. www.cask.co