Professional Documents
Culture Documents
Marty Gubar
Big Data SQL PM
Oracle Corporation
October 2018
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Confidential – Oracle Internal/Restricted/Highly Restricted
Safe Harbor Statement
The following is intended to outline our general product direction. It is intended for
information purposes only, and may not be incorporated into any contract. It is not a
commitment to deliver any material, code, or functionality, and should not be relied upon
in making purchasing decisions. The development, release, timing and pricing of any
features or functionality described for Oracle’s products may change and remains at the
sole discretion of Oracle Corporation.
Streaming
Data
Data Lake
Warehouse • Deploy in minutes
• Reduce runtime costs
• Support hybrid deployments
Infrastructure
– On-premise to cloud
– Multi-cloud
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Confidential – Oracle Internal/Restricted/Highly Restricted 4
Roadmap
Big Data SQL 3.2 Big Data SQL 4.0 Cloud Smart Scan
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Confidential – Oracle Internal/Restricted/Highly Restricted 5
Big Data SQL Today
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Confidential – Oracle Internal/Restricted/Highly Restricted 6
Big Data SQL Architecture
• Any application that queries
REST Python node.js SQL Java
R Graph
Oracle Database enhanced
Oracle Database – Seamlessly query external stores
Big Data SQL – Oracle Database Big Data SQL-enabled
Oracle Big Oracle Oracle Big Data Oracle Exadata DIY Cloudera or Oracle
Data Appliance Exadata Cloud Service Cloud Service Hortonworks Database
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Confidential – Oracle Internal/Restricted/Highly Restricted 8
Start With “Big Data Enabled” External Tables
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Slide - 10
next = lineNext.getQuantity();
}
return state;
}
}
return a.equals(b); PARTITION BY name ORDER BY time
private boolean gt(String a, String b) { MEASURES FIRST(x.time) AS first_x,
if (a.isEmpty() || b.isEmpty()) {
}
return false;
LAST(z.time) AS last_z
}
return Double.parseDouble(a) > Double.parseDouble(b);
ONE ROW PER MATCH
private boolean lt(String a, String b) {
if (a.isEmpty() || b.isEmpty()) { PATTERN (X+ Y+ W+ Z+)
return false;
}
return Double.parseDouble(a) < Double.parseDouble(b);
DEFINE X AS (price < PREV(price)),
}
Y AS (price > PREV(price)),
public String getState() {
}
return this.state;
W AS (price < PREV(price)),
}
BagFactory bagFactory = BagFactory.getInstance(); Z AS (price > PREV(price) AND
@Override
public Tuple exec(Tuple input) throws IOException { z.time - FIRST(x.time) <= 7 ))
long c = 0;
String line = "";
String pbkey = "";
V0Line nextLine;
V0Line thisLine;
V0Line processLine;
V0Line evalLine = null;
11 Copyright
//Object © 2014, Oracle
o = input.get(0); and/or its affiliates. All rights reserved. 10/25/18 Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | 11
if (!(o instanceof DataBag)) {
int errCode = 2114;
Data Visualization: No Changes to Query Kafka & Oracle
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Confidential – Oracle Internal/Restricted/Highly Restricted 12
Securing Access to Data
• Support Source Security Rules Single User Application LDAP / DB Support Varied
Users Users Application
– Use access privileges defined on HDFS Employee Dir My HR Direct Access
Authentication
Methods
sources with multiuser authorization
• Extend Protection with Advanced Oracle Big Data SQL Add Oracle
Advanced
Oracle Security Policies Salary Emp security options
– Redaction
– VPD
Salary Emp Automatically
– Database Vault use ACLs on
Hadoop Cluster protected files
– Database Security Assessment Tool
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Confidential – Oracle Internal/Restricted/Highly Restricted 13
Demonstration 1.
2.
3.
Review data in warehouse
Extend customer attributes with external data
Secure it
Seamlessly extend your warehouse 4. Add new detail facts – customer behavior information
5. Gain insights using advanced Oracle SQL
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Confidential – Oracle Internal/Restricted/Highly Restricted 14
Big Data SQL 4.0
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Confidential – Oracle Internal/Restricted/Highly Restricted 15
Roadmap
Big Data SQL 3.2 Big Data SQL 4.0 Cloud Smart Scan
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Confidential – Oracle Internal/Restricted/Highly Restricted 16
Query Server: SQL on Hadoop
• An Oracle query engine deployed to a
Hadoop Cluster
REST Python node.js SQL R Graph Java
• Simple, zero maintenance
Oracle Database – Uses Hive metadata and Hadoop
Big Data SQL authorization
– Oracle data not saved to Query Server
Hive Big Data SQL • Included with Big Data SQL license
Streaming
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Confidential – Oracle Internal/Restricted/Highly Restricted 18
Performance Breakthroughs
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Confidential – Oracle Internal/Restricted/Highly Restricted 19
Aggregation Offload: Major Performance Breakthrough
40.00
34.85
Single table Count(*)
Elapsed (sec)
20.00
5.38 SELECT COUNT(*)
0.00 FROM store_sales
OFF ON
200.0 159.0
150.0 124.2 SELECT ss_store_sk
100.0 OFF sum(ss_wholesale_cost),
50.0 8.9 10.5 14.3 ON sum(ss_list_price)
0.0
1 2 4 FROM store_sales
# of SUM columns GROUP BY ss_store_sk
300.0 256.8
Multi-table: Join fact to dimension table
Elapsed Seconds
250.0
181.1 SELECT d_dom
200.0 151.5
150.0 OFF sum(ss_wholesale_cost),
100.0 ON sum(ss_list_price)
50.0 17.1 11.4 15.2
0.0
FROM store_sales, date_dim
1 2 4 WHERE ss_sold_date_sk=d_date_sk
GROUP BY d_dom
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Confidential – Oracle Internal/Restricted/Highly Restricted 20
Support Object Store Sources
• Support data captured in
Oracle Database object stores
Oracle Big Data SQL – Oracle Object Storage, Amazon S3,
Azure Blob Storage
• Use new ORACLE_BIGDATA
driver
– Optimized C-mode driver support
– Support text, parquet, avro, json
Limitless, highly available,
economical storage
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Confidential – Oracle Internal/Restricted/Highly Restricted 21
Big Data SQL Cluster: Separate Compute from Storage
Post-Big Data SQL 4.0
• Support hybrid deployments
Oracle Big Data SQL
Database – Data local processing for Hadoop
– Separate compute and storage for
Big Data SQL Big Data SQL Big Data SQL other cloud deployments
Hive Metadata Big Data SQL Cells Cell Cluster Cell Cluster Cell Cluster
• Improve performance and
HDFS Oracle Object
Store
Azure Blob
Storage
Amazon S3 decrease costs
– Extends database processing to local
data center
– Minimize data movement across
cloud
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Confidential – Oracle Internal/Restricted/Highly Restricted 22
Important Investments on the Road to Autonomous DB
• Performance: Major breakthroughs in distributed database processing
• Data Sources: Added support for Object Stores
• Deployment: Option to separate compute and storage
• Metadata: Query Server automatically synchronizes external metadata
with Oracle Database query engine
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Confidential – Oracle Internal/Restricted/Highly Restricted 23
Autonomous Database with
Cloud Smart Scan
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Confidential – Oracle Internal/Restricted/Highly Restricted 24
Roadmap
Big Data SQL 3.2 Big Data SQL 4.0 Cloud Smart Scan
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Confidential – Oracle Internal/Restricted/Highly Restricted 25
Autonomous Data Warehouse
• Easy
–Automated management
–Automated tuning: Simply load data and run
• Fast
–Based on Exadata technology
• Elastic
–Instant scaling of compute or storage with no downtime
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Confidential – Oracle Internal/Restricted/Highly Restricted 27
Metadata Catalog
• Catalog scans and inventories data from
Autonomous Database
across all locations
– Crawl sources and identify schemas
– Understand business definition, lineage and
Metadata Catalog more
– Identify and tag sensitive data
• Metadata immediately available for query
– No need to define tables and parsing rules
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Confidential – Oracle Internal/Restricted/Highly Restricted 28
Oracle Data Management Vision
PROCESSING
METADATA CATALOG
STREAMING PERSISTENCE
https://cloudcustomerconnect.oracle.com/posts/9aaeb6c91a
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 30
http://cloudcustomerconnect.oracle.com