You are on page 1of 20

Data Extracts

By Sudharshan Seerapu
Customer Advocacy
Manager
Feb 9th, 2016

Help people see and


understand their data

Topics
Understanding Tableau Data Extracts
Why Use Tableau Data Extracts
Tips, Tricks and Best Practices

Because Tableau can now do


analytics so swiftly and gives people
the choice to connect directly to fast
databases or use Tableaus inmemory data engine, it has become
much more powerful in respect of data
exploration and data discovery.
This leads to analytical insights that
would most likely have been missed
before.
Robin Bloor, Ph.D and founder of The Bloor

Understanding Tableau Data Extracts

Understanding Tableau Data Extracts


What is a Tableau Data Extract (TDE)?
compressed snapshot of data stored on disk and loaded
into memory as required to render a Tableau viz.

Columnar Store - stores data tables as


sections of columns of data rather than as
rows of data
Architecture Aware - means that TDEs use
all parts of your computers memory, from
RAM to hard disk, and put each part to work
as best fits its characteristics

Rows vs Columnar

Rows vs Columnar

TDE Generation
Generates
TDE

Retrieves
Data

add
values for
each
column to
file

Sorts

Compress
es

Why Use Tableau Data Extracts

7 Reasons for Using Tableau Data


Extracts

Performance
Reduced load
Portability
Pre-aggregations
Materialization of calculated fields
Publishing to Tableau Public and
Tableau Online
Support for missing functionality

Example Use Cases


Compare an aggregate for all rows in
an underlying source with the same
aggregate for a subset of the rows.
Create double aggregates.
Build a KPI-style dashboard that
combines worksheets based on
aggregated extracts with worksheets
based on live connections.

KPI Dashboard with Extracts & Live

Tips, Tricks and Best Practices

Things to Keep in Mind

Hidden fields when creating extracts


Location of extracts
Retaining the case sensitivity of the source data.
Incremental or full
Regularly defragmenting a Tableau Servers hard
drive or using SSD
Backgrounder needs enough disk space to store
existing Tableau extracts
Tabcmd (a command-line utility) can be used to
refresh extracts, as well as to publish TDEs to
Tableau Server

Tips for Incremental Refreshes


Incremental extracts retrieve only new records from
the data source, reducing the amount of time
required to create an up-to-date extract.
When performing an incremental extract, records
are not replaced.
When creating an incremental refresh against an
Excel data source, only Date columns will be
available to use for defining new rows.
When publishing an extract that will not or should
not be refreshed, connect directly to the extract file
as a data source before publishing.

Tips for Aggregated Extracts


Use caution with aggregations such as
COUNTD or other non-additive aggregations.
Use caution with row level calculations that
involve a parameter.
Number of Records is special in
aggregated extracts. Its default aggregation
is SUM.

Data Source Considerations


When extracting data from SAP BW, be
aware that some limitations exist.
When creating an initial extract from a
Salesforce.com data source, Tableau
retrieves all objects and the extract creation
can be time consuming several hours in
some cases.

Best Practices
Use full refreshes when possible
Incrementally refreshed extracts should be fully
refreshed at regular intervals
Publish extracts to Data Server to help avoid
redundant extracts in the Tableau Server
environment
Hide unused columns before creating an extract in
order to speed extract creation and to preserve
storage space
Make sure there is enough contiguous free disk
space for the largest extract in use in order to
optimize extract performance.

Questions?
sseerapu@tableau.com
(408) 506 6377

You might also like