You are on page 1of 49

1

<Insert Picture Here>

Get the Best Out of Oracle Data Pump Functionality


Dean Gagne (Oracle)
Viljo Hakala (Nokia)

Agenda

Moving large amounts of data with Transportable


Tablespaces
Filtering metadata using the INCLUDE and EXCLUDE
parameters
Restarting stopped/failed jobs
Hear about Nokia Corporation's database environment
How Nokia uses the REMAP_DATA parameter to
scramble data
How Nokia regenerates primary keys without having to
use additional software or scripts

What is Oracle Data Pump?


New feature starting in Oracle Database 10g Release 1
Enables very fast bulk data and metadata movement between
Oracle databases
High-speed, parallel Export and Import utilities (expdp and
impdp) as well as a Web-based Oracle Enterprise Manager
interface
Jobs can be restarted without loss of data, whether or not the
stoppage was voluntary or involuntary
Jobs support fine-grained object selection. Virtually any type of
object can be included or excluded
Supports the ability to load one instance directly from another
(network import) and unload a remote instance (network export)

Moving Large Amounts of Data


With Transportable Tablespace

What is Transportable Tablespace?

An Oracle Database feature that allows data file


transfer from a database to another via a simple os
copy and a light specific export/import
Data Pump will move metadata only
Data moves with data file copy
Much faster than using direct path or external tables

Moving Large Amounts of Data


With Transportable Tablespaces
Restrictions:
Tablespaces need to be self contained
All dependent objects must be included in tablespace set

Tablespaces need to be read only for duration of export and


datafile copying.
Not restartable
Must be privileged account
Notes:
Use RMAN CONVERT to change endianness (if needed)
Data Pump moves only metadata
Can use network link
For a full list of what is exported:
select unique seq_num, full_path
from datapump_paths
where het_type = 'TRANSPORTABLE_EXPORT
order by seq_num;

Transportable Tablespaces
Self Containment Check
Create table part_tab(id number) tablespace a
partition by range (id)
(partition low value less than (100) tablespace b,
partition hi values less than (1000) tablespace c);
Create index part_ind on part_tab.id tablespace d;
Transportable tablespace export requires all 4
tablespaces
(a, b, c, d)

DBMS_TTS.TRANSPORT_SET_CHECK procedure
to verify
accepts a comma separated list of tablespace names
SELECT * FROM transport_set_violations;

Transportable Tablespace Steps


Source system tasks:
Optionally run DBMS_TTS.TRANSPORT_SET_CHECK
Set tablespaces read only
Run expdp command with tablespace list
Data Pump will list required dumpfiles and datafiles in log file

Copy datafiles and dumpfiles to target system


Optionally set tablespaces read write
Target system tasks:
Run impdp command
Optionally set tablespaces to read write
9

Table Mode with Transportable=always


Can export/import tables, partitions, and indexes
regardless of tablespace closure
Cant have any storage in system, sysaux, temp tablespaces

Data Pump export will list tablespaces that need to be


read only if not already done
Data Pump export will list datafiles and dumpfiles that
need to be copied to target
Data Pump import can use filters dont have to
import the complete dumpfile set
Can import only one partition Data Pump creates a
non-partitioned table
Partition_options=departition creates non-partitioned
tables for every partition
10

Export Table Mode with Transportable=always


Expdp scott/tiger tables=pt1,pt2 transportable=always

Table filter needs to be complete tables or partitions from


the same table
ptab1,ptab2, etc or ptab1:lo,ptab1:hi
ORA-29335: tablespace 'USER1' is not read only
ORA-29335: tablespace 'USER2 is not read only
Dump file set for SCOTT.SYS_EXPORT_TABLE_01 is:
/oracle/work/tab_tts.dmp
Datafiles required for transportable tablespace USER1:
/oracle/dbs/user1.f
Datafiles required for transportable tablespace USER2:
/oracle/dbs/user2.f

11

Import Using Transportable Table Mode Dumpfile


Impdp scott/tiger tables=ptab2:hi
rename_table=ptab2:hi:my_tab1 transport_datafiles
Created table is my_tab1

Impdp scott/tiger tables=ptab2


partition_options=departition transport_datafiles
Created tables are ptab2_lo, ptab2_hi

Creates all tablespaces even if no objects created


due to import filters
Cleans up any unused tablespace segments that are
not imported

12

Filtering Metadata

13

Filtering Metadata Overview

Perform selective export/import job based on object


type
Exp/imp had limited filtering
Grants, index, triggers, statistics, constraints

Data Pump has almost complete filtering capabilities

14

Filtering Metadata Using


INCLUDE and EXCLUDE Parameters

If using exclude parameter, everything else is included


If using include parameter, everything else is excluded
Cant use exclude and include in the same Data Pump job
Specify complete path or partial path. Objects matching the
specified path will be excluded/included.
Query to find exclude/include object types:
select unique seq_num, full_path
from sys.datapump_paths
where het_type = 'DATABASE_EXPORT'
order by seq_num;
Job_type:
het_type:
FULL
SCHEMA
TABLE
TRANSPORTABLE

DATABASE_EXPORT
SCHEMA_EXPORT
TABLE_EXPORT
TRANSPORTABLE_EXPORT

15

Exclude Example:
expdp system/manager schema=hr exclude=statistics
vs
expdp system/manager schema=hr exclude=
SCHEMA_EXPORT/TABLE/STATISTICS
select unique seq_num, full_path
from sys.datapump_paths
where het_type = 'SCHEMA_EXPORT' AND full_path like '%STATISTICS%
order by seq_num;
77
78
221
222
223
225
226
227

SCHEMA_EXPORT/TABLE/INDEX/STATISTICS
SCHEMA_EXPORT/TABLE/INDEX/STATISTICS/INDEX_STATISTICS
SCHEMA_EXPORT/TABLE/INDEX/STATISTICS
SCHEMA_EXPORT/TABLE/INDEX/STATISTICS/FUNCTIONAL_AND_BITMAP
SCHEMA_EXPORT/TABLE/INDEX/STATISTICS/FUNCTIONAL_AND_BITMAP/INDEX_STATISTICS
SCHEMA_EXPORT/TABLE/STATISTICS
SCHEMA_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
SCHEMA_EXPORT/TABLE/STATISTICS/USER_PREF_STATISTICS

16

Include Examples:
impdp system/manager tables=hr.employees
vs
impdp system/manager schemas=hr
include=table:\"= \'EMPLOYEES\'\"
Same results
Includes all objects that have table in the path
vs
impdp system/manager schemas=hr
include=table/table:\"= \'EMPLOYEES\'\
Only includes the table
Some of the TABLE object paths
SCHEMA_EXPORT/TABLE/TABLE
SCHEMA_EXPORT/TABLE/TABLE_DATA
SCHEMA_EXPORT/TABLE/GRANT

17

Restarting Stopped/Failed Jobs

18

Restarting Data Pump Jobs


Restart jobs intentionally or unintentionally stopped
User stopped
Dumpfile exhausted
Resumable wait

Optionally change value of PARALLEL parameter


Helps to know the job name
System generated job names
SYS_IMPORT_FULL_01
SYS_EXPORT_TABLE_05

What jobs are restartable


Select * from dba_datapump_jobs
Select * from user_datapump_jobs

19

Restarting Stopped Export Jobs


Initial Export Command:
expdp system/manager job_name=datapump_exp

Restart command:
expdp system/manager attach=datapump_exp
Export> start_job

Resets dumpfile pointers to last completed object type


Restarts the job after the last completed object

20

Restarting Stopped Import Jobs


initial import command:
impdp system/manager job_name=datapump_imp
Restart command:
Impdp system/manager attach=system.datapump_imp
Import> Start_job[=skip_current]
Detects the last object created
If creating an object and the object exists, checks to see if object
was in progress in failed job
Data that was being loaded would have either been committed or
not. Committed data marked complete, uncommitted data will be
retried.

21

Restarting Stopped Network Import Jobs


Initial Import Command
impdp scott/tiger job_name=net_imp network_link=dbs1
Restart command:
Impdp scott/tiger attach=net_imp
Import> Start_job[=skip_current]
Restarts on object type
If creating object and object exists, checks to see if object was in
progress in failed job
Data that was being loaded would have either been committed or
not. Committed data marked complete, uncommitted data will be
retried.

22

Hear about Nokia Corporation's Database


Environment and how they use Oracle Data Pump
Viljo Hakala (Nokia)

23

Nokia
Nokia is a world leader in mobility
Head office in Finland; R&D, production, sales, marketing
activities around the world
Worlds #1 manufacturer of mobile devices, with estimated
40% share of global device market in 2009
Mobile device volumes 468 million units
Net sales EUR 50.7 billion
Operating profit EUR 5.0 billion
128 445 employees at year end (including Nokia Siemens
Networks)
Strong R&D presence in 16 countries
R&D investment EUR 6.0 billion
Sales in more than 150 countries

24
24

Oracle @ Nokia
20 DBAs, 2300 databases
Grid Control: used for almost 5 years, #1 tool for all
DBAs
Applications:
OLTP: 50%, Hybrid databases: 40%, DW: 10% +
Teradata
About 1800+ Oracle; also MySQL 200+, MS SQL 300+
11g: 4%
10g: 80%
9i: 15%
8i: 1%
Host platforms:
Solaris: 35%
HP-UX: 35%
Linux: 20%
Windows 10%
25
25

Oracle @ Nokia

26
26

Production databases: single, HA Oracle, RAC, HA MySQL


5 MB 7 TB range
Leverage virtually all Oracle database features
DBAs share responsibilities
Many federated databases: Streams, Advanced
Replication, Materialized views, DB links
Common Oracle platform, Common MySQL platform and
Common MS SQL platform
99% of databases backed up with RMAN
8 RMAN catalog databases
10 PB+ backed up / month

What is Data Masking?

27
27

Hints for Effective Data Masking


Be sure to mask all data that is not sensitive, but can be used to create
sensitive data.
You should not be able to reverse the masking for it to be secure. Never
should you be able to retrieve original values from masked values.
The masked or obfuscated data should follow the source data format.
Masking should protect data, but still allow realistic looking data to be
used for testing.
Integrity of relations in data should be followed. When masking a
primary key, the foreign keys should also be masked using the same
masking method.
You should be able to repeat the masking process for it to be an
effective day to day process.

28

Data Masking with Oracle Data Pump


One of the best uses of Oracle Data Pump is to load test
and development systems with data from production
systems for realistic testing purposes.
Due to company rules or legal regulations, sensitive data
in the non-production systems has to be replaced with
realistic looking data to enable effective testing.
Effective data masking routines should support
automation and Oracle Data Pump provides several ways
to automate data imports and exports with masking.

29
29

Data Masking with Oracle Data Pump


In addition to Data Masking Pack in Grid Control, Oracle
Data Pump provides a method to mask data: REMAP_DATA
parameter introduced in Oracle Database 11g.
Oracle Data Pumps REMAP_DATA feature uses a
remapping function to rewrite data.

For example, a column with phone numbers could be


replaced by a numbers generated by a REMAP_DATA
function.

30
30

Data Masking with Oracle Data Pump


REMAP_DATA allows transformation of columns data while
exporting (expdp) or importing (impdp) by using a remapping
function in the database.
REMAP_DATA with Data Pump is usually faster than a custom
UPDATE statement for masking data.
To mask multiple columns in the same process and command,
the REMAP_DATA parameter can be used multiple times.

31
31

Data Masking with Oracle Data Pump:


Quick example
REMAP_DATA=[schema.]tablename.column_name:
[schema.]pkg.function
impdp dumpfile=data.dmp
REMAP_DATA=scott.orders.customer_name:scott.maskpk
g.mask
expdp dumpfile=data.dmp
REMAP_DATA=scott.orders.customer_name:scott.maskpk
g.mask

32
32

Data Masking with Oracle Data


Pump:
Syntax
REMAP_DATA=
[schema1.]tablename.colum
n_name:
[schema2.]pkg.function
schema1 -- the schema with
the table to be remapped. By
default, this is the schema of
the user doing the export.
tablename -- the table which
column will be remapped.

33
33

column_name -- which is to
be remapped.
schema2 -- with the PL/SQL
package for remapping
function. As a default, this is
the schema of the user doing
the export.
pkg -- the name of the
PL/SQL package with the
remapping function.
function -- the name of the
remap function in the PL/SQL
package

Data Masking with Oracle Data


Pump: Restrictions
Data types must be same in the table column, masking function
parameter, and function return value.
No commits or rollbacks should be done in the masking function.
No direct path loads can be used in import process with
REMAP_DATA.
Note: Operation of long export/import data pump processes can
be monitored from the v$session_longops view, but the
estimated values do not take into account REMAP_DATA
operations.

34
34

Data Masking with Oracle Data


Pump:
Example
Create a table in the CUSTOMERS schema called
phones
SQL>
CREATE TABLE CUSTOMERS.PHONES
(
MODELNAME VARCHAR(20) NOT NULL,
PHONENUMBER VARCHAR2(50)
);
insert into CUSTOMERS.PHONES
values(N900,+3581234567);
insert into CUSTOMERS.PHONES values(N8,+3589817654);
insert into CUSTOMERS.PHONES values(N7,+3584834819);
Commit;
35
35

Data Masking with Oracle Data


Pump:
Example
We then need to create a function
for remapping

create or replace package customers.maskpkg


as
function masknumber(phonenumber
varchar2) return varchar2;
end;
/
create or replace package body
customers.maskpkg as
function masknumber (phonenumber
varchar2) return varchar2 is
begin return
substr(phonenumber,1,4)||
round(dbms_random.value (100,999))||
lpad(round(dbms_random.value
(1,9999)),4,'0');
end;
end;
/

36
36

The function masknumber


will accept a varchar2 type
and returns a random phone
number in varchar2 type

Data Masking with Oracle Data


Pump:
Example
This example will mask one
column: phonenumber to the
export output file
$ expdp customers/manager \
tables=customers.phones \
dumpfile=phones_masked.dmp
\
directory=dumpdir \
remap_data=\
customers.phones.phonenu
mber:\
customers.maskpkg.masknu
mber
37
37

Export the data from the


customers.phones table with
expdp utility and use the
REMAP_DATA option for
masking the data in the
dump file with the function
customers.maskpkg.masknu
mber created earlier
By default the owner of the
remap function is the user
running expdp/impdp
Now the dumpfile
phones_masked.dmp can be
used for testing environments

Data Masking with Oracle Data


Pump:
Example, before masking
SQL> select * from customers.phones
MODELNAME PHONENUMBER
---------- -------------------N900

38
38

+3591234567

N8

+3589817654

N7

+3584834819

Data Masking with Oracle Data


Pump:
import/replace table
This example will import the masked dump file
and replace the existing table with masked column data
$ impdp customers/manager \
tables=customers.phones \
dumpfile=phones_masked.dmp \
directory=dumpdir \
table_exists_action=REPLACE

39
39

Data Masking with Oracle Data


Pump:
Example, after import
SQL> select * from customers.phones
MODELNAME PHONENUMBER
---------- -------------------N900

40
40

+35815499474

N8

+3584800578

N7

+3581247839

Data Masking with Oracle Data


Pump:
Import
REMAP_DATA parameter can also be used in the import process
Use impdp if you have an existing dump file and you want to
mask the data when loading into the database
$ impdp customers/manager TABLE_EXISTS_ACTION=replace
dumpfile=phones.dmp directory=dumpdir
remap_data=customers.phones.phonenumber:customers.mask
pkg.masknumber

41
41

Primary Key Regeneration with


REMAP_DATA
To change primary keys after development phase of an
application, use REMAP_DATA parameter.
It is possible to use Data Pumps REMAP_DATA to change the
primary keys within a database by writing a remapping function
which follows the new primary key format.
Primary key conflicts can happen when loading data to an
existing table in a database.
With REMAP_DATA, conflicts can be avoided by changing the
column values during import.

42
42

When not to use Masking


It can be very difficult and expensive to mask data especially in
large BI/DWH systems due to the amount of data and refining
operations.
It is often easier to use a data model in which sensitive data can
be
limited to one location, referenced via surrogate keys, and
kept safe with strict access policies and auditing.
But when giving out data for simplified testing needs, masking is a
good tool!

43

Other Methods to Secure Data in


Oracle Database 11g Release 2
Oracle Virtual Private Database (VPD aka Fine Grained Access
Control) feature allows filtering of data at row-level for runtime SQL
statements according to a defined policy.
Data Masking Pack for Enterprise Manager. Centralized masking
solution within Grid Control.
Oracle Advanced Security Option for encrypting data at tablespace or
column level.
Oracle Label Security for restricting access to data based on policies.
Oracle Database Vault for separating roles and restricting access to
data per roles.

44

Oracle Data Pump at Nokia:


Summary
Oracle Data Pump is used extensively at Nokia to move data and
metadata between systems.
REMAP_DATA option is a great way to mask sensitive data for
security purposes.
REMAP_DATA option can also be used to change primary keys.
Oracle has many features for securing data at different levels.
Security for data is not just Obscurity. It is a process and requires
time.

45
45

Oracle Data Pump Summary


Transportable Tablespaces
Tablespace mode
Table mode

Filtering metadata using the INCLUDE and EXCLUDE


parameters
Restarting stopped/failed jobs
REMAP_DATA parameter to scramble data
How to generate primary keys without having to use additional
software or scripts

46

Questions?

Oracle Data Pump


Demogrounds Booth Moscone West W-025

47

48

49

You might also like