2 What is Data Pump? A replacement of the traditional export/import utilities?
The evolution of the traditional export/import utilities?
A completely new 10g utility serving a similar yet slightly different purpose? 3 Other Options for Moving Data Traditional Export and Import Pros Easy to use most DBAs have years of experience using these utilities Versatile various options available; can specify what to include Platform independent Serial output Cons Comparatively slow Can be network intensive Non-interruptible / resumable Limited filtering options (for example, can exclude just VIEWS) Limited remapping options (i.e. from one tablespace to another)
4 Other Options for Moving Data Transportable Tablespaces Pros Undoubtedly the fastest way to move data Can use the traditional exp/imp or Data Pump to move meta-data Cross-platform support if the platform byte-order is the same Cons Tablespaces must be made read-only Not selective (must move the entire tablespace) Flashback is not possible (tablespace is read only when copied) No physical reorganization is performed Datafile sizes remain constant Must use RMAN to convert the datafile if migrating to a platform with a different byte-order (check V$TRANSPORTABLE_PLATFORM) 5 Other Options Used Less Frequently Extraction to a flat file and loading using SQL Loader Direct copy using database links (SQL Plus COPY command) Oracle Streams 3 rd Party data ETL or reorg tools 6 Top 10 Reasons to Love DataPump 10. Similar look and feel to the old exp/imp 9. Can filter on the full range of object types 8. Can re-map datafiles and or tablespaces on import 7. Estimates the export file size (space needed) 6. Parallelizable 5. Significantly faster than the traditional exp/imp 4. PL/SQL Interface programmable 3. A file is not actually required - can import through a network link 2. Track in v$session_longops 1. Resumable (interruptible and restartable) 7 Top 10 Reasons Not to Love Data Pump 10. Still generates redo (unlike direct path inserts) 9. Aggregation of exported data is not possible (sort only) 8. Performance on the server 7. Harder to tell what its doing at any given time 6. No equivalent to the STATISTICS option 5. Cannot be used with sequential media such as tapes and pipes (not read/written serially) 4. Only accesses files on the server, never the client 3. Oracle directories are required in the DB to access the files 2. Does not support COMMIT on imp or CONSISTENT on exp 1. If constraints are violated on import, the load is discontinued 8 Operation Fundamentals Export/Import These utilities would basically connect to the Oracle database via Oracle NET and run queries or DDL/DML Processing of returned results and I/O operations were done on the client Data Pump The executables call PL/SQL APIs Therefore processing is done on the database server This can be an advantage or a disadvantage depending on the situation Self-Tuning: no longer need to use BUFFER or RECORDSET 9 Export Operation Network Export File(s) exp.exe Oracle Database 10 Data Pump Export Operation Network Export File(s) expdp.exe Oracle Database 11 Key Differences Dump and log files are on the server, not the client Must have a DIRECTORY created in the Oracle database for I/O Permissions for the userid connecting to the instance, not the schemas being exported or imported Canceling the client process does not stop the job Doesnt automatically overwrite dump file if it already exists returns an error instead Parameters (command line) are reported in the log file Exported objects order based on table size (descending) instead of alphabetically 12 Multiple Interfaces 1. Command line utilities expdb and impdb Similar to the familiar exp and imp in usage Use HELP=Y for a list of commands Oracle documentation provides a comparison table to exp/imp 2. Enterprise Manager 3. PL/SQL Can be used independently but is difficult
All of these call the DBMS_DATAPUMP API Uses Oracle Advanced Queuing Uses DBMS_METADATA 13 Unload Mechanisms Data Pump automatically chooses to unload data either using: Direct path External Tables (new driver called ORACLE_DATAPUMP) Same External Tables mechanism that was introduced in Oracle9i When will it use External tables: When parallelism can be used When the table contains a complex data type or structure that prevents direct path unloads A lot of tables fall under this situation see Oracle documentation for a complete list It doesnt really matter to us which method is used 14 Multiple Processes Master Control Process Spawns worker processes Populates the master control table and log file The master control table can be queried to track the jobs process At the end of an export, the master control table is written to the dump file and dropped from the database Worker Processes Performs the loading/unloading Number of processes depends on the degree of parallelism (the PARALLEL option) 15 Detaching and Re-Attaching Issuing Ctrl-C from the data pump import will detach Import is running on the server so it will continue Brings you into interactive-command mode To re-attach, run impdp with the ATTACH= option Example: impdp userid=system/oracle attach=JOB_01 Brings you back into interactive-command mode 16 New Views DBA_DATAPUMP_JOBS and USER_DATABASE_JOBS Identify all jobs regardless of their state Identify any master tables not associated with an active job
DBA_DATAPUMP_SESSIONS Identify user sessions that are attached to a job
Data pump sessions populate v$session_longops Documentation says that it is 100% accurate for imports but testing proves otherwise!!! 17 Security Considerations Still uses the EXP_FULL_DATABASE and IMP_FULL_DATABASE A privileged user will have these two roles A privileged user can: Export/import objects owned by other schemas Export non-schema objects (metadata) Attach to, monitor, and control jobs initiated by others Perform schema, datafile, and tablespace remapping Similar to the traditional export/import Supports label security If exporting user has the EXEMPT ACCESS POLICY role 18 Object Statistics From Oracle documentation regarding data pump exports: A parameter comparable to STATISTICS is not needed. Statistics are always saved for tables. From Oracle documentation regarding data pump imports: A parameter comparable to STATISTICS is not needed. If the source table has statistics, they are imported. 19 Other Random Points Can still use a parameter file and the PARFILE command line option Fully supports Automatic Storage Management (ASM) Can still flashback to a specified time or SCN Can still extract (or backup) DDL (meta data) Using the SQLFILE option instead of the traditional INDEXFILE or SHOW options Full support of LOBS