You are on page 1of 39

TSM 6.

2 Family Update
Jim Smith Tivoli Storage Manager Architect
IBM Software

Optimizing the Worlds Infrastructure


26 May 2010 Stockholm

2010 IBM Corporation

Disclaimer
This presentation describes potential future enhancements to the IBM Tivoli Storage Manager family of products All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only Information in this presentation does not constitute a commitment to deliver the described enhancements or to do so in a particular timeframe IBM reserves the right to change product plans, features, and delivery schedules according to business needs and requirements This presentation uses the following designations regarding availability of potential product enhancements
Future Candidate: Candidate for delivery in a future release (2011 or beyond)

The information on the new product is intended to outline our general product direction and it should not be relied on in making a purchasing decision. The information on the new product is for informational purposes only and may not be incorporated into any contract. The information on the new product is not a commitment, promise, or legal obligation to deliver any material, code or functionality. The development, release, and timing of any features or functionality described for our products remains at our sole discretion.

Abstract
This session provides a detailed preview of selected future enhancements in the next TSM release and beyond. Topics include deduplication, security and compliance, management of the server database and storage hierarchy, virtual tape library integration, and software deployment.

Agenda
Data Deduplication Security and Compliance Storage Hierarchy and Database VTL Integration Software Deployment Backup Environments

Note: VMware and Hyper-V will be covered in a separate presentation at this conference and therefore is not included in this presentation

DATA DEDUPLICATION

Server-Side Data Deduplication


1. Data sent from clients to server and stored in primary storage pool DeduplicationEnabled Disk Storage Pool
File 1 A File B 2 File 3 A

TSM 6.1 Client

4. Duplicate data chunks removed from primary storage pool during Reclaim operation C 3. Backup Stgpool operation copies data to non-deduplicated copy storage pool

B C C E

D A

TSM 5.5 Client TSM 6.1 Server Copy Storage Pool (non-deduplicated)
File 1

File 3

File 2 File 3

Exchange Server

2. Identify Duplicates process creates chunks and pointers to hash index (deduplication index) in server database to relate files to chunks

Hash Index

Currently Available

Client-Side Data Deduplication


File 4 B

E DeduplicationEnabled Disk Storage Pool


File 1 A

1. Client creates chunks TSM 6.x Client 2. Client and server identify which chunks need to be sent

Chunks within pool are shared by client-side and server-side deduplication operations

C D

F Exchange Server with TSM 6.x API TSM 6.x Server 3. Client sends chunks and hashes so server can represent object in database FastBack Server with TSM 6.x API
File 4

4. Entire file is reconstructed during Backup Stgpool operation to non-deduplicated storage pool

Copy Storage Pool (non-deduplicated)


File 1

File 2 File 3 File 4

Hash Index

Reduced space requirement in storage pools Reduced space requirement in storage pools Reduced consumption of network bandwidth Reduced consumption of network bandwidth

TSM 6.2

Design Points for Client-Side Deduplication


Design point Comments Reduces network traffic by deduplication of data before transfer to the server Effective for all data sent from 6.x Backup-Archive or API client to 6.x server No post processing required after data is stored in deduplicationenabled disk storage pool Both client-side and server-side require deduplication-enabled pool Client-side and server-side deduplication share data chunks in pool using a unified chunk index in the TSM database Client-side and server-side use the same algorithms/parameters for fingerprinting and chunk identification (using hashing) Client optionally compresses data after it has been chunked Server expands (decompresses) data when it needs to be reconstructed, such as for backup to a tape copy storage pool or for restore to a legacy client Server enables each client node for client-side deduplication Client controls whether it actually uses client-side deduplication Client include/exclude options allow control of client-side deduplication at the file level
7

Source-side (client-side)

In-band Data compatibility with serverside data deduplication

Compatibility with client compression

Server and client controls over deduplication used

Design Points for Client-Side Deduplication (2)


Design point Optimization Comments Minimize network chat and database lookups due to chunk-index (deduplication-index) queries to the TSM server Maintain a local cache of hash values for each client SHA-1 digest for each chunk Comparison of chunk size MD5 digest for entire data object checked by client after restore Enhanced client-server protocol to detect malicious activity Limit ability to use commands which display raw hash and index data SSL communication allows encryption of deduplicated data Provide additional statistics with Backup-Archive client and API to indicate deduplication and data reduction savings

Avoidance of false matches

Security

Reporting

Comparison of TSM Data Reduction Methods


Client compression
How data reduction is achieved Conserves network bandwidth? Data supported Client compresses files

Incremental forever
Client only sends changed files

Subfile backup
Client only sends changed subfiles

Server-side deduplication
Server eliminates redundant data chunks No Backup, archive, HSM, API Redundant data from any files in storage pool Yes

Client-side deduplication
Client and server eliminate redundant data chunks Yes Backup, archive, API Redundant data from any files in storage pool Yes

Yes Backup, archive, HSM, API Redundant data within same file on client node No

Yes

Yes Backup (Windows only) Subfiles that do not change between backups No

Backup Files that do not change between backups No

Scope of data reduction Avoids storing identical files renamed, copied, or relocated on client node? Removes redundant data for files from different client nodes?

No

No

No

Yes

Yes

Available prior to V5

Available 6.1

Available 6.2

All of these data reduction methods conserve storage pool space


9

Expanded Deduplication Function


Deduplication for additional device classes
Tape Random-access disk Virtual volumes

Future Candidate

Expanded client-side deduplication capabilities


LAN-free support Backup-Archive client on z/OS USS

10

SECURITY AND COMPLIANCE

11

Deduplication and Encryption


Data source 1 Important text No encryption

Important text

Data encryption prior to Data encryption prior to deduplication processing can deduplication processing can subvert data reduction subvert data reduction

Data source 2 Important text Data source 3 Important text

Encryption key 1

txpt tnatroemI

Data deduplication

Data store Important text

Encryption key 2

txpt tnatroemI te tarpIxtntom te tarpIxtntom

1. Three data sources have the same text file

2. After encryption, text files do not match

3. Deduplication processing does not detect redundancy

4. Text files are stored without data reduction

12

Introduction to Secure Socket Layer (SSL)


SSL allows the entire client/server session to be wrapped in an encrypted tunnel

Client

Server

Exploits asymmetric encryption (public/private keys) during client/server authentication


Servers public key is widely distributed and used by the client to encrypt messages that only the server can decrypt Servers private key is known only to the server and is used to decrypt messages that have been encrypted by the client

Servers public key is distributed in a digital certificate Certificate validation ensures that the certificate really came from the server
Client can validate digital certificate using a trusted third party called a certificate authority (CA) Certificate can be self-signed by the server and delivered to each client using secure mechanism

After initial authentication, random symmetric key is negotiated for encrypting the remainder of the session

13

Enhanced TSM Support for SSL


Extended platform support
Windows (available in TSM 5.5) AIX (available in TSM 5.5) Linux Solaris HP-UX

Alternatives for validation of TSM server certificates


Manual, secure distribution of self-signed certificates (available in TSM 5.5) Acquire certificates signed by well-known certificate authority such as Thawte or Verisign Use certificate signed by customers own certificate authority

256-bit AES encryption for in-flight data 256-bit AES encryption for in-flight data Compatible with TSM server- or client-side deduplication Compatible with TSM server- or client-side deduplication Simplified deployment and validation of TSM server certificates Simplified deployment and validation of TSM server certificates

TSM 6.2

14

14

Client-Server Communication Using SSL


TSM Client 1 (B-A, HSM, or API)
AIX Windows HP-UX Linux Solaris

2 1
Storage Agent TSM Server 1 Admin Client

7 3 1 6
Admin Center

AIX Windows HP-UX Linux Solaris

4
TSM Client 2 (B-A, HSM, or API) SSL paths Non-SSL paths

TSM Server 2
Communication Types
1. Client-to-server (backup/recovery, file selection, data movement) 2. Admin command-line client (administrative commands) 3. Administration center 4. Web client (file selection for backup/restore) 5. Client-to-client (coordination for HSM, Copy Services) 6. Server-to-server (management tasks) 7. Storage-agent-to-server (LAN-free)
15

Web browser

Validation Using Certificate Authority


Certificate Authority

CA's root certificate

CA's root certificate

Signed server certificate (public/private)

Server public certificate TSM Client 1 TSM Server

Server public certificate TSM Client 2

Configuration Certificate request is signed by CA Certificate is installed on TSM server CA's root certificate is installed on the client

Runtime Client accepts any certificate signed by CA Client rejects all other certificates Client verifies server's identity TSM 6.2
16

PVU Estimation Reporting


Estimated Processor Value Units (PVU) reporting for Backup-Archive client and API applications
TSM clients will scan system and send processor data to TSM Server TSM server will store processor data and calculate PVU value

Ability to report on client-device and server-device at a node level


Allow TSM administrator to change classification on a per-node basis

PVU summary report (example at left) Full-Capacity licensing only


Virtualization Capacity (Sub-Capacity) customers are still required to use the IBM License Metric Tool (ILMT) to create, verify, adjust, sign and save reports Future Candidate
17

Historical Audit Trail: Data Objects


Information regarding initial store of object When object was stored Who initiated store operation How store was initiated (schedule, GUI) Initial storage pool / volumes Transport mechanism for store (LAN-free) Encryption strength Client compression Initial management class Object size Client-side deduplication Information regarding later operations on object Client restore/retrieve/recall attempts Outcome of client access operations Deletion (who/what initiated) Move/copy operations Management class rebinding Server-side deduplication

DB2

TSM Server A

Storage Hierarchy

Historical information for each stored object is tracked in database Object information can be queried for audit compliance or problem diagnosis
Improved tracking of historical information on data objects Improved tracking of historical information on data objects Future Candidate
18

Historical Audit Trail: Server Configuration


Server configuration history New/changed constructs - Policy definitions - Schedules - Storage pools/device classes - Nodes Set commands Changes to server options Changes to server level

DB2 TSM Server

Historical information relating to server configuration is tracked in database Server-configuration information can be queried for audit compliance or problem diagnosis
Improved tracking of historical information on server configuration Improved tracking of historical information on server configuration Future Candidate
19

STORAGE HIERARCHY AND DATABASE


20

TSM Database Backup with Multiple Streams


TSM Server TSM Database

DB2 Backup/restore streams

Database backup performance will enable sustained scalability improvement

Parallel streams for backup/restore processing give improved throughput Reduced time for database backup/restore Increased scalability of TSM server without expanding database backup window
Reduced database backup window Reduced database backup window Improved recovery time Improved recovery time Increased scalability of TSM server Increased scalability of TSM server

Future Candidate
21

Server-to-Server Metadata Export/Import


Source Server
DB2 DB2

Target Server

Server-server export of metadata

Transfer storage pool volumes Storage Hierarchy

Storage Hierarchy

Metadata transferred between servers using export/import Storage pool volumes physically moved (or replicated) to the target server Especially attractive when used with shared libraries Could be used for
Splitting/balancing servers Consolidating servers, such as after upgrade to DB2

Reduced time and bandwidth consumption for export/import of object data Reduced time and bandwidth consumption for export/import of object data Ability to transfer data for individual nodes Ability to transfer data for individual nodes

Future Candidate

22

2 3

Remote Copy Storage Pool with Deduplication


Site A
Near-realtime replication of database (HADR) DB2 DB2

Site B

Database
Storage pool backup

Database

TSM Server Storage Hierarchy


Deduplicated storage pool, optionally with client-side deduplication

Deduplicated Copy Pool (iSCSI/CIFS/NFS)

No special hardware/software required Deduplication gives storage/bandwidth savings All data in primary hierarchy could be replicated, after initially being stored in deduplicated primary pool

Near-term document solution combining existing technologies: TSM deduplication, probably client-side Copy pool with network-attached storage DB2 HADR

23

Node Replication with Deduplication


Site A
TSM Server A
Node C Node B Node A

Site B
TSM Server B

Metadata and deduplicated data


Node X Node Y

DB2

DB2

Storage Hierarchy Database

Storage Hierarchy Database

TSM server would replicate all data and metadata for specified nodes to another server, ensuring node completeness and consistency of data/metadata Incremental client data transfer with deduplication to minimize bandwidth consumption Remote TSM server could be hot standby for primary server, for improved RTO Native TSM solution with no dependency on specific storage device Many-to-1 transfer to target server (recovery manageability) Supports dissimilar hardware, configuration and retention at primary and remote sites

Remote vaulting without manual tape transfer Remote vaulting without manual tape transfer Efficient use of bandwidth through deduplicated replication Efficient use of bandwidth through deduplicated replication Allows hot standby at remote site Allows hot standby at remote site

Future Candidate
24

Simultaneous Write for Client Store Operations


2. Simultaneous write to multiple targets 1. Data sent Client Active-data pool Copy pool 1 Server Copy pool 2 Data flow 3. Migration

Primary Storage Pools

Data written synchronously to primary pool and one or more copy-pool or activedata-pool destinations Avoids need for subsequent copy operations to active-data pool or copy pool Requires that sufficient tape devices be available during client backup Tape delays may extend client backup window Not compatible with client-side deduplication
Currently Available
25

Simultaneous Write During Storage Pool Active data pool Migration


1. Data sent Client Server 2. Data stored 3. Migration Copy pool 2 3. Simultaneous copy Copy pool 1

Data flow

Primary Storage Pools

Combines windows for migration, storage pool backup, and copy active data
Reduces total time for these operations Frees server resources for other operations

Compared to existing simultaneous write function


Reduces the need for tape devices during client store operations (backup, archive, client HSM) Can reduce client backup window Efficient use of time and resources Efficient use of time and resources TSM 6.2
26

VIRTUAL TAPE LIBRARY (VTL) INTEGRATION


27

VTL Basics
VTL emulates/virtualizes tape drives and library Underlying media is magnetic disk Performs like disk, with no delays for mount, dismount, locate, or rewind Integrated capabilities may include
Compression Encryption Shredding of data when no longer needed Data deduplication Remote replication Attachment of physical tape and data movement from virtual to physical tape

28

Considerations for VTL Use with TSM Today


TSM treats VTL as tape TSM sequential-access disk offers many capabilities of VTL Possible advantages of VTL in the TSM storage hierarchy
Simplified setup and management of storage as compared to configuring native disk volumes Facilitates sharing of disk storage among TSM servers Offloading work from the TSM server to a VTL may improve scalability Simplified configuration for LAN-free operations as compared to LANfree to sequential-access disk Ability to exploit integrated VTL capabilities such as compression, data deduplication and replication

29

TSM Enhancements to Increase VTL Awareness


Support classification/prioritization of sequential-access storage devices to distinguish VTL from physical tape Enable concurrent access to VTL volumes Enhance mount-point processing to better handle large numbers of virtual drives For retrieval operations, enhance volume selection to differentiate VTL volumes from physical tape

Restore Node 1 Restore Node 2

VTL Volume

Backup storage pool Backup Node 3

Concurrent access for VTL volumes (multiple read operations, one write operation)

More effective use of VTL in TSM storage hierarchy More effective use of VTL in TSM storage hierarchy

Future Candidate

30

SOFTWARE DEPLOYMENT

31

Client Deployment for Upgrade


Deployment of client software to upgrade existing clients
Windows Backup-Archive client Client scheduler must be running Used for deployment of new version, release, modification (fix pack), interim fix Deployment across policy domains and for multiple TSM servers

TSM server and admin center must be at release 6.x or higher Supported client releases
Current client is 5.4 or higher Target client level is 6.x or higher

Client control via new option Autodeploy=Yes|No|NOREboot


Yes Automatically deploy the client even if computer restart is required NOREboot Automatically deploy the client unless a computer restart is required No Do not automatically deploy the client

Simplified deployment of client software Simplified deployment of client software

TSM 6.2
32

Client Deployment Flow


TSM Server
1 2

Acquire client packages from the FTP site

Make archive client packages available in storage pool


10

Administrator views results

Define / update policy and schedule

Define the nodes to which package should be deployed

TSM Admin Center


5

Retrieve client package and deployment manager from server

TSM Client Processes

6Start the deployment manager process

TSM Client Machines

Client Scheduler

Report update status to the server

Deployment Manager

7 8

Unpack package and parse instructions Run install script

33

33

Additional Software Deployment Function


Deployment of non-Windows clients Deployment of other components
HSM client Storage agents Data protection agents

Automatic downgrading (regression) of client software Initial client distribution and installation Distribution without client scheduler running

Expanded software deployment function Expanded software deployment function

Future Candidate
34

BACKUP ENVIRONMENTS

35

Windows System Writer Incremental Backup


Windows System Writer is often the largest component of the System State data, and includes Installed file system and application binaries Windows Side-by-Side directory contents PnP files and drivers User mode services and drivers This component has grown over recent Windows releases, and comprises 50,000+ files (>7 GB) in Windows 2008 TSM currently backs up all System Writer files if only one file has changed TSM 6.2 uses progressive incremental backup for System Writer files on Windows 2003 and above

TSM 6.2

36

Other Client Enhancements


TSM will continue enhancements for backup in specific environments TSM 6.2 Examples in TSM 6.2 include
Backup/archive of GPFS data on Windows Online backup of Hyper-V guests from host using VSS Segmentation of extremely large SAP databases for efficient handling by TSM for ERP

Examples in future releases may include


Future Candidate

Journal-based backup for Linux Data reduction through metadata separation Simplified configuration of backup-archive clients in a cluster Automated System Recovery (ASR) for Windows 2008, Vista, and Windows 7

37

QUESTIONS ?

38

You might also like