Professional Documents
Culture Documents
PA RT
2
INFORMATION
TECHNOLOGIES FOR
DISASTER RECOVERY
T
HIS PART DELVES INTO INFORMATION HANDLING AND PRO-
CESSING TECHNOLOGIES AND HOW THEY CAN BE USED TO
TUNE DISASTER RECOVERABILITY TO THE NEEDS OF THE
ENTERPRISE. THE CHAPTERS OF THIS PART FOLLOW A PROGRES-
SION FROM LONGER RECOVERY TIMES (RELYING PRIMARILY ON
OFFLINE DATA PROTECTION AND RECOVERY) TO SHORTER ONES
(RELYING ON NEAR-REAL-TIME REPLICATION OF DATA AND DATA
MANAGER AND APPLICATION CLUSTERING).
137
08_137_170_DR1 4/2/02 2:52 PM Page 138
08_137_170_DR1 4/2/02 2:52 PM Page 139
CHAPTER
8
Backup and Disaster Recovery
Never underestimate the bandwidth of a station wagon full of tapes
hurtling down the highway.
Andrew Tannenbaum
1. There are fuzzy le and database backup techniques that create copies of
changing data with limited currency and consistency guarantees. These can be
used to restore databases after certain failures; however, they have limited use as
durable business records.
08_137_170_DR1 4/2/02 2:52 PM Page 141
2. Throughout this chapter, the term tape is used generically to refer to any of
the various recording media that can be used to store data off line, because tape
is by far the most frequently occurring medium in computing today.
08_137_170_DR1 4/2/02 2:52 PM Page 145
File System
File Daily
Backup Schedule Backup Backup
File Agent Scheduler Engine
Weekly
Schedule
File
Quarterly
Free Space Schedule
Tape
Drive
Backup Backup Backup Media
Client Client Manager Server
Media Growth
Server
Backup
Manager Tape
Media
At the outset: Library
All backup functions are performed
by the application/database server With growth:
Backup functions migrate to specialized servers as
performance or other operational needs dictate
08_137_170_DR1 4/2/02 2:52 PM Page 147
Backup Commands
Backup
Client
Media
Server
Backup
Backup Manager
Client
Media
Backup Server
Client
Application Servers
(Backup Clients) Backup Data flow
08_137_170_DR1 4/2/02 2:52 PM Page 148
When to Back Up
Deciding when to back up also requires both enterprise business
policy and system operations knowledge. System administrators
must balance between acceptable maximum backup age (which
determines the worst-case number of hours of data updates that
08_137_170_DR1 4/2/02 2:52 PM Page 149
Backup Policies
All of the following parameters are typically bundled into an
abstraction called a backup policy:
08_137_170_DR1 4/2/02 2:52 PM Page 150
Backup clients
File and directory lists
Eligible media servers, media types and pools, and device groups
Scheduling information
INCREMENTAL BACKUP
Full and Incremental Backup
In most enterprise information services, only a small fraction of
online data changes between successive backups. In le-based sys-
tems, only a small percentage of the les change. Incremental
backup techniques make use of this fact to minimize backup
resource requirements. An incremental backup is a copy of only
the les changed since the preceding backup. A backup client uses
le system metadata to determine which les have changed, and
copies only those. Figure 84 illustrates the difference between
full and incremental backup.
Incremental backup augments rather than replaces full backup.
An incremental backup contains les that have changed since some
point in time for which a full backup exists. To restore a set of les
from incremental backups, a full backup must rst be restored to
establish a baseline. Incremental backups are then restored in age
order (oldest rst), replacing changed les in the baseline. Incre-
mental backup reduces the frequency with which time-consuming
full backups must be performed.
File H
File B
File System File System
File A File A
File A
File B File B
File B File B
File C File C
File C File H
File D File D
File D
File E File E
File E
File F File F
File F
File G File G
File G
File H File H Backup copy only
File H
File I File I contains files
File I changed since
File J File J
File J last backup
Newest incremental
backup restored last
File E
File J Result: the most up-to-date
File A data possible from backups
File B
File C File System File System
File D
File E File A Result
File F File F
File A File A
File G File B File B
File H
File I File C File C
Newest update
File J File D File D
File C to File J is
File E File J File E online when all
Step File F File F incremental
1 backups have
File G File G
been restored
Newest File H File H
full File I Step File I
backup 2
File J File J
Baseline for
incremental Oldest incremental
restores backup restored first
BACKING UP DATABASES
Database management systems are typically capable of producing
point-in-time database backups. The technology is similar to that
of le system snapshots. Database activity is halted momentarily
so that backup can be initiated, and then resumes. Each applica-
tion update while the backup is in progress causes a copy of the
updated objects prior contents to be saved. The backup program
reads these before images. All other programs read current data
object contents.
A backup made in this way represents the contents of the data-
base at the point in time at which the backup was initiated. This
technique, often called hot database backup, is well accepted and
widely used. Some enterprise backup managers are integrated with
database manager backup facilities so that hot database backups
can be scheduled as part of an overall enterprise backup strategy.
Hot database backup increases database I/O activity signicantly,
due both to the backup itself and to the storing of database object
before images.
08_137_170_DR1 4/2/02 2:52 PM Page 155
Database
Table A
"Before images" of Snapshot
Application updates changed data
Table B
Map
"Before images"
of changed
Table C blocks
Storage for
snapshot allocated from
file system free space
File System
Free Space
Snapshot T1
Map
"Before images"
of changed
Main database blocks since T1
image
Snapshot T2
Table A
Map
Application updates "Before images" of "Before images"
Table B changed data of changed
blocks since T2
Snapshot T3
Table C
T3 T2 T1
Map
"Before images"
of changed
blocks since T3
Database can be backed up as of
T1, T2, or T3 or rolled back to
state at any of T1, T2, or T3
back the database to its state at the time of the snapshot. This can
be useful, for example, if an application error that causes database
corruption is discovered only after running for a period of time.
File System
Table A Table A
Application updates
Table B Table B
Block Snapshot
addresses of
changed data Changed block map
Map indicates which blocks
File System need to be backed up
Address 1
Address 2
Table A
Application updates
Changed
Table B blocks
Table C Backup
Manager
Data for backup is read
from database image
(not from snapshot)
08_137_170_DR1 4/2/02 2:52 PM Page 158
ARCHIVES
Over time, an enterprises stock of historical data grows. Monthly,
quarterly, and annual closing reports, sales, production, shipping,
and service records, and other data must be retained, but generally
need not be on line. Such data can be archived. Functionally, archiv-
ing is identical to backup. Designated les are copied to backup
media on predened schedules and catalogued so that they can be
located later. Archiving differs from backup, however, in that once
an archive job has completed, the archived les are deleted from
disk storage, freeing the space they occupied for other use.
Figure 811 illustrates a le system in which database tables
occupy one directory, and monthly roll-up and report information
occupies another. The database directory is scheduled for regular
backup as discussed in preceding sections. Data in the monthly
roll-up directory is only of interest for a limited time, but must
be retained for regulatory or policy reasons. The directory con-
taining monthly roll-up data is therefore scheduled for regular
08_137_170_DR1 4/2/02 2:52 PM Page 159
Backup
Database
Table A Table A
Backup Month-end
Operational Data archive
Directory Table B Table B
Data remains on line
after backup copy
is made
Table C Table C Table D
(monthly roll-up)
Parallel Backup
In systems with high-performance networks and storage volumes,
large backup jobs can be speeded up by distributing the backup
data across several tapes. This parallel backup can be effective,
for example, when full backups of large databases are made from
08_137_170_DR1 4/2/02 2:52 PM Page 161
Flash Backup
So-called ash backup reads all of the disk blocks occupied by a le
system and writes them to tape without interpretation, including
blocks that are not allocated to les. Conventional backup man-
agers open les and copy them one by one, resulting in signicant
le system overhead I/O. Flash backup reads disk block contents
as fast as physically possible, whether they represent user data, le
system metadata, or unallocated space.
To retrieve a le from a ash backup, a backup manager must
rebuild the le system metadata, and then retrieve the le from
potentially scattered areas on the tapes. Because data is recon-
structed at restore time rather than at backup time, backups are
dramatically faster, but restores take more time.
Sparsely populated le systems are usually not well suited for
ash backup, since the technique copies the contents of unallo-
cated disk space as well as data. The method works best with le
systems containing many small les, since these introduce the most
overhead I/O during backups.