You are on page 1of 37

Database Recovery

By
Parteek Bhatia
Sr Lecturer
Deptt of Computer Science & Engineering
Thapar University
Patiala
parteek_bhatia@hotmail.com
Simplified Approach to DBMS
By Parteek Bhatia
Database recovery is the process of restoring the database to a
correct state following a failure. The failure may be the result of a
system crash due to hardware or software errors, a media failure, such as
a head crash, or a software error in the application, such as a logical error
in the program that is accessing the database. It may also be the result of
unintentional or intentional corruption or destruction of data. Whatever
the underlying cause of the failure, the DBMS must be able to recover
from the failure and restore the database to a consistent state. It is the
responsibility of DBMS to ensure that the database is reliable and
remains in a consistent state in the presence of failures. In general,
backup and recovery refers to the various strategies and procedures
involved in protecting your database against data loss and
reconstructing the data such that no data is lost after failure.
Thus, recovery scheme is an integral part of the database system that is
responsible for the restoration of the database to a consistent state that
existed prior to the occurrence of the failure.
Simplified Approach to DBMS
By Parteek Bhatia
Data Storage
The storage of data generally includes four different types of
media with an increasing degree of reliability. These are:
Main memory
Magnetic disk
Magnetic tape
Optical disk.
Simplified Approach to DBMS
By Parteek Bhatia
Stable Storage
There is a Stable Storage in which information is never lost.
Stable storage devices are the theoretically impossible to
obtain. But, we must use some technique to design a storage
system in which the chances of data loss are extremely low.
The most important information needed for whole recovery
process must be stored in stable storage.
Simplified Approach to DBMS
By Parteek Bhatia
Causes of failures Causes of failures Causes of failures Causes of failures
System Crashes
User Error
Carelessness
Sabotage (intentional corruption of data)
Statement Failure
Application software errors
Network Failure
Media Failure
Natural Physical Disasters
Simplified Approach to DBMS
By Parteek Bhatia
Recovery and Atomicity of a Transaction
In order to understand the concept of atomicity of a transactions, consider again
a simplified banking transactions that transfer Rs 50 from account A to account
B with initial values of A and B being Rs 1000 and Rs 2000 respectively.
T1
Read(A,a)
a=a-50
Write(A,a)
Output(A
X
)
Read(B,b)
b=b+50
Write(B,b)
Output(B
X
)
Simplified Approach to DBMS
By Parteek Bhatia
In this example we suppose that after each write operation output
operation is performed or in other-words we can say that this is
an example of Force-Output operation.
Suppose that a system crash occurs during the execution of
transaction Ti after Output (A
X
) has been taken place but before
output (B
X
) was executed, here A
X
and B
X
are the buffer block
which contains the values of data item A and B. Since output
(A
X
) operation performed successfully it means that data item A
has value 950 in disk and output (B
X
) does not performed
successfully so its value remains 2000. It means that now system
enters into inconsistent state because there is a loss of Rs 50 after
execution of transaction T1.
Simplified Approach to DBMS
By Parteek Bhatia
In order to recover for above problem there are two simple means:
(i) Re-execute T1
If transaction T1 is re-executed successfully then value of A is
Rs 900 and B is Rs 2050. It results again incorrect information
because value A is Rs 900 rather than Rs 950.
(ii) Do not Re-execute T1
The current state of system has A =950 and B=2000 which again
an inconsistent state.
Thus, in either case database is left in an inconsistent state, so this
simple recovery scheme does not work. The reason for this
difficulty is that we have modified the database without having
assurance that the transaction will be indeed commit or completed.
Simplified Approach to DBMS
By Parteek Bhatia
Ways to achieve recovery of data:
Log Based Recovery
Shadow Paging
Simplified Approach to DBMS
By Parteek Bhatia
Log Based Recovery
It is most used structure for recording database modification.
In log based recovery a log file is maintained for recovery
purpose.
Log file is a sequence of log records. Log Record maintain a
record of all the operations (update) of the database. There are
several types of log record.
Simplified Approach to DBMS
By Parteek Bhatia
<Start> Log Record:
Contain information about the start of each transaction. It has transaction identification.
Transaction identifier is the unique identification of the transaction that
starts.Representation
<Ti , start>
<Update> Log Record:
It describes a single database write and has the following fields:
< Ti, Xj, V1,V2 >
Here, Ti is transaction identifier, Xj is the data item, V1 is the old value of data item
and V2 is the modified or new value of the data item Xj.
For example < T0, A, 1000, 1100 >
Here, a transaction T0 perform a successful write operation on data Item A whose old
value is 1000 and after write operation A has value 1100.
Simplified Approach to DBMS
By Parteek Bhatia
<Commit> Log Record
When a transaction Ti is successfully committed or completed a <Ti,
commit> log record is stored in the log file.
<Abort> Log Record
When a transaction Ti is aborted due to any reason, a <Ti, abort>
log record is stored in the log file.
Whenever a transaction performs a write A, it is essential that the log
record for that write be created before the database is modified. Once a
log record exists, we have the ability to undo a transaction.
Simplified Approach to DBMS
By Parteek Bhatia
There are two techniques for log-based recovery:
Deferred Database Modification
Immediate Database Modification
Simplified Approach to DBMS
By Parteek Bhatia
Deferred Database Modification
It ensures transaction atomicity by recording all database
modifications in the log, but deferring the execution of all write
operations of a transaction until the transaction partially
commits.
A transaction is said to be partially committed once the final
action of the transaction has been executed. When a transaction
has performed all the actions, then the information in the log
associated with the transaction is used in executing the deferred
writes. In other words, at partial commits time logged updates
are replayed into database item.
Simplified Approach to DBMS
By Parteek Bhatia
The execution of transaction Ti proceeds as follows:
< Ti, Start > Before Ti starts its execution, a log record is
written to the log file.
< Ti, A, V2> The write operation by Ti results in the writing of
new records to the log. This record indicates the new value of A
i.e. V2 after the write operation performed by Ti.
< Ti, Commit > When Ti partially commits this record is
written to the log.
Simplified Approach to DBMS
By Parteek Bhatia
T1
Read (A,a)
a=a-100
write(A,a)
Read(B,b)
b=b+100
Write(B,b)
Simplified Approach to DBMS
By Parteek Bhatia
Let T2 be a transaction that withdraws Rs 200 from account C. This
transaction can be defined as follows:
T2
Read (C,c)
c=c-200
Write(C,c)
Suppose that these transaction are executed serially, in the order T1
followed by T2 and the value of account A,B and C before the
execution takes place were Rs 1000, Rs 2000 and Rs 3000
respectively.
Simplified Approach to DBMS
By Parteek Bhatia
Simplified Approach to DBMS
By Parteek Bhatia
Recovery procedure
The recovery procedure of deferred database modification is based
on Redo operation as explain below:
Redo(Ti)
It sets the value of all data items updated by transaction Ti to the
new values from the log of records.
After a failure has occurred the recovery subsystem consults the log
to determine which transaction need to be redone. Transaction Ti
needs to be redone if an only if the log contain both the record <Ti,
start> and the record <Ti, commit>. Thus, if the system crashes
after the transaction completes its execution, then the information in
the log is used in restoring the system to a previous consistence
state.
Simplified Approach to DBMS
By Parteek Bhatia
Conclusion
Redo: If the log contain both records <Ti,
start> and <Ti, commit>
This transaction may be or may not be stored to disk
physically. So use Redo operation to get the modified values
from the log record.
Re-execute: In all other cases ignore the log record and re-
execute the transaction.
Simplified Approach to DBMS
By Parteek Bhatia
Simplified Approach to DBMS
By Parteek Bhatia
Immediate Database Modification
The immediate database modification technique allows database
modification to be output to the database while the transaction is still in
the active state. The data modification written by active transactions are
called uncommitted modification.
If the system crash or transaction aborts, then the old value field of the
log records is used to restore the modified data items to the value they
had prior to the start of the transaction. This restoration is accomplished
through the undo operation. In order to understand undo operations, let
us consider the format of log record.
<Ti, Xj, V
old
, V
new
>
Here, Ti is transaction identifier, Xj is the data item, V
old
is the old
value of data item and V
new
is the modified or new value of the data item
Xj.
Simplified Approach to DBMS
By Parteek Bhatia
Undo (Ti):
It restores the value of all data items updated by transaction T1 to the
old values.
Before a transaction T1 starts its execution the record <T1, start> is
written to the log. During its execution, any write (x) operation by T1 is
performed by writing of the apropriate new update record to the log.
When T1 partially commits the record <T1, commit>is written to
the log.
Example:
Let us reconsider our simplified banking system. Let T1 be a transaction
that transfers Rs100 from account A to account B and T2 be a transaction
that withdraws Rs200 from account C. Suppose these transactions are
executed serially as in previous example with initial values of A, B and
C as Rs1000, Rs2000 and Rs3000 respectively.
Simplified Approach to DBMS
By Parteek Bhatia
Simplified Approach to DBMS
By Parteek Bhatia
After a failure has occurred, the recovery scheme consults the log record
to determine which transaction needs to be undone. Transaction Ti needs
to be undone if the log contains the record <Ti, start> but does not
contain the record <Ti, commit>. This transaction is crashed during
execution. Thus, transaction Ti needs to be undone.
Redo (Ti):
Sets the values of all data items updated by transaction Ti to the new
values. These new values can be found in log record. After a failure has
occurred log record is consulted to determine which transaction need to
be redone.
Transaction Ti needs to be redone if the log contains both the record <Ti,
start> and the record <Ti, commit>. This transaction is crashed just after
partially committed. Thus transaction Ti needs to be redone.
Simplified Approach to DBMS
By Parteek Bhatia
Simplified Approach to DBMS
By Parteek Bhatia
Checkpoints
To recover the database after some failure we must consult the
log record to determine which transaction needs to be undone and
redone. For this we need to search the entire log to determine this
information. There are two major problems with this approach. These
are:
The search process is time-consuming.
Most of the transactions need to redone have already written their
updates into the database. Although redoing them will cause no harm,
but it will make recovery process more time consuming.
To reduce these types of overhead, we introduce Checkpoints.
The system periodically performs Checkpoints.
Simplified Approach to DBMS
By Parteek Bhatia
Actions Performed During Checkpoints
Output onto stable storage all log records currently
residing in main memory.
Output on the disk all modified buffer blocks.
Output onto stable storage a log record <checkpoint>.
Simplified Approach to DBMS
By Parteek Bhatia
Simplified Approach to DBMS
By Parteek Bhatia
Simplified Approach to DBMS
By Parteek Bhatia
Simplified Approach to DBMS
By Parteek Bhatia
Log-Record Buffering
As, we assumed earlier that every log record is output to stable
storage at the time it is created. This assumption imposes a
high overhead on system execution for the following
reasons:
Output to stable storage is performed in units of blocks.
In most cases, a log record is much smaller than a block.
Thus, the output of each log record translates to a much
larger output at the physical level.
The cost of performing the output of a block to storage is
sufficiently high that it is desirable to output multiple log
records at once.
Simplified Approach to DBMS
By Parteek Bhatia
Due to the use of log buffering a log record may reside in only main
memory (volatile storage) for a considerable time before it is output
to stable storage. Since such log records are lost if the system
crashes, we must impose additional requirements on the recovery
techniques to ensure transaction atomicity.
Transaction Ti enters the commit state after the <Ti commit> log
record has been output to stable storage.
Before the <Ti commit> log record can be output to stable storage,
all log records pertaining to transaction Ti must have been output to
stable storage.
Before a block of data in main memory can be output to the
database (in nonvolatile storage), all log records pertaining to data in
that block must have been output to stable storage.
The latter rule is called the write-ahead logging (WAL) rule.
Simplified Approach to DBMS
By Parteek Bhatia
The Write-Ahead Log Protocol
Before writing a page to disk, every update log record that
describes a change to this page must be forced to stable
storage. This is accomplished by forcing all log records to
stable storage before writing the page to disk.
WAL is the fundamental rule that ensures that a record of
every change to the database is available while attempting to
recover from a crash.
Simplified Approach to DBMS
By Parteek Bhatia
Failure with Loss of Nonvolatile Storage
Until now, we have considered only the case where a failure results in
the loss of information residing in volatile storage while the content of
the nonvolatile storage remains intact. Although failures in which the
content of nonvolatile storage is lost are rare we nevertheless need to be
prepared to deal with this type of failure. In this section, we discuss only
disk storage.
The basic scheme is to dump the entire content of the database to stable
storage periodically-say, once per day. For example we may dump the
database to one or more magnetic tapes. If a failure occurs that results in
the loss of physical database blocks, the most recent dump is used in
restoring the database to a previous consistent state. Once this restoration
has been accomplished, the system uses the log to bring the database
system to the most recent consistent state.
Simplified Approach to DBMS
By Parteek Bhatia
More precisely, no transaction may be active during the dump procedure
and a procedure similar to check point must take place:
1. Output all log records currently residing in main memory onto stable
storage.
2. Output all buffer blocks onto the disk.
3. Copy the contents of the database to stable storage.
4. Output a log record <dump> onto the stable storage.
References
Simplified Approach to DBMS
Kalyani Publishers
By
Parteek Bhatia

You might also like