You are on page 1of 25

Database Systems

Recovery Control
Redo Logging

1
Disadvantage of Undo Logging

Cannot consider T
committed and write this
log rec until all T’s updates
are written to disk

• This forces the DBMS to make many I/Os


– Especially for small transactions
2
Rules for Redo Logging
• For every write action, generate redo log record.
– <T, X, v>: Transaction T has modified X and new value is v
• Flush log at commit.
• Before modifying any value X on disk (Output(X))
– All log records of T (including commit) must be on disk before X is
modified on disk

• Write <END T> log record after DB modifications have been


written to disk.

3
Example

That is the
new value

No Output can be done until the Log is flushed to


disk containing all T’s records and its <Commit T>

4
Redo Logging: Recovery Rules
Check the log
• T with no <Commit T>
– Can be ignored (do nothing)
– Because T did not write anything to disk

• T with <End T>


– Can be ignored (do nothing)
– Because T wrote all its data to disk

• T with <Commit T> but no <End T>


– Redo its actions (Start from <Start T> and move forward)

5
Example

<Commit T> is not written on disk yet


 Do Nothing

6
Example

<Commit T> is on disk, No <End T>


 Redo T
 Copy 16 to A
 Copy 16 to B
 Add <End T> to log and write to disk
7
Disadvantage of Redo Logging

Cannot write anything of


T’s updates to disk until it
commits.

• Delayed I/Os
– Needs to keep all modified blocks in memory until T commits

• Bad especially for large transactions

8
Solu onè Combine Undo and
Redo Logging

Next: Undo/Redo Logging


& Checkpoints

9
Undo/Redo Logging
• Stores more data in its log to offer more flexibility
– Log record: <Ti, X, oldVal, newVal>

Transation Ti has updated X, the old value is oldVal


and the new value is newVal

• In Recovery
– Some transactions will be undone (incomplete ones)
– Some transactions will be redone (complete ones)

10
Undo/Redo Logging
• For every write action, generate log record.
– <T, X, old, new>: Transaction T has modified X from old to new

• Before writing any object X to disk (Output(x)), its log record


must be on disk Flush

Can we write X before <Commit T> ? Yes A block containing X can


now move to disk either
before or after the
commit
Can we write X a er <Commit T> ? Yes

11
Example
Action T Mem A Mem B Disk A Disk B Log
<START T>
REAT(A,t) 8 8 8 8
t:=t*2 16 8 8 8
WRITE(A,t) 16 16 8 8 <T,A,8,16>

At this point it is READ(B,t) 8 16 8 8 8


allowed to write t:=t*2 16 16 8 8 8
either A or B to disk
WRITE(B,t) 16 16 16 8 8 <T,B,8,16>
FLUSH LOG
OUTPUT(A) 16 16 16 16 8
<COMMIT T>
OUTPUT(B) 16 16 16 16 16
At this point T is
considered completed
although some of its data
are not written yet to disk

12
Undo/Redo Logging: Recovery Rules
• If <Commit T> is in the log (on disk)
– It means T is be completed
– Some data may be written to disk, others may not

Redo T (top-down)

• If <Commit T> is not in the log (not on disk)


– Still some data may be written to disk

Undo T (bottom-up)

13
Undo/Redo Logging: Recovery Rules
Redo Phase
1. Decide on which transactions to redo (can be many)
2. Take one forward pass (top-down) and redo them

Undo Phase
1. Decide on which transactions to undo (can be many)
2. Take one backward pass (bottom-up) to undo them

The order of the two phases may


switch…

14
Example
Log on disk
<START T1>
<T1,X1,u1,v1>
<START T2>
Redo T2 <T2, X2,u2,v2> Undo T1, T3
<START T3>
<T1,X3,u3,v3>
<COMMIT T2>
<T3,X4,u4,v4>
<T1,X5,u5,v5>

crash
15
Example
Action T Mem A Mem B Disk A Disk B Log
<START T>
REAT(A,t) 8 8 8 8
t:=t*2 16 8 8 8
WRITE(A,t) 16 16 8 8 <T,A,8,16>
READ(B,t) 8 16 8 8 8
t:=t*2 16 16 8 8 8
WRITE(B,t) 16 16 16 8 8 <T,B,8,16>
FLUSH LOG
OUTPUT(A) 16 16 16 16 8
<COMMIT T>
OUTPUT(B) 16 16 16 16 16

• The <Commit T> record is not written yet to disk


• T is considered incomplete and will be undone
- Return B to 8
- Return A to 8
16
Example
Action T Mem A Mem B Disk A Disk B Log
<START T>
REAT(A,t) 8 8 8 8
t:=t*2 16 8 8 8
WRITE(A,t) 16 16 8 8 <T,A,8,16>
READ(B,t) 8 16 8 8 8
t:=t*2 16 16 8 8 8
WRITE(B,t) 16 16 16 8 8 <T,B,8,16>
FLUSH LOG
OUTPUT(A) 16 16 16 16 8
<COMMIT T>
FLUSH LOG
OUTPUT(B) 16 16 16 16 16

• The <Commit T> record is on disk


• T is considered complete and will be redone
- Copy 16 to A
- Copy 16 to B 17
Database Systems
Recovery Control
Checkpointing

18
Checkpointing
• When restart from a crash
– Recovery Manager needs to check the entire log to decide
• Which Trnx to undo (if Undo logging)
• Which Trnx to redo (if Redo logging)
• Which Trnx to undo/redo (if Undo/Redo logging)

• Checking the entire log is very expensive

• Need points before which everything is guaranteed to be correct


– These points are called “checkpoints”

19
Recovery is very, very SLOW !

log:

... ... ...

First T1 wrote A,B Last


Crash
Record Committed a year ago Record
(1 year ago) --> STILL, Need to redo after crash!!

20
Solution: Checkpoint
•simple checkpoint
Periodically:
(1) Do not accept new transactions
(2) Wait until all transactions finish (commit or abort)
(3) Flush all log records to disk (log)
(4) Flush all buffers to disk (DB)
(5) Write “checkpoint” record on disk (log)
(6) Resume transaction processing

21
Example: what to do at recovery?

Undo or Redo log (disk):


<T1,commit>

<T2,commit>

<T3,C,21>
<T1,A,16>

<T2,B,17>
Checkpoint Crash
... ... ... ... ... ...

Check only to the last checkpoint

22
More Complex Checkpointing
Mechanism
• The simple mechanism does not accept new transactions during
checkpointing

• May not be acceptable for some systems

• A more complex mechanism allows accepting new transactions while


doing the checkpointing

The checkpoin ng mechanism is different


for the different logging mechanisms

23
Explain the purpose of the checkpoint mechanism. How often should
checkpoints be performed? How does the frequency of checkpoints affect:
• System performance when no failure occurs?
• The time it takes to recover from a system crash?
• The time it takes to recover from a media (disk) failure?
Answer: Checkpointing is done with log-based recovery schemes to reduce the
time required for recovery after a crash. If there is no checkpointing, then the
entire log must be searched after a crash, and all transactions undone/redone
from the log. If checkpointing had been performed, then most of the log-records
prior to the checkpoint can be ignored at the time of recovery.
Another reason to perform checkpoints is to clear log-records from stable storage
as it gets full. !!!
Since checkpoints cause some loss in performance while they are being taken,
their frequency should be reduced if fast recovery is not critical.
If we need fast recovery checkpointing frequency should be increased. If the
amount of stable storage available is less, frequent checkpointing is unavoidable.
Checkpoints have no effect on recovery from a disk crash; archival dumps are
the equivalent of checkpoints for recovery from disk crashes.
24
We Are
Done…

You might also like