You are on page 1of 59

Chapter 17

Introduction to Transaction Processing


Concepts and Theory
Chapter Outline

Introduction to Transaction Processing


Transaction and System Concepts
Desirable Properties of Transactions
Transaction Support in SQL

Next Lecture
Schedules and Recoverabilityy
Characterizing Schedules based on Serializability
1 Introduction to Transaction Processing (1)

z Single-User System: At most one user at a time can


use the system.
z Multiuser System: Many users can access the system
concurrently.
z Concurrency
– Interleaved processing: concurrent execution of
processes is interleaved in a single CPU
– Parallel processing: processes are concurrently
executed in multiple CPUs.
Introduction to Transaction Processing (2)
z A Transaction: logical
l i l unit
it off database
d t b processing
i that
th t
includes one or more access operations (read -retrieval,
write - insert or update, delete).
z A transaction (set of operations) may be stand-
alone specified in a high level language like SQL
submitted interactively,
y, or may
y be embedded within a
program.
z Transaction boundaries: Begin and End transaction.
z An application program may contain several
transactions separated by the Begin and End
transaction boundaries.
boundaries
Introduction to Transaction Processing (3)
z Basic operations in a database are read and write
– read_item(X): Reads a database item named X into a
program variable. To simplify our notation, we assume
that the program variable is also named X.
– write_item(X): Writes the value of program variable X
into the database item named X.
z Basic unit of data transfer from the disk to the computer
main memory is one block.
Two sample transactions
transactions. (a) Transaction T1.
(b) Transaction T2.
Some p problems that occur when concurrent
execution is uncontrolled. (a) The lost update
problem.
Some p problems that occur when concurrent
execution is uncontrolled. (b) The temporary update
problem.
Some problems that occur when concurrent execution is
uncontrolled. (c) The incorrect summary problem.
Why Concurrency Control is needed?
z The Lost Update Problem.
Problem
This occurs when two transactions that access the
same database items have their operations
p
interleaved in a way that makes the value of some
database item incorrect.
z The Temporary Update (or Dirty Read) Problem
Problem.
This occurs when one transaction updates a
database item and then the transaction fails for some
reason. The updated item is accessed by another
transaction before it is changed back to its original
value.
value
Why Concurrency Control is needed?

Why Concurrency Control is needed (cont.):


z The Incorrect Summary Problem .
If one transaction is calculating an aggregate
summary function on a number of records while other
transactions are updating some of these records
records, the
aggregate function may calculate some values before
they are updated and others after they are updated.
What causes a Transaction to fail?
1. A computer failure (system crash): A hardware or
software error occurs in the computer system during
transaction execution. If the hardware crashes, the
contents of the computer’s internal memory may be
lost.
lost
2. A transaction or system error : Some operation in the
transaction may cause it to fail, such as integer overflow
or division by zero
zero. Transaction failure may also occur
because of erroneous parameter values or because of
a logical programming error. In addition, the user may
interrupt the transaction during its execution
execution.
What causes a Transaction to fail?
3. Local errors or exception conditions detected by the
transaction:
- certain conditions necessitate cancellation of the
transaction. For example, data for the transaction may
not be found. A condition, such as insufficient account
balance in a banking database, may cause a
transaction, such as a fund withdrawal from that
account, to be canceled.
- a programmed d abort
b t iin th
the ttransaction
ti causes it tto
fail.
4. Concurrency control enforcement: The concurrency
control
t l method
th d may decide
d id tto abort
b t th
the ttransaction,
ti tto
be restarted later, because it violates serializability or
because several transactions are in a state of
deadlock.
deadlock
What causes a Transaction to fail?
5. Disk failure: Some disk blocks may lose their data
because of a read or write malfunction or because of
a disk read/write head crash. This may happen during
a read or a write operation of the transaction.
6. Physical problems and catastrophes: This refers
to an endless list of problems that includes power or
air-conditioning
air conditioning failure
failure, fire
fire, theft
theft, sabotage
sabotage,
overwriting disks or tapes by mistake, and mounting
of a wrong tape by the operator.
Chapter Outline

Introduction to Transaction Processing


Transaction and System Concepts
Desirable Properties of Transactions
Transaction Support in SQL
2 Transaction and System Concepts (1)

A transaction is an atomic unit of work that is either


completed in its entirety or not done at all.
For recovery purposes
purposes, the system needs to keep track
of when the transaction starts, terminates, and
commits or aborts.
T
Transaction
ti states:
t t
z Active state
z Partiallyy committed state
z Committed state
z Failed state
z Terminated
T i dSState
State transition diagram illustrating the states for
transaction execution.
Transaction and System Concepts (2)
Recovery manager keeps track of the following
operations:
z begin_transaction: This marks the beginning of
transaction execution
execution.
z read or write: These specify read or write operations
on the database items that are executed as part of a
transaction.
z end_transaction: This specifies that read and write
transaction operations have ended and marks the
end limit of transaction execution. At this p
point it may
y
be necessary to check whether the changes
introduced by the transaction can be permanently
applied to the database or whether the transaction
h tto be
has b aborted
b t db because it violates
i l t concurrency
control or for some other reason.
Transaction and System Concepts (3)

z commit_transaction: This signals a successful end of


the transaction so that any changes (updates) executed
by the transaction can be safely committed to the
database and will not be undone.
) This signals
z rollback ((or abort): g that the transaction
has ended unsuccessfully, so that any changes or
effects that the transaction may have applied to the
database must be undone.
undone
Transaction and System Concepts (4)

Recovery techniques use the following operators:


z undo: Similar to rollback except p that it applies
pp to
a single operation rather than to a whole
transaction.
z redo: This specifies that certain transaction
operations must be redone to ensure that all the
operations
ti off a committed
itt d transaction
t ti have
h
been applied successfully to the database.
The System Log or Database Journal

z The log keeps track of all transaction operations that


affect the values of database items. This information
may be needed to permit recovery from transaction
failures.

z Th
The llog iis kkeptt on di
disk,
k so it iis nott affected
ff t d bby any ttype
of failure except for disk or catastrophic failure.

z The log is periodically backed up to archival storage


(tape) to guard against such catastrophic failures.
The System Log or Database Journal
Types of log record:
1. [start_transaction,T]: Records that transaction T has
started execution.
2. [[write_item,T,X,old
, , , _value,new
, _value]:
] Records that
transaction T has changed the value of database item
X from old_value to new_value.
3 [read_item,T,X]:
3. [read item T X]: Records that transaction T has read
the value of database item X.
4. [commit,T]: Records that transaction T has completed
successfully and affirms that its effect can be
successfully,
committed (recorded permanently) to the database.
5. [abort,T]: Records that transaction T has been aborted.
The System Log or Database Journal
z Many protocols for recovery do not require that
read operations be written to the system log,
whereas some protocols require these entries
for recovery.

z S
Strict
i protocolsl require
i simpler
i l write
i entries
i
that do not include new_value.
Recovery using log records
If the system crashes, we can recover to a consistent
database state by examining the log log.
1. Because the log contains a record of every write
operation that changes the value of some database
item, it is possible to undo the effect of these write
operations of a transaction T by tracing backward
through the log and resetting all items changed by a
write operation of T to their old_values.
2. We can also redo the effect of the write operations of a
transaction T byy tracingg forward through
g the log
g and
setting all items changed by a write operation of T (that
did not get done permanently) to their new_values.
Commit Point of a Transaction
z A transaction T reaches its commit point when all its
operations that access the database have been
executed successfully and the effect of all the
transaction operations on the database has been
recorded in the log
log.
z Beyond the commit point, the transaction is said to be
committed, and its effect is assumed to be
permanently recorded in the database.
– The transaction then writes an entry [commit,T] into the log.
z Roll Back of transactions is needed for transactions
that have a [start_transaction,T] entry into the log but
no commit entry [commit,T] into the log.
Commit Point of a Transaction
z Redoing transactions: Transactions that have written
their commit entry in the log must also have recorded
all their write operations in the log; otherwise they
would not be committed, so their effect on the
database can be redone from the log entries
entries.
– At the time of a system crash, only the log entries that have
been written back to disk are considered in the recovery
process because the contents of main memory may be lost lost.
z Force writing a log: before a transaction reaches its
commit point, any portion of the log that has not been
written to the disk yet must now be written to the disk
disk.
This process is called force-writing the log file before
committing a transaction.
Chapter Outline

Introduction to Transaction Processing


Transaction and System Concepts
Desirable Properties of Transactions
Transaction Support in SQL
3 Desirable Properties of Transactions (1)

ACID properties:
y A transaction is an atomic unit of
z Atomicity:
processing; it is either performed in its entirety
or not performed at all.

z Consistency preservation: A correct execution


of the transaction must take the database from
one consistent state to another.
Desirable Properties of Transactions (2)

z Isolation: A transaction should not make its updates visible


to other transactions until it is committed; this property, when
enforced strictly
strictly, solves the temporary update problem and
makes cascading rollbacks of transactions unnecessary.
– Level 0: No Overwrite to Dirty Read
– Level 1: No Lost Update
– Level 2: 0+1
– Level 3: 2+ Repeatable Reads
z Durability or permanency: Once
O a transaction changes the
database and the changes are committed, these changes
must never be lost because of subsequent failure.
Chapter Outline

Introduction to Transaction Processing


Transaction and System Concepts
Desirable Properties of Transactions
Transaction Support in SQL
Transaction Support in SQL

z A single
g SQLQ statement is always y considered to
be atomic. Either the statement completes
execution without error or it fails and leaves the
database unchanged
unchanged.
z With SQL, there is no explicit Begin Transaction
statement.
state e t. Transaction
a sact o initiation
t at o iss done
do e
implicitly when particular SQL statements are
encountered.
z Every transaction must have an explicit end
statement, which is either a COMMIT or
ROLLBACK.
ROLLBACK
Transaction Support in SQL
z Access mode: READ ONLY or READ WRITE. The
default is READ WRITE unless the isolation level of
READ UNCOMITTED is specified,
specified in which case
READ ONLY is assumed.

z Isolation level of a transaction determines what data


the transaction can see when other transactions are
running concurrently:
– SERIALIZABLE All statements of the current transaction
can only see rows committed before the first query or data-
modification statement was executed in this transaction.
– READ COMMITTED A statement can only see rows
committed before it began.
began This is the default.
default
– The SQL standard defines two additional levels, READ
UNCOMMITTED and REPEATABLE READ.
In PostgreSQL READ UNCOMMITTED is treated
as READ COMMITTED
COMMITTED, whileREPEATABLE READ is
treated as SERIALIZABLE.
Transaction Support in Postgresql
z BEGIN
– [ WORK | TRANSACTION ]
– [ISOLATION LEVEL { SERIALIZABLE | REPEATABLE
READ | READ COMMITTED | READ UNCOMMITTED }]
– [READ WRITE | READ ONLY]
z ROLLBACK - abort the current transaction
z p
SAVEPOINT - define a new savepoint within the current
transaction
z RELEASE [SAVEPOINT] - destroy a previously defined
savepoint
z ROLLBACK TO [SAVEPOINT] - roll back to a
savepoint
z COMMIT - commit
i the
h current transaction
i
Chapter Outline

Introduction to Transaction Processing


Transaction and System Concepts
Desirable Properties of Transactions
Transaction Support in SQL
Schedules and Recoverability
Schedules and Serializability
Testingg for Serializabilityy
Schedules
z Schedule – a sequences of instructions that specify the
chronological order in which instructions of concurrent
transactions
i are executedd
– a schedule for a set of transactions must consist of all
instructions of those transactions
– must preserve the order in which the instructions appear in
each individual transaction.
z Operations other than read and write instructions are ignored
– Assumption: Transactions may perform arbitrary
computations on data in local buffers in between reads and
writes.
z A transaction that successfully completes its execution will have
an explicit commit instructions as the last statement.
z A transaction that fails to successfullyy complete
p its execution
will have a rollback instructions as the last statement.
Serial Schedule
zLet T1 transfer $50
f
from A to
t B,
B andd T2
transfer 10% of the
balance from A to B.

zA
A serial schedule in
which T1 is followed by
T2:
Equivalent
q Schedule
zLet T1 and T2 be the
t
transactions
ti defined
d fi d
previously.

zFollowing schedule is
not a serial schedule,
but it is equivalent to
Schedule 1.
Recoverable Schedules
z Recoverable schedule — if a transaction Tj reads a data item
previously written by a transaction Ti , then the commit
operation of Ti appears before the commit operation of Tj.

z The above schedule is not recoverable if T9 commits


immediatelyy after the read.
z If T8 should abort, T9 would have read an inconsistent database
state.
z Database must ensure that schedules are recoverable.
Cascading
g Rollbacks
z Cascading rollback – a single transaction failure leads to a
series of transaction rollbacks.

z Above schedule is recoverable if none of the transactions has yet


committed.
z If T10 fails, T11 and T12 must also be rolled back.
z Can lead to undoing of a significant amount of work.
work
Cascadeless Schedules

zCascadeless schedules — for each p pair of


transactions Ti and Tj such that Tj reads a
data item ppreviously y written byy Ti, the
commit operation of Ti appears before the
p
read operation of Tj.
zEvery cascadeless schedule is also
recoverable.
zIt is desirable to restrict the schedules to
those that are cascadeless
Strict Schedules

z Strict
St i t Schedules:
S h d l A schedule
h d l iin which
hi h a transaction
t ti
Tj can neither read or write an item X until the last
transaction Ti that wrote X has committed
committed.
Chapter Outline

Introduction to Transaction Processing


Transaction and System Concepts
Desirable Properties of Transactions
Transaction Support in SQL
Schedules and Recoverability
Schedules and Serializability
Testingg for Serializabilityy
Serializabilityy
zA schedule is serializable if it is equivalent
to a serial schedule.
– Each transaction preserves database
consistency.
i
– Thus serial execution of a set of transactions
preserves database consistency
consistency.
zDifferent forms of schedule equivalence
give rise to the notions of:
1.Conflict Serializability
2 View Serializability
2.View
Conflicting
g Instructions
z Instructions li and lj of transactions Ti and Tj
respectively conflict if and only if there exists some
respectively,
item Q accessed by both li and lj, and at least one of
these instructions wrote Q.
1 li = read(Q),
1. read(Q) lj = write(Q).
write(Q) They conflict.
conflict
2. li = write(Q), lj = read(Q). They conflict
3. li = write(Q), lj = write(Q). They conflict

z A conflict between li and lj forces a (logical) temporal


order between them.
z If li and lj are consecutive in a schedule and they do not
conflict, their results would remain the same even if
they had been interchanged in the schedule.
schedule
Conflict Serializability

zIf a schedule S can be transformed into a


schedule S´ by a series of swaps of non-
conflicting instructions,
instructions we say that S and S

are conflict [free] equivalent.
– S = {r
{ 1(A),
(A) w1(A),
(A) r2(A),
(A) r1(B)}
– S´ = {r1(A), w1(A), r1(B), r2(A)}
zWe say that
h a schedule
h d l S isi conflict
i
serializable if it is conflict equivalent to a
serial
i l schedule.
h d l
Conflict Serializabilityy (Example)
( p )
z Schedule A can be transformed into a serial schedule,
by series of swaps of non-conflicting instructions.
– Therefore schedule A is conflict serializable.
Conflict Serializabilityy ((Example
p 2))

z We are unable to swap instructions in the above schedule


to obtain either the serial schedule < T3, T4 >, or the serial
schedule < T4, T3 >.
z Example of a schedule that is not conflict serializable:
Exercises
z In Class Quiz
– Whi
Which h off the
h ffollowing
ll i schedules
h d l are ((conflict)
fli )
serializable?
a) r1 (X); r3 (X); w1(X); r2(X); w3(X)
b) r1 (X); r3 (X); w3(X); w1(X); r2(X)
c) r3 (X); r2 (X); w3(X); r1(X); w1(X)
d) r3 (X); r2 (X); r1(X); w3(X); w1(X)
– For each serializable schedule, determine the
equivalent serial schedules.
z Home Task
– Exercise 17.26 (Use any programming language of
your choice)
View Serializabilityy
z Let S and S´ be two schedules with the same set of
transactions.
transactions
z S and S´ are view equivalent if the following three
conditions are met:
1 For
1. F eeach h ddata
t item
ite Q,
Q if transaction
t ti Ti reads
e d the iinitial
iti l value
l e
of Q in schedule S, then transaction Ti must, in schedule S´,
also read the initial value of Q.
2. For each data item Q if transaction Ti executes read(Q) in
schedule S, and that value was produced by transaction Tj (if
any), then transaction Ti must in schedule S´ also read the
value of Q that was produced by transaction Tj .
3 For
3. F each h ddata iitem Q,
Q the
h transaction
i (if any)) that
h performs
f
the final write(Q) operation in schedule S must perform the
final write(Q) operation in schedule S´.
View Serializabilityy (2)
( )
z A schedule S is view serializable it is view equivalent to a
serial schedule.
z Every conflict serializable schedule is also view serializable.

z Above schedule is view-serializable


– Is it conflict serializable?
– What serial schedule is it equivalent to?
z Every view serializable schedule that is not conflict serializable
has blind writes.
Chapter Outline

Introduction to Transaction Processing


Transaction and System Concepts
Desirable Properties of Transactions
Transaction Support in SQL
Schedules and Recoverability
Schedules and Serializability
Testingg for Serializabilityy
Precedence (Serialization) Graph
z For each
F h ttransaction
ti Ti participating
ti i ti in i the
th schedule
h d l S,
S createt a
node (vertices) labeled Ti
z For each case in S where Ti executes a write(X)( ) before Tj
executes a read(X), create an edge (Ti → Tj ) in the graph
z For each case in S where Ti executes a read(X) before Tj
executes
t a write(X),
it (X) create d (Ti → Tj ) in
t an edge i the
th graphh
z For each case in S where Ti executes a write(X) before Tj
executes a write(X), g ((Ti → Tj ) in the ggraph
( ), create an edge p
z The schedule is serializable if and only if the precedence graph
has no cycles.
Precedence Graphs
p Examples
p ((1))
Precedence Graphs
p Examples
p ((2))
Another example of serializability testing.
Yet another example of serializability testing
Topological Ordering or
P
Precedence
d G
Graphh
z Cycle-detection algorithms exist
which take order n2 time, where n is
the number of vertices in the graph.
– Better algorithms take order n + e
where e is the number of edges.
z If precedence graph is acyclic, the
serializability
i li bili order
d can beb obtained
b i d
by a topological sorting of the graph.
– This is a linear order consistent with
the partial order of the graph.
– For example, a serializability order for
given schedule would be
Ti → Tj → Tk → Tm
Exercise for Topological Ordering
Summary

zIntroduction to Transaction Processing


Concepts
p

zReading Material
– Chapter 17 (Elmasri)
– Chapter
Ch 15 (Silberschatz)
(Silb h )

You might also like