You are on page 1of 7

BST

3101:
DATABASE
SYSTEMS
II
Topic 3
Database Administration
Database administration refers to the whole set of activities performed by a
database administrator to ensure that a database is always available as needed. It
also includes other related activities and roles as database security, database
monitoring and troubleshooting, and planning for future growth. Database
administration can also be described as the function of managing and
maintaining database management systems (DBMS) software. It is an important
function in any organisation that is dependent on one or more databases.
Organisations that use DBMS software often hire specialized IT personnel known as
Database Administrators (DBAs), who require deep technical skills on running and
operation of databases. They also need knowledge and training in the platform
(database engine and operating system) on which the database runs. The main role
of DBAs is to ensure maximum uptime for the database so that it is available when
needed. Other key roles are as follows:
(i)
Designing, implementing, and maintaining the database system.
(ii)
Database tuning - Tweaking of several parameters to optimise performance,
such as server memory allocation, file fragmentation and unauthorized
access.
(iii)
Maintaining database documentation and standards.
(iv)
Enforcing security and maintenance of the database(s).
(v)
Backup and recovery - Ensure the database has adequate backup and
recovery procedures in place to recover from any accidental or deliberate loss
of data.
(vi)
Producing reports from queries.
(vii) Facilitating training of employees on use of the database.
There are three broad categories of database administrators, systems, development
and application database administrators.
Systems DBAs, also known as physical, operations or production Support DBAs
focus on the physical aspects of database administration such as DBMS installation,
configuration, patching, upgrades, backups, restores, refreshes, performance
optimisation, and maintenance and disaster recovery.
Development DBAs focus on the logical and development aspects of database
administration such as data model design and maintenance, DDL (data definition
language) generation, SQL writing and tuning, coding stored procedures.
Application DBAs are usually found in organizations that have purchased 3rd
party application software such as ERP (enterprise resource planning) and CRM
(customer relationship management) systems. Application DBAs cut across the
DBMS and the application software and are responsible for ensuring that the
application is fully optimized for the database and vice versa.

Transaction Management
1

A transaction is one or more SQL statements that make up a unit of work performed
against the database. Either all the statements in a transaction are committed as a
unit, or all the statements are rolled back as a unit. For example, when a user uses
a computer to transfer money from one bank account to another, the request
involves a transaction: updating values stored in the database for both accounts.
For a transaction to be completed and database changes to be made
permanent, a transaction must be completed in its entirety.

Managing Commits in Transactions


Committing (and rolling back) transactions is slow because of the disk I/O and
potentially the number of network round trips required. The database must write to
disk every modification made by a transaction to the database. This is usually a
sequential write to a journal file (or log); nevertheless, it involves expensive disk I/O.

In most standards-based APIs, the default transaction commit mode is autocommit.


In auto-commit mode, a commit is performed for every SQL statement that requires
a request to the database, such as Insert, Update, Delete, and Select statements.
When auto-commit mode is used, the application does not control when database
work is committed.

Some database systems do not support auto-commit mode. For these databases,
the database driver, by default, sends a commit request to the database after every
successful operation (SQL statement). This request equates to a network round trip
between the driver and the database. The round trip to the database occurs even
though the application did not request the commit and even if the operation made
no changes to the database. For example, the driver makes a network round trip
even when a Select statement is executed.

For longer periods, unit of work is a series of Select statements, one executed after
the other to return a column of balances. In most cases, for every Select statement
executed against the database, a lock is placed on rows to prevent another user
from updating that data. By holding locks on rows for longer than necessary, active
transactions can prevent other users from updating data, which ultimately can
reduce throughput and cause concurrency issues. In this case, one may want to

commit the Selectstatements in intervals (after every five Select statements, for
example) so that locks are released in a timely manner.

Leaving transactions active consumes database memory. Committing a transaction


flushes the contents of the log and releases database memory. If an application
uses transactions that update large amounts of data (1,000 rows, for example)
without committing modifications, the application can consume a substantial
amount of database memory. In this case, you may want to commit after every
statement that updates a large amount of data.

Isolation Levels
A transaction isolation level represents a particular locking strategy used in the
database system to improve data integrity. Most database systems support several
isolation levels, and the standards-based APIs provide ways for the user to set
isolation levels.

Local Transactions versus Distributed Transactions


A local transaction is a transaction that accesses and updates data on only one
database. Local transactions are significantly faster than distributed transactions
because local transactions do not require communication between multiple
databases, which means less logging and fewer network round trips are required to
perform local transactions.

A distributed transaction is a transaction that accesses and updates data on


multiple networked databases or systems and must be coordinated among those
databases or systems. These databases may be of several types located on a single
server, such as Oracle, Microsoft SQL Server, and Sybase; or they may include
several instances of a single type of database residing on numerous servers.

Distributed transactions are substantially slower than local transactions because of


the logging and network round trips needed to communicate between all the
components involved in the distributed transaction.
3

The following occurs when the application requests a transaction:


1.

The driver issues a commit request.


2.

The transaction coordinator sends a pre-commit request to all databases


involved in the transaction.
3.
The transaction coordinator sends a commit request command to all
databases.
(i) Each database executes the transaction up to the point where the database is
asked to commit, and each writes recovery information to its logs.
(ii) Each database replies with a status message to the transaction
coordinator indicating whether the transaction up to this point succeeded or
failed.
4.
The transaction coordinator waits until it has received a status message from
each database. If the transaction coordinator received a status message from all
databases indicating success, the following occurs:
(i)
The transaction coordinator sends a commit message to all the databases.
(ii)
Each database completes the commit operation and releases all the locks
and resources held during the transaction.
(iii)
Each database replies with a status to the transaction coordinator
indicating whether the operation succeeded or failed.
5.
The transaction coordinator completes the transaction when all
acknowledgments have been received and replies with a status to the driver
indicating if the commit succeeded or failed.
Database Transactions: the ACID Model
The database transaction ACID (Atomicity, Consistency, Isolation, Durability) rules
or ACID model refers to a set of properties that guarantee that database
transactions are processed reliably. It is one of the oldest and most important
concepts of database theory. The ACID rules sets forward four goals that every
database management system must strive to achieve: atomicity, consistency,
isolation and durability. No database that fails to meet any of these four goals can
be considered reliable.

Atomicity
Atomicity states that database modifications must follow an all or nothing rule. It
refers to the ability of the database to guarantee that either all of the tasks of a
transaction are performed or none of them are. Database modifications must follow
an all or nothing rule. Each transaction is said to be atomic if when one part of the
transaction fails, the entire transaction fails.

It is critical that the database management system maintain the atomic nature of
transactions in spite of any DBMS, operating system or hardware failure.

Consistency
The consistency property ensures that the database remains in a consistent state
before the start of the transaction and after the transaction is over (whether
successful or not). For example, in a storefront there is an inconsistent view of what
is truly available for purchase if inventory is allowed to fall below 0, making it
impossible to provide more than an intent to complete a transaction at checkout
time.

Consistency states that only valid data will be written to the database. If, for some
reason, a transaction is executed that violates the databases consistency rules, the
entire transaction will be rolled back and the database will be restored to a state
consistent with those rules. On the other hand, if a transaction successfully
executes, it will take the database from one state that is consistent with the rules to
another state that is also consistent with the rules.

Isolation
Isolation refers to the requirement that other operations cannot access or see the
data in an intermediate state during a transaction. This constraint is required to
maintain the performance as well as the consistency between transactions in a
database. Thus, each transaction is unaware of another transactions executing
concurrently in the system.

Isolation requires that multiple transactions occurring at the same time not impact
each others execution. For example, if Joe issues a transaction against a database
at the same time that Mary issues a different transaction, both transactions should
operate on the database in an isolated manner. The database should either perform
Joes entire transaction before executing Marys or vice-versa. This prevents Joes
transaction from reading intermediate data produced as a side effect of part of
Marys transaction that will not eventually be committed to the database

Durability
5

Durability refers to the guarantee that once the user has been notified of success,
the transaction will persist, and not be undone. This means it will survive system
failure, and that the database system has checked the integrity constraints and
won't need to abort the transaction. Many databases implement durability by
writing all transactions into a transaction log that can be played back to recreate
the system state right before a failure. A transaction can only be deemed
committed after it is safely in the log.

Durability ensures that any transaction committed to the database will not be lost.
Durability is ensured through the use of database backups and transaction logs that
facilitate the restoration of committed transactions in spite of any subsequent
software or hardware failures.

Durability does not imply a permanent state of the database. Another transaction
may overwrite any changes made by the current transaction without hindering
durability.

Concurrency Control
Concurrency control is the process of managing simultaneous execution of
transactions in a shared database, to ensure the serialisability of transactions. It
ensures that database transactions provide correct results for concurrent
operations without violation of data integrity , and as quickly as possible.
Concurrent access to data is desirable when:
(i) The amount of data is sufficiently great that at any given time only fraction of
the data can be in primary memory & rest should be swapped from secondary
memory as needed.
(ii) Even if the entire database can be present in primary memory, there may be
multiple processes.
Most high-performance transactional systems need to run transactions concurrently
to meet their performance requirements. Thus, without concurrency control such
systems can neither provide correct results nor maintain their databases consistent.
The main categories of concurrency control mechanisms are:
Optimistic The optimistic approach to concurrency control assumes that conflicts
between transactions are rare. It delays the checking of whether a transaction
meets the isolation and other integrity rules until its end, without blocking any of its
(read, write) operations, and then abort a transaction to prevent the violation, if the
desired rules are to be violated upon its commit. The method does not require
locking, and transaction executed without restrictions. Check for conflicts is done
just before commit.
6

Pessimistic This method assumes that conflicts will happen, and involves
blocking an operation of a transaction, if it may cause violation of the rules, until the
possibility of violation disappears. Blocking operations is typically involved with
performance reduction. The pessimistic concurrency control techniques detect
conflicts as soon as they occur and resolve them using blocking
Semi-optimistic - Block operations in some situations, if they may cause violation
of some rules, and do not block in other situations while delaying rules checking (if
needed) to transaction's end, as done with optimistic.
When an object is locked by another transaction, the requesting transaction must
wait
The disadvantages of locking include:
(i) Lock management overhead.
(ii) Deadlock detection/resolution.
(iii)Concurrency is significantly lowered, when congested nodes are locked.
(iv) To allow a transaction to abort itself when mistakes occur, locks cant be
released until the end of transaction, thus currency is significantly lowered
(v) Conflicts are rare, and checking for locks every moment merely slows down
processes.
Methods of Concurrency Control
Locking (e.g., Two-phase locking ) - Controlling access to data by locks assigned
to the data. Access of a transaction to a data item (database object) locked by
another transaction may be blocked (depending on lock type and access operation
type) until lock release.
Serialisation graph checking - Checking for cycles in the schedule's graph and
breaking them by aborts.
Timestamp ordering - Assigning timestamps to transactions, and controlling or
checking access to data by timestamp order.
Commitment ordering (or Commit ordering) - Controlling or checking
transactions' chronological order of commit events to be compatible with their
respective precedence order.

You might also like