You are on page 1of 15

M.

Tech-I Semester Regular/Supply Examination, February 2016


Subject Code: G0503/R13
DATABASE MANAGEMENT SYSTEMS
Detailed Answers
1
.
a)
i)

2
.

These foreign key constraints are necessary:


Works : FOREIGN KEY did REFERENCES Dept(did)
Dept: FOREIGN KEY managerid REFERENCES Emp(eid)
When deleting a Dept tuple, we need to remove the respective Works
tuple(s).
This can be done with the ON DELETE CASCADE rule.
ii)
managerId INT NOT NULL
iii)
UPDATE Emp E SET E.salary = E.salary * 1.10
b) The user of SQL has no idea how the data is physically represented in the
machine. He or she relies entirely on the relation abstraction for querying.
Physical data independence is therefore assured. Since a user can define views,
logical data independence can also be achieved by using view definitions to hide
changes in the conceptual schema
a) Explanation with Cardinality of the following operations.
Union
Intersection
Set Difference
Cross Product
Union: RUS returns a relation instance containing all tuples that occur in either
relation instance R or relation instance S (or both). R and S must be union
compatible, and the schema of the result is defined to be identical to the schema
of R.
Intersection: R Intersection S returns a relation instance containing all tuple that
occur in both R and S. The relations R and S must be union-compatible, and the
schema of the result is defined to be identical to the schema of R.
Set-difference: RS returns a relation instance containing all tuples that occur in
R but not in S. The relations R and S must be union-compatible, and the schema
of the result is defined to be identical to the schema of R.
Cross-product: RXS returns a relation instance whose schema contains all the
fields of R (in the same order as they appear in R) followed by all the fields of
S(in the same order as they appear in S).

2M

2M
2M
6M

6M

b) Any three differences with examples and explanation.

6M

Relational Algebra is a procedural language that can be used to tell the DBMS
how to build a new relation from one or more relations in the database and the
Relational Calculus is a non-procedural language that can be used to formulate
the definition of a relation in terms of one or more database relations.
1. Relational algebra operations manipulate some relations and provide
expressions in the form of queries where as relational calculus forms
queries on the basis of pairs of expressions.
2. RA have operator like join, union, intersection, division, difference,
projection, selection etc. whereas RC has tuples and domain oriented
expressions.
3. Expressive power of RA and RC are equivalent. This means any query
that could be expressed in RA could be expressed by formula in RC.
4. Relational algebra is easy to manipulate and understand than RC.
5. RA queries are more powerful than the RC.
6. RC will form WFFs where as RA does not form any formula.
7. RA evaluation of query depends on the order of operations and RC do not
depends on the order of operations
8. RA is domain independent where as RC is domain dependent
9. RA specifies operations performed on existing relation to obtain new
relations. In RC operations are performed on the relations in the form of
formulae.
3
.

a) Explanation with examples.


A Nested query is a query within another SQL query and embedded within the
WHERE clause.
A Nested query is used to return data that will be used in the main query as a
condition to further restrict the data to be retrieved.
Nested queries can be used with the SELECT, INSERT, UPDATE, and DELETE
statements along with the operators like =, <, >, >=, <=, IN, BETWEEN etc.
There are a few rules that sub queries must follow:

A sub query can have only one column in the SELECT clause, unless
multiple columns are in the main query for the subquery to compare its
selected columns.
An ORDER BY cannot be used in a sub query, although the main query
can use an ORDER BY. The GROUP BY can be used to perform the same
function as the ORDER BY in a sub query.
Sub queries that return more than one row can only be used with multiple
value operators, such as the IN operator.

6M

A sub query cannot be immediately enclosed in a set function.


The BETWEEN operator cannot be used with a sub query; however, the
BETWEEN operator can be used within the sub query.

b) Explanation with importance of Trigger, Syntax and Example


A trigger is a statement that is executed automatically by the system as a side
effect of a modification to the database. Triggers are fired implicitly and not
called by user like procedure and function. To design a trigger mechanism, we
must specify two things:
Specify the conditions under which the trigger is to be executed
Specify the actions to be taken when the trigger executes
Use of Database Triggers
To access table during regular business hours or on predetermined weekdays
To keep track of modification of data along with the user name, the operation
performed and the time when the operation was performed
To prevent invalid transaction
Enforces complex security authorization
Types of Triggers
Row Triggers A row trigger is fired each time a row in the table is affected
by triggering statement. If the triggering statement affects no rows, the trigger
is not executed at all
Statement Triggers A statement trigger is fired once on behalf of the
triggering statement, independent of number of rows affected by the triggering
statement
Creating a Trigger

Syntax :
CREATE OR REPLACE TRIGGER [ schema. ]
< trigger_name >
{ BEFORE, AFTER }
{ DELETE, INSERT, UPDATE [ OF column1, . . . ]
ON [schema.] < table_name >
[ REFERENCING { OLD AS old, NEW AS new} ]
[ FOR EACH ROW [ WHEN condition ] ]
DECLARE
<variable declarations>;
<constant declarations>;
BEGIN
< PL/SQL sub-program body >;
Exception
< exception PL/SQL block >;
End;
Example :
CREATE OR REPLACE TRIGGER t_Audit_trail
BEFORE DELETE OR UPDATE ON

6M

FOR EACH ROW


DECLARE
oper varchar2(8);
BEGIN
If updating then
oper :=Update
end if;
If deleting then
oper :=Delete
end if;
insert into audit_cust values (:OLD.custno, :OLD.fname, :OLD.lname,
:OLD.address, oper, user, sysdate);
End;
Normalization of Database
Normalization is a systematic approach of decomposing tables to eliminate data
redundancy and undesirable characteristics like Insertion, Update and Deletion
Anamolies. It is a two step process that puts data into tabular form by removing
duplicated data from the relation tables.
Normalization is used for mainly two purpose,
Eliminating reduntant(useless) data.
Ensuring data dependencies make sense i.e data is logically stored.
Problem Without Normalization
Without Normalization, it becomes difficult to handle and update the database,
without facing data loss. Insertion, Updation and Deletion Anamolies are very
frequent if Database is not Normalized. To understand these anomalies let us take an
example of Student table.
S_id S_Name S_Address Subject_opted
401 Adam Noida
Bio
402 Alex
Panipat
Maths
403 Stuart Jammu
Maths
404 Adam Noida
Physics

UpdationAnamoly : To update address of a student who occurs twice or more


than twice in a table, we will have to update S_Address column in all the
rows, else data will become inconsistent.
Insertion Anamoly : Suppose for a new admission, we have a Student
id(S_id), name and address of a student but if student has not opted for any
subjects yet then we have to insert NULL there, leading to Insertion Anamoly.
Deletion Anamoly : If (S_id) 401 has only one subject and temporarily he
drops it, when we delete that row, entire student record will be deleted along
with it.

Normalization Rule
Normalization rule are divided into following normal form.

12M

1.
2.
3.
4.

First Normal Form


Second Normal Form
Third Normal Form
BCNF

First Normal Form (1NF)


A row of data cannot contain repeating group of data i.e each column must have a
unique value. Each row of data must have a unique identifier i.ePrimary key. For
example consider a table which is not in First normal form
Student Table :
S_id S_Name subject
401 Adam Biology
401 Adam Physics
402 Alex
Maths
403 Stuart Maths
You can clearly see here that student name Adam is used twice in the table and
subject math is also repeated. This violates the First Normal form. To reduce above
table to First Normal form break the table into two different tables
New Student Table :
S_id S_Name
401 Adam
402 Alex
403 Stuart
Subject Table :
subject_id student_id subject
10
401
Biology
11
401
Physics
12
402
Math
12
403
Math
In Student table concatenation of subject_id and student_id is the Primary key. Now
both the Student table and Subject table are normalized to first normal form
Second Normal Form (2NF)
A table to be normalized to Second Normal Form should meet all the needs of First
Normal Form and there must not be any partial dependency of any column on
primary key. It means that for a table that has concatenated primary key, each column
in the table that is not part of the primary key must depend upon the entire
concatenated key for its existence. If any column depends oly on one part of the
concatenated key, then the table fails Second normal form. For example, consider a
table which is not in Second normal form.
Customer Table :
customer_id Customer_Name Order_id Order_name Sale_detail
101
Adam
10
order1
sale1

101
Adam
11
order2
sale2
102
Alex
12
order3
sale3
103
Stuart
13
order4
sale4
In Customer table concatenation of Customer_id and Order_id is the primary key.
This table is in First Normal form but not in Second Normal form because there are
partial dependencies of columns on primary key. Customer_Name is only dependent
on customer_id, Order_name is dependent on Order_id and there is no link between
sale_detail and Customer_name.
To reduce Customer table to Second Normal form break the table into following
three different tables.
Customer_DetailTable :
customer_id Customer_Name
101
Adam
102
Alex
103
Stuart
Order_DetailTable :
Order_id Order_Name
10
Order1
11
Order2
12
Order3
13
Order4
Sale_DetailTable :
customer_id Order_id Sale_detail
101
10
sale1
101
11
sale2
102
12
sale3
103
13
sale4
Now all these three table comply with Second Normal form.
Third Normal Form (3NF)
Third Normal form applies that every non-prime attribute of table must be dependent
on primary key. The transitive functional dependency should be removed from the
table. The table must be in Second Normal form. For example, consider a table with
following fields.
Student_DetailTable :
Student_id Student_name DOB Street city State Zip
In this table Student_id is Primary key, but street, city and state depends upon Zip.
The dependency between zip and other fields is called transitive dependency. Hence
to apply 3NF, we need to move the street, city and state to new table, with Zip as
primary key.
New Student_DetailTable :
Student_id Student_name DOB Zip

Address Table :
Zip Street city state
The advantage of removing transtive dependency is,
Amount of data duplication is reduced.
Data integrity achieved.
Boyce and Codd Normal Form (BCNF)
Boyce and Codd Normal Form is a higher version of the Third Normal form. This
form deals with certain type of anamoly that is not handled by 3NF. A 3NF table
which does not have multiple overlapping candidate keys is said to be in BCNF.
For a table to be in BCNF, following conditions must be satisfied:
R must be in 3rd Normal Form
and, for each functional dependency ( X -> Y ), X should be a super Key.

Concurrent Execution of Transactions


The DBMS interleaves the actions of different transactions to improve performance,
in terms of increased throughput or improved response times for short transactions,
but not all interleaving should be allowed.

12M

The schedule shown in Fig represents an interleaved execution of the two


transactions. Ensuring transaction isolation while permitting such concurrent
execution is difficult, but is necessary for performance reasons. First, while one
transaction is waiting for a page to be read in from disk, the CPU can process another
transaction. This is because I/O activity can be done in parallel with CPU activity in a
computer. Overlapping I/O and CPU activity reduces the amount of time disks and
processors are idle, and increases system throughput (the average number of
transactions completed in a given time). Second, interleaved execution of a short
transaction with a long transaction usually allows the short transaction to complete
quickly. In serial execution, a short transaction could get stuck behind a long
transaction leading to unpredictable delays in response time, or average time taken to
complete a transaction.
Serilizability
A serializable schedule over a set S of committed transactions is a schedule
whose effect on any consistent database instance is guaranteed to be identical to that
of some complete serial schedule over S. That is, the database instance that results
from executing the given schedule is identical to the database instance that results
from executing the transactions in some serial order.
Executing the transactions serially in different orders may produce different
results, but all are presumed to be acceptable; the DBMS makes no guarantees about
which of them will be the outcome of an interleaved execution.
Reading Uncommitted Data (WR Conflicts)
The first source of anomalies is that a transaction T2 could read a database object A
that has been modified by another transaction T1, which has not yet committed. Such
a read is called a dirty read.

A simple example illustrates how such a schedule could lead to an inconsistent


database state. Consider two transactions T1 and T2,each of which, run alone,
preserves database consistency: T1 transfers $100 from A to B, and T2 increments
both A and B by 6 percent (e.g., annual interest is deposited into these two accounts).
Suppose that their actions are interleaved so that (1) the account transfer program T1
deducts $100 from account A, then (2) the interest deposit program T2 reads the
current values of accounts A and B and adds 6 percent interest to each, and then (3) the
account transfer program credits $100 to account B.
Unrepeatable Reads (RWConflicts)
The second way in which anomalous behavior could result is that a transaction
T2 could change the value of an object A that has been read by a transaction T1, while
T1 is still in progress. This situation causes two problems. First, if T1 tries to read the
value of A again, it will get a different result, even though it has not modified A in the
meantime. This situation could not arise in a serial execution of two transactions; it is
called an unrepeatable read.
Overwriting Uncommitted Data (WW Conflicts)
The third source of anomalous behavior is that a transaction T2 could
overwrite the value of an object A, which has already been modified by a transaction
T1, while T1 is still in progress. Even if T2 does not read the value of A written by T1,
a potential problem exists.

When the system is restarted after a crash, the recovery manager proceeds in three
phases,

12M

1. Analysis Phase
2. Redo Phase
3. Undo Phase
Analysis Phase has three main objectives.

It determines the point in the log at which to start the Redo pass.
It determines (a conservative superset of the) pages in the buffer pool that were
dirty at the time of the crash.
It identifies transactions that were active at the time of the crash and must
beundone.

The Analysis phase begins by examining the most recent begin checkpoint record,
whose LSN is denoted as C in Fig 1, and proceeds forward in the log until the last log
record.

The Redo phase follows Analysis and redoes all changes to any page that might have
been dirty at the time of the crash; this set of pages and the starting point for Redo (the
smallest recLSN of any dirty page) are determined during Analysis. The Undo phase
follows Redo and undoes the changes of all transactions that were active at the time of
the crash; again, this set of transactions is identified during the Analysis phase.
Redo Phase
During the Redo phase, ARIES reapplies the updates of all transactions, committed or
otherwise. Further, if a transaction was aborted before the crash and its updates were
undone, as indicated by CLRs, the actions described in the CLRs are also reapplied.
This repeating history paradigm distinguishes ARIES from other proposed
WALbased recovery algorithms and causes the database to be brought to the same
state that it was in at the time of the crash.
The Redo phase begins with the log record that has the smallest recLSN of all pages in

the dirty page table constructed by the Analysis pass because this log record identifies
the oldest update that may not have been written to disk prior to the crash. Starting
from this log record, Redo scans forward until the end of the log. For each redoable
log record (update or CLR) encountered, Redo checks whether the logged action must
be redone.
Undo Phase
The Undo phase, unlike the other two phases, scans backward from the end of the log.
The goal of this phase is to undo the actions of all transactions that were active at the
time of the crash, that is, to effectively abort them. This set of transactions is identified
in the transaction table constructed by the Analysis phase.
The Undo Algorithm
Undo begins with the transaction table constructed by the Analysis phase,
which identifies all transactions that were active at the time of the crash, and includes
the LSN of the most recent log record (the lastLSN field) for each such transaction.
Such transactions are called loser transactions. All actions of losers must be undone,
and further, these actions must be undone in the reverse of the order in which they
appear in the log.
7

Pseudo Code for Deleting Operation with example and Tracing.


( Algorithm 6 M+ Tracing with explanation 6 M)
Deleting a Data Entry from a B+ Tree
1. Start at root, find leaf L where entry belongs.
2. Remove the entry.
If L is at least half-full, done!
If L has only d-1 entries,
a) Try to re-distribute, borrowing from sibling (adjacent node with same
parent as L).
b) If re-distribution fails, merge L and sibling.
3. If merge occurred, must delete entry (pointing to L or sibling) from parent of
L.
4. Merge could propagate to root, decreasing height
Example:
B+ Tree

12M

Deleting 19

Deleting 20

Deleting 19* is easy.

Deleting 20* is done with re-distribution. Notice how middle key is copied up.

Importance of Active Databases

3M

ACTIVE DATABASES
A trigger is a procedure that is automatically invoked by the DBMS in
response to specific changes to the database, and is typically specified by the DBA. A
database that has a set of associated triggers is called an active database. An active
database is a database that includes an event-driven architecture (often in the form of
ECA rules) which can respond to conditions both inside and outside the database

Fig 1: ADBMS Architecture

ADBMS provides regular DBMS primitives, definition of application-defined


situations, triggering of application-defined reactions as shown in the figure.
Active databases in Postgres
Rules
o allow the definition of extra or alternate actions on updates
Triggers
o allow the association of user supplied procedures (functions) with
database events
5M
Importance of Hash Based Indexing and explaining any one of the three
techniques.
The basic idea of hash based indexing is to use a hashing function, which maps
values in a search field into arrange of bucket numbers to find the page on which a
desired data entry belongs.
Static Hashing scheme is like ISAM, suffers from the problem of long
overflow chains, which can affect performance. Two solutions to the problem are
Extendible Hashing and Linear Hashing.
The Extendible Hashing scheme uses a directory to support inserts and deletes
efficiently without any overflow pages. The Linear Hashing scheme uses a clever

policy for creating new buckets and supports inserts and deletes efficiently without the
use of a directory.

Fig: Static Hashing


The Static Hashing scheme is illustrated in Fig. The pages containing the data can be
viewed as a collection of buckets, with one primary page and possibly additional
overflow pages per bucket. A file consists of buckets 0 through N 1, with one
primary page per bucket initially.
Explaining ISAM Index Structure with an example

Fig 1: Index Page

Fig 2: One Level Index Structure

4M

Example of ISAM Tree

ISAM stands for Indexed Sequential Access Method, a method for indexing data for
fast retrieval. Fig 1 provides index page and fig 2 provides one level index structure.
Indexing permit access to selected records without searching the entire file.
Advantages:
Permits efficient and economic use of sequential processing technique when
the activity rate is high.
Permits quick access to records, in a relatively efficient way when this activity
is a fraction of the work load.
Disadvantages:
Slow retrieval, when compared to other methods.
Does not use the storage space efficiently.
Hardware and software used are relatively expensive.
Prepared by
Dr P.Kiran Sree, Professor, Dept of CSE, SVECW, BVRM

You might also like