You are on page 1of 20

DBMS

A database is collection of related information stored so that it is available to many user for
different purposes.The management of data in a database system is done by means of a
general purpose software package called database management system (DBMS).
Some DBMS examples include MySQL, PostgreSQL, Microsoft Access, SQLServer, Oracle
.

What is RDBMS?
RDBMS stands for Relational Database Management System. RDBMS is the basis for SQL,
and for all modern database systems like MS SQL Server, Oracle, MySQL, and Microsoft
Access.
A Relational database management system (RDBMS) is a database management system
(DBMS) that is based on the relational model as introduced by Edgar F. Codd.
What is table or relation?
The data in RDBMS is stored in database objects called tables. The table is a collection of
related data entries and it consists of columns and rows.
Following is the example of a CUSTOMERS table:
ID

NAME

AGE

ADDRESS

SALARY

Ramesh

32

Ahemdabad

2000

Anu

20

Delhi

1000

Ram

23

Delhi

10020

What is field or attribute or domain?


Every table is broken up into smaller entities called fields. The fields in the CUSTOMERS
table consist of ID, NAME, AGE, ADDRESS and SALARY.
A field is a column in a table that is designed to maintain specific information about every
record in the table.
What is record or Tuple?
A record, also called a row of data, is each individual entry that exists in a table. For example
there are 7 records in the above CUSTOMERS table. Following is a single row of data or
record in the CUSTOMERS table:
1

Ramesh

32

Ahemdabad

2000

A record is a horizontal entity in a table.


What is column?
A column is a vertical entity in a table that contains all information associated with a specific
field in a table.
For example, a column in the CUSTOMERS table is ADDRESS, which represents location
description and would consist of the following:
ADDRESS
Ahemdabad
Delhi
Delhi

Differnce between DBMS and RDBMS


The key difference is that RDBMS (relational database management system) applications
store data in a tabular form, while DBMS applications store data as files. Does that mean
there are no tables in a DBMS? There can be, but there will be no relation between the
tables, like in a RDBMS. In DBMS, data is generally stored in either a hierarchical form or
a navigational form. This means that a single data unit will have one parent node and zero,
one or more children nodes. It may even be stored in a graph form, which can be seen in
the network model.
In a RDBMS, the tables will have an identifier called primary key. Data values will be
stored in the form of tables. The relationships between these data values will be stored in
the form of a table as well. Every value stored in the relational database is accessible. This
value can be updated by the system.

DBMS Users
Mainly , there are mainly four types of users in DBMS system:

Database administrator
2

Database designer
End users
Application programmers.

Database Administrator (DBA)


Database Administrator or DBA is a person or group of persons responsible for over all
control of database system. Intact all the activities in a database system are controlled by
DBA.
Hence, Database Administrators (DBA) are responsible for:
1.
2.
3.
4.

authorizing access to the database.


coordinating and monitoring its use
acquiring software and hardware resources as needed
Monitoring performance and responding to changing requirements.

Database Designers
Database designers task is undertaken before the database is actually implemented. Hence,
Database Designers are responsible for identifying the data to be stored in the database, and
choosing appropriate structures to represent and store this data.
End Users
End Users are the users who use the applications developed . End users need not know about
working ,database design , and the access mechanism etc.They just use the system to get their
task done.
Application Programmers
These users write application programs to interact with the database . Application programs
can be written in some programming languages like C,C++,Java etc. Such programs access
the database by issuing the appropriate request , typically a SQL statement to DBMS .

Three Levels of Data Abstraction Or Various views of data


Data abstraction is a process of representing the essential features without
including implementation details. Many database-systems users are not computer trained,
developers hide the complexity from users through several levels of abstraction, to
simplify users interactions with the system.
1. Physical level or Physical schema
The lowest level of abstraction describes how the data
actually stored. The physical level describes complex low-level
structures in detail.
2. Logical level or Conceptual Schema
The next-higher level of abstraction describes what data
stored in the database, and what relationships exist among those data.

are
data

are
The
3

logical level thus describes the entire database in terms of a small number of
relatively simple structures.
3. View level or External Schema
The highest level of abstraction describes only part of the entire database. The
variety of information stored in a large database. Many users of the database
system do not need all this information; instead, they need to access only a part of
the database. The view level of abstraction exists to simplify their interaction with
the system.

Data Independence
Data independence means that 'the application is independent of the storage structure and
access strategy of data'. In other words, The ability to modify the schema definition in one
level should not affect the schema definition in the next higher level.
There are Two types of Data Independence
1. Physical Data Independence
Physical data independence is the ability to modify the inner schema without having
alteration to the conceptual schemas or application programs. Alteration in the
internal schema might include.
Using new storage devices.
Using different file organizations or storage structures.
2. Logical Data Independence
Logical data independence is the ability to modify the conceptual schema without
having alteration in external schemas or application programs. Alterations in the
conceptual schema may include addition or deletion of fresh entities, attributes or
relationships and should be possible without having alteration to existing external
schemas .
NOTE: Logical Data Independence is more difficult to achieve.

Data Models
Data model tells how the logical structure of a database is modeled. Data Models are
fundamental entities to introduce abstraction in DBMS. Data models define how data is
connected to each other and how it will be processed and stored inside the system.
4

Data models are of the following types:


1. Hierarchical Model
2. Network Model
3. Relational Model
Hierarchical Model
The hierarchical model uses the tree as its basic structure . A tree is a data structure that
consists of hierarchy of nodes , with a single node , called the root at the higher level.A node
can have any number of children but each child may have only one parent node on which it is
dependent . The parent to child relationship in a tree is thus a one to many relationship but the
child to parent is one to one .

Network Model
The network model was developed by an alternative to the hierarchical database. In the
network model , entities are organized in a graph in which some entites can be accessed
through several other path.

Relational Model
In the relational model of a database, all data is represented in terms of tuples, grouped
into relations. A database organized in terms of the relational model is a relational database.

Database Languages
A DBMS must provide appropriate languages and interfaces for each category of users to
express database queries and updates. Database Languages are used to create and maintain
database on computer. Database languages can be categorized as follows :

1. Data Definiton Language(DDL)


2. Data Manipulation Language(DML)
3. Data Control Language(DCL)
Data Definition Langauge
It is a language that allows the users to define database structure.
Some examples are as:

CREATE - to create objects in the database


ALTER - alters the structure of the database
DROP - delete objects from the database
TRUNCATE - remove all records from a table, including all spaces
allocated for the records are removed
Data Manipulation Langauge
It is a language that provides a set of operations to support the basic data manipulation
operations on the data in the databases. It allows users to insert, update, delete and retrieve
data from the database.
6

Some examples are as

SELECT - Retrieve data from the a database


INSERT - Insert data into a table
UPDATE - Updates existing data within a table
DELETE - deletes all records from a table, the space for the records
remain
There are two types of DML :
1. Procedural DML
It requires the user to specify what data are needy and how to get those data.
e.g , PL/SQL
2. Declarative (Non-Procedural) DML
It requires the user to specify what data are needy without specifying how to get those
data .
e.g , SQL
Data Control Langauge
DCL statements control access to data and the database using statements such as GRANT and
REVOKE. A privilege can either be granted to a User with the help of GRANT statement.
The privileges assigned can be SELECT, ALTER, DELETE, EXECUTE, INSERT, INDEX
etc. In addition to granting of privileges, you can also revoke (taken back) it by using
REVOKE command.
Some examples are as:

GRANT gives user's access privileges to database


REVOKE - withdraw access privileges given with the GRANT command
COMMIT - save work done
ROLLBACK - restore database to original since the last COMMIT

Keys in DBMS
Key is a single or combination of multiple fields in a table. Its is used to fetch or retrieve
records/data-rows from data table according to the condition/requirement. Keys are also used
to create relationship among different database tables or views.The following are the various
types of keys available in the DBMS system:
Super Key
Super key is a set of one or more than one keys that can be used to identify a record uniquely
in a table. Example :Primary key, Unique key, Alternate key are subset of Super Keys.
Candidate Key
A Candidate Key is a set of one or more fields/columns that can identify a record uniquely in
a table. There can be multiple Candidate Keys in one table. Each Candidate Key can work as
Primary Key.Example: In below diagram ID, RollNo and EnrollNo are Candidate Keys since
all these three fields can be work as Primary Key
7

Primary Key
Primary key is a set of one or more fields/columns of a table that uniquely identify a record in
database table. It can not accept null, duplicate values. Only one Candidate Key can be
Primary Key.
Alternate Key
A Alternate key is a key that can be work as a primary key. Basically it is a candidate key that
currently is not primary key.
Example: In below diagram RollNo and EnrollNo becomes Alternate Keys when we define ID as
Primary Key.

Foreign Key
Foreign Key is a field in database table that is Primary key in another table. It can accept
multiple null, duplicate values. Foreign key can accept multiple null values in table. Example
: We can have a DeptID column in the Employee table which is pointing to DeptID column
in a department table where it a primary key.
Secondary Key
It defines the tuple but not uniquely.Example:Name,Address are secondary keys.
Unique Key
Unique key is a set of one or more fields/columns of a table that uniquely identify a record in
database table. It is like Primary key but it can accept only one null value and it can not have
duplicate values.We can have more than one primary key in a table.
8

Functional Dependence (FD)


A functional dependency is an association between two attributes of the same
relational database table. One of the attributes is called the determinant and the other attribute
is called the determined. For each value of the determinant there is associated one and only
one value of the determined.
If A is the determinant and B is the determined then we say that A functionally determines
B and graphically represent this as A -> B. The symbols A B can also be expressed as B is
functionally determined by A.
Suppose given the value of one attribute,we can obtain the value of another attribute.
Example: if we know the value of customer account number , then we can obtain the value of
customer balance.So we can say that cutomer balance is functional dependent on customer
account number.
Example

Since for each value of A there is associated one and only one value of B.
Example

Since for A = 3 there is associated more than one value of B.

Functional dependency can also be defined as follows:


An attribute in a relational model is said to be functionally dependent on another attribute in
the table if it can take only one value for a given value of the attribute upon which it is
functionally dependent.
Trivial functional dependency: If an FD, X Y holds where Y subset of X, then it is
called a trivial FD. Trivial FDs are always hold.
For example: {Employee ID, Employee Address} {Employee Address} is trivial, here
{Employee Address} is a subset of {Employee ID, Employee Address}.
Non - Trivial functional dependency: If an FD, X Y holds where Y is not subset of X,
then it is called non-trivial FD.

Integrity Constriants
Constraints are used to limit the type of data that can go into a table. Integrity constraints are
used to ensure accuracy and consistency of data in a relational database. Constraints can be
specified when a table is created (with the CREATE TABLE statement) or after the table is
created (with the ALTER TABLE statement).

You can define integrity constraints to enforce the business rules you want to associate with
the information in a database. If any of the results of a DML statement execution violate an
integrity constraint, then Oracle rolls back the statement and returns an error.
Example: assume that you define an integrity constraint for the salary column of the
employees table. This integrity constraint enforces the rule that no row in this table can
contain a numeric value greater than 10,000 in this column. If an INSERT or UPDATE
statement attempts to violate this integrity constraint, then Oracle rolls back the statement and
returns an information error message.
There are following two types of integrity constraints as follows:
1. Entity Integrity Constraints
2. Referential Integrity Constraints
Entity Integrity Constraint
The entity integrity constraint states that primary keys can't be null. There must be a proper
value in the primary key field.
10

This is because the primary key value is used to identify individual rows in a table. If there
were null values for primary keys, it would mean that we could not indentify those rows.
On the other hand, there can be null values other than primary key fields. Null value means
that one doesn't know the value for that field. Null value is different from zero value or space.
In the Car Rental database in the Car table each car must have a proper and unique Reg_No.
There might be a car whose rate is unknown - maybe the car is broken or it is brand new - i.e.
the Rate field has a null value. See the picture below.
The entity integrity constraints assure that a spesific row in a table can be identified.

Referential Integrity Constraint


The referential integrity constraint is specified between two tables and it is used to maintain
the consistency among rows between the two tables.
The rules are:
1. You can't delete a record from a primary table if matching records exist in a related
table.
2. You can't change a primary key value in the primary table if that record has related
records.
3. You can't enter a value in the foreign key field of the related table that doesn't exist in
the primary key of the primary table.
4. However, you can enter a Null value in the foreign key, specifying that the records are
unrelated.
11

Examples
Rule 1. You can't delete any of the rows in the CarType table that are visible in the picture
since all the car types are in use in the Car table.
Rule 2. You can't change any of the model_ids in the CarType table since all the car types are
in use in the Car table.
Rule 3. The values that you can enter in the model_id field in the Car table must be in the
model_id field in the CarType table.
Rule 4. The model_id field in the Car table can have a null value which means that the car
type of that car in not known

Normalization
Normalization was developed by IBM researcher E.F. Codd in 1970s . Database
normalization is a database schema design technique by which an existing schema is
modified to minimize redundancy and dependency of data. Redundancy is storing the same
data item in more one place . Normalization split a table into smaller tables and deine
relationship between them to increase the clearity in organizing data.
While designing a database out of an entityrelationship model, the main problem existing in
that raw database is redundancy.
Problem Without Normalization
Without Normalization, it becomes difficult to handle and update the database,
without facing data loss. Insertion,Updaton and Deletion Anamolies are very frequent
if Database is not Normalized. To understand these anomalies let us take an
example of Student table.
S_Id
401
402

S_Name
Adam
Alex

S_Address
Noida
Panipat

Subject_Opted
Bio
Maths

403

Sturat

Jammu

Maths

404

Adam

Noida

Physics

Updation Anamoly : To update address of a student who occurs twice or more than
twice in a table, we will have to update S_Address column in all the rows, else data
will become inconsistent.
Insertion Anamoly : Suppose for a new admission, we have a Student id(S_id),
name and address of a student but if student has not opted for any subjects yet then
we have to insert NULL there, leading to Insertion Anamoly.
Deletion Anamoly : If (S_id) 401 has only one subject and temporarily he drops it,
when we delete that row, entire student record will be deleted along with it.
To solve this problem, the raw database needs to be normalized. This is a step by step
process of removing different kinds of redundancy and anomaly at each step. At each step a
specific rule is followed to remove specific kind of impurity in order to give the database a
slim and clean look.
12

Un-Normalized Form (UNF)


If a table contains non-atomic values at each row, it is said to be in UNF. An atomic value is
something that can not be further decomposed. A non-atomic value, as the name suggests,
can be further decomposed and simplified. Consider the following table:
Emp-Id

Emp-Name

Month

Sales

Bank-Id

Bank-Name

E01

AA

Jan

1000

B01

SBI

Feb

1200

Mar

850

Jan

2200

B02

UTI

Feb

2500

Jan

1700

B01

SBI

Feb

1800

Mar

1850

Apr

1725

E02
E03

BB
CC

In the sample table above, there are multiple occurrences of rows under each key Emp-Id.
Although considered to be the primary key, Emp-Id cannot give us the unique identification
facility for any single row. Further, each primary key points to a variable length record (3 for
E01, 2 for E02 and 4 for E03).

First Normal Form (1NF)


A relation is said to be in 1NF if

it contains no non-atomic values and each row can provide a unique combination of
values.

The above table in UNF can be processed to create the following table in 1NF.
Emp-Id

Emp-Name

Month

Sales

Bank-Id

Bank-Name

E01

AA

Jan

1000

B01

SBI

E01

AA

Feb

1200

B01

SBI

E01

AA

Mar

850

B01

SBI

E02

BB

Jan

2200

B02

UTI

E02

BB

Feb

2500

B02

UTI

E03

CC

Jan

1700

B01

SBI

E03

CC

Feb

1800

B01

SBI

E03

CC

Mar

1850

B01

SBI

E03

CC

Apr

1725

B01

SBI

As you can see now, each row contains unique combination of values. Unlike in UNF, this
relation contains only atomic values, i.e. the rows can not be further decomposed, so the
13

relation is now in 1NF.

Second Normal Form (2NF)


A relation is said to be in 2NF if it satisfies the following conditions:

It is in first normal form


All non-key attributes are fully functional dependent on the primary key

Let us explain. Emp-Id is the primary key of the above relation. Emp-Name, Month, Sales
and Bank-Name all depend upon Emp-Id. But the attribute Bank-Name depends on Bank-Id,
which is not the
primary key of the table. So the table is in 1NF, but not in 2NF. If this position can be
removed into another related relation, it would come to 2NF.

Emp-Id

Emp-Name

Month

Sales

Bank-Id

E01

AA

JAN

1000

B01

E01

AA

FEB

1200

B01

E01

AA

MAR

850

B01

E02

BB

JAN

2200

B02

E02

BB

FEB

2500

B02

E03

CC

JAN

1700

B01

E03

CC

FEB

1800

B01

E03

CC

MAR

1850

B01

E03

CC

APR

1726

B01

Bank-Id

Bank-Name

B01

SBI

B02

UTI

After removing the portion into another relation we store lesser amount of data in two
relations without any loss information. There is also a significant reduction in redundancy.

Third Normal Form (3NF)


A relation is said to be in 3NF if it satisfies the following conditions:

It is in second normal form


14

There is no transitive functional dependency

By transitive functional dependency, we mean we have the following relationships


in the table: A is functionally dependent on B, and B is functionally dependent on
C. In this case, C is transitively dependent on A via B.
Consider the following example:

In the table able, [Book ID] determines [Genre ID], and [Genre ID] determines
[Genre Type]. Therefore, [Book ID] determines [Genre Type] via [Genre ID] and
we have transitive functional dependency, and this structure does not satisfy third
normal form.
To bring this table to third normal form, we split the table into two as follows:

Now all non-key attributes are fully functional dependent only on the primary key.
In [TABLE_BOOK], both [Genre ID] and [Price] are only dependent on [Book
ID]. In [TABLE_GENRE], [Genre Type] is only dependent on [Genre ID].

Boyce-Code Normal Form (BCNF)


A relation is said to be in BCNF if it satisfies the following conditions:

It is in third normal form.


15

A relation is in BCNF if every determinant is a candidate key

Consider the following example:


Table-ADVISOR

SID
100
150
200
250
200

Major
Math
Phy
Math
Math
Phy

Fname
Ram
Shyam
Rohan
Ram
Rohit

Here
FD: Fname
Major .Any Faculty member advises only in one
subject.Therefore given the Fname,we can determines the major.Thus Fname is
determinant. But it is not a candidate key.
According to definition,Fname must be candidate key.So this table can be
decomposed into two relations STU-ADV(SID,Fname) and ADVSUBJ(Fname,Major) . Hence,these relations are in BCNF.
STU-ADV(SID,Fname)
Key(SID,Fname)
SID
100
150
200
250
200

Fname
Ram
Shyam
Rohan
Ram
Rohit

ADV-SUBJ(Fname,Major)
KeyFname)
Fname
Ram
Shyam
Rohan
Ram
Rohit

Major
Math
Phy
Math
Math
Phy
16

Fourth Normal Form (4NF)


A relation is said to be in 4NF if it satisfies the following conditions:

It is in BCNF.
It has no multi value dependency(MVD).

MVD exists if

A leads to multiple values of B.


A leads to multiple values of C.
B and C are independent of each other.

Take the following table structure as an example:


info(employee#, skills, hobbies)
Take the following table:
employee#

skills

hobbies

Programming

Golf

Programming

Bowling

Analysis

Golf

Analysis

Bowling

Analysis

Golf

Analysis

Gardening

Management

Golf

Management

Gardening

17

This table is difficult to maintain since adding a new hobby requires multiple new rows
corresponding to each skill. This problem is created by the pair of multi-valued dependencies
EMPLOYEE#--->SKILLS and EMPLOYEE#--->HOBBIES. A much better alternative
would be to decompose INFO into two relations:
skills(employee#, skill)

employee#

skills

Programming

Analysis

Analysis

Management

hobbies(employee#, hobby)
employee#

hobbies

Golf

Bowling

Golf

Gardening

Fifth Normal Form (5NF)


A relation is said to be in 5NF or Project-Join Normal Form (PJNF) if it satisfies the
following conditions:

It is in 4NF.

18

It cannot have a lossless decomposition into any number of smaller tables.


Example to understand 5NF
Take the following table structure as an example of a buying table.This is used to track
buyers, what they buy, and from whom they buy. Take the following sample data:

buyer

vendor

item

Sally
Mary
Sally
Mary
Sally

Liz Claiborne Blouses


Liz Claiborne Blouses
Jordach
Jeans
Jordach
Jeans
Jordach
Sneakers

Problem:- The problem with the above table structure is that if Claiborne starts to sell Jeans
then how many records must you create to record this fact? The problem is there are pair wise
cyclical dependencies in the primary key. That is, in order to determine the item you must
know the buyer and vendor, and to determine the vendor you must know the buyer and the
item, and finally to know the buyer you must know the vendor and the item.
Solution:- The solution is to break this one table into three tables; Buyer-Vendor, Buyer-Item,
and Vendor-Item. So following tables are in the 5NF.

Buyer-Vendor

buyer

vendor

Sally

Liz Claiborne

Mary

Liz Claiborne

Sally

Jordach

Mary

Jordach

19

Buyer-Item

buyer

item

Sally

Blouses

Mary

Blouses

Sally

Jeans

Mary

Jeans

Sally

Sneakers

Vendor-Item

vendor

item

Liz Claiborne Blouses

Jordach

Jeans

Jordach

Sneakers

20

You might also like