Professional Documents
Culture Documents
UNIT I INTRODUCTION
Database and DBMS characteristics importance advantages evolution - codd
rules-database architecture; data organization- file structures and indexing
DATABASE
MANAGEMENT SYSTEM
UNIT I
DATABASE
A database is a collection of information
that is organized so that it can easily be
accessed, managed, and updated.
In one view, databases can be classified
according
to
types
of
content:
bibliographic, full-text, numeric, and
images.
DBMS
Database-management system
is a collection of interrelated data
and a set of programs to access
those data.
CHARACTERISTICS OF DBMS
Represents complex relationship between
data
Controls data redundancy.
Enforces user defined rules.
Ensures data sharing.
It has automatic and intelligent backup and
recovery procedures.
It has central dictionary to store information.
Pertaining to data and its manipulation.
It has different interfaces via which user can
manipulate the data.
Enforces data access authorization.
IMPORTANCE OF DBMS
ADVANTAGES OF DBMS
Redundancies and inconsistencies can be
reduced
Better service to the Users
Flexibility of the system is improved
Cost of developing and maintaining systems
is lower
Standards can be enforced
Security can be improved
Integrity can be improved
Enterprise requirements can be identified
Data model must be developed
EVOLUTION
MIS and early database concepts
SDC a group of RAND corp adopted the term DBMS
CODASYL and DBTG - Committee On Data SYstems Languages & Database Task
Group
Played important role for DBMS
Subgroup DBTG for promoting DBMS idea
Technical savior for MIS
Ex: Prime dbms from PRIME Computer, IDS II from Honeywell, DMS 170 from CDC,
DMS II and DMS 1100 from UNISYS, and DBMS 10 and DBMS 11 from Digital
Equipment Corp
CODD RULES
1. Information Rule - All information in a relational
database including table names, column names
are represented by values in tables.
2. Guaranteed Access Rule -Every piece of data in a
relational database, can be accessed by using
combination of a table name & a primary key
3. Systematic Treatment of Nulls Rule - The RDBMS
handles
records
that
have
unknown
or
inapplicable values in a pre-defined fashion
4. Active On-line catalog based on the relational
model - The descriptions of a database and in its
contents are database tables and therefore can
be queried on-line via the data manipulation
language.
CODD RULES
5. Comprehensive Data Sub-language Rule - A RDBMS
may support several languages. But at least one of
them should allow user to do all of the following:
define tables and views, query and update the
data,
set
integrity
constraints,
set
authorizations and define transactions.
6. View Updating Rule - Any view that is theoretically
updateable can be updated using the RDBMS.
7. High-Level Insert, Update and Delete - The RDBMS
supports insertions, updation and deletion at a table
level.
8. Physical Data Independence - The execution of adhoc
requests and application programs is not affected
by changes in the physical data access and
storage methods
CODD RULES
9. Logical Data Independence - Logical changes in tables
and views such adding/deleting columns or changing
fields lengths need not necessitate modifications in the
programs or in the format of adhoc requests.
10.Integrity Independence - Like table/view definition,
integrity constraints are stored in the on-line catalog
and can therefore be changed without necessitating
changes in the application programs.
11.Distribution Independence - Application programs and
adhoc requests are not affected by change in the
distribution of physical data.
12.No subversion Rule - If the RDBMS has a language that
accesses the information of a record at a time, this
language should not be used to bypass the integrity
constraints. This is necessary for data integrity.
DATABASE ARCHITECTURE
DATA ORGANIZATION
DATA ORGANIZATION
Data Value (Cells).Contents of a field contained in a record.
Raw Facts that can be recognized.
Relation Table. Collection of related records
Tuple -Record/Row. Collection of related fields
Attributes - Field/Columns. Group of characters representing
something
Domain - Set of valid values of attributes
Degree -Number of columns in a table
Cardinality -Number of rows in a table
FREE LISTS
Fixed-length representation
POINTER METHOD
ORGANIZATION OF RECORDS IN
FILES
Heap a record can be placed anywhere in
the file where there is space
Sequential store records in sequential
order, based on the value of the search key
of each record
Hashing a hash function computed on
some attribute of each record; the result
specifies in which block of the file the record
should be placed.
SEQUENTIAL FILE
ORGANIZATION
DATA DICTIONARY
INDEXING
An index is a small table having only
two columns.
The first column contains a copy of
the primary or candidate key of a
table and the second column
contains a set of pointers holding
the address of the disk block where
that particular key value can be
found.
DISADVANTAGE
TYPES OF INDEX
PRIMARY INDEX
In primary index, there is a one-toone relationship between the entries
in the index table and the records in
the main table.
CLUSTERING INDEX
SECONDARY INDEX
DATA MODEL
Maple
Queen
556
Hodges
Shiver
Sidehill
North
55
100000
Brooklyn
647
105366
801
10532
Brookyln
Lowary
Queens
y
Maple
Shiver
55
6
90
0
55
10000
0
Sidehill
Hodges
North
Brookly
n
Brookyl
n
64
7
10536
6
64
7
10536
6
80
1
1053
2
Title
LastNam
e
SubjectI
d
Subject
Supervis
es
FirstName
Teacher
TeacherId
Entity
Item
Vendor
Purchase
Order
Key
attribute
Item code
Vendor Code
Purchase
Order
Number
Other attributes
Description
30
characters
Vendor
Name
Date
Specificatio
n 20
Characters
Vendor
Location
Vendor
Name
Unit of
measures
Vendor Reg.
No.
Vendor Code
RELATIONSHIP TYPES
OCCUPIES
PATIENT
ONE-TO-ONE
BED
ASSIGNED
PATIENT
ONE-TO-MANY
HOSPITAL ROOM
ACCOMODATES
PATIENT
OPERATED
MANY-TO-MANY
OPERATES
SURGEON
DATABASEDESIGN
of
to
requirement.
meet
an
end
users
PHASES OF DATABASEDESIGN
It is a process of constructing a
model of information, which can then
be mapped into storage objects
supported
by
the
Database
Management System.
This step involves:
Table Generation From ER Model
Normalization of Tables
ER DIAGRAM-RELATIONSHIPS
ENTITY
Entity is a thing in the real world with an
independent existence.
Entity set is collection or set all entities of a
particular entity type at any point of time.
Ex:
A company have many employees.
Employees are defined as entities(e1,e2,e3....)
These entities have same attributes are
defined under ENTITY TYPE employee.
Set{e1,e2,.....} is called entity set.
WEAK ENTITY
A weak entity is an entity that
cannot be uniquely identified by its
attributes alone; therefore, it must
use a foreign key in conjunction with
its attributes to create a primary key.
READING AN ERD
NORMALIZATION
DDL
1. Create:
Syntax: Create table <table name> (<list of variable>);
Example:
Create table inventory
(
id int primary key,
product varchar(50),
quantity int,
price decimal(18,2)
);
2. Alter:
Syntax: Alter table <table name>
Example: Alter table department add primary key (dname)
3. Drop/Truncate:
Syntax: Drop table <table name> or Truncate table <table name>
Example: Drop table customer or Truncate table employee
delete
information
from
the
Retrieval:
Syntax: SELECT <field names> FROM table_name [WHERE condition];
Example: SELECT * FROM employee;
Insertion:
Syntax: INSERT INTO <tablename> (col1name, col2name... colxname)
VALUES (value1, value2... valuex);
Example: INSERT INTO citylist (name, state, population, zipcode) VALUES
('Argos', 'Indiana', '89', '46501');
Deletion:
Syntax: DELETE FROM table_name [WHERE condition];
Example: DELETE FROM employee WHERE id = 100;
Modification:
Syntax: UPDATE <tablename> SET column_1 = [value1], column_2=
[value2]
WHERE {condition}
Example: UPDATE StoreInformation SET Sales = 500 WHERE storename
= "Los Angeles"
DATABASE TRANSACTION
A transaction comprises a unit of work
performed within a database management
system (or similar system) against a
database, and treated in a coherent and
reliable way
independent of
other
transactions.
Transaction should have four properties:
atomic, consistent, isolated and durable.
ATOMICITY
Atomicity is one of the ACID transaction
properties. In an atomic transaction, a series
of database operations either all occur, or
nothing occurs.
An example of atomicity is ordering an airline
ticket where two actions are required:
payment, and a seat reservation. The
potential passenger must either:
both pay for and reserve a seat; OR
neither pay for nor reserve a seat.
CONSISTENCY
Consistency states that only valid data will be
written to the database. If, for some reason, a
transaction is executed that violates the databases
consistency rules, the entire transaction will be
rolled back and the database will be restored to a
state consistent with those rules.
Assume that a transaction attempts to subtract 10
from A without altering B. Because consistency is
checked after each transaction, it is known that A +
B = 100 before the transaction begins. If the
transaction removes 10 from A successfully,
atomicity will be achieved. However, a validation
check will show that A + B = 90.
ISOLATION
Isolation requires that multiple transactions occurring at
the same time not impact each others execution.
DURABILITY
Durability ensures that any transaction
committed to the database will not be lost.
Ex: Assume that a transaction transfers 10
from A to B. It removes 10 from A. It then
adds 10 to B. At this point, a "success"
message is sent to the user. However, the
changes are still queued in the disk buffer
waiting to be committed to the disk. Power
fails and the changes are lost. The user
assumes (validly) that the changes have
been made.
CONCURRENCY CONTROL
RECOVERY
Data recovery is the process of salvaging
data from damaged, failed, corrupted, or
inaccessible secondary storage media
when it cannot be accessed normally.
SECURITY
Unauthorized or unintended activity or misuse by authorized database
users, database administrators, or network/systems managers, or by
unauthorized users or hackers
Malware infections causing incidents such as unauthorized access,
leakage or disclosure of personal or proprietary data, deletion of or damage
to the data
Overloads, performance constraints and capacity issues resulting in
the inability of authorized users to use databases as intended;
Physical damage to database servers caused by computer room fires or
floods, overheating, lightning, accidental liquid spills
Design flaws and programming bugs in databases and the associated
programs and systems, creating various security vulnerabilities
Data corruption and/or loss caused by the entry of invalid data or
commands, mistakes in database or system administration processes,
sabotage/criminal damage etc.
BACKUP
DATABASE
DISK
=
ACID PROPERTIES
CONCURRENCY CONTROL
a)Locking
1. Pessimistic concurrency
control
2. Optimistic concurrency
control
3. Overly Optimistic Locking
b) Timestamps
LOCKING
A lock is used when multiple users need
to access a database concurrently. This
prevents data from being corrupted or
invalidated when multiple users try to
write to the database.
T1
read-lock(X)
Read(X)
write-lock(Y)
unlock(X)
Read(Y)
Y=Y+X
Write(Y)
unlock(Y)
T2
read-lock(X)
Read(X)
unlock(X)
write-lock(Y)
Read(Y)
Y=Y+X
Write(Y)
unlock(Y
TIMESTAMPS
DATABASE RECOVERY TECHNIQUES
a) DEFERRED UPDATE
TECHNIQUES
Do not update the database until reaches commit point.
Before reaching the commit point, all transaction updates are
recorded in the local transaction workspace (or buffers).
During commit, the updates are first recorded persistently in the
log and then written to the DB.
If a transaction fails before reaching its commit point, no UNDO is
needed because it will not have changed the database anyway.
If there is a crash, it may be necessary to REDO the effects of
committed transactions from the Log because their effect may
not have been recorded in the database.
Deferred update also known as NO-UNDO/REDO algorithm.
b) IMMEDIATE UPDATE
TECHNIQUES
Use two lists of transactions.
List of all committed transactions since the last
checkpoint
List of all active transactions since the last checkpoint
Undo the writes of all active transactions using the
undo policy
Redo the write operations of all the committed
transactions
Submit all active transactions again to the DBMS
C) SHADOW PAGING
This technique does not require LOG in single user environment
In multi-user may need LOG for concurrency control method
Shadow paging considers
The database is partitioned into fixed-length blocks referred to as
PAGES.
Page table has n entries
Each contain pointer to a page on disk
E) LOG-BASED RECOVERY
Logging
is
the
most
popular
mechanism for implementing recovery
algorithms.
The recovery manager implements
Commit - by writing a commit record to the
log and flushing the log (satisfies the Redo
Rule)
Abort - by using the transactions log
records to restore before-images
Restart - by scanning the log and undoing
and redoing operations as necessary
administration
of
is
managing
the
and
DBA RESPONSIBILITIES
Installation, configuration and upgrading of Database server
software and related products.
Evaluate Database features and Database related products.
Establish and maintain sound backup and recovery policies and
procedures.
Take care of the Database design and implementation.
Implement and maintain database security
Database tuning and performance monitoring.
Setup and maintain documentation and standards.
Plan growth and changes (capacity planning).
Work as part of a team and provide 24x7 support when required.
Do general technical troubleshooting and give cons.
Database recovery.
TYPES OF DBAS
Systems DBAs: focus on the physical aspects
such as DBMS installation, configuration,
patching, upgrades, backups, restores,
refreshes,
performance
optimization,
maintenance and disaster recovery.
Development DBAs: focus on the logical and
development
aspects
of
database
administration.
Application DBAs: usually found in organizations
that have purchased 3rd party application
software such as ERP and CRM systems.
UNIT IV
DISTRIBUTED DATABASE
A
logically
interrelated
collection of shared data,
physically distributed over a
computer network is called
Distributed Database.
DDBMS
DC
LDBMS
GDD
Computer Network
DDBMS
DC
site 2
DB
PARALLEL DBMS
A
DBMS
running
across
multiple
processors and disks designed to
execute operations in parallel, whenever
possible, to improve performance.
Parallel DBMSs link multiple, smaller
machines to achieve same throughput as
single, larger machine, with greater
scalability and reliability.
TYPES OF DDBMS
Homogeneous DDBMS
All sites use same DBMS product.
Much easier to design and manage.
Approach provides incremental growth
increased performance.
and
allows
Heterogeneous DDBMS
Sites may run different DBMS products, with possibly
different underlying data models.
Occurs when sites have implemented their own
databases and integration is considered later.
Translations required to allow for:
Different hardware.
Different DBMS products.
Different hardware and different DBMS products.
object-oriented
management
sometimes
(object
system
shortened
database
database
(OODBMS),
to
ODBMS
management
UML
Unified
Modeling
standardized
Language
(UML)
general-purpose
is
modeling
software-intensive
under development.
system
used
for
capturing
user
requirements
Work like a contract between end user and
software developers
<<include>>
<<extend>>
b) CLASS DIAGRAM
Account_Name
- Customer_Name
- Balance
+addFunds( )
+withDraw( )
+transfer( )
Name
Attributes
Operations
c) SEQUENCE DIAGRAM
Creation
Create message
Object life starts at that point
Activation
Symbolized by rectangular stripes
Place on the lifeline where object is activated.
Rectangle also denotes when object is deactivated.
Deletion
Placing an X on lifeline
Objects life ends at that point
d) COLLABORATION DIAGRAM
Shows the relationship between objects and the
order of messages passed between them.
The objects are listed as rectangles and arrows
indicate the messages being passed.
The numbers next to the messages are called
sequence numbers.They show the sequence of the
messages as they are passed between the objects.
Convey the same information as sequence
diagrams, but focus on object roles instead of the
time sequence.
d) STATE DIAGRAM
State Diagrams show the sequences
of states an object goes through
during its life cycle in response
to
stimuli, together with its
responses
and
actions;
an
abstraction of all possible behaviors.
End
Start
Unpaid
Invoice created
Paid
paying
Invoice destroying
State
Transition
Red
Yellow
Green
Event
Start
UNIT V
and
CERTIFICATION PROGRAMS
OVERVIEW
Oracle Certified Professional Credential
(OCP)
Microsoft Certified Database
Administrator (MCDBA)
MY SQL Certification
OCA:
Oracle9i DBA Certified Associate
Oracle9iAS Web Administrator Certified Associate
Oracle9i PL/SQL Developer Certified Associate
OCP:
Oracle8i DBA Certified Professional
Oracle9i DBA Certified Professional
Oracle6i Internet Application Developer Certified
Professional
Oracle9i Forms Developer Certified Professional
OCM:
Oracle9i Database Administrator Certified Master
TRENDS IN DBMS
Multimedia Databases
Multimedia data typically means digital images i.e audio, video, animation
and graphics. The acquisition, generation, storage and processing of
multimedia data.
Distributed Database
A distributed database is a collection of multiple, logically interrelated
databases of the same system distributed over various sites of a computer
network.
Document-oriented Databases
Each record/document might have a different format (number and size of
fields). They dont store data in tables. Each record is stored as a document
that has certain characteristics. An XML database are a document oriented
database.
Mobile & embedded Databases
Many daily-use devices contain databases. TVs, washing machines, mobile
phones e.g. Android phones with SQLite database. Embedded databases in
cars, airplanes etc. manage configurations & store sensor data. Ex. db4o object
database used in BMW Car IT system.