Professional Documents
Culture Documents
© Leo Mark
DATABASE CONCEPTS
Leo Mark
College of Computing
Georgia Tech
(January 1999)
Database Concepts
© Leo Mark
Course Contents
● Introduction
● Database Terminology
● Data Model Overview
● Database Architecture
● Database Management System Architecture
● Database Capabilities
● People That Work With Databases
● The Database Market
● Emerging Database Technologies
● What You Will Be Able To Learn More About
Database Concepts
© Leo Mark
INTRODUCTION
● What a Database Is and Is Not
● Models of Reality
● Why use Models?
● A Map Is a Model of Reality
● A Message to Map Makers
● When to Use a DBMS?
● Data Modeling
● Process Modeling
● Database Design
● Abstraction
Database Concepts
© Leo Mark
What a Database Is and Is Not
The word database is commonly used to refer
to any of the following:
● your personal address book in a Word document
● a collection of Word documents
● a collection of Excel Spreadsheets
● a very large flat file on which you run some
statistical analysis functions
● data collected, maintained, and used in airline
reservation
● data used to support the launch of a space shuttle
Database Concepts
© Leo Mark
Models of Reality
DML
DATABASE SYSTEM
REALITY
• structures DATABASE
• processes DDL
Database Concepts
© Leo Mark
A Message to Map Makers
● A model is a means of communication
● Users of a model must have a certain amount
of knowledge in common
● A model on emphasized selected aspects
● A model is described in some language
● A model can be erroneous
● A message to map makers: “Highways are
not painted red, rivers don’t have county lines
running down the middle, and you can’t see
contour lines on a mountain” [Kent 78]
Database Concepts
© Leo Mark
Use a DBMS when Do not use a
this is important DBMS when
● persistent storage of data ● the initial investment in
● centralized control of data hardware, software, and
training is too high
● control of redundancy
● control of consistency and
● the generality a DBMS
integrity provides is not needed
● multiple user support
● the overhead for security,
concurrency control, and
● sharing of data recovery is too high
● data documentation ● data and applications are
● data independence simple and stable
● control of access and ● real-time requirements
security cannot be met by it
● backup and recovery ● multiple user access is not
Database Concepts
© Leo Mark
needed
Data Modeling
DATABASE SYSTEM
REALITY
• structures MODEL
• processes
data modeling
Database Concepts
© Leo Mark
Database Design
Database Concepts
© Leo Mark
Abstraction
It is very important that the language used for
data representation supports abstraction
● Classification
● Aggregation
● Generalization
Database Concepts
© Leo Mark
Classification
In a classification we form a concept in a way
which allows us to decide whether or not a
given phenomena is a member of the extension
of the concept.
CUSTOMER
AIRPLANE
WING COCKPIT
ENGINE
Database Concepts
© Leo Mark
Generalization
In a generalization we form a new concept by
emphasizing common aspects of existing concepts,
leaving out special aspects
CUSTOMER
BUSINESS ECONOMY
1 CLASS
ST
CLASS CLASS
Database Concepts
© Leo Mark
Generalization (cont.)
Subclasses may overlap
CUSTOMER
BUSINESS
1 CLASS
ST
CLASS
classification
extension
O O O
Abstraction Concretization
classification exemplification
aggregation decomposition
generalization specialization
Database Concepts
© Leo Mark
DATABASE TERMINOLOGY
● Data Models
● Keys and Identifiers
● Integrity and Consistency
● Triggers and Stored Procedures
● Null Values
● Normalization
● Surrogates - Things and Names
Database Concepts
© Leo Mark
Data Model
A data model consists of notations for expressing:
● data structures
● integrity constraints
● operations
Database Concepts
© Leo Mark
Data Model - Data Structures
All data models have notation for defining:
● attribute types
● entity types
● relationship types
FLIGHT-SCHEDULE DEPT-AIRPORT
data structures
● Static alone:apply to database state
constraints
● Dynamic constraints apply to change of database state
● E.g., “All FLIGHT-SCHEDULE entities must have precisely
one DEPT-AIRPORT relationship
FLIGHT-SCHEDULE DEPT-AIRPORT
from FLIGHT-SCHEDULE
where AIRLINE=‘delta’;
FLIGHT-SCHEDULE DEPT-AIRPORT
repeat
fetch C into :FLIGHT#, :WEEKDAY;
do your thing;
until done;
close C;
Database Concepts
© Leo Mark
Keys and Identifiers
Keys (or identifiers) are uniqueness constraints
● A key on FLIGHT# in FLIGHT-SCHEDULE will force all
FLIGHT#’s to be unique in FLIGHT-SCHEDULE
● Consider the following keys on DEPT-AIRPORT:
FLIGHT-SCHEDULE DEPT-AIRPORT
FLIGHT-SCHEDULE DEPT-AIRPORT
Database Concepts
242 usair mo 231 242 bos
© Leo Mark
Triggers and Stored Procedures
● Triggers can be defined to enforce constraints on a
database, e.g.,
FLIGHT-SCHEDULE DEPT-AIRPORT
Database Concepts
© Leo Mark
Null Values
CUSTOMER
101 fr
FLIGHT-SCHEDULE 545 we
545 fr
FLIGHT# AIRLINE WEEKDAY PRICE ju
st
101 delta mo 156
rig
FLIGHT-SCHEDULE ht
545 american mo 110 !
FLIGHT# AIRLINE PRICE
912 scandinavian fr red 450
un 101 delta 156
101 delta fr da156
nt 545 american 110
545 american we 110
912 scandinavian 450
545 american fr 110
Database Concepts
© Leo Mark
Surrogates - Things and Names
reality
name custom# customer
custom# name addr
customer addr
name-based representation
reality
name custom# customer
customer custom# name addr
customer addr
surrogate-based representation
● name-based: a thing is what we know about it
● surrogate-based: “Das ding an sich” [Kant]
● surrogates are system-generated, unique, internal identifiers
Database Concepts
© Leo Mark
DATA MODEL OVERVIEW
● ER-Model
● Hierarchical Model
● Network Model
● Inverted Model - ADABAS
● Relational Model
● Object-Oriented Model(s)
Database Concepts
© Leo Mark
ER-Model
● Data Structures
● Integrity Constraints
● Operations
composite
attribute
relationship
type
attribute
subset
relationship
type
multivalued
attribute
derived
Database Concepts attribute
© Leo Mark
ER-Model - Integrity Constraints
E1 1 n A
R E2
(min,max) participation of E2 in R
E1 E2 E3
E1 R E2 d disjoint
x exclusion
total participation of E2 in R
p partition
E1 R E2
weakConcepts
Database entity type E2; identifying
© Leo Mark
ER Model - Example visa
required
seat#
Database Concepts
© Leo Mark
ER-Model - Operations
● Several navigational query languages have
been proposed
● A closed query language as powerful as
relational languages has not been developed
● None of the proposed query languages has
been generally accepted
Database Concepts
© Leo Mark
Hierarchical Model
● Data Structures
● Integrity Constraints
● Operations
Database Concepts
© Leo Mark
Hierarchical Model - Data Structures
flight-sched
flight#
P
flight-inst dept-airp arriv-airp
date airport-code airport-code
customer-
pointer
Database Concepts
© Leo Mark
flight-sched
Hierarchical Model
flight#
- Operations
flight-inst dept-airp arriv-airp
date airport-code airport-code
customer
customer# customer name
GET UNIQUE flight-sched (flight#=‘912’) [search flight-sched; get first such flight-sched]
customer (name=‘Jensen’) for each customer with name=Jensen, get the first one]
flight-inst (date=‘102298’) for each flight-inst with date=102298, get the first
GET NEXT flight-inst get the next flight-inst, whatever the date]
Database Concepts
© Leo Mark
Network Model
● Data Structures
● Integrity Constraints
● Operations
Database Concepts
© Leo Mark
Network Model - Data Structures
Type diagram Occurrence diagram
FR
reservation R1 R2 R3 R4 R5 R6
flight# date customer#
C1 C4
customer CR
customer# customer name
● owner record types: flight-schedule, customer
● member record type: reservations
● DBTG-set types: FR, CR
● n-m relationships cannot be modeled directly
recursive relationships cannot be modeled directly
● Concepts
Database
© Leo Mark
Network Model - Integrity
Constraints
flight-schedule
● keys flight#
reservation
● checks flight# date customer# price
check is price>100
flight-schedule
● set retention options: flight#
– fixed
FR
– mandatory
reservation
– optional flight# date customer#
● set insertion options:
customer CR
– automatic
customer# customer name
– manual
Database Concepts
© Leo Mark
FR and CR are fixed and automatic
Network Model - Operations
● The operations in the Network Model are
generic, navigational, and procedural
R1 R2 R3 R4 R5 R6
C1 CR C4
Database Concepts
© Leo Mark
Network Model - Operations
● navigation is cumbersome; tuple-at-a-time
● many different currency indicators
● multiple copies of currency indicators may be
needed if the same path is traveled twice
● external schemata are only sub-schemata
Database Concepts
© Leo Mark
Inverted Model - ADABAS
● Data Structures
● Integrity Constraints
● Operations
Database Concepts
© Leo Mark
Relational Model
● Data Structures
● Integrity Constraints
● Operations
Database Concepts
© Leo Mark
Relational Model - Data Structures
● domains
● attributes
● relations
relation name
attribute names
flight-schedule
flight#: airline: weekday: price:
domain names
Database Concepts
© Leo Mark
Relational Model - Integrity
Constraints
● Keys
● Primary Keys
● Entity Integrity
● Referential Integrity
flight-schedule customer
flight# customer# customer name
p p
reservation
Database Concepts flight# date customer#
© Leo Mark
Relational Model - Operations
● Powerful set-oriented query languages
● Relational Algebra: procedural; describes
how to compute a query; operators like JOIN,
SELECT, PROJECT
● Relational Calculus: declarative; describes
the desired result, e.g. SQL, QBE
● insert, delete, and update capabilities
Database Concepts
© Leo Mark
Relational Model - Operations
● tuple calculus example (SQL)
select flight#, date
from reservation R, customer C
where R.customer#=C.customer#
and customer-name=‘LEO’;
● algebra example (ISBL)
((reservation join customer) where customer-
name=‘LEO’) [flight#, date];
● domain calculus example (QBE)
reservation customer
flight# date customer# customer# customer-name
.P .P _c _c LEO
Database Concepts
© Leo Mark
Object-Oriented Model(s)
● based on the object-oriented paradigm,
e.g., Simula, Smalltalk, C++, Java
● area is in a state of flux
Database Concepts
© Leo Mark
class flight-instance {
type tuple (flight-date: tuple ( year: integer, month: integer, day: integer);
instance-of: flight-schedule,
passengers: set (customer) inv customer::reservations)
method add-passenger(new-passenger:customer):boolean,
/*adds to passengers; invokes customer.make-reservation */
remove-passenger(passenger: customer):boolean}
/*removes from passengers; invokes customer.cancel-reservation*/
class customer {
type tuple (customer#: integer,
customer-name: tuple ( fname: string, lname: string)
reservations: set (flight-instance) inv flight-instance::passengers)
method make-reservation(new-reservation: flight-instance): boolean,
cancel-reservation(reservation: flight-instance): boolean}
Database Concepts
© Leo Mark
Object-Oriented Model - Updates
class customer { O2-like syntax
type tuple (customer#: integer,
customer-name: tuple ( fname: string, lname: string)
reservations: set (flight-instance) inv flight-instance::passengers)
main () {
transaction::begin();
all-customers: set( customer); /*makes persistent root to hold all customers */
customer c= new customer; /*creates new customer object */
c= tuple (customer#: “111223333”,
customer-name: tuple( fname: “Leo”, lname: “Mark”));
all-customers += set( c); /*c becomes persistent by attaching to root */
transaction::commit();}
Database Concepts
© Leo Mark
Object-Oriented Model - Queries
O2-like syntax
“Find the customer#’s of all customers with first name Leo”
“Find passenger lists, each with a flight# and a list of customer names, for
flights out of Atlanta on October 22, 1998”
Database Concepts
© Leo Mark
ANSI/SPARC 3-Level DB
Architecture - separating concerns
DML
data
data
data
data
Database Concepts
© Leo Mark
ANSI/SPARC 3-Level DB Architecture
external external external
schema1 schema2 schema3
conceptual
schema
• external schema:
internal
use of data
schema
• conceptual schema:
meaning of data
• internal schema:
database
storage of data
Database Concepts
© Leo Mark
Conceptual Schema
● Describes all conceptually relevant, general,
time-invariant structural aspects of the universe
of discourse
● Excludes aspects of data representation and
physical organization, and access
CUSTOMER
MALE-TEEN-CUSTOMER
NAME ADDR
TEEN-CUSTOMER(X, Y) =
CUSTOMER(X, Y, S, A)
WHERE SEX=M AND 12<A<20;
CUSTOMER
CUSTOMER
B+-tree on index on
AGE NAME NAME PTR
Database Concepts
© Leo Mark
Physical Data Independence
external external external
schema1 schema2 schema3
conceptual
schema
conceptual
schema
compiler
schemata
• Catalog
• Data Dictionary
• Metadatabase
Database Concepts
© Leo Mark
Query Transformer
Uses metadata to transform a metadata
query at the external schema
level to a query at the storage
level DML
query
query
transformer
data
Database Concepts
© Leo Mark
ANSI/SPARC DBMS Framework
enterprise
administrator
1
schema compiler
3 3
database conceptual application
administrator schema system
processor administrator
13 2 4
14 5
internal external
schema schema
processor metadata processor
query transformer
34 36 38
21 30 31 12
data storage internal conceptual user
internal conceptual external
transformer transformer transformer
Database Concepts
© Leo Mark
Metadata - What is it?
● System metadata: ● Business metadata:
– Where data came from – What data are available
– How data were changed – Where data are located
– How data are stored – What the data mean
– How data are mapped – How to access the data
– Who owns data – Predefined reports
– Who can access data – Predefined queries
– Data usage history – How current the data are
– Data usage statistics
Database Concepts
© Leo Mark
ISO-IRDS - example
metaschema relations
rel-name att-name dom-name
access-rights
data dictionary user relation operation
schema
relations
rel-name att-name dom-name
Database Concepts
© Leo Mark
DATABASE MANAGEMENT
SYSTEM ARCHITECTURE
● Teleprocessing Database
● File-Sharing Database
● Client-Server Database - Basic
● Client-Server Database - w/Caching
● Distributed Database
● Federated Database
● Multi-Database
● Parallel Databases
Database Concepts
© Leo Mark
Teleprocessing Database
dumb dumb dumb
terminal terminal terminal
communication
lines
OSTP
OSDB
database
DB
Database Concepts
© Leo Mark
Teleprocessing Database -
characteristics
● Dumb terminals
● APs, DBMS, and DB reside on central computer
● Communication lines are typically phone lines
● Screen formatting transmitted via communication
lines
● User interface character oriented and primitive
● Dumb terminals are gradually being replaced by
micros
Database Concepts
© Leo Mark
File-Sharing Database
AP1 AP2 AP3
LAN
OSDB
micro
database
DB
Database Concepts
© Leo Mark
File-Sharing Database -
characteristics
● APs and DBMS on client micros
● File-Server on server micro
● Clients and file-server communicate via LAN
● Substantial traffic on LAN because large files
(and indices) must be sent to DBMS on
clients for processing
● Substantial lock contention for extended
periods of time for the same reason
● Good for extensive query processing on
downloaded snapshot data
● Bad for high-volume transaction processing
Database Concepts
© Leo Mark
Client-Server Database - Basic
AP1 AP2 AP3
micros
OSNET OSNET
LAN
OSNET micro(s) or
DBMS mainframe
OSDB
database
DB
Database Concepts
© Leo Mark
Client-Server Database - Basic -
characteristics
● APs on client micros
● Database-server on micro or mainframe
● Multiple servers possible; no data replication
● Clients and database-server communicate via
LAN
● Considerably less traffic on LAN than with
file-server
● Considerably less lock contention than with
file-server
Database Concepts
© Leo Mark
Client-Server Database - w/Caching
AP1 AP2 AP3
micros
DBMS DBMS
OSNET OSNET
LAN
DB DB
OSNET
micro(s) or
DBMS
mainframe
OSDB
database
DB
Database Concepts
© Leo Mark
Client-Server Database -
w/Caching - characteristics
● DBMS on server and clients
● Database-server is primary update site
● Downloaded queries are cached on clients
● Change logs are downloaded on demand
● Cached queries are updated incrementally
● Less traffic on LAN than with basic client-
server database because only initial query
result is downloaded followed by change logs
● Less lock contention than with basic client-
server database for same reason
Database Concepts
© Leo Mark
Distributed Database
AP1 AP2 AP3
micros(s) or
DDBMS DDBMS
mainframes
OSNET&DB OSNET&DB
network
conceptual
internal
DB DB DB
Database Concepts
© Leo Mark
Distributed Database -
characteristics
● APs and DDBMS on multiple micros or mainframes
● One distributed database
● Communication via LAN or WAN
● Horizontal and/or vertical data fragmentation
● Replicated or non-replicated fragment allocation
● Fragmentation and replication transparency
● Data replication improves query processing
● Data replication increases lock contention and
slows down update transactions
Database Concepts
© Leo Mark
Distributed Database - Alternatives
increasing cost, complexity, difficulty of control, security risk
increasing parallelism, independence, flexibility, availability
A C
partitioned
non-replicated
B D
A C A C non-partitioned
replicated
B D B D
A C C partitioned
B D replicated
Database Concepts + -
© Leo Mark
Federated Database
AP1 AP2 AP3
micros(s) or
DDBMS DDBMS
mainframes
OSNET&DB OSNET&DB
network
federation
schema
DB DB DB
Database Concepts
© Leo Mark
Federated Database -
characteristics
● Each federate has a set of APs, a DDBMS,
and a DB
● Part of a federate’s database is exported,
i.e., accessible to the federation
● The union of the exported databases
constitutes the federated database
● Federates will respond to query and update
requests from other federates
● Federates have more autonomy than with a
traditional distributed database
Database Concepts
© Leo Mark
Multi-Database
DB DB DB
Database Concepts
© Leo Mark
Multi-Database - characteristics
● A multi-database is a distributed database
without a shared schema
● A multi-DBMS provides a language for
accessing multiple databases from its APs
● A multi-DBMS accesses other databases via
a network, like the www
● Participants in a multi-database may respond
to query and update requests from other
participants
● Participants in a multi-database have the
highest possible level of autonomy
Database Concepts
© Leo Mark
Parallel Databases
● A database in which a single query may be
executed by multiple processors working
together in parallel
● There are three types of systems:
– Shared memory
– Shared disk
– Shared nothing
Database Concepts
© Leo Mark
Parallel Databases - Shared Memory
● processors share memory via
P
bus
M
P
● extremely efficient processor
communication via memory
P
writes
● bus becomes the bottleneck
P
● not scalable beyond 32 or 64
processors
P processor
M memory
disk
Database Concepts
© Leo Mark
Parallel Databases - Shared Disk
● processors share disk via
M P
interconnection network
M P
● memory bus not a bottleneck
● fault tolerance wrt. processor
M P or memory failure
● scales better than shared
M P
memory
● interconnection network to
disk subsystem is a
bottleneck
● used in ORACLE Rdb
Database Concepts
© Leo Mark
Parallel Databases - Shared Nothing
P
● scales better than shared memory
M
and shared disk
● main drawbacks:
P
M
– higher processor communication cost
– higher cost of non-local disk access
M P ● used in the Teradata database
machine
M P
Database Concepts
© Leo Mark
RAID -
redundant array of inexpensive disks
● disk striping improves performance via parallelism
(assume 4 disks worth of data is stored)
c c c c
Database Concepts
© Leo Mark
DATABASE CAPABILITIES
● Data Storage
● Queries
● Optimization
● Indexing
● Concurrency Control
● Recovery
● Security
Database Concepts
© Leo Mark
Data Storage
● Disk management
● File management
● Buffer management
● Garbage collection
● Compression
Database Concepts
© Leo Mark
Queries
SQL queries are composed from the following:
● Selection ● Set operations
– Point – Cartesian Product
– Range – Union
– Conjunction – Intersection
– Disjunction – Set Difference
● Join ● Other
– Natural join – Duplicate elimination
– Equi join – Sorting
– Theta join – Built-in functions: count,
– Outer join sum, avg, min, max
● Projection ● Recursive (not in SQL)
Database Concepts
© Leo Mark
Query Optimization
select flight#, date reserv
from reserv R, cust C flight# date cust# 10,000 reserv blocks
where R.cust#=C.cust# customer 3,000 cust blocks
and cust-name=‘LEO’; cust# cust-name 30 “Leo” blocks
cost: 10,000x30
cust-name=Leo
cust#
cost: 10,000x3,000
cust-name=Leo
cost: 3,000
reserv cust reserv cust
Database Concepts
© Leo Mark
Query Optimization
● Database statistics
● Query statistics
● Index information
● Algebraic manipulation
● Join strategies
– Nested loops
– Sort-merge
– Index-based
– Hash-based
Database Concepts
© Leo Mark
Indexing
Why Bother?
● Disk access time: 0.01-0.03 sec
● Memory access time: 0.000001-0.000003 sec
● Databases are I/O bound
● Rate of improvement of
Database Concepts
© Leo Mark
Indexing (cont.)
● Clustering vs. non-clustering
● Primary and secondary indices
● I/O cost for lookup:
– Heap: N/2
– Sorted file: log2(N)
– Single-level index: log2(n)+1
– Multi-level index; B+-tree: logfanout(n)+1
– Hashing: 2-3
● View caching; incremental computation
Database Concepts
© Leo Mark
Concurrency Control
flight-inst reserv
flight# date #avail-seats flight# date customer#
T1: T2:
read(flight-inst(flight#,date)
seats:=#avail-seats read(flight-inst(flight#,date)
if seats>0 then { seats:=#avail-seats
seats:=seats-1 if seats>0 then {
seats:=seats-1
write(reserv(flight#,date,customer2))
write(flight-inst(flight#,date,seats))}
write(reserv(flight#,date,customer1))
write(flight-inst(flight#,date,seats))}
overbooking!
Database Concepts
© Leo Mark
Concurrency Control (cont.)
ACID Transactions:
● An ACID transaction is a sequence of database
operations that has the following properties:
● Atomicity
– Either all operations are carries out, or none is
– This property is the responsibility of the concurrency
control and the recovery sub-systems
● Consistency
– A transaction maps a correct database state to another
correct state
– This requires that the transaction is correct, which is the
responsibility of the application programmer
Database Concepts
© Leo Mark
Concurrency Control (cont.)
● Isolation
– Although multiple transactions execute concurrently, i.e.
interleaved, not parallel, they appear to execute
sequentially
– This is the responsibility of the concurrency control sub-
system
● Durability
– The effect of a completed transaction is permanent
– This is the responsibility of the recovery manager
Database Concepts
© Leo Mark
Concurrency Control (cont.)
● Serializability is a good definition of correctness
● A variety of concurrency control protocols exist
– Two-phase (2PL) locking
● deadlock and livelock possible
● deadlock prevention: wait-die, wound-wait
● deadlock detection: rollback a transaction
– Optimistic protocol: proceed optimistically; back up and
repair if needed
– Pessimistic protocol: do not proceed until knowing that no
back up is needed
Database Concepts
© Leo Mark
Recovery
reserv flight-inst
102298 102398
change-reservation(DL212,102298,DL212,102398,C) 100 50
read(flight-inst(DL212,102298) 100 50
#avail-seats:=#avail-seats+1 100 50
update(flight-inst(DL212,102298,#avail-seats) 101 50
read(flight-inst(DL212,102398) 101 50
#avail-seats:=#avail-seats-1 101 50
update(flight-inst(DL212,102398,#avail-seats) 101 49
update(reserv(DL212,102298,C,DL212,102398,C) 101 49
Database Concepts
© Leo Mark
Recovery (cont.)
Storage types:
● Volatile: main memory
● Nonvolatile: disk
Errors:
● Logical error: transaction fails; e.g. bad input, overflow
● System error: transaction fails; e.g. deadlock
● System crash: power failure; main memory lost, disk
survives
● Disk failure: head crash, sabotage, fire; disk lost
What to do?
Database Concepts
© Leo Mark
Recovery (cont.)
● Deferred update (NO-UNDO/REDO):
– don’t change database until ready to commit
– write-ahead to log to disk
– change the database
● Immediate update (UNDO/NO-REDO):
– write-ahead to log on disk
– update database anytime
– commit not allowed until database is completely updated
● Immediate update (UNDO/REDO):
– write-ahead to log on disk
– update database anytime
– commit allowed before database is completely updated
● Shadow paging (NO-UNDO/NO-REDO):
– write-ahead to log in disk
© Leo Mark – keep shadow page; update copy only; swap at commit
Database Concepts
Security
DAC: Discretionary Access Control
● is used to grant/revoke privileges to users,
including access to files, records, fields (read,
write, update mode)
MAC: Mandatory Access Control
● is used to enforce multilevel security by
classifying data and users into security levels
and allowing users access to data at their
own or lower levels only
Database Concepts
© Leo Mark
PEOPLE THAT WORK WITH
DATABASES
● System Analysts
● Database Designers
● Application Developers
● Database Administrators
● End Users
Database Concepts
© Leo Mark
System Analysts
● communicate with each prospective database
user group in order to understand its
– information needs
– processing needs
● develop a specification of each user group’s
information and processing needs
● develop a specification integrating the
information and processing needs of the user
groups
● document the specification
Database Concepts
© Leo Mark
Database Designers
● choose appropriate structures to represent
the information specified by the system
analysts
● choose appropriate structures to store the
information in a normalized manner in order
to guarantee integrity and consistency of data
● choose appropriate structures to guarantee
an efficient system
● document the database design
Database Concepts
© Leo Mark
Application Developers
● implement the database design
● implement the application programs to meet
the program specifications
● test and debug the database implementation
and the application programs
● document the database implementation and
the application programs
Database Concepts
© Leo Mark
Database Administrators
● Manage the database structure
– participate in database and application development
– assist in requirement analysis
– participate in database design and creation
– develop procedures for integrity and quality of data
– facilitate changes to database structure
– seek communitywide solutions
– assess impact on all users
– provide configuration control
– be prepared for problems after changes are made
– maintain documentation
Database Concepts
© Leo Mark
Database Administrators (cont.)
● Manage data activity
– establish database standards consistent with data
administration standards
– establish and maintain data dictionary
– establish data proponencies
– work with data proponents to develop data access
and modification rights
– develop, document, and train staff on backup and
recovery procedures
– publish and maintain data activity standards
documentation
Database Concepts
© Leo Mark
Database Administrators (cont.)
● Manage the database management system
– generate database application performance reports
– investigate user performance complaints
– assess need for changes in database structure or
application design
– modify database structure
– evaluate and implement new DBMS features
– tune the database
● Establish the database data dictionary
– data names, formats, relationships
– cross-references between data and application
programs
– (see metadata slide)
Database Concepts
© Leo Mark
End Users
● Parametric end users constantly query and
update the database. They use canned
transactions to support standard queries and
updates.
● Casual end users occasional access the
database, but may need different information
each time. They use sophisticated query
languages and browsers.
● Sophisticated end users have complex
requirement and need different information
each time. They are thoroughly familiar with
the capabilities of the DBMS.
Database Concepts
© Leo Mark
THE DATABASE MARKET
● Prerelational vs. Relational
● Database Vendors
● Relational Database Products
● Relational Databases for PCs
● Object-Oriented Database Capabilities
Database Concepts
© Leo Mark
Prerelational vs. Relational
billion $
14
prerelational
12 relational
10
0
1994 1995 1996 1997 1998 1999
●
Database Concepts
Object-Oriented market revenue about 150 million/year
© Leo Mark
Database Vendors
Other ($2,272M)
Informix CA Oracle ($1,755M)
Sybase
Oracle
Informix (+Illustra) ($492M)
NEC ($211M)
Fujitsu ($186M)
Total:
Hitachi $7,847M
($117M)
Software AG
Source: (ADABAS)
IDC, 1995 ($136M)
Database Concepts
© Leo Mark
Relational Database Products
We compare the following products:
● ORACLE 7 Version 7.3
● Sybase SQL Server 11
● Informix OnLine 7.2
● Microsoft SQL Server 6.5
● IBM DB2 2.1.1
● CA-OpenIngres 1.2
Database Concepts
© Leo Mark
Relational Database Products COMPARISON
CRITERIA
ORACLE7
VERSION7.3
SYBASE SQL
SERVER11
INFORMIX
ONLINE7.1
Relational Model
Domains no no no
Referential Integ. restrict, except restrict only restrict, except
violation options cascading delete cascading delete
Taylor referential no no no
messages
Referential no no no
WHERE clause
Updatable views yes yes yes
w/check option
Database Objects
User-defined yes yes no
data types
BLOBs yes yes yes
Additional image,video,text, binary,image,text, byte,
data types messaging,spatial money,bit, text up to 2GB
data types varbinary
Table structure heap,clustered heap,clustered no choice
Index structure B-tree,bitmap, B-tree B+-tree,clustered
hash
Tuning facilities table and index index pre-fetch, extents, table
allocation I/O buffer cache, fragmentation by
block size, expression or
table partitioning round robin
Database Concepts
© Leo Mark
Relational Database Products COMPARISON MICROSOFT SQL IBM DB2 2.1.1 CA-
CRITERIA SERVER6.5 OPENINGRES1.2
Relational Model
Domains no no no
Ref. integrity restrict restrict,cascade, restrict only
w/check option set null
Taylor referential no no no
messages
Referential no no no
WHERE clause
Updatable views yes yes, including yes
w/check option union vews
Database objects
User-defined yes yes yes
data types
BLOBs yes yes yes
Additional large objects byte,longbyte,long
data types varchar,spatial,
varbyte, money
Table structure no choice no choice B-tree,hash,heap,
ISAM
Index structure clustered clustered B-tree,hash,ISAM
Tuning facilities fill factors, table & index table&index alloc.
allocation allocation, cluster fill factors,
ratio,cluster factor pre-allocation
Database Concepts
© Leo Mark
Relational Database Products
COMPARISON ORACLE7 SYBASE SQL INFORMIX
CRITERIA VERSION7.3 SERVER11 ONLINE7.2
Triggers
Level row&set-based set-based row&set-based
Timing before,after after before,after,each
Nesting yes yes yes
Stored procedures
Language PL/SQL Transact-SQL SPL
Nesting yes yes yes
Cursors yes yes yes
External calls RPC RPC system calls
Events yes time-based no
Queries
Locking level table, row table, page db,table,page,row
ANSI SQL comply entry level SQL92 entry level SQL92 entry level SQL92
Cursors forward forward forward,backward
Outer join yes yes yes
ANSI syntax no no no
APIs ODBC DBLIB,CTLIB,ODBC ESQL,TP/XA,CLI,
ODBC
Database Concepts
© Leo Mark
Relational Database Products COMPARISON MICROSOFT SQL IBM DB22.1.1 CA-
CRITERIA DERVER6.5 OPENINGRES1.2
Triggers
Level set-based set&row-based row-based
Timing after before,after after
Nesting yes yes yes
Stored procedures
Language Transact-SQL SQL, 3GL SQL-like
Nesting yes yes yes
Cursors yes yes no
External calls system call yes no(db events)
Events no user-def functions db event alerters
Queries
Locking level db,table, page,row db,table, page,row db,table,page
ANSI SQL comply entry level SQL92 entry level SQL92 entry level SQL92
Cursors forward,backward forward forward
,relative,absolute
Outer join yes yes yes
ANSI syntax no no yes
APIs ESQL,DBLIB,ODBC, ESQL,,ODBC ESQL,TP/XA,ODBC
Dist mgt objects
Database Concepts
© Leo Mark
Relational Database Products COMPARISON
CRITERIA
ORACLE7 SYBASE SQL
SERVER11
INFORMIX
ONLINE7.2
Database Admin
Tools Oracle Enterp Mgr S ybase SQL Mgr SMI,DB/Cockpit,
Performance Pack S QL Monitor OnPerf
S NMP support yes yes no
S ecurity C2(trusted Oracle) C2 C2,B1online secur
Partial backup & configurable configurable no
recovery
Internet
Internet support OracleWebServer web.sql ESQL,4GLCGI,
Interface Kit
Connectivity,
Distribution
Gateways to other MVS source Adabas,AS /400, Oracle,Sybase,
DBMS s through EDA/S QL DB2,IDMS ,IMS , IMS ,DB2
(Adabas,IDMS,S QL Informix,Ingres,
/DS,VSAM), any IS AM,SQL S erver,
APPC source, AS / Oracle,Rdb,RMS ,
400,DRDA,DB2,Tur seq.flies,S QL/DS ,
boimage,S ybase,R S ybaseS QL S erver,
db,RMS ,Informix,C Teradata,VSAM
A-Ingres,S QL
S erver,Teradata
Distributed DBs part of base prod OmniConnect Online server
2PC protocol yes yes yes,presumeabort
Heterogeneous gateways DirectConnect no
Optimization yes yes yes
RPC yes yes no
Database Concepts
© Leo Mark
Relational Database Products COMPARISON
CRITERIA
MICROSOFT SQL IBM DB@ 2.1.1
SERVER6.5
CA-
OPENINGRES1.2
Database Admin
Tools Enterprise Mgr, DB Director,Perf IPM, VisualDBA,
Perf Monitor Monitor, IMA
Visual Explain
SNMP support yes yes yes
Security NT integrated three levels C2
Partial backup & per table yes per table
recovery
Internet
Internet support Internet Info Serv DB2 WWW CA-OpenIngres/
(WindowsNT) Connection ICE
Connectivity,
Distribution
Gateways to other no Oracle, Sybase, DB2, Datacom,
DBMSs Informix, MS SQL IMS, IDMS, VSAM,
Server Oracle, Rdb,
Albase, Informix,
Oracle, Sybase
Distributed DBs no DataJoiner CA-OpenIngres*
2PC protocol n/a yes yes,automatic
Heterogeneous no DataJoiner through gateways
Optimization no yes yes
RPC yes no no
Database Concepts
© Leo Mark
Relational Database Products COMPARISON ORACLE SYBASE SQL INFORMIX
CRITERIA VERSION7.3 SERVER11 ONLINE7.2
Replication
Recording replic. log/trigger log buffer log
Hot standby yes yes yes
Peer-to-peer yes yes no
To other DBMSs through gateways DirectConnect no
Cascading yes yes no
Additional
restrictions
Name length 30 30 18
Columns 254 250 2767
Column size 2GB 1962 32,767
Tables n/a 2 billion 477 million
Table size n/a storage dependent 64 terabytes
Table width by column storage dependent 32,767
Platforms (OS) most UNIX, OS/2, most UNIX, OS/2, most UNIX,
VAX/VMS, MAC, VAX/VMS, MAC WindowsNT,
WindowsNT, WindowsNT, Windows95
Windows95 Windows95,
Database Concepts
© Leo Mark
Relational Database Products COMPARISON MICROSOFT SQL IMB DB2 2.1.1 CA-
CRITERIA SERVER6.5 OPENINGRES1.2
Replication
Recording log log rules(triggers)
Hot standby yes yes yes
Peer-to-peer no yes yes
To other DBMSs through ODBC DataJoiner through gateways
Cascading no no yes
Additional
restrictions
Name length 30 18 32
Columns 250 255 300
Column size 255 4005, except LOB 2008 (BLOBs 2GB)
Tables 2 billion storage dependent n/a
Table size 2 terabytes 64GB n/a
Table width 2048 storage dependent 2008 (BLOBs 2GB)
Platforms (OS) WindowsNT most UNIX, OS/2, most UNIX,VAX/
VAX/VMS, MAC VMS, WindowsNT,
WindowsNT, Windows95 (CA-
Windows95, OpenIngres/
Desktop
Database Concepts
© Leo Mark
Relational Databases for PCs
Relational databases for PCs
include:
● Microsoft FoxPro for Windows
● Microsoft FoxPro for DOS
● Borland’s Paradox for Windows
● Borland’s dBASE IV
● Paradox for DOS
● R:BASE
● Microsoft Access
Database Concepts
© Leo Mark
GemStone ONTOS ORION-2 Statice VERSANT
Primary Coop CAD/CAM CAD/CAM - Colab.
Use environ. OIS, MM engineer
Object-Oriented Database Capabilities Version yes yes yes limited yes
Mgt.
Recovery shadowp yes logs & REDO log -
shadowp
Transac. yes yes yes yes yes
Mgt.
Composite no no yes yes yes
Objects
Multiple no yes yes yes yes
Inherit. planned
Concur/ 3 locks 4 locks 5 locks 2PL 4 locks,
Locking optim 2PL
pesim
Distribute yes yes yes yes yes
Support
Dynamic yes yes yes - yes
Evolution limited limited all feature limited
Multimedia yes no yes yes no
Language C,C++,OPAL C++ LISP, C Common C, C++
Interface Smalltalk LISP
Platforms SUN3&4, SUN3&4 Symbolics, Symbolics SUN3&4
Apollo,PCs, OS/2 SUN3, HP,
VAX/VMS VAX/VMS DECstation,
Apollo
Special change Object SQL change browser, change
Feature notific. notific. dev. tools notific.
pri/sha db pri/sha db
Database Concepts
© Leo Mark
O2 Starburst
Primary CAD/CAM, CAD/CAM,
Use GIS, OIS KBS
Version limited no
Database Concepts
© Leo Mark
WHAT YOU WILL BE ABLE TO
LEARN MORE ABOUT
Database Concepts
© Leo Mark