You are on page 1of 44

Chapter 1: Introduction and Basic concepts ( [S] chp.

1)

Purpose of Database Systems View of Data Data Models Data Definition Language Data Manipulation Language

Transaction Management
Storage Management Database Administrator Database Users Overall System Structure

Database System Concepts

1.1

Silberschatz, Korth and Sudarshan

A database represents some aspect of the real world, sometimes called the mini-world or the Universe of Discourse (UoD).

A database is a logically coherent collection of data with some inherit meaning.


A random assortment of data cannot correctly be referred to as a database. A database is designed, built, and populated with data for a specific purpose. It has an intended group of users and some preconceived applications in which these users are interested

Database System Concepts

1.2

Silberschatz, Korth and Sudarshan

What Is a DBMS?

A very large, integrated collection of data. Models real-world enterprise.


Entities (e.g., students, courses) Relationships (e.g., Madonna is taking CS564)

A Database Management System (DBMS) is a software package designed to store and manage databases.

Database System Concepts

1.3

Silberschatz, Korth and Sudarshan

Database Management System (DBMS)


Collection of interrelated data Set of programs to access the data DBMS contains information about a particular enterprise DBMS provides an environment that is both convenient and efficient to use. Database Applications:
Banking: all transactions Airlines: reservations, schedules Universities: registration, grades Sales: customers, products, purchases

Manufacturing: production, inventory, orders, supply chain Human resources: employee records, salaries, tax deductions

Databases touch all aspects of our lives

Database System Concepts

1.4

Silberschatz, Korth and Sudarshan

Purpose of Database System

In the early days, database applications were built on top of file systems Drawbacks of using file systems to store data:
Data redundancy and inconsistency Multiple file formats, duplication of information in different files

Difficulty in accessing data


Need to write a new program to carry out each new task Data isolation multiple files and formats Integrity problems

Integrity constraints (e.g. account balance > 0) become part of program code
Hard to add new constraints or change existing ones

Database System Concepts

1.5

Silberschatz, Korth and Sudarshan

Purpose of Database Systems (Cont.)

Drawbacks of using file systems (cont.)


Atomicity of updates Failures may leave database in an inconsistent state with partial updates carried out E.g. transfer of funds from one account to another should either complete or not happen at all

Concurrent access by multiple users


Concurrent accessed needed for performance Uncontrolled concurrent accesses can lead to inconsistencies E.g. two people reading a balance and updating it at the same time

Security problems

Database systems offer solutions to all the above problems

Database System Concepts

1.6

Silberschatz, Korth and Sudarshan

Why Use a DBMS?

Separation of the Data definition and the Program Abstraction into a simple model Data independence and efficient access.

Reduced application development time ad-hoc queries


Data integrity and security. Uniform data administration. Concurrent access, recovery from crashes. Support for multiple different views

Database System Concepts

1.7

Silberschatz, Korth and Sudarshan

Why Study Databases??

Shift from computation to information


at the low end: scramble to webspace (a mess!)

at the high end: scientific applications


Digital libraries, interactive video, Human Genome project, EOS project ... need for DBMS exploding OS, languages, theory, AI, multimedia, logic

Datasets increasing in diversity and volume.

DBMS encompasses most of CS

Database System Concepts

1.8

Silberschatz, Korth and Sudarshan

Levels of Abstraction

Many views, single conceptual (logical) schema and physical schema.

View 1

View 2

View 3

Views describe how users see the data. Conceptual schema defines logical structure. Sometime we separate between conceptual level and logical level Physical schema describes the files and indexes used.

Conceptual Schema

Physical Schema

* Schemas are defined using DDL (Data Definition Language) *data is modified/queried using DML (Data Manipulation Language)
Database System Concepts 1.9 Silberschatz, Korth and Sudarshan

Levels of Abstraction

Physical level describes how a record (e.g., customer) is stored. Logical level: describes data stored in database, and the relationships among the data. type customer = record name : string; street : string; city : integer; end; View level: application programs hide details of data types. Views can also hide information (e.g., salary) for security purposes.

Database System Concepts

1.10

Silberschatz, Korth and Sudarshan

Instances and Schemas

Similar to types and variables in programming languages Schema the logical structure of the database
e.g., the database consists of information about a set of customers and accounts and the relationship between them) Analogous to type information of a variable in a program Physical schema: database design at the physical level Logical schema: database design at the logical level

Instance the actual content of the database at a particular point in time


Analogous to the value of a variable

Database System Concepts

1.11

Silberschatz, Korth and Sudarshan

Database System Concepts

1.12

Silberschatz, Korth and Sudarshan

student name smith brown

student number 17 8

class 1 2

major cosc cosc

course courseName
data s tructure s

intro to com duts s cie nce

dis database

coursenumber Cradit hours depertment cosc 1310 4 cosc cosc 3320 4 cosc math2410 3 math cosc 3380 3 cosc

prerequisite

coursenumber cosc 3380 cosc 3330 cosc 3320

rerequisite number cosc 3320 math2410 cosc 1310

section sectionldentifier 85 92 102 112 119 135


grade_report student number 17 17 8 8 8 8 sectionldentifier 112 119 85 92 102 135

coursenumber math2410 cosc 1310 cosc 3320 math2410 cosc 1310 cosc 3380
grade B C A A B A

semester fall fall spring fall fall fall

year 86 86 87 87 87 87

instructor king anderson kuuth chang anderson stone

Database System Concepts

1.13

Silberschatz, Korth and Sudarshan

Physical (Storage) schema decisions

Mapping of entities to files (OS files) Data representation and encoding (compression) Access methods (Direct, Hashing, Indexed) Which indexes to maintain Clustering of records OS/DBMS issues (buffer management)

Database System Concepts

1.14

Silberschatz, Korth and Sudarshan

External (View) schema decisions

Which entities to present/filter Data representation and encoding (compression) Programming language dependent issues Changes to names, order of attributes Derived (computed) fields and joined tables

Database System Concepts

1.15

Silberschatz, Korth and Sudarshan

Database System Concepts

1.16

Silberschatz, Korth and Sudarshan

Data Independence

Physical Data Independence the ability to modify the physical schema without changing the application programs
Applications depend on the logical schema
DBA may change physical level (tuning) without affecting applications The DBMS automatically make the required adjustments, and application programs are not changed (queries may need to be recompiled and optimized)

Logical Data Independence the ability to modify the logical schema without changing the application programs
Applications depend on the logical schema via the Views Can be supported on a limited basis only (if view is not affected)

Database System Concepts

1.17

Silberschatz, Korth and Sudarshan

Data Models

A collection of modeling tools for describing


data data relationships data semantics data constraints

Entity-Relationship model Relational model Other models:


object-oriented model semi-structured data models (XML) Older models: network model and hierarchical model

Database System Concepts

1.18

Silberschatz, Korth and Sudarshan

Entity-Relationship Model

Example of schema in the entity-relationship model

Database System Concepts

1.19

Silberschatz, Korth and Sudarshan

Entity Relationship Model (Cont.)

E-R model of real world


Entities (objects) E.g. customers, accounts, bank branch Relationships between entities E.g. Account A-101 is held by customer Johnson Relationship set depositor associates customers with accounts

Widely used for database design


Database design in E-R model usually converted to design in the relational model (coming up next) which is used for storage and processing

Database System Concepts

1.20

Silberschatz, Korth and Sudarshan

Relational Model

Attributes

Example of tabular data in the relational model


Customerid 192-83-7465 019-28-3746 192-83-7465 customername Johnson Smith Johnson Jones Smith customerstreet Alma North Alma Main North customercity Palo Alto Rye accountnumber A-101 A-215 A-201 A-217 A-201

Palo Alto
Harrison Rye

321-12-3123
019-28-3746

Database System Concepts

1.21

Silberschatz, Korth and Sudarshan

A Sample Relational Database

Database System Concepts

1.22

Silberschatz, Korth and Sudarshan

Data Definition Language (DDL)

Specification notation for defining the database schema


E.g. create table account ( account-number balance

char(10), integer)

DDL compiler generates a set of tables stored in a data dictionary Data dictionary contains metadata (i.e., data about data)
database schema Data storage and definition language

language in which the storage structure and access methods used by the database system are specified
Usually an extension of the data definition language

Database System Concepts

1.23

Silberschatz, Korth and Sudarshan

Data Manipulation Language (DML)

Language for accessing and manipulating the data organized by the appropriate data model
A declarative DML is also known as query language

Two classes of languages


Procedural user specifies what data is required and how to get those data (DML)

Nonprocedural user specifies what data is required without specifying how to get those data (Query language)

SQL is the most widely used query language

Database System Concepts

1.24

Silberschatz, Korth and Sudarshan

SQL

SQL: widely used non-procedural language


E.g. find the name of the customer with customer-id 192-83-7465 select customer.customer-name from customer where customer.customer-id = 192-83-7465 E.g. find the balances of all accounts held by the customer with customer-id 192-83-7465 select account.balance from depositor, account where depositor.customer-id = 192-83-7465 and depositor.account-number = account.account-number

Application programs generally access databases through one of


Language extensions to allow embedded SQL Application program interface (e.g. ODBC/JDBC) which allow SQL queries to be sent to a database

Database System Concepts

1.25

Silberschatz, Korth and Sudarshan

Database Users

Users are differentiated by the way they expect to interact with the system

Application programmers interact with system through DML calls


Sophisticated users form requests in a database query language

Specialized users write specialized database applications that do not fit into the traditional data processing framework
Nave users invoke one of the permanent application programs that have been written previously
E.g. people accessing database over the web, bank tellers, clerical staff

Database System Concepts

1.26

Silberschatz, Korth and Sudarshan

Database Administrator

Coordinates all the activities of the database system; the database administrator has a good understanding of the enterprises information resources and needs. Database administrator's duties include:
Schema definition Storage structure and access method definition Schema and physical organization modification

Granting user authority to access the database Specifying integrity constraints Acting as liaison with users Monitoring performance and responding to changes in requirements

Database System Concepts

1.27

Silberschatz, Korth and Sudarshan

Structure of a DBMS

A typical DBMS has a layered architecture. The figure does not show the concurrency control and recovery components. This is one of several possible architectures; each system has its own variations. These layers must consider concurrency control and recovery

Query Optimization and Execution Relational Operators Files and Access Methods Buffer Management

Disk Space Management

DB

Database System Concepts

1.28

Silberschatz, Korth and Sudarshan

Transfer money from: account A to: account B

Begin Transaction
CRASH! SUBTRACT 100 FROM A

ADD

100 TO B

End Transaction

Abort, Commit, Rollback

Database System Concepts

1.29

Silberschatz, Korth and Sudarshan

READ # SEATS # SEATS = SEATS 1 WRITE # SEATS

READ # SEATS # SEATS = #SEATS 1

WRITE # SEATS

Database System Concepts

1.30

Silberschatz, Korth and Sudarshan

Overall System Structure

Database System Concepts

1.31

Silberschatz, Korth and Sudarshan

Storage Management

Storage manager is a program module that provides the interface between the low-level data stored in the database and the application programs and queries submitted to the system. The storage manager is responsible to the following tasks:
interaction with the file manager efficient storing, retrieving and updating of data

Database System Concepts

1.32

Silberschatz, Korth and Sudarshan

Concurrency Control

Concurrent execution of user programs good DBMS performance.

is essential for

Because disk accesses are frequent, and relatively slow, it is important to keep the cpu humming by working on several user programs concurrently.

Interleaving actions of different user programs can lead to inconsistency: e.g., check is cleared while account balance is being computed. DBMS ensures such problems dont arise: users can pretend they are using a single-user system.

Database System Concepts

1.33

Silberschatz, Korth and Sudarshan

Transaction Management

A transaction is a collection of operations that performs a single logical function in a database application

Transaction-management component ensures that the database remains in a consistent (correct) state despite system failures (e.g., power failures and operating system crashes) and transaction failures. Concurrency-control manager controls the interaction among the concurrent transactions, to ensure the consistency of the database.

Database System Concepts

1.34

Silberschatz, Korth and Sudarshan

Transaction: An Execution of a DB Program

Key concept is transaction, which is an atomic sequence of database actions (reads/writes). Each transaction, executed completely, must leave the DB in a consistent state if DB is consistent when the transaction begins.

Users can specify some simple integrity constraints on the data, and the DBMS will enforce these constraints. Beyond this, the DBMS does not really understand the semantics of the data. (e.g., it does not understand how the interest on a bank account is computed). Thus, ensuring that a transaction (run alone) preserves consistency is ultimately the users responsibility!

Database System Concepts

1.35

Silberschatz, Korth and Sudarshan

Scheduling Concurrent Transactions

DBMS ensures that execution of {T1, ... , Tn} is equivalent to some serial execution T1 ... Tn.

Before reading/writing an object, a transaction requests a lock on the object, and waits till the DBMS gives it the lock. All locks are released at the end of the transaction. (Strict 2PL locking protocol.)
Idea: If an action of Ti (say, writing X) affects Tj (which perhaps reads X), one of them, say Ti, will obtain the lock on X first and Tj is forced to wait until Ti completes; this effectively orders the transactions. What if Tj already has a lock on Y and Ti later requests a lock on Y? (Deadlock!) Ti or Tj is aborted and restarted!

Database System Concepts

1.36

Silberschatz, Korth and Sudarshan

The importance of the Data Dictionary

Contains all definitions: DDL (logical schema), Views definition, Physical schema definitions including Indexing and clustering information, Integrity constraints, security rules, stored procedures (SQL) Essential for query parsing and optimization Contains other important documentation and programs (regulations, standards, codes, etc.) There are companies who sell Data Dictionary tools as a separate product!

Database System Concepts

1.37

Silberschatz, Korth and Sudarshan

Logical Design and Data-Dictionary Tools Loading Physical Design and File reorganization Backup / Restore / Recovery Performance Monitoring and Tuning

Database System Concepts

1.38

Silberschatz, Korth and Sudarshan

Application Architectures

Two-tier architecture: E.g. client programs using ODBC/JDBC to communicate with a database Three-tier architecture: E.g. web-based applications, and applications built using middleware

Database System Concepts

1.39

Silberschatz, Korth and Sudarshan

Hierarchical Pre-historic IMS

Network Historic IDMS, ADABAS, lead to Object- Oriented


RELATIONAL- current 95% of the market Oracle, Informix, SQL/ Server, Progress, IBM DB2, etc. Object- ORIENTED Current lot of HuHa but very narrow market, mainly CAD AND Engineering Objectivity, Versant, Jasmine Object Relational- Current / Future SQL3, Informix UDO , Oracle-9, IBM DB2.

Database System Concepts

1.40

Silberschatz, Korth and Sudarshan

PRE-1960S 1945-magnetic tapes developed (the first medium to allow searching). 1957- First commercial computer installed. 1959- McGee proposed the notion of generalized access to electronically stored data. THE 60s 1961- The first generalized DBMS-GEs Integrated Data Store (IDS) designed by Bachman. THE 70s database technology experienced rapid growth. 1970- The relational model is developed by Ted Codd, an IBM research fellow. 1971- CODASYL Database Task Group Report. 1975- ACM Special Interest Group on Management of data organized first SIGMOD international conference. 1976- Entity- relationship (ER)model introduced by chen. THE 80s- DBMSs developed for personal computers (DBASE, PARADOX, etc). 1983- ANSI/SPARC survey revealed>100 relational systems had been implemented by the beginning of the 80s.
Database System Concepts 1.41 Silberschatz, Korth and Sudarshan

1985- Preliminary SQL standard published. Business world influenced by Fourth Generation Languages. *Trends in the 80s: extendable database systems:object- oriented DBMSs, client server architecture for distributed database. The 90s

* Demand for extending DBMS capabilities to meet new applications.


* Emergence of commercial object- oriented DBMSs. * Demand for developed applications utilizing data from a variety of sources. * Demand for exploiting massively parallel processors (MPPs).

Total victory by the relational mod


SQL 3 Object relational systems. The 00s
Database System Concepts

The emergence of XML and the integration of XML and Relational databases
1.42

Silberschatz, Korth and Sudarshan

Databases make these folks happy ...

End users and DBMS vendors DB application programmers

E.g. smart webmasters


Designs logical /physical schemas Handles security and authorization

Database administrator (DBA)


Data availability, crash recovery


Database tuning as needs evolve

Must understand how a DBMS works!

Database System Concepts

1.43

Silberschatz, Korth and Sudarshan

Summary

DBMS used to maintain, query large datasets.


Benefits include recovery from system crashes, concurrent access, quick application development, data integrity and security. Levels of abstraction give data independence. A DBMS typically has a layered architecture. DBAs hold responsible jobs and are well-paid! DBMS R&D is one of the broadest, most exciting areas in CS. Advanced databases course at the graduate level

Database System Concepts

1.44

Silberschatz, Korth and Sudarshan

You might also like