Professional Documents
Culture Documents
1)
Purpose of Database Systems View of Data Data Models Data Definition Language Data Manipulation Language
Transaction Management
Storage Management Database Administrator Database Users Overall System Structure
1.1
A database represents some aspect of the real world, sometimes called the mini-world or the Universe of Discourse (UoD).
1.2
What Is a DBMS?
A Database Management System (DBMS) is a software package designed to store and manage databases.
1.3
Manufacturing: production, inventory, orders, supply chain Human resources: employee records, salaries, tax deductions
1.4
In the early days, database applications were built on top of file systems Drawbacks of using file systems to store data:
Data redundancy and inconsistency Multiple file formats, duplication of information in different files
Integrity constraints (e.g. account balance > 0) become part of program code
Hard to add new constraints or change existing ones
1.5
Security problems
1.6
Separation of the Data definition and the Program Abstraction into a simple model Data independence and efficient access.
1.7
1.8
Levels of Abstraction
View 1
View 2
View 3
Views describe how users see the data. Conceptual schema defines logical structure. Sometime we separate between conceptual level and logical level Physical schema describes the files and indexes used.
Conceptual Schema
Physical Schema
* Schemas are defined using DDL (Data Definition Language) *data is modified/queried using DML (Data Manipulation Language)
Database System Concepts 1.9 Silberschatz, Korth and Sudarshan
Levels of Abstraction
Physical level describes how a record (e.g., customer) is stored. Logical level: describes data stored in database, and the relationships among the data. type customer = record name : string; street : string; city : integer; end; View level: application programs hide details of data types. Views can also hide information (e.g., salary) for security purposes.
1.10
Similar to types and variables in programming languages Schema the logical structure of the database
e.g., the database consists of information about a set of customers and accounts and the relationship between them) Analogous to type information of a variable in a program Physical schema: database design at the physical level Logical schema: database design at the logical level
1.11
1.12
student number 17 8
class 1 2
course courseName
data s tructure s
dis database
coursenumber Cradit hours depertment cosc 1310 4 cosc cosc 3320 4 cosc math2410 3 math cosc 3380 3 cosc
prerequisite
coursenumber math2410 cosc 1310 cosc 3320 math2410 cosc 1310 cosc 3380
grade B C A A B A
year 86 86 87 87 87 87
1.13
Mapping of entities to files (OS files) Data representation and encoding (compression) Access methods (Direct, Hashing, Indexed) Which indexes to maintain Clustering of records OS/DBMS issues (buffer management)
1.14
Which entities to present/filter Data representation and encoding (compression) Programming language dependent issues Changes to names, order of attributes Derived (computed) fields and joined tables
1.15
1.16
Data Independence
Physical Data Independence the ability to modify the physical schema without changing the application programs
Applications depend on the logical schema
DBA may change physical level (tuning) without affecting applications The DBMS automatically make the required adjustments, and application programs are not changed (queries may need to be recompiled and optimized)
Logical Data Independence the ability to modify the logical schema without changing the application programs
Applications depend on the logical schema via the Views Can be supported on a limited basis only (if view is not affected)
1.17
Data Models
1.18
Entity-Relationship Model
1.19
1.20
Relational Model
Attributes
Palo Alto
Harrison Rye
321-12-3123
019-28-3746
1.21
1.22
char(10), integer)
DDL compiler generates a set of tables stored in a data dictionary Data dictionary contains metadata (i.e., data about data)
database schema Data storage and definition language
language in which the storage structure and access methods used by the database system are specified
Usually an extension of the data definition language
1.23
Language for accessing and manipulating the data organized by the appropriate data model
A declarative DML is also known as query language
Nonprocedural user specifies what data is required without specifying how to get those data (Query language)
1.24
SQL
1.25
Database Users
Users are differentiated by the way they expect to interact with the system
Specialized users write specialized database applications that do not fit into the traditional data processing framework
Nave users invoke one of the permanent application programs that have been written previously
E.g. people accessing database over the web, bank tellers, clerical staff
1.26
Database Administrator
Coordinates all the activities of the database system; the database administrator has a good understanding of the enterprises information resources and needs. Database administrator's duties include:
Schema definition Storage structure and access method definition Schema and physical organization modification
Granting user authority to access the database Specifying integrity constraints Acting as liaison with users Monitoring performance and responding to changes in requirements
1.27
Structure of a DBMS
A typical DBMS has a layered architecture. The figure does not show the concurrency control and recovery components. This is one of several possible architectures; each system has its own variations. These layers must consider concurrency control and recovery
Query Optimization and Execution Relational Operators Files and Access Methods Buffer Management
DB
1.28
Begin Transaction
CRASH! SUBTRACT 100 FROM A
ADD
100 TO B
End Transaction
1.29
WRITE # SEATS
1.30
1.31
Storage Management
Storage manager is a program module that provides the interface between the low-level data stored in the database and the application programs and queries submitted to the system. The storage manager is responsible to the following tasks:
interaction with the file manager efficient storing, retrieving and updating of data
1.32
Concurrency Control
is essential for
Because disk accesses are frequent, and relatively slow, it is important to keep the cpu humming by working on several user programs concurrently.
Interleaving actions of different user programs can lead to inconsistency: e.g., check is cleared while account balance is being computed. DBMS ensures such problems dont arise: users can pretend they are using a single-user system.
1.33
Transaction Management
A transaction is a collection of operations that performs a single logical function in a database application
Transaction-management component ensures that the database remains in a consistent (correct) state despite system failures (e.g., power failures and operating system crashes) and transaction failures. Concurrency-control manager controls the interaction among the concurrent transactions, to ensure the consistency of the database.
1.34
Key concept is transaction, which is an atomic sequence of database actions (reads/writes). Each transaction, executed completely, must leave the DB in a consistent state if DB is consistent when the transaction begins.
Users can specify some simple integrity constraints on the data, and the DBMS will enforce these constraints. Beyond this, the DBMS does not really understand the semantics of the data. (e.g., it does not understand how the interest on a bank account is computed). Thus, ensuring that a transaction (run alone) preserves consistency is ultimately the users responsibility!
1.35
DBMS ensures that execution of {T1, ... , Tn} is equivalent to some serial execution T1 ... Tn.
Before reading/writing an object, a transaction requests a lock on the object, and waits till the DBMS gives it the lock. All locks are released at the end of the transaction. (Strict 2PL locking protocol.)
Idea: If an action of Ti (say, writing X) affects Tj (which perhaps reads X), one of them, say Ti, will obtain the lock on X first and Tj is forced to wait until Ti completes; this effectively orders the transactions. What if Tj already has a lock on Y and Ti later requests a lock on Y? (Deadlock!) Ti or Tj is aborted and restarted!
1.36
Contains all definitions: DDL (logical schema), Views definition, Physical schema definitions including Indexing and clustering information, Integrity constraints, security rules, stored procedures (SQL) Essential for query parsing and optimization Contains other important documentation and programs (regulations, standards, codes, etc.) There are companies who sell Data Dictionary tools as a separate product!
1.37
Logical Design and Data-Dictionary Tools Loading Physical Design and File reorganization Backup / Restore / Recovery Performance Monitoring and Tuning
1.38
Application Architectures
Two-tier architecture: E.g. client programs using ODBC/JDBC to communicate with a database Three-tier architecture: E.g. web-based applications, and applications built using middleware
1.39
1.40
PRE-1960S 1945-magnetic tapes developed (the first medium to allow searching). 1957- First commercial computer installed. 1959- McGee proposed the notion of generalized access to electronically stored data. THE 60s 1961- The first generalized DBMS-GEs Integrated Data Store (IDS) designed by Bachman. THE 70s database technology experienced rapid growth. 1970- The relational model is developed by Ted Codd, an IBM research fellow. 1971- CODASYL Database Task Group Report. 1975- ACM Special Interest Group on Management of data organized first SIGMOD international conference. 1976- Entity- relationship (ER)model introduced by chen. THE 80s- DBMSs developed for personal computers (DBASE, PARADOX, etc). 1983- ANSI/SPARC survey revealed>100 relational systems had been implemented by the beginning of the 80s.
Database System Concepts 1.41 Silberschatz, Korth and Sudarshan
1985- Preliminary SQL standard published. Business world influenced by Fourth Generation Languages. *Trends in the 80s: extendable database systems:object- oriented DBMSs, client server architecture for distributed database. The 90s
The emergence of XML and the integration of XML and Relational databases
1.42
1.43
Summary
1.44