You are on page 1of 30

1.

Database A database is a logically coherent collection of data with some inherent meaning, representing some aspect of real world and which is designed, built and populated with data for a specific purpose 2. DBMS It is a collection of programs that enables user to create and maintain a database. In other words it is general-purpose software that provides the users with the processes of defining, constructing and manipulating the database for various applications. 3. Database system The database and DBMS software together is called as Database system.

4. Advantages of DBMS? Redundancy is controlled. Unauthorized access is restricted. providing multiple user interfaces. Enforcing integrity constraints. Providing backup and recovery.

5. Disadvantage in File Processing System Data redundancy & inconsistency. Difficult in accessing data. Data isolation. Data integrity. Concurrent access is not possible. Security Problems.

6.The three levels of data abstraction

Physical level: The lowest level of abstraction describes how data are stored. Logical level: The next higher level of abstraction, describes what data are stored in database and what relationship among those data. View level: The highest level of abstraction describes only part of entire database.

11. Data Independence

Data independence means that the application is independent of the storage structure and access strategy of data. In other words, The ability to modify the schema definition in one level should not affect the schema definition in the next higher level. Two types of Data Independence: Physical Data Independence: Modification in physical level should not affect the logical level. Logical Data Independence: Modification in logical level should affect the view level. NOTE: Logical Data Independence is more difficult to achieve

12. View & How is it related to data independence?

A view may be thought of as a virtual table, that is, a table that does not really exist in its own right but is instead derived from one or more underlying base table. In other words, there is no stored file that direct represents the view instead a definition of view is stored in data dictionary. Growth and restructuring of base tables is not reflected in views. Thus the view can insulate users from the effects of restructuring and growth in the database. Hence accounts for logical data independence.

13 Data Model A collection of conceptual tools for describing data, data relationships data semantics and constraints.

14. E-R model This data model is based on real world that consists of basic objects called entities and of relationship among these objects. Entities are described in a database by a set of attributes. 15. Object Oriented model This model is based on collection of objects. An object contains values stored in instance variables with in the object. An object also contains bodies of code that operate on the object. These bodies of code are called methods. Objects that contain same types of values and the same methods are grouped together into classes. 16 Entity It is a 'thing' in the real world with an independent existence.

17. Entity type It is a collection (set) of entities that have same attributes.

18. Entity set It is a collection of all entities of particular entity type in the database. 19. Extension of entity type The collections of entities of a particular entity type are grouped together into an entity set. 20.Weak Entity set An entity set may not have sufficient attributes to form a primary key, and its primary key compromises of its partial key and primary key of its parent entity, then it is said to be Weak Entity set. 21.Attribute It is a particular property, which describes the entity. 22 Relation Schema & Relation A relation Schema denoted by R(A1, A2, , An) is made up of the relation name R and the list of attributes Ai that it contains. A relation is defined as a set of tuples. Let r be the relation which contains set tuples (t1, t2, t3, ..., tn). Each tuple is an ordered list of n-values t=(v1,v2, ..., vn). 23. Degree of a Relation It is the number of attribute of its relation schema.

24. Relationship It is an association among two or more entities.

25 Relationship set The collection (or set) of similar relationships.

26. Relationship type Relationship type defines a set of associations or a relationship set among a given set of entity types. 27. Degree of Relationship type It is the number of entity type participating. 25. DDL (Data Definition Language) A data base schema is specifies by a set of definitions expressed by a special language called DDL. 26. VDL (View Definition Language) It specifies user views and their mappings to the conceptual schema.

29. DML (Data Manipulation Language) This language that enable user to access or manipulate data as organised by appropriate data model. Procedural DML or Low level: DML requires a user to specify what data are needed and how to get those data. Non-Procedural DML or High level: DML requires a user to specify what data are needed without specifying how to get those data

30 Relational Algebra It is procedural query language. It consists of a set of operations that take one or two relations as input and produce a new relation. 37. Relational Calculus It is an applied predicate calculus specifically tailored for relational databases proposed by E.F. Codd. E.g. of languages based on it are DSL ALPHA, QUEL. 38. Difference between Tuple-oriented relational calculus & domain-oriented relational calculus The tuple-oriented calculus uses a tuple variables i.e., variable whose only permitted values are tuples of that relation. E.g. QUEL The domain-oriented calculus has domain variables i.e., variables that range over the underlying domains instead of over relation. E.g. ILL, DEDUCE. 39. Normalization It is a process of analysing the given relation schemas based on their Functional Dependencies (FDs) and primary key to achieve the properties Minimizing redundancy Minimizing insertion, deletion and update anomalies. 40. Functional Dependency A Functional dependency is denoted by X Y between two sets of attributes X and Y that are subsets of R specifies a constraint on the possible tuple that can form a relation state r of R. The constraint is for any two tuples t1 and t2 in r if t1[X] = t2[X] then they have t1[Y] = t2[Y]. This means the value of X component of a tuple uniquely determines the value of component Y. 41. When is a functional dependency F said to be minimal? Every dependency in F has a single attribute for its right hand side. We cannot replace any dependency X A in F with a dependency Y A where Y is a proper subset of X and still have a set of dependency that is equivalent to F. We cannot remove any dependency from F and still have set of dependency that is equivalent to F.

42. Multivalued dependency Multivalued dependency denoted by X Y specified on relation schema R, where X and Y are both subsets of R, specifies the following constraint on any relation r of R: if two tuples t1 and t2 exist in r such that t1[X] = t2[X] then t3 and t4 should also exist in r with the following properties t3[x] = t4[X] = t1[X] = t2[X] t3[Y] = t1[Y] and t4[Y] = t2[Y] t3[Z] = t2[Z] and t4[Z] = t1[Z] where [Z = (R-(X U Y)) ] 42 Lossless join property It guarantees that the spurious tuple generation does not occur with respect to relation schemas after decomposition. 44. 1 NF (Normal Form) The domain of attribute must include only atomic (simple, indivisible) values. 45. Fully Functional dependency It is based on concept of full functional dependency. A functional dependency X Y is full functional dependency if removal of any attribute A from X means that the dependency does not hold any more. 46. 2NF A relation schema R is in 2NF if it is in 1NF and every non-prime attribute A in R is fully functionally dependent on primary key. 47. 3NF A relation schema R is in 3NF if it is in 2NF and for every FD X A either of the following is true X is a Super-key of R. A is a prime attribute of R. In other words, if every non prime attribute is non-transitively dependent on primary key. 48. BCNF (Boyce-Codd Normal Form) A relation schema R is in BCNF if it is in 3NF and satisfies an additional constraint that for every FD X A, X must be a candidate key. 49. 4NF A relation schema R is said to be in 4NF if for every Multivalued dependency X Y that holds over R, one of following is true X is subset or equal to (or) XY = R. X is a super key. 50. 5NF A Relation schema R is said to be 5NF if for every join dependency {R1, R2, ..., Rn} that holds R, one the following is true Ri = R for some i. The join dependency is implied by the set of FD, over R in which the left side is key of R.

51. Atomicity and Aggregation Atomicity:

Either all actions are carried out or none are. Users should not have to worry about the effect of incomplete transactions. DBMS ensures this by undoing the actions of incomplete transactions. Aggregation: A concept which is used to model a relationship between a collection of entities and relationships. It is used when we need to express a relationship among relationships.

database : A database is a collection of stored operational data used by various applications and/or users by some particular enterprise or by a set of outside authorized applications and authorized users. DataBase Management System : A DataBase Management System (DBMS) is a software system that manages execution of users applications to access and modify database data so that the data security, data integrity, and data reliability is guaranteed for each application and each application is written with an assumption that it is the only application active in the database. What Is Data ? Different view points: A sequence of characters stored in computer memory or storage Interpreted sequence of characters stored in computer memory or storage Interpreted set of objects Database supports a concurrent access to the data

File Systems : File is uninterpreted, unstructured collection of information File operations: delete, catalog, create, rename, open, close, read, write, find, Access methods: Algorithms to implement operations along with internal file organization Examples: File of Customers, File of Students; Access method: implementation of a set of operations on a file of students or customers. File Management System Problems : Data redundancy Data Access: New request-new program Data is not isolated from the access implementation Concurrent program execution on the same file Difficulties with security enforcement Integrity issues . Database Applications : Airline Reservation Systems Data items are: single passenger reservations; Information about flights and airports; Information about ticket prices and tickets restrictions. Banking Systems Data items are accounts, customers, loans, mortgages, balances, etc. Failures are not tolerable. Concurrent access must be provided Corporate Records Data items are: sales, accounts, bill of materials records, employee and their dependents ADVANTAGES OF A DBMS: Data independence: Application programs should be as independent as possible from details of data representation and storage. The DBMS can provide an abstract view of the data to insulate application code from such details. cient data access: A DBMS utilizes a variety of sophisticated techniques to store and retrieve data efciently. This feature is especially important if the data is stored on external storage devices. Data integrity and security: If data is always accessed through the DBMS, the DBMS can enforce integrity constraints on the data. For example, before inserting salary information for an employee, the DBMS can check that the department budget is not exceeded. Also, the DBMS can enforce access controls that govern what data is visible to dierent classes of users. Data administration: When several users share the data, centralizing the administration of data can oer signi cant improvements. Experienced professionals who understand the nature of the data being managed, and how dierent groups of users use it, can be responsible for organizing the data representation to minimize redundancy and for ne-tuning the storage of the data to make retrieval efcient.

concarence recovery: A DBMS schedules concurrent accesses to the data in such a manner that users can think of the data as being accessed by only one user at a time. Further, the DBMS protects users from the eects of system failures. Reduced application development time: Clearly, the DBMS supports many important functions that are common to many applications accessing data stored in the DBMS. This, in conjunction with the high-level interface to the data, facilitates quick development of applications. Such applications are also likely to be more robust than applications developed from scratch because many important tasks are handled by the DBMS instead of being implemented by the application. Data Levels and their Roles :

Physical corresponds to the first view of data: How data is stored, how is it accessed, how data is modified, is data ordered, how data is allocated to computer memory and/or peripheral devices, how data items are actually represented (ASCI, EBCDIC,) .The physical schema speci es additional storage details. Essentially, the physical schema summarizes how the relations described in the conceptual schema are actually stored on secondary storage devices such as disks and tapes. We must decide what le organizations to use to store the relations, and create auxiliary data structures called indexes to speed up data retrieval operations. Conceptual corresponds to the second view of data: What we want the data to express and what relationships between data we must express, what story data tells, are all data necessary for the story are discussed. The conceptual schema (sometimes called the logical schema) describes the stored data in terms of the data model of the DBMS. In a relational DBMS, the conceptual schema describes all relations that are stored in the database. In our sample university database, these relations contain information about entities, such as students and faculty, and about relationships, such as students' enrollment in courses. All student entities can be described using records in a Students relation, as we saw earlier. In fact, each collection of entities and each collection of relationships can be described as a relation, leading to the following conceptual schema: Students(sid: string, name: string, login: string, age: integer, gpa: real) Faculty( d: string, fname: string, sal: real) Courses(cid: string, cname: string, credits: integer) Rooms(rno: integer, address: string, capacity: integer) Enrolled(sid: string, cid: string, grade: string) Teaches( d: string, cid: string) Meets In(cid: string, rno: integer, time: string) The choice of relations, and the choice of elds for each relation, is not always obvious, and the process of arriving at a good conceptual schema is called conceptual database design. View corresponds to the third view of data:What part of the data is seen by a specific application .External schemas, which usually are also in terms of the data model of the DBMS, allow data access to be customized (and authorized) at the level of individual users or groups of users.The external schema design is guided by end user requirements. For example, we might ant to allow students to nd out the names of faculty members teaching courses, as well as course enrollments. This can be done by de ning the following view: Courseinfo(cid: string, fname: string, enrollment: integer) STRUCTURE OF A DBMS:

When a user issues a query, the parsed query is presented to a query optimizer, which uses information about how the data is stored to produce an effcient execution plan for evaluating the query. An execution plan is a blueprint for evaluating a query, and is usually represented as a tree of relational operators. The code that implements relational operators sits on top of the le and access methods layer. This layer includes a variety of software for supporting the concept of a le, which, in a DBMS, is a collection of pages or a collection of records. This layer typically supports a heap le, or le of unordered pages, as well as indexes. In addition to keeping track of the pages in a le, this layer organizes the information within a page.The les and access methods layer code sits on top of the buer manager, which brings pages in from disk to main memory as needed in response to read requests.

The lowest layer of the DBMS software deals with management of space on disk, where the data is stored. Higher layers allocate, deallocate, read, and write pages through (routines provided by) this layer, called the disk space manager. The DBMS supports concurrency and crash recovery by carefully scheduling user requests and maintaining a log of all changes to the database. DBMS components associated with concurrency control and recovery include the transaction manager, which ensures that transactions request and release locks according to a suitable locking protocol and schedules the execution transactions; the lock manager, which keeps track of requests for locks and grants locks on database objects when they become available; and the recovery manager, which is responsible for maintaining a log, and restoring the system to a consistent state after a crash. The disk space manager, buer manager, and le and access method layers must interact with these components. Data Models: A collection of tools for describing ......

Data. Data relationships. Data semantics. Data constraints.

Relational model.......... Entity-Relationship data model (mainly for database design) . Object-based data models (Object-oriented and Object-relational).

Semistructured data model (XML).

Other older models:.........

Network model . Hierarchical model.

Database Access from Application Programs: To access the database, DML statements need to be executed from the host language.There are two ways to do this: By providing an application program interface (set of procedures) that can be used to send DML and DDL statements to the database, and retrieve the results.The Open Database Connectivity (ODBC) standard defined by Microsoft for use with the C language is a commonly used application program interfacestandard. The Java Database Connectivity (JDBC) standard provides correspondingfeatures to the Java language. By extending the host language syntax to embed DML calls within the host language program. Usually, a special character prefaces DML calls, and a preprocessor, called the DML precompiler, converts the DML statements to normal procedure calls in the host language. Database Users and Administrators:

Naive users are unsophisticated users who interact with the system by invoking one of the application programs that have been written previously. Application programmers are computer professionals who write application programs. Sophisticated users interact with the system without writing programs. Instead,they form their requests in a database query language. They submit each such query to a query processor, whose function is to break

down DML statements into instructions that the storage manager understands. Analysts who submit queries to explore data in the database fall in this category. Specialized users are sophisticated users who write specialized database applications that do not fit into the traditional data-processing framework. Database Administrator:A person who has such central control over the system is called a database administrator (DBA) Schema definition. The DBA creates the original database schema by executing a set of data definition statements in the DDL. Storage structure and access-method definition. Schema and physical-organization modification. The DBA carries out changes to the schema and physical organization to reflect the changing needs of the organization, or to alter the physical organization to improve performance. Granting of authorization for data access. Routine maintenance.

Data Model:
A data model is a collection of conceptual tools for describing data, data relationships, data semantics, and consistency constraints. Entity:An entity is a thing or object in the real world that is distinguishable from all other objects. For example, each person in an enterprise is an entity. Entity set:An entity set is a set of entities of the same type that share the same properties, orattributes. The set of all persons who are customers at a given bank, for example, can be defined as the entity set customer. Similarly, the entity set loan might represent the set of all loans awarded by a particular bank. An entity is represented by a set of attributes. Attributes are descriptive properties possessed by each member of an entity set. The designation of an attribute for an entity set expresses that the database stores similar information concerning each entity in the entity set; however, each entity may have its own value for each attribute. Simple and composite attributes:the attributes havebeen simple; that is, they are not divided into subparts is called as "simple attributes". on the other hand, can be divided into subparts is called as "composite attributes".For example, an attribute name could be structured as a composite attribute consisting of first-name, middle-initial, and last-name. Single-valued and multivalued attributes:For instance, the loan-number attribute for a specific loan entity refers to only one loan number. Such attributes are said to be single valued. There may be instances where an attribute has a set of values for a specific entity. Consider an employee entity set with the attribute phone-number. An employee may have zero, one, or several phone numbers, and different employees may have different numbers of phones. This type of attribute is said to be multivalued.

Derived attribute:The value for this type of attribute can be derived from the values of other related attributes or entities. For instance, let us say that the customer entity set has an attribute loans-held, which represents how many loans a customer has from the bank.We can derive the value for this attribute by counting the number of loan entities associated with that customer. Relationship Sets:A relationship is an association among several entities. A relationship set is a set of relationships of the same type. Mapping Cardinalities:Mapping cardinalities, or cardinality ratios, express the number of entities to which another entity can be associated via a relationship set. Mapping cardinalities are most useful in describing binary relationship sets, although they can contribute to the description of relationship sets that involve more than two entity sets.

One to one. An entity in A is associated with at most one entity in B, and an entity in B is associated with at most one entity in A. One to many. An entity in A is associated with any number (zero or more) of entities in B. An entity in B, however, can be associated with at most one entity in A. Many to one. An entity in A is associated with at most one entity in B. An entity in B, however, can be associated with any number (zero or more) of entities in A. Many to many. An entity in A is associated with any number (zero or more) of entities in B, and an entity in B is associated with any number (zero or more) of entities in A.

Keys: A key allows us to identify a set of attributes that suffice to distinguish entities from each other. Keys also help uniquely identify relationships, and thus distinguish relationships from each other. superkey:A superkey is a set of one or more attributes that, taken collectively, allow us to identify uniquely an entity in the entity set. For example, the customer-id attribute of the entity set customer is sufficient to distinguish one customer entity from another. Thus, customer-id is a superkey. Similarly, the combination of customer-name and customer-id is a superkey for the entity set customer. The customer-name attribute of customer is not a superkey, because several people might have the same name. candidate key:minimal superkeys are called candidate keys.If K is a superkey, then so is any superset of K. We are often interested in superkeys for which no proper subset is a superkey.It is possible that several distinct sets of attributes could serve as a candidate key.Suppose that a combination of customer-name and customer-street is sufficient to distinguish among members of the customer entity set. Then, both {customer-id} and {customername, customer-street} are candidate keys. Although the attributes customerid and customer-name together can distinguish customer entities, their combination does not form a candidate key, since the attribute customer-id alone is a candidate key. primary key:which denotes the unique identity is called as primary key.primary key to denote a candidate key that is chosen by the database designer as the principal means of identifying entities within an entity set. A key (primary, candidate, and super) is a property of the entity set, rather than of the individual entities. Any two individual entities in the set are prohibited from having the same value on the key attributes at the same time. The designation of a key represents a constraint in the real-world enterprise being modeled. Weak Entity Sets:An entity set may not have sufficient attributes to form a primary key. Such an entity set is termed a weak entity set. An entity set that has a primary key is termed a strong entity set.

For a weak entity set to be meaningful, it must be associated with another entityset, called the identifying or owner entity set. Every weak entity must be associated with an identifying entity; that is, the weak entity set is said to be existence dependent on the identifying entity set. The identifying entity set is said to own the weak entity set that it identifies. The relationship associating the weak entity set with the identifying entity set is called the identifying relationship. The identifying relationship is many to one from the weak entity set to the identifying entity set, and the participation of the weak entity set in the relationship is total. In our example, the identifying entity set for payment is loan, and a relationship loan-payment that associates payment entities with their corresponding loan entities is the identifying relationship. Although a weak entity set does not have a primary key, we nevertheless need a means of distinguishing among all those entities in the weak entity set that depend on one particular strong entity. The discriminator of a weak entity set is a set of attributes that allows this distinction to be made. In E-R diagrams, a doubly outlined box indicates a weak entity set, and a doubly outlined diamond indicates the corresponding identifying relationship.in fig the weak entity set payment depends on the strong entity set loan via the relationship set loan-payment. The figure also illustrates the use of double lines to indicate total participationthe of the (weak) entity set payment in the relationship loan-payment is total, meaning that every payment must be related via loan-payment

to some loan. Finally, the arrow from loan-payment to loan indicates that each payment is for a single loan. The discriminator of a weak entity set also is underlined, but with a dashed, ratherthan a solid, line.

Specialization:An entity set may include subgroupings of entities that are distinct in some wayfrom other entities in the set. For instance, a subset of entities within an entity set may have attributes that are not shared by all the entities in the entity set. The E-R model provides a means for representing these distinctive entity groupings. Consider an entity set person, with attributes name, street, and city. A personmay be further classified as one of the following: customer employee Each of these person types is described by a set of attributes that includes all the attributes of entity set person plus possibly additional attributes. For example, customer entities may be described further by the attribute customer-id, whereas employee entities may be described further by the attributes employee-id and salary. The process of designating subgroupings within an entity set is called specialization. The specialization of person allows us to distinguish among persons according to whether they are employees or customers. Generalization:The design process may also proceed in a bottom-up manner, in which multiple entity sets are synthesized into a higher-level entity set on the basis of common features. The database designer may have first identified a customer entity set with the attributes name, street, city, and customer-id, and an employee entity set with the attributes name, street, city, employee-id, and salary. There are similarities between the customer entity set and the employee entity set in the sense that they have several attributes in common. This commonality can be expressed by generalization, which is a containment relationship that exists between a higher-level entity set and one or more lower-level entity sets. In our example, person is the higher-level entity set and customer and employee are lower-level entity sets. Higher- and lower-level entity sets also may be designated by the terms superclass and subclass, respectively. The person entity set is the superclass of the customer and employee subclasses.For all practical purposes, generalization is a simple inversion of specialization. We will apply both processes, in combination, in the course of designing the E-R schema for an enterprise. In terms of the E-R diagram itself, we do not distinguish between specialization and generalization. New levels of entity representation will be distinguished (specialization) or synthesized (generalization) as the design schema comes to express fully the database application and the user requirements of the database. Differences in the two approaches may be characterized by their starting point and overall goal.Generalization proceeds from the recognition that a number of entity sets share some common

features (namely, they are described by the same attributes and participatein the same relationship sets).

Aggregation: Aggregation is an abstraction in which relationship sets (along with their associated entity sets) are treated as higher-level entity sets, and can participate in relationships.

Symbols used in the E-R notation:

ER Model For a college DB: Assumptions :

A college contains many departments Each department can offer any number of courses Many instructors can work in a department An instructor can work only in one department For each department there is a Head An instructor can be head of only one department Each instructor can take any number of courses A course can be taken by only one instructor A student can enroll for any number of courses Each course can have any number of students

Steps in ER Modeling:

Identify the Entities

Find relationships Identify the key attributes for every Entity Identify other relevant attributes Draw complete E-R diagram with all attributes including Primary Key

Step 1: Identify the Entities:

DEPARTMENT STUDENT COURSE INSTRUCTOR

Step 2: Find the relationships:

One course is enrolled by multiple students and one student enrolls for multiple courses, hence the cardinality between course and student is Many to Many. The department offers many courses and each course belongs to only one department, hence the cardinality between department and course is One to Many. One department has multiple instructors and one instructor belongs to one and only one department , hence the cardinality between department and instructor is one to Many. Each department there is a Head of department and one instructor is Head of department ,hence the cardinality is one to one . One course is taught by only one instructor, but the instructor teaches many courses, hence the cardinality between course and instructor is many to one.

Step 3: Identify the key attributes

Deptname is the key attribute for the Entity Department, as it identifies the Department uniquely. Course# (CourseId) is the key attribute for Course Entity. Student# (Student Number) is the key attribute for Student Entity. Instructor Name is the key attribute for Instructor Entity.

Step 4: Identify other relevant attributes For the department entity, the relevant attribute is location

For course entity, course name,duration,prerequisite For instructor entity, room#, telephone# For student entity, student name, date of birth

ER model for Banking Business : Assumptions :

There are multiple banks and each bank has many branches. Each branch has multiple customers Customers have various types of accounts Some Customers also had taken different types of loans from these bank branches One customer can have multiple accounts and Loans

Step 1: Identify the Entities BANK BRANCH LOAN ACCOUNT CUSTOMER Step 2: Find the relationships One Bank has many branches and each branch belongs to only one bank, hence the cardinality between Bank and Branch is One to Many. One Branch offers many loans and each loan is associated with one branch, hence the cardinality between Branch and Loan is One to Many. One Branch maintains multiple accounts and each account is associated to one and only one Branch, hence the cardinality between Branch and Account is One to Many One Loan can be availed by multiple customers, and each Customer can avail multiple loans, hence the cardinality between Loan and Customer is Many to Many. One Customer can hold multiple accounts, and each Account can be held by multiple Customers, hence the cardinality between Customer and Account is Many to Many Step 3: Identify the key attributes BankCode (Bank Code) is the key attribute for the Entity Bank, as it identifies the bank uniquely. Branch# (Branch Number) is the key attribute for Branch Entity. Customer# (Customer Number) is the key attribute for Customer Entity. Loan# (Loan Number) is the key attribute for Loan Entity.

Account No (Account Number) is the key attribute for Account Entity. Step 4: Identify other relevant attributes For the Bank Entity, the relevant attributes other than BankCode would be Name and Address. For the Branch Entity, the relevant attributes other than Branch# would be Name and Address. For the Loan Entity, the relevant attribute other than Loan# would be Loan Type. For the Account Entity, the relevant attribute other than Account No would be Account Type. For the Customer Entity, the relevant attributes other than Customer# would be Name, Telephone# and Address. E-R diagram with all attributes including Primary Key:

2.

Chaptter 1: Introduction
n Purpose of Database Systems n View of Data n Database Languages n Relational Databases

n Database Design n Objectbased

and semistructured databases n Data Storage and Querying n Transaction Management n Database Architecture n Database Users and Administrators n Overall Structure n History of Database Systems
Silberschatz, Database System Concepts 5th Edition, May 23, 2005 1.<number> Korth and Sudarshan

Database Management System (DBMS)


n DBMS contains information about a particular enterprise
l Collection of interrelated data l Set of programs to access the data l An environment that is both convenient

and efficient to

use n Database Applications: l Banking: all transactions l Airlines: reservations, schedules l Universities: registration, grades l Sales: customers, products, purchases l Online retailers: order tracking, customized recommendations l Manufacturing: production, inventory, orders, supply chain l Human resources: employee records, salaries, tax deductions n Databases touch all aspects of our lives

Silberschatz, Database System Concepts 5th Edition, May 23, 2005 1.<number> Korth and Sudarshan

Purpose of Database Systems


n In the early days, database applications were built

directly on top of file systems n Drawbacks of using file systems to store data: l Data redundancy and inconsistency Multiple file formats, duplication of information in different files l Difficulty in accessing data Need to write a new program to carry out each new task l Data isolation multiple files and formats l Integrity problems Integrity constraints (e.g. account balance > 0) become buried in program code rather than being stated explicitly Hard to add new constraints or change existing ones
Silberschatz, Database System Concepts 5th Edition, May 23, 2005 1.<number> Korth and Sudarshan

Purpose of Database Systtems ((Cont.)


n Drawbacks of using file systems (cont.)
l Atomicity

of updates Failures may leave database in an inconsistent state with partial updates carried out Example: Transfer of funds from one account to another should either complete or not happen at all

l Concurrent

access by multiple users Concurrent accessed needed for performance Uncontrolled concurrent accesses can lead to inconsistencies Example: Two people reading a balance and updating it at the same time l Security problems Hard to provide user access to some, but not all, data n Database systems offer solutions to all the above problems
Silberschatz, Database System Concepts 5th Edition, May 23, 2005 1.<number> Korth and Sudarshan

Levells of Absttrraction
n Physical level: describes how a record (e.g., customer)

is stored. n Logical level: describes data stored in database, and the relationships among the data. type customer = record customer_id : string; customer_name : string; customer_street : string; customer_city : integer; end; n View level: application programs hide details of data types. Views can also hide information (such as an employees salary) for security purposes.
Silberschatz, Database System Concepts 5th

Edition, May 23, 2005 1.<number> Korth and Sudarshan

View of Data
An architecture for a database system
Silberschatz, Database System Concepts 5th Edition, May 23, 2005 1.<number> Korth and Sudarshan

Instances and Schemas


n Similar to types and variables in programming languages n Schema the logical structure of the database
l Example:

The database consists of information about a set of customers and accounts and the relationship between them) l Analogous to type information of a variable in a program l Physical schema: database design at the physical level l Logical schema: database design at the logical level n Instance the actual content of the database at a particular point in time l Analogous to the value of a variable n Physical Data Independence the ability to modify the physical schema without changing the logical schema l Applications depend on the logical schema l In general, the interfaces between the various levels and components should be well defined so that changes in some parts do not seriously influence others.
Silberschatz, Database System Concepts 5th Edition, May 23, 2005 1.<number> Korth and Sudarshan

Data Models
n A collection of tools for describing
l Data l Data l Data

relationships semantics

l Data

constraints n Relational model n EntityRelationship data model (mainly for database design) n Objectbased data models (Objectoriented and Objectrelational) n Semistructured data model (XML) n Other older models: l Network model l Hierarchical model
Silberschatz, Database System Concepts 5th Edition, May 23, 2005 1.<number> Korth and Sudarshan

Data Manipulation Language (DML)


n Language for accessing and manipulating the data

organized by the appropriate data model l DML also known as query language n Two classes of languages l Procedural user specifies what data is required and how to get those data l Declarative (nonprocedural) user specifies what data is required without specifying how to get those data n SQL is the most widely used query language
Silberschatz, Database System Concepts 5th Edition, May 23, 2005 1.<number> Korth and Sudarshan

Data Definition Language (DDL)


n Specification notation for defining the database schema

Example: create table account ( accountnumber char(10), balance integer) n DDL compiler generates a set of tables stored in a data dictionary n Data dictionary contains metadata (i.e., data about data) l Database schema l Data storage and definition language Specifies the storage structure and access methods used l Integrity constraints Domain constraints Referential integrity (references constraint in SQL) Assertions l Authorization
Silberschatz, Database System Concepts 5th Edition, May 23, 2005 1.<number> Korth and Sudarshan

Relattiional Model
n Example of tabular data in the relational model Attributes
Silberschatz, Database System Concepts 5th Edition, May 23, 2005 1.<number> Korth and Sudarshan

A Sample Rellational Database

Silberschatz, Database System Concepts 5th Edition, May 23, 2005 1.<number> Korth and Sudarshan

SQL
n SQL: widely used nonprocedural

language l Example: Find the name of the customer with customerid 192837465 select customer.customer_name from customer where customer.customer_id = 192837465 l Example: Find the balances of all accounts held by the customer with customerid 192837465 select account.balance from depositor, account where depositor.customer_id = 192837465 and depositor.account_number = account.account_number n Application programs generally access databases through one of l Language extensions to allow embedded SQL l Application program interface (e.g., ODBC/JDBC) which allow SQL queries to be sent to a database
Silberschatz, Database System Concepts 5th Edition, May 23, 2005 1.<number> Korth and Sudarshan

Database Desiign
The process of designing the general structure of the database:

n Logical Design Deciding on the database schema.

Database design requires that we find a good collection of relation schemas. l Business decision What attributes should we record in the database? l Computer Science decision What relation schemas should we have and how should the attributes be distributed among the various relation schemas? n Physical Design Deciding on the physical layout of the database
Silberschatz, Database System Concepts 5th Edition, May 23, 2005 1.<number> Korth and Sudarshan

The EntityRelationship Model


n Models an enterprise as a collection of entities and

relationships l Entity: a thing or object in the enterprise that is distinguishable from other objects Described by a set of attributes l Relationship: an association among several entities n Represented diagrammatically by an entityrelationship diagram:
Silberschatz, Database System Concepts 5th Edition, May 23, 2005 1.<number> Korth and Sudarshan

ObjectRellational

Data Models
n Extend the relational data model by including object

orientation and constructs to deal with added data types. n Allow attributes of tuples to have complex types, including nonatomic values such as nested relations. n Preserve relational foundations, in particular the declarative access to data, while extending modeling power. n Provide upward compatibility with existing relational languages.
Silberschatz, Database System Concepts 5th Edition, May 23, 2005 1.<number> Korth and Sudarshan

XML:: Extensible Markup Language


n Defined by the WWW Consortium (W3C) n Originally intended as a document markup language not

a database language n The ability to specify new tags, and to create nested tag structures made XML a great way to exchange data, not just documents n XML has become the basis for all new generation data interchange formats. n A wide variety of tools is available for parsing, browsing and

querying XML documents/data


Silberschatz, Database System Concepts 5th Edition, May 23, 2005 1.<number> Korth and Sudarshan

Storage Managementt
n Storage manager is a program module that provides the

interface between the lowlevel data stored in the database and the application programs and queries submitted to the system. n The storage manager is responsible to the following tasks: l Interaction with the file manager l Efficient storing, retrieving and updating of data n Issues: l Storage access l File organization l Indexing and hashing
Silberschatz, Database System Concepts 5th Edition, May 23, 2005 1.<number> Korth and Sudarshan

Query Processing
1. Parsing and translation 2. Optimization 3. Evaluation
Silberschatz, Database System Concepts 5th Edition, May 23, 2005 1.<number> Korth and Sudarshan

Query Processing (Contt..))


n Alternative ways of evaluating a given query
l Equivalent expressions l Different algorithms for each

operation n Cost difference between a good and a bad way of evaluating a query can

be enormous n Need to estimate the cost of operations l Depends critically on statistical information about relations which the database must maintain l Need to estimate statistics for intermediate results to compute cost of complex expressions
Silberschatz, Database System Concepts 5th Edition, May 23, 2005 1.<number> Korth and Sudarshan

Transaction Management
n A transaction is a collection of operations that performs

a single logical function in a database application n Transactionmanagement component ensures that the database remains in a consistent (correct) state despite system failures (e.g., power failures and operating system crashes) and transaction failures. n Concurrencycontrol manager controls the interaction among the concurrent transactions, to ensure the consistency of the database.
Silberschatz, Database System Concepts 5th Edition, May 23, 2005 1.<number> Korth and Sudarshan

Database Architecture
The architecture of a database systems is greatly influenced by the underlying computer system on which the database is running:

n Centralized n Clientserver n Parallel (multiprocessor) n Distributed


Silberschatz, Database System Concepts 5th Edition, May 23, 2005 1.<number> Korth and Sudarshan

Database Users
Users are differentiated by the way they expect to interact with the system n Application programmers interact with system through DML calls n Sophisticated users form requests in a database query language n Specialized users write specialized database applications that do not fit into the traditional data processing framework n Nave users invoke one of the permanent application programs that have been written previously l Examples, people accessing database over the web, bank tellers, clerical staff
Silberschatz, Database System Concepts 5th Edition, May 23, 2005 1.<number> Korth and Sudarshan

Database Admiinistrator
n Coordinates all the activities of the database system; the

database administrator has a good understanding of the enterprises information resources and needs. n Database administrator's duties include: l Schema definition

l Storage structure and access method definition l Schema and physical organization modification l Granting user authority to access the database l Specifying integrity constraints l Acting as liaison with users l Monitoring performance and responding to changes

in

requirements
Silberschatz, Database System Concepts 5th Edition, May 23, 2005 1.<number> Korth and Sudarshan

Overrall System Structure


Silberschatz, Database System Concepts 5th Edition, May 23, 2005 1.<number> Korth and Sudarshan

History of Database Systems


n 1950s and early 1960s:
l Data

processing using magnetic tapes for storage Tapes provide only sequential access l Punched cards for input n Late 1960s and 1970s: l Hard disks allow direct access to data l Network and hierarchical data models in widespread use l Ted Codd defines the relational data model Would win the ACM Turing Award for this work IBM Research begins System R prototype UC Berkeley begins Ingres prototype l Highperformance (for the era) transaction processing
Silberschatz, Database System Concepts 5th Edition, May 23, 2005 1.<number> Korth and Sudarshan

History (cont.)
n 1980s:

l Research

relational prototypes evolve into commercial

systems SQL becomes industrial standard l Parallel and distributed database systems l Objectoriented database systems n 1990s: l Large decision support and datamining applications l Large multiterabyte data warehouses l Emergence of Web commerce n 2000s: l XML and XQuery standards l Automated database administration
Database System C oncepts, 5th Ed.
Silberschatz, Korth and Sudarshan See www.dbbook. com for conditions on reuse

End of Chapter 1
Silberschatz, Database System Concepts 5th Edition, May 23, 2005 1.<number> Korth and Sudarshan

Fiigure 1.4
Silberschatz, Database System Concepts 5th Edition, May 23, 2005 1.<number> Korth and Sudarshan

Fiigure 1.7

You might also like