You are on page 1of 17

DBMS

Basic concepts: Data can be defined as unprocessed information. Information is data that is organized and communicated in a coherent (clear) and meaningful manner. Data is converted into information and information is converted into knowledge. Knowledge is the information evaluated and organized so that it is used purposefully. Flow of data: Data Information Knowledge Action

Disadvantages of File Management System: In FMS the data is stored as a collection of operating system files. This approach has many drawbacks, including the following: Data redundancy and inconsistency: Multiple file formats and duplication of information in different files. Difficulty in accessing data: Need to waste a new program to carry out each new task. Data Isolation: Multiple files and formats. Integrity problems: Integrity constraints become buried in program code rather than being stated explicitly. Hard to add new constraints or change existing one. Atomicity of updates: Failures may leave database in an inconsistent state with partial updates carried out. Example: Transfer of funds from one account to another should either complete or not happen at all. Concurrent access of multiple users: Concurrent access needed for performance. Uncontrolled concurrent accesses can lead to inconsistencies. Example: Two people reading a balance and updating it at the sametime. Limited data sharing and lengthy processing time. Main drawback of FMS is storage and memory management. Database: It is structured or organized collection of related records or data that is stored in a computer system and which can easily be accessed managed and updated. Database applications: Banking : All transactions Airlines : Reservations and schedules

Universities : Registration, grades etc Sales : Customers, products and purchases Online retailers : Order tracking, customized recommendations Human resources: Employee records, salaries etc.

Database Management System: DBMS is a database program or software system that stores, retrieves and modifies data in a database on request. Examples: MS Access, Oracle, SQL Server, FoxPro etc. Three view of data / Levels of abstraction: Physical Level: Describes how a record is stored. Logical Level: Describes data stored in database and the relationships among the data. View level:: Application programs hide details of data types. Views can also hide information for security purposes.

Schema: The logical structure of the database. Physical Schema: Database design at physical level Logical Schema: Database design at logical level. Instance Actual content of the database at a particular point of time. Advantages of DBMS: Reduction of redundancies: In database approach data can be stored at a single place or with controlled redundancy under DBMS, which saves space and does not permit inconsistency. Shared Data: A DBMS allows the sharing of database under its control by any number of application programs or users.

Data Independence: A change in the structure of data may require alternations to programs. DBMS separates data descriptions from data. Hence it is not affected by changes. This is called data independence, where details of data are not exposed. DBMS provides an abstract view and hides details. Improved Integrity: Data integrity refers to validity and consistency of data. Data integrity means that the data should be accurate and consistent. This is done by providing some checks or constraints. These are consistency rules that the database is not permitted to violate. Constraints may apply to data items with in a record or relationships between records. Example: Age of all employees can be between 18 and 70 years only. Efficient Data Access: DBMS utilizes techniques to store and retrieve the data efficiently atleast for unforeseen queries. A complex DBMS should be able to provide services to end users, where they can efficiently retrieve the data almost immediately. Data Security: Data is of vital importance to an organization and may be confidential. Unauthorized person must not access the confidential data. The DBA who has the ultimate responsibility for the data in the DBMS can ensure that proper access procedures are followed, including proper authentication schemes for access to the DBMS and additional checks before permitting access to sensitive data. Different levels of security can be implemented for various types of data and operations. Backup and Recovery: DBMS provides facilities for recovering the software and hardware failures. A backup and recovery subsystem is responsible for this. In case a program fails, it restored the database to a state in which it was before the execution of the program. Support for concurrent transactions: DBMS allows multiple transactions to occur simultaneously. Differences between File Processing System and Database Management System # 1 2 3 4 5 6 7 8 File Processing System A file-processing system only coordinates physical access to the data Cheaper Data dependent Data redundancy Inconsistent data A file-processing system only allows pre-determined access to data Integrity problems Concurrent transactions are Database Management System A database coordinates the physical and logical access to the data Costly Data independent Controlled data redundancy Consistent data A DBMS is designed to allow flexibility in what queries give access to the data Improved integrity Supports concurrent transactions

not supported A file processing system is A DBMS is designed to coordinate much more restrictive in and permit multiple users to simultaneous data access. access data at the same time.

Data Models:
Data model is the collection of tools for describing Data Data relationships Data semantics Data constraints

Types of data models: 1) Relational Model 2) Entity Relationship Model 3) Object Based Data Models a) Object oriented b) Object relational 4) Semi structured data model(XML) 5) Network Model 6) Hierarchial Model Database design: The database design process is divided into six steps. 1) Requirement Analysis: Requirement analysis is an informal process that involves discussions with user groups, a study on current operating environment, how it is going to change, analysis of available documentation on existing application. 2) Conceptual database design: The information gathered in the requirement analysis step is used to develop a high level description of the data to be stored in the database, along with the constraints known to hold over this data. This step is carried out using ER model. 3) Logical database design: The task of the logical design step is to convert an ER diagram/ER schema into a relational database schema. 4) Schema Refinement: The fourth step in database design is to analyze the collection of relations in our relational database schema to identify potential problems and refine it. 5) Physical database design: This step involves building indexes on some tables and clustering some tables, exit may involve a substantial redesign of parts of the database schema obtained form the earlier design steps. 6) Application and security design: Here we must identify the entities and processes involved in the application. We must describe the role of each entity in every process that is reflected in some application task, as part of complete workflow for that task. For each role, we must identify the parts of the database that must be accessible and parts of the database that must not be accessible. We must take steps that these access rules are enforced.

In the implementation phase, we must code each task in an application language using the DBMS to access data. Realistically, although we might begin with the six step process outlines here, a complete database design will probably require a subsequent tuning phase in which all six kinds of design steps are interleaved and repeated until the design is satisfactory. Database Users: Users are differentiated by the way they expect to interact with the system. Application Programmers: Interact with system through DML calls Sophisticated users: Forms requests in a database query language Specialized users: Writes specialized database application that do not fit into the traditional data processing framework. Nave users: People accessing database over the web etc. Database Administrator: DBA coordinates all the activities of the database system. He has good understanding of the enterprises, information resources and needs. DBAs duties include: Storage structure and access method definition Schema and physical organization modification Granting users authority to access the database Backing up of data Monitoring performance and responding to changes Database tuning

Entity Relationship Model


Features of ER Model: o Entity relationship model is a high level conceptual data model. o It allows us to describe the data involved in a real world enterprise in terms of objects and their relationships. o It is widely used to develop an initial design of a database. o It provides a set of useful concepts that make it convenient for a developer to move from a basic set of information to a detailed and precise description of information that can be easily implemented in a database system. o ER model describes data as a collection of Entities / Entity set Relationship set Attributes

Entity: o An entity is an object in the real worlds that is distinguishable from other objects. Example: car, table, book etc o An entity need not be a physical entity; it can also represent a concept in real world. Example: project, loan etc o Entity represents a class of things, not any one instance. Example: STUDENT entity has instances of JONES, RAMA etc o Entity is denoted by the symbol rectangle

o Entity type/Entity set: A collection of a similar kind of entities is called an entity set or entity type. Attribute: o An attribute is a property used to describe the specific feature of an entity. o Attributes are denoted by ellipse symbol

Example: STUDENT entity may be described by the attributes stud_name, age, address, course etc

Domain: o Each simple attribute of an entity type contains a possible set of values that can be attached to it. This is called the domain of the attribute. o An attribute cannot have a value outside this domain. o Example: For PERSON entity, person_Id attribute has a specific domain, integer values say upon 1 to100. Types of Attributes: 1) Simple Attribute: The attribute that cannot be further divided into smaller parts and represents the basic meaning is called a simple attribute. Example: Name, age attributes of an entity PERSON represent simple attributes.

2) Composite Attribute: Attributes which can be divided into subparts and each individual unit has a specific meaning. Example: An attribute name could be structures as a composite attribute consisting of Firstname and Lastname

3) Single Valued Attributes: Attributes having single value for a particular entity. Example: Age is a single valued attribute of a student entity. 4) Multivalued Attribute: Attributes that have more than one value for a particular entity is called multivalued attribute. It is represented by double ellipse Example: Consider an employee entity set with the attribute phone_number. An employee may have zero or more than one phone number. This type of attribute is said to be mulvalued attribute.

5) Stored attribute: Attributes that are directly stored in the database. Example Birthdate attribute of a person. 6) Derived attribute: They are derived from the values of other related attributes or entities. The value of a derived attribute is not stored but is computed when required. It is denoted by dotted ellipse. Example: Age is calculated from date of birth, experience is calculated from DOJ

7) An attribute takes a null value when an entity does not have a value for it.

8) Descriptive Attribute: Descriptive attributes are used to record information about the relationship. Relationships: A relationship can be defined as a Connection or set of associations A rule for communication among entities Association among several entities It is denoted by diamond symbol.

Example: Association between student and course. STUDENT OPTS COURSE

Relationship Sets: A relationship set is a set of relationships of the same type. Example: Consider the relationship between two entity sets student and course. Collection of all the instances of relationship OPTS forms a relationship set. Degree of a relationship type is the number of participating entities. The relationship between two entities is called binary relationship.

Fig: An ER diagram with a binary relationship. A relationship among three entities is called ternary relationship.

Fig: ER diagram with a ternary relationship. Relationship among n entities is called n-ry relationship. Roles: Entity sets of a relationship set need not be distinct. The function that an entity plays in a relationship is called its role. Roles are normally explicit and not specified. They are useful when the meaning of a relationship set needs clarification. Roles are indicated in ER diagrams by labeling the lines that connect diamonds to rectangles. Role labels are optional and are used to clarify semantics of the relationship.

Cardinality Constraints: Cardinality specifies the number of instances of an entity associated with another entity participating in a relationship. Based on the cardinality, binary relationship can be further classified into the following categories: One-to-one: An entity in A is associated with at most one entity in B, and an entity in B is associated with at most one entity in A. Example : Relationship between college and principal has College Principal

One college can have at the most one principal and one principal can be assigned to only one college.

One-to-many: An entity in A is associated with any number of entities in B. An entity in B is associated with at the most one entity in A. Example : Relationship between department and faculty.

One department can appoint any number of faculty members but a faculty member is assigned to only one department. Many-to-one: An entity in A is associated with at most one entity in B. An entity in B is associated with any number in A. Example: Relationship between course and instructor. An instructor can teach various courses but a course can be taught only by one instructor. Please note this is an assumption.

Many-to-many: Entities in A and B are associated with any number of entities from each other. Example 20: Taught by Relationship between course and faculty. One faculty member can be assigned to teach many courses and one course may be taught by many faculty members.

Relationship between book and author. One author can write many books and one book can be written by more than one authors.

Recursive relationships: When the same entity type participates more than once in a relationship type in different roles, the relationship types are called recursive relationships Participation constraints:

The participation Constraints specify whether the existence of an entity depends on its being related to another entity via the relationship type. There are 2 types of participation constraints: Total Participation: When all the entities from an entity set participate in a relationship type, is called total participation. Partial Participation: When it is not necessary for all the entities from an entity set to participate in a relationship type, it is called partial participation.

In the above ER diagram, every loan must have a customer associated to it through borrower. Therefore participation of loan in borrower is total and participation of customer in borrower is partial. Strong entity set: The entity types containing a key attribute are called strong entity types or regular entity types. Example: The Student entity has a key attribute RollNo which uniquely identifies it, hence is a strong entity.
Weak Entity Set:

Entity types that do not contain any key attribute, and hence cannot be identified independently, are called weak entity types. A weak entity can be identified uniquely only by considering some of its attributes in conjunction with the primary key attributes of another entity, which is called the identifying owner entity. A partial key is attached to a weak entity type that is used for unique identification of weak entities related to a particular owner entity type. The following restrictions must hold: The owner entity set and the weak entity set must participate in one to many relationship set. This relationship set is called the identifying relationship set of the weak entity set. The weak entity set must have total participation in the identifying relationship.

Enhanced Entity Relationship Model (EER)


Semantic concepts are incorporated into the original ER model and are called the EER Model. Examples of additional concepts of EER model are a) Specialization b) Generalization c) Aggregation Superclass: An entity type that includes one or more distinct subgroups of its occurrences. Subclass: A distinct subgrouping of occurrences of an entity type. Super class / Sub class relationship is one-to-one. Superclass may contain overlapping or distinct subclasses. Not all members of a superclass need to be a member of subclass. Specialization: 1) Specialization is a process of identifying subsets of an entity sets( the superclass) that share some distinguishing characteristics. 2) Superclass is defined first, subclasses are defined next and subclass attributes and relationship set are then added. 3) Top-down design process; we designate subgroupings within an entity set that are distinctive from other entities in the set. 4) These subgroupings become lower-level entity sets that have attributes or participate in relationships that do not apply to the higher-level entity set. 5) Depicted by a triangle component labeled ISA (E.g. customer is a person). 6) Attribute inheritance a lower-level entity set inherits all the attributes and relationship participation of the higher-level entity set to which it is linked. Example: In the below ER diagram Person is the entity set having attributes Person_id, Name,Street,City. Person can be further classified as Customer

Employee

Customer entity is described by the attribute credit_rating Employee entity is described by the attributes salary. Employee can be further classified as Officer Teller Secretary The process of designating subgroupings with in an entity set is called Specialization.

Generalization: 1) A bottom-up design process. 2) Generalization consists of identifying some common characteristics of a collection of entity sets and creating a new entity set that contains entities possessing these common characteristics. 3) Subclasses are defined first, superclass is defined next and relationship sets involving superclass are then defined. 4) Specialization and generalization are simple inversions of each other; they are represented in an E-R diagram in the same way Design Constraints on a Specialization/Generalization A. Constraint on which entities can be members of a given lower-level entity set.

condition-defined Example: all customers over 65 years are members of seniorcitizen entity set; senior-citizen ISA person. ii. user-defined B. Constraint on whether or not entities may belong to more than one lower-level entity set within a single generalization. i. Disjoint an entity can belong to only one lower-level entity set Noted in E-R diagram by writing disjoint next to the ISA triangle ii. Overlapping an entity can belong to more than one lower-level entity set C. Completeness constraint -- specifies whether or not an entity in the higherlevel entity set must belong to at least one of the lower-level entity sets within a generalization. i. total : an entity must belong to one of the lower-level entity sets ii. partial: an entity need not belong to one of the lower-level entity sets Aggregation: Aggregation is an abstraction through which relationship are treated as higher level entities. Aggregation allows us to treat a relationship set as an entity set for purposes of participation in (other) relationships. Aggregation is used o When we have to model a relationship involving (entitity sets and) a relationship set. o When the relationships are distinct and have its own attributes

i.

Conceptual Design with the ER Model


It is most important to recognize that there is more than one way to model a given situation. Our next goal is to start to compare the pros and cons of common choices.

Should a concept be modeled as an entity or an attribute? Consider the scenario, if we want to add address information to the Employees entity set? We might choose to add a single attribute address to the entity set. Alternatively, we could introduce a new entity set, Addresses and then a relationship associating employees with addresses. What are the pros and cons? Adding a new entity set is more complex model. It should only be done when there is need for the complexity. For example, if some employees have multiple address to be associated, then the more complex model is needed. Also, representing addresses as a separate entity would allow a further breakdown, for example by zip code or city. What if we wanted to modify the Works_In relationship to have both a start and end date, rather than just a start date. We could add one new attribute for the end date; alternatively, we could create a new entity set Duration which represents intervals, and then the Works_In relationship can be made ternary (associating an employee, a department and an interval). What are the pros and cons? If the duration is described through descriptive attributes, only a single such duration can be modeled. That is, we could not express an employment history involving someone who left the department yet later returned. Should a concept be modeled as an entity or a relationship? Consider a situation in which a manager controls several departments. Let's presume that a company budgets a certain amount (budget) for each department. Yet it also wants managers to have access to some discretionary budget (dbudget). There are two corporate models. A discretionary budget may be created for each individual department; alternatively, there may be a discretionary budget for each manager, to be used as she desires. Which scenario is represented by the following ER diagram? If you want the alternate interpretation, how would you adjust the model?

Should we use binary or ternary relationships? Consider the following ER diagram, representing insurance policies owned by employees at a company. Each employee can own several polices, each policy can be owned by several employees, and each dependent can be covered by several policies.

What if we wish to model the following additional requirements: A policy cannot be owned jointly by two or more employees. Every policy must be owned by some employee. Dependents is a weak entity set, and each dependent entity is uniquely identified by taking pname in conjunction with the policyid of a policy entity (which, intuitively, covers the given dependent). The best way to model this is to switch away from the ternary relationship set, and instead use two distinct binary relationship sets.

Should we use aggregation?

Consider again the following ER diagram:

If we did not need the until or since attributes. In tihs case, we could model the identical setting using the following ternary relationship:

You might also like