You are on page 1of 79

Provided by: CCS Global Tech

Data Processing Basic DBMS concepts Basic RDBMS concepts Conceptual Database Design ER Modeling Notations

Data collection Recording Sorting Classifying Calculating Storing & retrieving Summarizing Communicating

Batch processing: Transactions are collected in a group & processed together. On-line (interactive) processing: Transactions are processed as & when they appear. Real-time processing: It is a parallel time relationship with on-going activity & the information produced is useful in controlling the current/dynamic activity.

A DBMS has to be persistent, that is it should be accessible when the program created the data ceases to exist or even when the application that created the data restarted. A DBMS also has to provide some uniform methods independent of a specific application for accessing the information that is stored. RDBMS is a Relational Data Base Management System (Relational DBMS). This adds the additional condition that the system supports a tabular structure for the data, with enforced relationships between the tables. This excludes the databases that don't support a tabular structure or don't enforce relationships between tables.

Data is viewed as existing in two dimensional tables known as relations An entity (table) consists of unique attributes (columns) and tuples (rows) Tuples are unique Sometimes the value to be inserted into a particular cell may be unknown, or it may have no value. This is represented by a null Null is not the same as zero, blank or an empty string Relational Database: Any database whose logical organization is based on relational data model. RDBMS: A DBMS that manages the relational database.

Entity type Attribute Multivalued attribute Relationship Degree Cardinality Business Rule Associative entity

Learn to draw Entity-Relationship (E-R) Diagrams Review the role of conceptual data modeling in overall design and analysis of an information system Distinguish between unary, binary, and ternary relationships, and give an example of each Define four basic types of business rules in an E-R diagram

Representation of organizational data Purpose is to show rules about the meaning and interrelationships among entities Entity-Relationship (E-R) diagrams are commonly used to show how data are organized Main goal of conceptual data modeling is to create accurate E-R diagrams Methods such as interviewing, questionnaires and JAD are used to collect information Consistency must be maintained between process flow, decision logic and data modeling descriptions

First step is to develop a data model for the system being replaced Next, a new conceptual data model is built that includes all the requirements of the new system In the design stage, the conceptual data model is translated into a physical design Project repository links all design and data modeling steps performed during SDLC

Primary deliverable is the entity-relationship diagram There may be as many as 4 E-R diagrams produced and analyzed during conceptual data modeling

Covers just data needed in the projects application E-R diagram for system being replaced An E-R diagram for the whole database from which the new applications data are extracted An E-R diagram for the whole database from which data for the application system being replaced is drawn

Two perspectives

Top-down
Data model is derived from an intimate understanding

of the business

Bottom-up
Data model is derived by reviewing specifications and

business documents

Notation uses three main constructs

Data entities Relationships Attributes


A detailed, logical representation of the entities, associations and data elements for an organization or business

Entity-Relationship (E-R) Diagram

Entity

A person, place, object, event or concept in the user environment about which the organization wishes to maintain data Represented by a rectangle in E-R diagrams
A collection of entities that share common properties or characteristics A named property or characteristic of an entity that is of interest to an organization

Entity type or associate entity

Attribute

Usually a noun in singular Represented by a rectangle with a label First letter in capitals

Simple attribute: cannot be divided into simpler components E.g birthdate of an employee Composite attribute: can be split into components E.g Address of the employee.

Can be split into Street, City, and State

Single valued : can take on only a single value for each entity instance E.g. birth date of employee. There can be only one value for this Multi-valued: can take many values E.g. skill set of employee like MATLAB,.net developer programming,etc

Stored Attribute: Attribute that need to be stored permanently. E.g. name of an employee Derived Attribute: Attribute that can be calculated based on other attributes. E.g. : years of service of employee can be calculated from date of joining and current date

Regular Entity: Entity that has its own key attribute. E.g.: Employee, student ,customer, policy holder etc. Weak entity: Entity that depends on other entity for its existence and doesnt have key attribute of its own E.g. : spouse of employee

A relationship type between two entity types defines the set of all associations between these entity types Each instance of the relationship between members of these entity types is called a relationship instance

Primary Key Unique Key Composite Key Foreign Key Surrogate Key Candidate Key Alternate Key

Candidate keys and identifiers

Each entity type must have an attribute or set of attributes that distinguishes one instance from other instances of the same type
Attribute (or combination of attributes) that uniquely identifies each instance of an entity type

Candidate key

Identifier

A candidate key that has been selected as the unique identifying characteristic for an entity type

Selection rules for an identifier


1. Choose a candidate key that will not change its value 2. Choose a candidate key that will never be null 3. Avoid using intelligent keys 4. Consider substituting single value surrogate keys for large composite keys

Multi-valued Attribute

An attribute that may take on more than one value for each entity instance Represented on E-R Diagram in two ways:
double-lined ellipse weak entity

Relationship

An association between the instances of one or more entity types that is of interest to the organization Association indicates that an event has occurred or that there is a natural link between entity types Relationships are always labeled with verb phrases

Degree

Number of entity types that participate in a relationship


Unary
A relationship between two instances of one entity type

Three cases

Binary
A relationship between the instances of two entity types

Ternary
A simultaneous relationship among the instances of three

entity types Not the same as three binary relationships

The number of instances of entity B that can be associated with each instance of entity A Minimum Cardinality

The minimum number of instances of entity B that may be associated with each instance of entity A The maximum number of instances of entity B that may be associated with each instance of entity A

Maximum Cardinality

Relationship name is a verb phrase Avoid vague names Guidelines for defining relationships

Definition explains what action is being taken and why it is important Give examples to clarify the action Optional participation should be explained Explain reasons for any explicit maximum cardinality

All instances of the entity type Employee dont participate in the relationship, Head-of. Every employee doesnt head a department. So, employee entity type is said to partially participate in the relationship. But, every department would be headed by some employee. So, all instances of the entity type Department participate in this relationship. So, we say that it is total participation from the department side.

The set of all data types and ranges of values that an attribute can assume Several advantages
1. Verify that the values for an attribute are valid 2. Ensure that various data manipulation operations are logical 3. Help conserve effort in describing attribute characteristics

Banks have customers. Customers are identified by name, custid, phone number and address. Customers can have one or more accounts Accounts are identified by an account number, account type (savings, current) and a balance. Customers can avail loans. Loans are identified by loan id, loan type (car, home, personal) and an amount. Banks are identified by a name, code and the address of the main office. Banks have branches. Branches are identified by a branch number, branch name and an address. Accounts and loans are related to the banks branches.

Create an ER diagram for a database to represent this application

Each entity type becomes a table Each single-valued attribute becomes a column Derived attributes are ignored Composite attributes are represented by components Multi-valued attributes are represented by a separate table The key attribute of the entity type becomes the primary key of the table

Here address is a composite attribute Years of service is a derived attribute (can be calculated from date of joining and current date) Skill set is a multi-valued attribute The relational Schema Employee (E#, Name, Door_No, Street, City, Pincode, Date_Of_Joining) Emp_Skillset( E#, Skill)

Weak entity types are converted into a table of their own, with the primary key of the strong entity acting as a foreign key in the table This foreign key along with the key of the weak entity form the composite primary key of this table The Relational Schema

The way relationships are represented depends on the cardinality and the degree of the relationship The possible cardinalities are:

1:1, 1:M, N:M


Unary Binary

The degrees are:


Consider employees who are also a couple The primary key field itself will become foreign key in the same table

The primary key field itself will become foreign key in the same table Same as unary 1:1

There will be two resulting tables. One to represent the entity and another to represent the M:N relationship as follows

Case 1:Combination of participation types

The primary key of the partial participant will become the foreign key of the total participant

Case 2: Uniform participation types

The primary key of either of the participants can become a foreign key in the other

The primary key of the relation on the 1 side of the relationship becomes a foreign key in the relation on the N side

A new table is created to represent the relationship Contains two foreign keys - one from each of the participants in the relationship The primary key of the new table is the combination of the two foreign keys

An entity type that associates the instances of one or more entity types and contains attributes that are peculiar to the relationship between those entity instances

Well-structured table - contains minimal redundancy and allows users to insert, modify, and delete the rows without errors or inconsistencies. The possible anomalies could be

Insertion Anomalies Deletion Anomalies Modification Anomalies

The formal process that can be followed to achieve a good database design Also used to check that an existing design is of good quality The different stages of normalization are known as normal forms

An attribute B of a relation R is fully functionally dependent on attribute A of R if it is functionally dependent on A & not functionally dependent on any proper subset of A.

An attribute B of a relation R is partially dependent on attribute A of R if it is functionally dependent on any proper subset of A.

An attribute B of a relation R is transitively dependent on attribute A of R if it is functionally dependent on an attribute C Which in turn is functionally dependent on A or any proper subset of A. S

An attribute of a relation R that belongs to any key of R is said to be a prime attribute and that which doesnt is a non-prime attribute

E.g Report( S#,C#,Title,Lname,Room#,Marks)

S# is a prime attribute C# is a prime attribute Title is a non-prime attribute

A relation schema R is in 2NF if it is in 1NF and every non-prime attribute is fully functionally dependent on every key of R Consider the relational schema:

Empdetails( E#, Project#, Role, Number_Of_shares, Share_worth)

In this,
E#, Project# -> Role E# -> Number_Of_shares

Number_of_Shares, Share_Worth

A relation schema R is in 3NF if it is in 2NF and every non-prime attribute is non-transitively dependent on every key of R

A relation R is in BCNF if, for every non-trivial functional dependency A->B in it, it is true that A is a superkey of R In other words, every determinant is a candidate key BCNF is a stronger form of 3NF 3NF states that every non-prime attribute must be non-transitively dependent on every key BCNF states that every attribute (prime or nonprime) must be non-transitively dependent on every key

{Dept#,Course#}->Lecturer# {Dept#,Course#}-> Num-of_students {Lecturer#,Course#}->Num-of_students Lecturer# -> Dept#

The candidate keys are: {Dept#,Course#} {Lecturer#,Course#}

In the table, the only non-prime attribute is Num-of_students. It depends on every key of the table nontransitively So, it is in 3NF But, the fact that Lecturer L1 belongs to department D1 is repeated redundancy Lecturer#->Dept#. In this, the attribute Dept# is only partially dependent on the key

Course_Offering(Lecturer#, Course#, Num-ofStudents) Lecturer(Lecturer#, Dept#)

A caterers association in a city wants to build a database of all the caterers who are members of the association. Every type of item has got an item code for example:

Idly - itemcode : bf1 dosa - itemcode : bf2 aloo paratha itemcode : bf 3 south indian meals itemcode : ln1

Each caterer may have outlets spread over the city. It is not necessary that all caterers or all branches of a single caterer must provide all item types. Design the relational schema for the above requirement. Normalize the relations.

You might also like