Takahiko Saito Spring 2005 CS 157A Purpose of Normalization
• To reduce the chances for anomalies
to occur in a database. • normalization prevents the possible corruption of databases stemming fro m what are called "insertion anomalie s," "deletion anomalies," and "update anomalies." Insertion Anomaly
• A failure to place a new database entr
y into all the places in the database w here that new entry needs to be store d. • In a properly normalized database, a new entry needs to be inserted into o nly one place in the database Deletion Anomaly
• A failure to remove an existing datab
ase entry when it is time to remove t hat entry. • In a properly normalized database, a n old, to-be-gotten-rid-of entry needs to be deleted from only one place in t he database Update anomaly
• An update of a database involves modifica
tions that may be additions, deletions, or both. Thus "update anomalies" can be eith er of the kinds of anomalies discussed abo ve. 1st Normal Form
• A table (relation) is in 1NF if
1. There are no duplicated rows in the t able. 2. Each cell is single-valued 3. Entries in a column are of the same kind. 2nd Normal Form
• A table is in 2NF if it is in 1NF and if all no
n-key attributes are fully dependent on ea ch candidate key. • A partial dependency occurs when a non-k ey attribute is dependent on only a part of the (composite) key 1NF but not 2NF
• Supplier (supplier#, status, city, part#,
quantity) – (supplier#, part#) -> quantity – supplier# -> status – supplier# -> city – city -> status • => status and city are dependent on just part of the key, namely supplier#. 1NF but not 2NF (cont’d)
• Decomposition (into 2NF):
– Supplier (supplier#, status, city) – Supplier_Part (supplier#, part#, quantit y) 3rd Normal Form (3NF)
• A table is in 3NF if it is in 2NF and if it has
no transitive dependencies. • Transitive dependency is a functional dependency between non-key attributes. 2NF but not 3NF
• Supplier (supplier#, status, city)
– supplier# -> status – supplier# -> city – city -> status => Lacks mutual independence among non-key attributes. 2NF but not 3NF (cont’d)
y determinant is a candidate key. • the definition of 3NF does not deal with a r elation that: – has multiple candidate keys, where – Those candidate keys are composite, and – the candidate keys overlap (i.e., have at least one common attribute) 3NF but not boyce-codd NF • SUPPLIER_PART (supplier#, supplier_name, part #, quantity) – Two candidate keys: • (supplier#, part#) and (supplier_name, part#) – (supplier#, part#) -> quantity – (supplier#, part#) -> supplier_name – (supplier#, part#) -> quantity – (supplier#, part#) -> supplier# – supplier_name -> supplier# – supplier# -> supplier_name Another example of boyce-codd NF
title year length filmType studioNam starName
e Star Wars 1977 124 color Fox Fisher Star Wars 1977 124 color Fox Hamill Star Wars 1977 124 color Fox Ford Mighty Ducks 1991 104 color Disney Esteves Wayne’s World 1992 95 color Paramount Carvey Wayne’s World 1992 95 color Paramount Meyers Example (cont’d)
• {title, year, starName} as candidate key
• title, year length, filmType, studioName • The above FD (Functional Dependency) vi olates the BCNF condition because title an d year do not determine the sixth attribut e, starName Example (cont’d) • We solve this BCNF violation by decomposing rel ation Movies into 1. The schema with all the attributes of the FD {title, year, length, filmType, studioName} 2. The schema with all attributes of Movies exce pt the three that appear on the right of the FD {title, year, starName} Summary of Boyce-Codd NF
• When there is more than one candidate key, a relational
table may be in 3NF and anomalies may still result. • This occurs when there is a composite primary key, and there are two equally valid candidates to make up part o f this composite primary key. If there is an attribute (one or more columns) on which any other attribute is fully de pendent, and this attribute is NOT itself a candidate key, then the table is not in Boyce-Codd Normal form (BC NF). • We fix this by breaking the table up into two tables, both in BCNF.