You are on page 1of 6

Database Normalization 1

Running head: DATABASE NORMALIZATION

Database Normalization

Lyndsey Parish

Thomas Edison State College

2009-07-CIS-311-OL009
Database Normalization 2

Abstract

This paper describes what database normalization is, why it is important, what it does, and why it is

used when building a database. This paper will also describe what happens to a database, when it has

not been normalized. It goes on then, to describe the first (1NF), second (2NF) and third (3NF) normal

forms.
Database Normalization 3

Database Normalization

When designing a relational database, it is common at first to run into problems like data

redundancy and data anomalies. To decrease data redundancy and anomaly problems, database

designers use a process called normalization. The process will eliminate data that is unnecessarily

stored in many tables, and eliminate inconsistency. Normalization works through a series of different

stages, which evaluate and correct the table structure. Database normalization is essentially the process

of organizing data in the database to increase consistency and integrity of the data.

Redundant data wastes time and space. If a data exists in more than one place and needs to be

changed, it must be changed exactly the same, in every place it exists. If the data is not changed

correctly, it will cause data inconsistency, meaning that a customer, for example, could have two

different addresses, leaving the end user unsure of which address is the correct one. Redundant data not

only poses an update anomaly, but also can create insertion and deletion anomalies. These all pose a

threat towards the data integrity of the database.

The normalization process ensures that all the relations become well formed, however they

must have certain characteristics to be considered normalized. Each table must represent a single

subject, no data may be entered in more than one table, non prime attributes on a table must be

dependent on the primary key, and every table must not have any anomalies. All these characteristics
Database Normalization 4

make sure that the data is consistent and has integrity.

The different normalization stages are rules called “normal forms.” There are actually seven

normal forms, however only the first three are the most common ones used. A relational database starts

with first normal form (1NF), and progresses to third normal form (3NF). First normal form is the least

restrictive, and by adding restrictions the database can progress then into second normal form. All

relations in second normal form are also considered in first normal form because they still fall under

first normal forms requirements. All relations in third normal form are also in second normal form, and

first normal form. The pattern continues as you move into the higher normal forms.

When normalizing the relational database, any table that is considered a relation is in 1NF. Each

cell can contain only a single value. Every entry in a column must be the same kind, and every column

needs to have a unique name. The order of the columns and rows do not matter, as long as no two rows

are the same. Because these requirements are pretty vague, almost every table is qualified for and

begins at 1NF. This also still allows many possibilities for modification anomalies, so it must continue

to be normalized.

Second normal form (2NF) adds a few more restrictions, focusing on the removal of duplicate

data. For a relation to be in 2NF, it must first meet the requirements of 1NF. 2NF creates separate tables

for the duplicated data, and relates these tables with a foreign key. Each key component creates a new

table, and every non-key attributes dependent on the entire key. To get rid of most anomaly problems in

2NF, every determinant must be a key. Even after creating keys, 2NF can still have anomalies, so we

can continue to normalize the relational database.

Third normal form (3NF) continues to add more restrictions to eliminate fields that do not

depend on any keys. For a relation to be in 3NF, it must also qualify for 2NF. Third normal form will

still have anomalies that are created by problems with keys and dependencies. Data that are in a record
Database Normalization 5

which are not a part of the records key will have to be removed from the table, because they do not

belong there. This process will eliminate transitive dependency, putting the relational database in 3NF.

A table in 3NF, may still harbor anomalies, however moving onto higher normal forms may not be

practical for many databases. While not moving onto higher normal forms may not create the perfect

database, it will usually not affect the functionality of one.

Normalizing a database to 3NF is the most practical because every time the tables are

normalized, it creates even more tables, and it requires more space. When it requires too much space, it

might not make the normalization worth most of the effort, because it is using as much space as it was

when there was redundant data. If a database change often, normalizing it too much can also reduce the

performance of the database. Therefore most of the time, the database designer must find a balance

between performance and data integrity.

Database normalization is an important part of the database design process. The normal forms

determine to what degree the database is vulnerable to inconsistent data and data anomalies. The

higher the normal form, the less it is vulnerable, meaning it has higher integrity and consistency.

However, too much normalization can cause a lack of performance and increase the size of the

database. Therefore, the ideal and most commonly used normal form is third normal form. By using

third normal form, the database will have an ideal mix of data integrity and performance.
Database Normalization 6

References

Coronel, C. , & Rob P. (2009)

Normalization of Database Tables

Database Systems; Design, Implementation, and Management (8th ed.) (pp. 152-184)

United States

Chapple, M. (n.d.)

Database Normalization Basics from

http://databases.about.com/od/specificproducts/a/normalization.htm

Taylor, G. (2001)

Chapter 7: The Relational Model

Database Development for Dummies (pp. 105-123) Foster City, CA:

IDG Books Worldwide, Inc.

You might also like