You are on page 1of 41

David M. Kroenke and David J.

Auer
Database Processing
Fundamentals, Design, and Implementation

Chapter Four:
Database Design
Using Normalization
ING. HERNN QUITO
UNIVERSIDAD DE CUENCA 2017 - 2018
Chapter Objectives
To design updatable databases to store data received
from another source
To use SQL to access table structure
To understand the advantages and disadvantages of
normalization
To understand denormalization
To design read-only databases to store data from
updateable databases

KROENKE AND AUER - DATABASE PROCESSING, 13th Edition 4-2


2014 Pearson Education, Inc.
Chapter Objectives
To recognize and be able to correct common design
problems:
The multivalue, multicolumn problem
The inconsistent values problem
The missing values problem
The general-purpose remarks column problem

KROENKE AND AUER - DATABASE PROCESSING, 13th Edition 4-3


2014 Pearson Education, Inc.
Chapter Premise
We have received one or more tables of
existing data.
The data is to be stored in a new
database.
QUESTION: Should the data be stored as
received, or should it be transformed for
storage?

KROENKE AND AUER - DATABASE PROCESSING, 13th Edition 4-4


2014 Pearson Education, Inc.
How Many Tables?
SKU_DATA (SKU, SKU_Description, Buyer)
BUYER (Buyer, Department)

Where SKU_DATA.Buyer must exist in BUYER.Buyer

Should we store these two tables as


they are, or should we combine them
into one table in our new database?

KROENKE AND AUER - DATABASE PROCESSING, 13th Edition 4-5


2014 Pearson Education, Inc.
Assessing Table Structure

KROENKE AND AUER - DATABASE PROCESSING, 13th Edition 4-6


2014 Pearson Education, Inc.
Counting Rows in a Table
To count the number of rows in a table use
the SQL COUNT(*) built-in function :

SELECT COUNT(*) AS NumRows


FROM SKU_DATA;

SELECT department, Count(sku) AS cantidad


FROM sku_data group by department;

KROENKE AND AUER - DATABASE PROCESSING, 13th Edition 4-7


2014 Pearson Education, Inc.
Examining the Columns
To determine the number and type of columns in
a table, use an SQL SELECT statement.
To limit the number of rows retrieved, use the
SQL TOP {NumberOfRows} expression:
SELECT TOP (10) *
FROM SKU_DATA;

Sql 89
select top 10 * from sku_data

KROENKE AND AUER - DATABASE PROCESSING, 13th Edition 4-8


2014 Pearson Education, Inc.
Checking Validity of Assumed
Referential Integrity Constraints I
Given two tables with an assumed foreign
key constraint:
SKU_DATA (SKU, SKU_Description, Buyer)
BUYER (Buyer, Department)

Where SKU_DATA.Buyer must exist in BUYER.Buyer

KROENKE AND AUER - DATABASE PROCESSING, 13th Edition 4-9


2014 Pearson Education, Inc.
Checking Validity of Assumed
Referential Integrity Constraints II
To find any foreign key values that violate the
foreign key constraint:
SELECT Buyer
FROM SKU_DATA
WHERE Buyer NOT IN
(SELECT SKU_DATA.Buyer
FROM SKU_DATA, BUYER
WHERE SKU_DATA.BUYER =
BUYER.Buyer);
select distinct buyer from sku_data where buyer not in (select buyer from
buyer)
KROENKE AND AUER - DATABASE PROCESSING, 13th Edition 4-10
2014 Pearson Education, Inc.
Type of Database
Updateable database, or read-only
database?
If updateable database, we normally want
tables in BCNF.
If read-only database, we may not use
BCNF tables.
26/09/2017

KROENKE AND AUER - DATABASE PROCESSING, 13th Edition 4-11


2014 Pearson Education, Inc.
Designing
Updatable Databases

KROENKE AND AUER - DATABASE PROCESSING, 13th Edition 4-12


2014 Pearson Education, Inc.
Normalization:
Advantages and Disadvantages

KROENKE AND AUER - DATABASE PROCESSING, 13th Edition 4-13


2014 Pearson Education, Inc.
Non-Normalized Table:
EQUIPMENT_REPAIR

KROENKE AND AUER - DATABASE PROCESSING, 13th Edition 4-14


2014 Pearson Education, Inc.
Normalized Tables:
ITEM and REPAIR
30/10/2017
Inc. Arquitectura
de servidores y
conexin con
.NET y JAVA
+ revisin de
programa para
emp de telecom

KROENKE AND AUER - DATABASE PROCESSING, 13th Edition 4-15


2014 Pearson Education, Inc.
Copying Data to New Tables
To copy data from one table to another,
use the SQL command INSERT INTO
TableName command:
INSERT INTO EQUIPMENT_ITEM
SELECT DISTINCT ItemNumber,
EquipmentType, AcquisitionCost
FROM EQUIPMENT_REPAIR;
INSERT INTO REPAIR
SELECT RepairNumber, ItemNumber,
RepairDate, RepairCost
FROM EQUIPMENT_REPAIR;

KROENKE AND AUER - DATABASE PROCESSING, 13th Edition 4-16


2014 Pearson Education, Inc.
Choosing Not To Use BCNF
BCNF is used to control anomalies from functional
dependencies.
There are times when BCNF is not desirable.
The classic example is ZIP codes:
ZIP codes almost never change.
Any anomalies are likely to be caught by normal business
practices.
Not having to use SQL to join data in two tables will speed up
application processing.

4-17
Multivalued Dependencies
Anomalies from multivalued dependencies
are very problematic.
Always place the columns of a
multivalued dependency into a separate
table (4NF).

KROENKE AND AUER - DATABASE PROCESSING, 13th Edition 4-18


2014 Pearson Education, Inc.
Designing
Read-Only Databases

KROENKE AND AUER - DATABASE PROCESSING, 13th Edition 4-19


2014 Pearson Education, Inc.
Read-Only Databases
Read-only databases are nonoperational
databases using data extracted from
operational databases (data warehouse).
They are used
in business intelligence (BI) systems
for querying, reporting, and data mining
applications.
They are never updated
(in the operational database sensethey may
have new data imported from time to time).
KROENKE AND AUER - DATABASE PROCESSING, 13th Edition 4-20
2014 Pearson Education, Inc.
Denormalization
For read-only databases, normalization is
seldom an advantage.
Application processing speed is more
important.
file space is exceedingly cheap, nearly free.
Denormalization is the joining of the data
in normalized tables prior to storing the
data.
The data is then stored in nonnormalized
tables.
KROENKE AND AUER - DATABASE PROCESSING, 13th Edition 4-21
2014 Pearson Education, Inc.
Normalized Tables

KROENKE AND AUER - DATABASE PROCESSING, 13th Edition 4-22


2014 Pearson Education, Inc.
Denormalizing the Data
INSERT INTO STUDENT_ACTIVITY_PAYMENT_DATA
SELECT STUDENT.StudentID, StudentName,
ACTIVITY.Activity,
ActivityFee, AmountPaid
FROM STUDENT, PAYMENT, ACTIVITY
WHERE STUDENT.StudentID = PAYMENT.StudentID
AND PAYMENT.Activity = ACTIVITY.Activity;

KROENKE AND AUER - DATABASE PROCESSING, 13th Edition 4-23


2014 Pearson Education, Inc.
Customized Tables I
Read-only databases
are often designed
with many copies of
the same data, but
with each copy
customized for a
specific application.

Consider the
PRODUCT table:

KROENKE AND AUER - DATABASE PROCESSING, 13th Edition 4-24


2014 Pearson Education, Inc.
Customized Tables II
PRODUCT_PURCHASING (SKU, SKU_Description, VendorNumber,
VendorName, VendorContact_1, VendorContact_2, VendorStreet,
VendorCity, VendorState, VendorZip)

PRODUCT_USAGE (SKU, SKU_Description, QuantitySoldPastYear,


QuantitySoldPastQuarter, QuantitySoldPastMonth)

PRODUCT_WEB (SKU, DetailPicture, ThumbnailPicture,


MarketingShortDescription, MarketingLongDescription, PartColor)

PRODUCT_INVENTORY (SKU, PartNumber, SKU_Description, UnitsCode,


BinNumber, ProductionKeyCode)

KROENKE AND AUER - DATABASE PROCESSING, 13th Edition 4-25


2014 Pearson Education, Inc.
Common Design Problems

KROENKE AND AUER - DATABASE PROCESSING, 13th Edition 4-26


2014 Pearson Education, Inc.
The Multivalue, Multicolumn Problem

The multivalue, multicolumn problem


occurs when multiple values of an attribute
are stored in more than one column:
EMPLOYEE (EmployeeNumber, EmployeeLastName, EmployeeFirstName,
Email, Auto1_LicenseNumber, Auto2_LicenseNumber,
Auto3_LicenseNumber)

This is another form of a multivalued


dependecy.
Solution = like the 4NF solution for
multivalued dependencies, use a separate
table to store the multiple values.
KROENKE AND AUER - DATABASE PROCESSING, 13th Edition 4-27
2014 Pearson Education, Inc.
KROENKE AND AUER - DATABASE PROCESSING, 13th Edition 4-28
2014 Pearson Education, Inc.
Inconsistent Values I
Inconsistent values occur when different
users, or different data sources, use
slightly different forms of the same data
value:
Different codings:
SKU_Description = 'Corn, Large Can'
SKU_Description = 'Can, Corn, Large'
SKU_Description = 'Large Can Corn
Different spellings:
Coffee, Cofee, Coffeee
KROENKE AND AUER - DATABASE PROCESSING, 13th Edition 4-29
2014 Pearson Education, Inc.
Inconsistent Values II
Particularly problematic are primary or foreign
key values.
To detect:
Use referential integrity check already discussed for
checking keys.
Use the SQL GROUP BY clause on suspected
columns.

KROENKE AND AUER - DATABASE PROCESSING, 13th Edition 4-30


2014 Pearson Education, Inc.
Inconsistent Values III

SELECT SKU_Description, COUNT(*) AS NameCount


FROM SKU_DATA
GROUP BY SKU_Description;

KROENKE AND AUER - DATABASE PROCESSING, 13th Edition 4-31


2014 Pearson Education, Inc.
Missing Values
A missing value or null value is a value
that has never been provided.

KROENKE AND AUER - DATABASE PROCESSING, 13th Edition 4-32


2014 Pearson Education, Inc.
Null Values
Null values are ambiguous:
May indicate that a value is inappropriate;
DateOfLastChildbirth is inappropriate for a male.
May indicate that a value is appropriate but unknown;
DateOfLastChildbirth is appropriate for a female, but may be
unknown.
May indicate that a value is appropriate and known,
but has never been entered;
DateOfLastChildbirth is appropriate for a female, and may be
known but no one has recorded it in the database.

KROENKE AND AUER - DATABASE PROCESSING, 13th Edition 4-33


2014 Pearson Education, Inc.
Checking for Null Values
Use the SQL keyword IS NULL to check
for null values:
SELECT COUNT(*) AS QuantityNullCount
FROM ORDER_ITEM
WHERE Quantity IS NULL; 06/11/2017 + rev
trab 1 grupo

KROENKE AND AUER - DATABASE PROCESSING, 13th Edition 4-34


2014 Pearson Education, Inc.
The General-Purpose Remarks Column

A general-purpose remarks column is a


column with a name such as:
Remarks
Comments
Notes
It often contains important data stored in an
inconsistent, verbal, and verbose way.
A typical use is to store data on a customers
interests.
Such a column may:
Be used inconsistently
Hold multiple data items
KROENKE AND AUER - DATABASE PROCESSING, 13th Edition 4-35
2014 Pearson Education, Inc.
Task. Part 1

KROENKE AND AUER - DATABASE PROCESSING, 13th Edition 4-36


2014 Pearson Education, Inc.
1. Identify possible multivalued dependencies in these tables.
2. Identify possible functional dependencies in these tables.
3. Determine whether each table is either in BCNF or in 4NF. State
your assumptions.
4. Modify each of these tables so that every table is in BCNF and
4NF. Use the assumptions you made in your answer to question 3.
5. Using these tables and your assumptions, recommend a design
for an updatable database.
6. Add a table to your answer to question 5 that would allow Elliot
Bay to assign members to particular classes. Include an
AmountPaid column in your new table.
7. Recommend a design for a read-only database that would
support the following needs:
a. Enable trainers to ensure that their clients are members of the club.
B. Enable the club to assess the popularity of various trainers.
C. Enable the trainers to determine if they are assisting the same client.
D. Enable class instructors to determine if the attendees to their classes have
paid.

KROENKE AND AUER - DATABASE PROCESSING, 13th Edition 4-37


2014 Pearson Education, Inc.
Task Part 2

KROENKE AND AUER - DATABASE PROCESSING, 13th Edition 4-38


2014 Pearson Education, Inc.
KROENKE AND AUER - DATABASE PROCESSING, 13th Edition 4-39
2014 Pearson Education, Inc.
David Kroenke and David Auer
Database Processing
Fundamentals, Design, and Implementation
(13th Edition)

End of Presentation:
Chapter Four

KROENKE AND AUER - DATABASE PROCESSING, 13th Edition 4-40


2014 Pearson Education, Inc.
All rights reserved. No part of this publication may be reproduced, stored in a
retrieval system, or transmitted, in any form or by any means, electronic,
mechanical, photocopying, recording, or otherwise, without the prior written
permission of the publisher. Printed in the United States of America.

KROENKE AND AUER - DATABASE PROCESSING, 13th Edition 4-41


2014 Pearson Education, Inc.

You might also like