You are on page 1of 251

LOGICAL DATA MODEL

LDM1 LDM2 LDM3 LDM4 LDM5 LDM6 LDM7 LDM8 LDM9 Identify Major Entities Determine Relationships Between Entities Determine Primary and Alternate Keys Determine Foreign Keys Determine Key Business Rules Add Remaining Attributes Validate Normalization Rules Determine Domains Determine Other Attribute Business Rules (Triggering Operations) LDM10: Combine User Views LDM11: Integrate With Existing Data Models LDM12: Analyze for Stability and Growth : : : : : : : : :

RELATIONAL DATABASE DESIGN


RDD1 : Identify Tables RDD2 : Identify Columns RDD3 : Adapt Data Structure to Product Environment RDD4 : Design for Business Rules about Entities RDD5 : Design for Business Rules about Relationships RDD6 : Design for Additional Business Rules About Attributes RDD7 : Tune for Scan Efficiency RDD8 : Define Clustering Sequences RDD9 : Define Hash Keys RDD10 : Add Indexes - Tune by Adding Indexes RDD11 : Add Duplicate Data RDD12 : Redefine Columns RDD13 : Redefine Tables

Handbook of Relational Database Design


by

Candace C. Fleming Barbara von Halle

LOGICAL DATA MODEL

LDM1 Identify Major Entities

LDM1 Identify Major Entities


LDM1.1 Name, define, diagram, and document entities in the data dictionary (or design documentation).

LDM2 Determine Relationships Between Entities

LDM2 Determine Relationships Between Entities


LDM2.1 Name, define, diagram, and document relationships in the data dictionary (or design documentation)

LDM2 Determine Relationships Between Entities


LDM2.2 Classify relationships as being either oneto-one (1:1) or one-to-many (1:N). For simplicity, reduce each many-to-many (M:N) relationship to a new entity type and two 1:N relationships

LDM2 Determine Relationships Between Entities


LDM2.3 For simplicity, reclassify a complex relationship as an entity, related through binary relationships to each of the original entities.

LDM2 Determine Relationships Between Entities


LDM2.4 Eliminate redundant relationships from the logical model.

LDM2 Determine Relationships Between Entities


LDM2.5 Establish 1:1 relationships between supertypes and subtypes. Establish a special types of 1:1 relationship, known as a category, between a supertype and a set of mutually exclusive subtypes.

LDM2 Determine Relationships Between Entities


LDM2.6 Represent a 1:N bill-of-materials relationships as a 1:N relationship from and to the same entity. Represent an M:N bill-of-materials relationship by creating a second entity type and relating it to the original (now parent) entity type through two 1:N relationships.

LDM3 Determine Primary and Alternate Keys

LDM3 Determine Primary and Alternate Keys


LDM3.1 Choose one primary key for each entity.

LDM3 Determine Primary and Alternate Keys


LDM3.2 Identify alternate keys for each entity.

LDM3 Determine Primary and Alternate Keys


LDM3.3 Choose the primary key of an entity that is a subtype of another entity (known as a supertype) to be the same as the primary key of the supertype.

LDM3 Determine Primary and Alternate Keys


LDM3.4 Name, diagram, and document primary and alternate keys in the data dictionary (or design documentation).

LDM3 Determine Primary and Alternate Keys


LDM3.5 Establish naming standards that facilitate assignment of clear, descriptive, intuitive, and unique names to attributes (and to entities and relationship).

LDM3 Determine Primary and Alternate Keys


LDM3.6 For brevity and simplicity, use a standard abbreviations given word

LDM4 Determine Foreign Keys

LDM4 Determine Foreign Keys


LDM4.1 For each relationship in the logical data model, identify the foreign key.

LDM4 Determine Foreign Keys


LDM4.2 Name, diagram and document foreign keys in the data dictionary (or design documentation).

LDM5 Determine Key Business Rules

LDM5 Determine Key Business Rules


LDM5.1 Identify one insert rule for each relationship.

LDM5 Determine Key Business Rules


Insert Rule - We classify insert constraints into six types: Dependent - Permit insertion of child entity occurrence only when matching parent entity occurrence already exists. Automatic - Always permit insertion of child entity occurrence. If matching parent entity occurrence does not already exist, create it. Nullify - Always permit insertion of child entity occurrence. If matching parent entity occurrence does not exist, set foreign key in child to null. Default - Always permit insertion of child entity occurrence. If matching parent entity occurrence does not exist, set foreign key in child to a previously defined default value. Customized - Permit insertion of child entity occurrence only if certain customized validity constraints are met. No Effect - Always permit insertion of child entity occurrence. No matching parent entity occurrence need exist, and thus no validity checking need be done.

LDM5 Determine Key Business Rules


LDM5.2 Identify one delete rule for each relationship.

LDM5 Determine Key Business Rules


Delete Rules - Six types of delete constraints govern valid deletion of a parent entity (or update of its primary key): Restrict - Permit deletion of parent entity occurrence only when there are no matching child entity occurrences. Cascade - Always permit deletion of parent entity occurrence and cascade the deletion to any matching child entity occurrences (i.e., delete all matching child entity occurrences). Nullify - Always permit deletion of parent entity occurrence. If any matching child entity occurrences exist, set their foreign keys to null. Default - Always permit deletion of parent entity occurrence. If any matching child entity occurrences exist, set their foreign keys to a previously defined default value. Customized - Permit deletion of parent entity occurrence only if certain customized validity constraints are met. No Effect - Always permit deletion of parent entity occurrence. Matching child entity occurrences mayor may not exist, and thus no validity checking need be done.

LDM5 Determine Key Business Rules


LDM5.3 Avoid the use of nullify insert or delete rules. Favor default rules instead.

LDM5 Determine Key Business Rules


LDM5.4 Never define a nullify insert or delete rule when the foreign key also is part of the primary key of the child entity.

LDM5 Determine Key Business Rules


LDM5.5 Always define the insert rule for a supertype-subtype (or supertypecategory) relationship as a tailored version of either automatic or dependent (tailored to enforce the 1:1 relationship). Define the delete rule for such relationships as cascade.

LDM6 Add Remaining Attributes

LDM6 Add Remaining Attributes


LDM6.1 Associate each attribute with the entity the entire primary key of which is necessary and sufficient to identify or determine it uniquely.

LDM6 Add Remaining Attributes


LDM6.2 Place nonkey attributes as high as possible in the logical data model (as long as the primary key of the entity uniquely identifies the attribute).

LDM6 Add Remaining Attributes


LDM6.3 If an attribute in an entity depends on the primary key but is multivalues (i.e., may have multiple values for one particular value of the primary key), reclassify the attributes as a new child entity type. If unique, that attribute constitutes the primary key of the new child entity. If not unique, that attribute plus the primary key of the original (now parent) entity constitute the primary key of the new child entity.

LDM6 Add Remaining Attributes


LDM6.4 Name, diagram, and document attributes in the data dictionary (or design documentation).

LDM6 Add Remaining Attributes


LDM6.5 If there are attributes that seem to describe a relationships (rather than an entity), reclassify the relationship as a new entity and as child to each of the original two entities.

LDM6 Add Remaining Attributes


LDM6.6 Avoid representing attributes in encode form (e.g., 01 = red, 02 = blue) unless the codes are user-defined and are meaningful within the industry or business area.

LDM6 Add Remaining Attributes


LDM6.7 Do not include processing-oriented flags as attributes in the logical data model.

LDM6 Add Remaining Attributes


LDM6.8 If you must represent attributes in encoded form for business reasons, keep the coded values mutually independent.

LDM6 Add Remaining Attributes


LDM6.9 Optionally represent derived data as attributes within the logical data model when they have a significant business meaning, but indicate that they are derived.

LDM6 Add Remaining Attributes


LDM6.10 Use a special designation for subtype identifiers in the logical data model.

LDM6 Add Remaining Attributes


LDM6.11 Place attributes that are common to all occurrences of a supertype entity in the supertype rather than in each of its associated subtypes.

LDM6 Add Remaining Attributes


LDM6.12 In general, combine entities with the same primary key into one entity. Exceptions include entities with truly distinct business meanings.

LDM6 Add Remaining Attributes


LDM6.13 In general, combine into one subtype all subtypes having the same attributes and the same relationships. (Possibly include a new attribute representing the distinction among the original subtypes.)

LDM6 Add Remaining Attributes


LDM6.14 In general, combine with its associated supertype any subtype that spans the supertype.

LDM6 Add Remaining Attributes


LDM6.15 In general, combine entities containing no nonkey attributes with their child entities (if any).

Normalization

Normalization
Logical database design is concerned with grouping the attributes identified during requirements analysis into relevant entities. It can be shown that certain rules must be followed if the data is to have desirable maintenance properties. These rules are based on the theory of NORMALIZATION.

Normalization
Normalization is the process of deriving or checking a relational model. Data gathered but not formed into relations is termed unnormalized. The process of normalization progressively refines the data in a three stage process, corresponding to: First Normal Form 1NF Second Normal Form 2NF Third Normal Form 3NF

Normal Forms

3NF 2NF 1NF Unnormalized

Normalization
The rules for this process are explicit but an understanding of the organization s business rules is imperative. 1NF all repeating groups of attributes removed 2NF in 1NF, and each non-key attribute depends on the entire key (remove partial dependencies) 3NF in 2NF, and each non-key attribute depends only on the key (remove transitive dependencies)

Normalization Example
Unnormalized:
PRESCRIPTION (pres_num, pres_date, pat_name, pat_address, doc_name, drug_name, drug_quant, issue_status, drug_name, drug_quant, issue_status, )

Normalization Example
First Normal Form:
PRESCRIPTION (pres_num, pres_date, pat_name, pat_address, doc_name) PRESCRIBED_DRUG (pres_num, drug_name, drug_quant, issue_status) Business rule establishes that issue_status (e.g. free to pensioners ) is always the same for given drug, so that:

Normalization Example
Second Normal Form:
PRESCRIPTION (pres_num, pres_date, pat_name, pat_address, doc_name) PRESCRIBED_DRUG (pres_num, drug_name, drug_quant) DRUG (drug_name, issue_status) But patient address is dependent on patient name, so that:

Normalization Example Third Normal Form:


PRESCRIPTION (pres_num, pres_date, pat_name, doc_name) PRESCRIBED_DRUG (pres_num, drug_name, drug_quant) DRUG (drug_name, issue_status) PATIENT (pat_name, pat_address)

LDM7 Validate Normalization Rules

LDM7 Validate Normalization Rules


LDM7.1 Reduce entities to first normal form (1NF) by removing repeating or multivalued attributes to another, child entity.

LDM7 Validate Normalization Rules


LDM7.2 Reduce first normal form entities to second normal form (2NF) by removing attributes that are not dependent on the whole primary key.

LDM7 Validate Normalization Rules


LDM7.3 Reduce second normal form entities to third normal form (3NF) by removing attributes that depend on other, nonkey attributes (other than alternate keys).

LDM7 Validate Normalization Rules


LDM7.4 Reduce third normal form entities to Boyce/Codd normal form (BCNF) by ensuring that they are in third normal for any feasible choice of candidate key as primary key.

LDM7 Validate Normalization Rules


LDM7.5 Reduce Boyce/Codd normal form entities to fourth normal form (4NF) by removing any independently multivalued components of the primary key to two new parent entities. Retain the original (new child) entity only if it contains other, nonkey attributes.

LDM7 Validate Normalization Rules


LDM7.6 Reduce fourth normal form entities to fifth normal form (5NF) by removing pairwise cyclic dependencies (appearing with composite primary keys with three or more component attributes) to three or more new parent entities.

LDM7 Validate Normalization Rules


LDM7.7 In general, do not split fully normalized entities into smaller entities.

LDM7 Validate Normalization Rules


LDM7.8 Reevaluate the normalized data model in light of insert and delete rules and timing considerations. Introduce additional attributes or entities if necessary to prevent temporal integrity anomalies (loss of data due to historical events and timing differences).

LDM8 Determine Domains

LDM8 Determine Domains


LDM8.1 Associate each attribute with a domain or with a set of domain characteristics.

LDM8 Determine Domains


LDM8.2 Document the domain or set of domain characteristics for each attributes in the data dictionary or design documentation. Include data type, length, format or mask, allowable value constraints, meaning, uniqueness, null support, and default value if applicable.

LDM8 : Determine Domains


LDM8.3 Define domains for primary keys to be consistent with the following rules: - Primary keys are unique - Components of composite primary keys are not unique - Neither primary keys nor primary key components may be null - Both primary keys and primary key components may accept default values (as long as primary key uniqueness still holds)

LDM8 : Determine Domains


LDM8.4 Define domains for alternate keys to be consistent with the following rules: - Alternate keys are unique - Components of composite alternate keys are not unique

LDM8 : Determine Domains


LDM8.4 (cont.) - Both alternate keys and alternate key components may be null (although use of default values is preferred) - Both alternate keys and alternate key components may accept default values (as long as alternate key uniqueness still holds)

LDM8 : Determine Domains


LDM8.5 Define domain for foreign keys to be consistent with the following rules: - Data type, length, and format (mask) of foreign key components must be the same as data type, length, and format of corresponding primary key components in parent entities

LDM8 : Determine Domains


LDM8.5 (cont.) - Uniqueness property for foreign keys must be consistent with relationship type (i.e., 1:1 relationship implies unique foreign key, 1:N relationship implies nonunique foreign key) - Null support, default values, allowable value constraints for foreign keys must be consistent with key business rules (insert/delete constraints), but may include additional constraints as needed

LDM8 : Determine Domains


LDM8.6 Define domains for derived attributes to be consistent with the following rules: - Allowable value constraints must include the derivation algorithm - Data type must be the same as data type for the source attribute(s) unless specified otherwise by the derivation algorithm

LDM8 : Determine Domains


LDM8.6 (cont.) - Meaning must be defined using the derivation algorithm and source attribute(s) meaning

LDM8 : Determine Domains


LDM8.7 Define the domain for a subtype primary key (which is also a foreign key) to be subset of the domain for the associated supertype primary key. Specially, - Data type, length and format must be the same as those for the supertype primary key

LDM8 : Determine Domains


LDM8.7 (cont.) - Allowable value constraints must be based on the subtype identifier (whether the subtype identifier is explicitly represented in the logical data model or not) - Meaning must be similar to that of the supertype primary key but based on the subtype identifier - Uniqueness must be specified (for the entire primary key)

LDM8 : Determine Domains


LDM8.7 (cont.) - Nonuniqueness must be specified for component attributes in the case of a composite primary key - Nulls must be prohibited (both for the entire primary key) - Default values can be specified as appropriate.

LDM9 Determine Other Attribute Business Rules (Triggering Operations)

LDM9 : Determine Other Attribute Business Rules (Triggering Operations)


LDM9.1 Define for all business rules triggering operations that maintain integrity and consistency of attribute values.

LDM9 : Determine Other Attribute Business Rules (Triggering Operations)


LDM9.2 Document all triggering operations in the data dictionary or design documentation. Include documentation about the trigger: - Event that initiates the triggering operation (i.e., insert, update, delete, or retrieval) - Object of event (i.e., name of entity and/or attribute being modified or accessed)

LDM9 : Determine Other Attribute Business Rules (Triggering Operations)


LDM9.2 (cont.) - Condition under which the triggering operation is initiated Also include documentation about the operation: - Action to take place (such as reject event or trigger related event)

LDM9 : Determine Other Attribute Business Rules (Triggering Operations)


LDM9.3 Define triggering operations for all attributes that are sources for derived attributes, such that update of a source attribute triggers update of the derived attribute.

LDM9 : Determine Other Attribute Business Rules (Triggering Operations)


LDM9.4 Typically define triggering operations for subtypes such that, when a subtype occurrence is deleted, the corresponding supertype also is deleted.

LDM9 : Determine Other Attribute Business Rules (Triggering Operations)


LDM9.5 Define triggering operations for timeinitiated integrity constraints. Specify the event initiating the operation (i.e., the trigger) as a change in a system current date/time variable.

LDM10 Combine User Views

LDM10 : Combine User Views


LDM10.1 When combining user views, merge entities with the same primary key and equivalent primary key domains. Include in the merged entity all attributes from the original entities (eliminating redundant attributes).

LDM10 : Combine User Views


LDM10.2 When combining user views, establish a supertype-subtype relationship between entities with the same primary key, where one primary key domain is a subset of the other. Eliminate from the new subtype any attributes that are also in the supertype.

LDM10 : Combine User Views


LDM10.3 When combining user views, establish a common supertype to relate two entities with the same primary key. Primary key domains that differ in allowable value constraints but are otherwise equivalent.

LDM10 : Combine User Views


LDM10.4 When combining use views, merge entities the primary keys of which serve as candidate keys for each other. Include in the merged entity all attributes from the original entities (eliminating redundant attributes).

LDM10 : Combine User Views


LDM10.5 When combining user views, include without change (i.e., do not merge) all entities with different primary keys.

LDM10 : Combine User Views


LDM10.6 When combining user views, retain all business rules about candidates keys from the original user views (e.g., primary and alternate key uniqueness and null support). Allow an exception for primary keys in the original user views that are reclassified as alternate keys in the combined user view: Consider whether the no-null constraint can be relaxed.

LDM10 : Combine User Views


LDM10.7 When combining user views, merge relationships between entities where the entities themselves are the results of merging, but only when such relationships convey the same meaning. Apply to the resultant merged relationships a cardinality incorporating cardinalities of the original source relationships. If the resultant relationship is many-to- many (M:N), resolve it by defining a new entity type and two one-tomany (1:N) relationships.

LDM10 : Combine User Views


LDM10.8 When combining user views, initially include without change (i.e., do not merge) all relationships with different meanings. Then identity and eliminate any redundant

LDM10 : Combine User Views


LDM10.9 When combining user views, look for missing relationships between entities originating in different user views. Add these relationships to the combined user view.

LDM10 : Combine User Views


LDM10.10 Correct all foreign keys in a combined user view to reflect primary keys (rather than the alternate keys) of parent entities.

LDM10 : Combine User Views


LDM10.11 When combining user views, initially include key business rules (insert/delete constraints) as defined in the source user views. Add key business rules for new relationships. Then evaluate for inconsistencies. Resolve through discussions with users.

LDM10 : Combine User Views


LDM10.12 When combining user views, merge attributes having the same meanings in the same entity. Reconcile or union their domains and triggering operations.

LDM10 : Combine User Views


LDM10.13 When combining user views, eliminate or flag derived attributes.

LDM10 : Combine User Views


LDM10.14 When combining user views, after merging, eliminating, and adding new relationships as appropriate, again normalize to eliminate any newly introduced redundancies.

LDM10 : Combine User Views


LDM10.15 After combining user views, reexamine attributes to identify changes in domain characteristics such as null support.

LDM11 Integrate With Existing Data Models

LDM11 : Integrate With Existing Data Models


LDM11.1 Integrate databases by comparing and defining mappings among the underlying logical data models.

LDM11 : Integrate With Existing Data Models


LDM11.2 Evolve the business conceptual schema by integrating and incorporating each new logical data model that is developed.

LDM11 : Integrate With Existing Data Models


LDM11.3 Identify the mappings between each logical data model and the business conceptual schema. Document these mappings in the data dictionary, including: - Naming differences - Operations performed on the conceptual schema to obtain a specific logical data model, such as selects, projects, joins, or aggregations of entities and summarization or other derivations of attributes

LDM11 : Integrate With Existing Data Models


LDM11.3 (cont.) - Interrelation of business rules, such as additional constraints applied by a particular logical data model to the domains or triggering operations defined within the conceptual schema - Actual conflicts between a particular logical data model and the conceptual schema, such as contradictions in data definitions or business rules

LDM12 Analyze for Stability and Growth

LDM12 : Analyze for Stability and Growth


LDM12.1 Incorporate into at least document with the logical data model changes that are imminent, significant, and/or probable, for further evaluation during database design or future logical data modeling projects.

RELATIONAL DATABASE DESIGN

RDD1 Identify Tables

RDD1 :Identify Tables


RDD1.1 In general, identify one table for each entity.

RDD1 :Identify Tables


RDD1.2 Name and Document each relational table in the catalog or data dictionary. If feasible, in corporate the entity name within the table name while accommodating specific product restrictions. Cross-reference each table to the entity it represents.

RDD2 Identify Columns

RDD2 :Identify Columns


RDD2.1 In general, identify one column in the appropriate table for each attribute of an entity.

RDD2 :Identify Columns


RDD2.2 Do not define multiple attributes as one (composite) column in a table.

RDD2 :Identify Columns


RDD2.3 Name and document each column in the catalog or data dictionary. If feasible, incorporate the attribute name within the column name while accommodating specific product restrictions. Cross-reference each column to the attribute it represents.

RDD2 :Identify Columns


RDD2.4 Diagram relational tables, columns, and selected implementation options in a format that aids understanding of the design by users and by developers.

RDD3 Adapt Data Structure to Product Environment

RDD3 :Adapt Data Structure to Product Environment


RDD3.1 Define left-to-right sequencing of columns in a table to optimize product-specific storage utilization and performance.

RDD3 :Adapt Data Structure to Product Environment


RDD3.2 Require users and programs to explicitly name (and therefore sequence columns) to be returned by a query; i.e., discourage use of SQL SELECT * syntax (or equivalent). Also discourage INSERT commands that do not specify column names.

RDD3 :Adapt Data Structure to Product Environment


RDD3.3 If feasible, allocate primary space large enough to contain the entire table.

RDD3 :Adapt Data Structure to Product Environment


RDD3.4 If feasible, allocate free space to accommodate all anticipated row inserts and updates that may occur after table load or reorganization.

RDD3 :Adapt Data Structure to Product Environment


RDD3.5 In general, assign to one database tables representing entities that are closely related in business meaning.

RDD3 :Adapt Data Structure to Product Environment


RDD3.6 When different groups of users consistently access different set of tables, consider separating those sets by placing them in separated databases. This may facilitate greater concurrency or data availability, and may avoid database size limitations.

RDD3 :Adapt Data Structure to Product Environment


RDD3.7 Assign to each database a meaning name that conveys general business meaning (grouping of entities) and conforms to product-specific restrictions.

RDD3 :Adapt Data Structure to Product Environment


RDD3.8 Document information about databases in the catalog or data dictionary.

RDD3 :Adapt Data Structure to Product Environment


RDD3.9 If feasible, set database locking parameters to effect locking of the least amount of data for the shortest duration.

RDD4 Design for Business Rules about Entities

RDD4 :Design for Business Rules about Entities


RDD4.1 Enforce logical properties (uniqueness, minimality, and disallowance of nulls) of the entity's primary key through the relational implementation.

RDD4 :Design for Business Rules about Entities


RDD4.2 If an entity has an alternate key, enforce the logical properties (uniqueness, minimality, and, if applicable, disallowance of nulls) of the alternate key through the relational implementation.

RDD4 :Design for Business Rules about Entities


RDD4.3 Document primary and alternate keys for each relational table in the catalog or data dictionary, including mechanisms for enforcing the keys' properties.

RDD5
Design for Business Rules about Relationships

RDD5 :Design for Business Rules about Relationships


RDD5.1 Enforce business rules about relationships (key attribute insert, delete, and update constrains) through the relation implementation.

RDD5 :Design for Business Rules about Relationships


RDD5.2 Document foreign keys for each relational table in the catalog or data dictionary, including techniques used to enforce related business rules.

RDD6 Design for Additional Business Rules About Attributes

RDD6 : Design for Additional Business Rules About Attributes


RDD6.1 Enforce business rules about attributes (domains and triggering operations) through the relation implementation.

RDD6 : Design for Additional Business Rules About Attributes


RDD6.2 Document in the catalog or data dictionary business rules about attributes, including techniques used to enforce them.

RDD6 : Design for Additional Business Rules About Attributes


RDD6.3 In general, do not permit null values for any column.

RDD6 : Design for Additional Business Rules About Attributes


RDD6.4 In the relational implementation, favor use of default values over null values.

RDD6 : Design for Additional Business Rules About Attributes


RDD6.5 Automate assignment of default values, if feasible.

RDD6 : Design for Additional Business Rules About Attributes


RDD6.6 When defining default values for foreign keys, establish a corresponding primary key occurrence containing a matching value.

RDD6 : Design for Additional Business Rules About Attributes


RDD6.7 When using default values, establish special mechanisms as needed to ensure correct execution of aggregate operations.

RDD6 : Design for Additional Business Rules About Attributes


RDD6.8 Enforce business rules about tables and columns through standard maintenance routines, always executed in the place of or in addition to native DML update commands, when the DBMS table definition cannot enforce these rules automatically.

RDD6 : Design for Additional Business Rules About Attributes


RDD6.9 Make each standard maintenance routine table-specific. Embed calls to table-specific subroutines to avoid proliferation of duplicate code.

RDD6 : Design for Additional Business Rules About Attributes


RDD6.10 Establish standard integrity audit routines for each table (or set of related tables) to detect business rule violations.

RDD7 Tune for Scan Efficiency

RDD7 :Tune for Scan Efficiency


RDD7.1 Encourage scan processing for - Small tables (e.g., six or fewer physical blocks) - Medium and large tables (e.g., more than six physical blocks) when accessed to satisfy requests for which a large percentage of rows (e.g., 20 percent or more) qualify.

RDD7 :Tune for Scan Efficiency


LDM7.1 (cont.) - Any tables when accessed to satisfy batch or low-priority data requests, for which other access mechanisms are too costly.

RDD7 :Tune for Scan Efficiency


RDD7.3 In general, do not store multiple tables within the same DBMS storage structure.

RDD7 :Tune for Scan Efficiency


RDD7.4 Accelerate scan processing where feasible (product-specific) by - Facilitating parallel scan processing - Using high-speed storage devices - Employing high-speed scanning techniques - Specifying appropriate numbers and sizes of data buffers

RDD8 Define Clustering Sequences

RDD8 :Define Clustering Sequences


RDD8.1 In general, cluster medium to large tables (e.g., more than six physical blocks) that are any one of the following: - Frequently sorted into the same sequence - Frequently accessed based on selection criteria involving a range of values for a particular column or set of columns

RDD8 :Define Clustering Sequences


- Frequently processes sequentially using the same sequence, and are - infrequently inserted or deleted - infrequently updated, where updates involve change to the columns that determine the clustering.

RDD8 :Define Clustering Sequences


RDD8.2 Do not cluster a table if the overhead is detrimental to other critical processing requirement (e.g., insert/update/ delete processing).

RDD8 :Define Clustering Sequences


RDD8.3 Do not cluster small tables.

RDD8 :Define Clustering Sequences


RDD8.4 Consider clustering rows on columns involved in - ORDER BY - GROUP BY - UNION, DISTINCT, and other operations involving sorts - joins - Selection over range of values

RDD8 :Define Clustering Sequences


RDD8.5 When choosing a clustering sequence, evaluate the effect of clustering on concurrency.

RDD8 :Define Clustering Sequences


RDD8.6 Consider cross-table clustering on primary and foreign keys to facilitate joins.

RDD8 :Define Clustering Sequences


RDD8.7 Periodically execute statistics-gathering utilities on clustered tables if such utilities maintain clustering statistics used by the optimizer.

RDD8 :Define Clustering Sequences


RDD8.8 Minimize cost of clustering through appropriate specification of free space and frequent table reorganizations.

RDD9 Define Hash Keys

RDD9 :Define Hash Keys


RDD9.1 In general, hash medium to large tables (e.g., more than six physical blocks) for which you - Frequently access individual rows in random order - Typically specify discrete values of the same column or set of columns (known as the hash key) in your selection criteria - Infrequently update values of the hash key.

RDD9 :Define Hash Keys


RDD9.2 Define hash keys and (if possible) hash algorithms that ensure a useful distribution of data.

RDD9 :Define Hash Keys


RDD9.3 In general, avoid hash synonyms.

RDD9 :Define Hash Keys


RDD9.4 Consider hashing the column or set of columns most frequently equated to discrete values in selection criteria.

RDD9 :Define Hash Keys


RDD9.5 In general, do not hash on a set of columns if a subset (e.g. one column of multicolumn hash key) is frequently referenced alone in selection criteria. Define a clustering sequence or ordered index instead.

RDD9 :Define Hash Keys


RDD9.6 Avoid hashes on frequently updated columns.

RDD9 :Define Hash Keys


RDD9.7 Test various hash algorithms to optimize data distribution.

RDD9 :Define Hash Keys


RDD9.8 Reevaluate domain characteristics (e.g., data type and length) of columns that are hash keys.

RDD10
Add Indexes - Tune by Adding Indexes

RDD10 : Add Indexes - Tune by Adding Indexes


RDD10.1 In general, build indexes on medium to large tables (e.g., more than six physical blocks) to do either of the following: - Facilitate access of a small percentage (e.g., less than 20 percent) of rows in a table - Avoid table access altogether for requests involving a small subset of columns Assuming that the additional overhead imposed on updates is acceptable.

RDD10 : Add Indexes - Tune by Adding Indexes


RDD10.2 Build clustering indexes to facilitate clustering of table rows (for products that require an index to accompany clustering).

RDD10 : Add Indexes - Tune by Adding Indexes


RDD10.3 Build ordered indexes and specify ascending or descending (if supported by the product) to facilitate sequential or sorted access to a small percentage (e.g., less than 20 percent) of the rows.

RDD10 : Add Indexes - Tune by Adding Indexes


RDD10.4 If you store multiple tables in the same DBMS storage structure, establish indexes on each table to minimize inappropriate multitable scanning.

RDD10 : Add Indexes - Tune by Adding Indexes


RDD10.5 Avoid building indexes for which the overhead severely degrades critical processing requirements or for which the cost (related to index storage and maintenance) is excessive relative to the benefits. Consider - Storage requirements - Influence on inserts, updates, deletes

RDD10 : Add Indexes - Tune by Adding Indexes


Implications for load times Table reorganization performance Recovery times Backup performance Statistics gathering

RDD10 : Add Indexes - Tune by Adding Indexes


RDD10.6 Do not create (non unique) indexes on small tables.

RDD10 : Add Indexes - Tune by Adding Indexes


RDD10.7 Consider building indexes on columns frequently involved in - Selection or join criteria (SQL WHERE clause) - ORDER BY - GROUP BY - UNION, DISTINCT, and other operations involving sorts

RDD10 : Add Indexes - Tune by Adding Indexes


RDD10.8 Consider building indexes on foreign key columns.

RDD10 : Add Indexes - Tune by Adding Indexes


RDD10.9 Consider building to encourage index-only access when - A reasonable subset of columns is required to satisfy certain requests - The optimizer is smart enough to invoke indexonly access - Index-only access is more efficient than table access

RDD10 : Add Indexes - Tune by Adding Indexes


RDD10.10 Consider building indexes on columns in built-in functions, together with the (SQL GROUP BY) columns used the built-in functions.

RDD10 : Add Indexes - Tune by Adding Indexes


RDD10.11 Choose sequence of columns in a composite (multicolumn) ordered indexes to facilitate processing of as many types of requests as possible.

RDD10 : Add Indexes - Tune by Adding Indexes


RDD10.12 Evaluate ways in which the DBMS optimizer uses indexes when you choose between a composite index and singlecolumn indexes.

RDD10 : Add Indexes - Tune by Adding Indexes


RDD10.13 Avoid indexes on frequently updated columns.

RDD10 : Add Indexes - Tune by Adding Indexes


RDD10.14 Avoid indexes on columns with such an irregular distribution of values that the optimizer frequently misjudges index usefulness.

RDD10 : Add Indexes - Tune by Adding Indexes


RDD10.15 In general, create one to four indexes per table, or perhaps more if the tables are rarely updated.

RDD10 : Add Indexes - Tune by Adding Indexes


RDD10.16 In planning for the process of creating indexes, consider - Availabilities implications for the table being accessed and perhaps for other tables or the DBMS catalog - Efficiency implications of building in the index during or after table load

RDD10 : Add Indexes - Tune by Adding Indexes


RDD10.17 Where feasible, store indexes and indexed tables on different storage devices.

RDD10 : Add Indexes - Tune by Adding Indexes


RDD10.18 Specify index locking and index free space options to minimize the effects of the index on concurrent processing and on updates.

RDD10 : Add Indexes - Tune by Adding Indexes


RDD10.19 Update the catalog statistics by executing a utility (if supported by the DBMS) immediately after adding an index.

RDD10 : Add Indexes - Tune by Adding Indexes


RDD10.20 Evaluate dynamic drop and recreation of indexes to accommodate specific processing and performance requirements.

RDD10 : Add Indexes - Tune by Adding Indexes


RDD10.21 Establish index names (if supported by the DBMS) and dictionary documentation to convey meaning and purpose of the indexes.

RDD10 : Add Indexes - Tune by Adding Indexes


RDD10.22 For DBMS products that support multiple types of storage structures for indexes, choose the most efficient structure based on user access requirements. Regardless of structure, also tune for efficiency of index scans as you did for tables scans (RDD7.2, RDD7.3, and RDD7.4).

RDD11 Add Duplicate Data

RDD11 : Add Duplicate Data


RDD11.1 Consider duplicating one or a few nonvolatile columns from a parent, ancestor, or one-to-one child table to expedite table lookups in frequent or critical requests.

RDD11 : Add Duplicate Data


RDD11.2 Specify the domain of a copied column to be consistent with that of the source column.

RDD11 : Add Duplicate Data


RDD11.3 Consider adding column derived from existing, nonvolatile column - To reduce multitable access - To improve performance of frequent of critical requests involving calculations (by performing the calculations in advance) - To circumvent DML limitations that prevent expressing the derivation in DML

RDD11 : Add Duplicate Data


RDD11.4 Use special naming conventions to denote columns that are copied or derived from existing columns.

RDD11 : Add Duplicate Data


RDD11.5 Document copied and derived columns in the data dictionary or catalog, with reasons for duplication.

RDD11 : Add Duplicate Data


RDD11.6 Enforce consistency of source and copied or derived columns by - Permitting users (and programs) to update only the source (not copied of derived) columns - Establishing triggering operations on the source columns that automatically cascade updates to all copied or derived columns

RDD11 : Add Duplicate Data


RDD11.7 Consider storing repeating groups across columns rather than down rows when - The repeating group contains a fixed number of occurrences, each of which has a different meaning, or - The entire repeating group is normally accessed and updated as one unit (i.e., individual occurrences are seldom updated, retrieved or involved in selection criteria)

RDD11 : Add Duplicate Data


RDD11.8 Consider contriving a (shorter) column to substitute for a primary key and all associated foreign keys when the primary key - Is very long (in bytes) - Is comprised of many columns - Cannot effectively be indexed or hashed

RDD11 : Add Duplicate Data


RDD11.9 Instead of replacing a primary key by a contrived substitute, consider redefining an alternate key as the primary key.

RDD11 : Add Duplicate Data


RDD11.10 In general, do not allow exact copied of rows in the same table.

RDD11 : Add Duplicate Data


RDD11.11 Avoid storing rows derived from existing rows with in the same relational table, because such derived rows destroy the association of every column with exactly one domain.

RDD11 : Add Duplicate Data


RDD11.12 To effect an outer join, consider adding rows to the joined tables, so that, in effect, all rows match. However, do so only after evaluating DML alternatives for simulating an outer join, such as - UNION, NOT EXISTS syntax - UNION, NOT IN syntax - Insert into temporary table - Insert into non-relational file

RDD11 : Add Duplicate Data


RDD11.13 To effect an outer join, consider contriving columns to substitute for the joined columns in nonmatching rows. (This will include nonmatching rows in the result.) Do so in preference to adding extra rows (RDD11.12) when the number of extra rows is excessive. However, consider DML alternatives (see RDD11.12) first.

RDD12 Redefine Columns

RDD12 : Redefine Columns


RDD12.1 Evaluate redefining a long or variable-length textual column as one of the following: - Short column of abstracted text - Long column in a separate table - Short column in the existing table plus a long column in a separate overflow table - Multiple occurrences (rows) of fixed-length columns in a separate table

RDD12 : Redefine Columns


RDD12.2 Consider selective redefinition of foreign keys to reference alternate rather than primary keys when - Such redefinition eliminates multitable access and improves performance of critical requests - Referenced alternate keys do not allow nulls - Referential integrity still can be enforced

RDD13 Redefine Tables

RDD13 : Redefine Tables


RDD13.1 In general, eliminate (choose not to create) tables that - Add no new information - Are not referenced by any (known) data requests

RDD13 : Redefine Tables


RDD13.2 Consider adding duplicate (subset or derived) tables: - To expedite frequent or critical requests - To enable testing of new or ad hoc requests - To facilitate requests involving only summarized or derived data

RDD13 : Redefine Tables


RDD13.3 Treat column in duplicate tables as duplicate columns. Specially, apply RDD11.5 and RDD11.6 - To document the duplicate columns in the data dictionary or catalog (RDD11.5) - To permit update of only the source (not the duplicate) columns (RDD11.6) - To establish triggering operations on source columns to automatically cascade updates to duplicate columns (RDD11.6)

RDD13 : Redefine Tables


RDD13.4 Consider segmenting a table vertically into separate tables if different users consistently access specific subsets of columns in the table. Consider not doing so if either of the following: - Other frequent or critical requests would reference multiple table segments - Update processing would span table segments

RDD13 : Redefine Tables


RDD13.5 In general, when segmenting a table vertically, store the primary key in every segment but each nonkey column in exactly one of the segments.

RDD13 : Redefine Tables


RDD13.6 When segmenting a table vertically, include every row in each segment to avoid the need for outer joins.

RDD13 : Redefine Tables


RDD13.7 Consider segmenting a table horizontally into separate tables if different users consistently access specific subsets of rows of the table. Consider not doing so if either of the following: - Other frequent or critical requests would reference multiple table segments - Update processing would span table segments

RDD13 : Redefine Tables


RDD13.8 In general, when segmenting a table horizontally, store each row in exactly one of the segments.

RDD13 : Redefine Tables


RDD13.9 When segmenting a table horizontally, include all columns in each segment to avoid the need for outer unions.

RDD13 : Redefine Tables


RDD13.10 Consider combining tables that - Represent entities involved in 1:1 relationships - Are frequently referenced together by users - Are infrequently referenced separately assuming the effect on performance, data availability, and storage is acceptable.

RDD13 : Redefine Tables


RDD13.11 Consider combining subtype tables that share the same supertype if the subtypes - Have similar columns - Are involved in similar relationships - Are frequently accessed together - Are infrequently accessed separately assuming the effect on performance, data availability, and storage is acceptable.

RDD13 : Redefine Tables


RDD13.12 Consider combining a supertype table with (typically all of) its subtype tables if - Users usually access the supertype and subtype together - Users infrequently access the supertype and subtypes separately - The effect on performance, data availability, and storage is acceptable

RDD13 : Redefine Tables


RDD13.13 Consider combining a parent table with a child table if - Every parent row is associated with at least child row - Users often reference the parent table with the child - Users infrequently reference one without the other The effect on performance, data availability, and storage is acceptable

RDD13 : Redefine Tables


RDD13.14 To effect an outer join, consider combining the tables into one table. However, do so only after evaluating other alternatives for simulating the outer join, such as - UNION, NOT EXISTS syntax - UNION, NOT IN syntax - Insert into temporary table - Insert into nonrelational file - Adding extra rows so that in effect all rows match (RDD11.12) - Contriving substitute columns to facilitate joining non matching rows (RDD11.13)

Relational Database Design Techniques

Relational Database Design Techniques


RDDT.1 Allow users to access data only through views. Do not permit users to access base tables directly.

Relational Database Design Techniques


RDDT.2 Create one master view for each table and build all other views on the master views (i.e., not directly on the base tables).

Relational Database Design Techniques


RDDT.3 Create views on master views as needed to simplify or restrict read-only access.

Relational Database Design Techniques


RDDT.4 Build the standard maintenance (insert, update, and delete) routines and the integrity audit routines to access master (or potentially more restricted) views rather than base tables.

Relational Database Design Techniques


RDDT.5 Name and document views in the data dictionary or catalog.

Relational Database Design Techniques


RDDT.6 Where possible, enforce user security through master views or through second-level security views defined on the master views.

Relational Database Design Techniques


RDDT.7 Consider creating a user authorization table to control access to row subsets by user ID.

Relational Database Design Techniques


RDDT.8 When using standard maintenance routines to enforce integrity, grant appropriate user access to the routines and prohibit use of native DBMS update commands.

Relational Database Design Techniques


RDDT.9 In a production environment, generally restrict access to base table definitions and storage structure specifications to designated database administrators. If users are permitted to create their own (e.g., temporary) tables, isolate such user definitions to their own space or even their own databases.

Relational Database Design Techniques


RDDT.10 Tune performance of data retrieval from VLRDB by - Selecting appropriate access mechanisms - Implementing sample tables - Implementing summarized tables

Relational Database Design Techniques


RDDT.11 Facilitate maximum throughput against a VLRDB by appropriate - Choice of locking options - Clustering - Table partitioning or segmentation - Commit processing - Segmentation of updates

Relational Database Design Techniques


RDDT.12 Tune mass insert processing against a VLRDB by - Using a load utility - Disabling logging - Temporarily dropping and subsequently recreating indexes

Relational Database Design Techniques


RDDT.13 Tune mass delete processing against a VLRDB by partitioning and segmenting.

Relational Database Design Techniques


RDDT.14 For VLRDBs, ensure that long-running requests can be recovered and restarted within reasonable timeframes.

Relational Database Design Techniques


RDDT.15 Accommodate utility processing against a VLRDB by - Encouraging parallel utility execution - Taking advantage of partial utilities - Scheduling statistics-gathering utilities only as needed - Assessing recovery times

Relational Database Design Techniques


RDDT.16 Be cognizant of the structural modifications that require dropping, redefining, and recreating existing objects (product-specific).

Relational Database Design Techniques


RDDT.17 Determine the effect that dropping a relational object will have on existing relational objects, applications, and users.

Relational Database Design Techniques


RDDT.18 Notify user prior to dropping an object.

Relational Database Design Techniques


RDDT.19 In environments where object drops automatically cascade to dependent objects or where objects cannot be dropped until all dependent objects have been dropped, establish mechanisms or procedures for identifying and subsequently recreating such dependent objects.

Relational Database Design Techniques


RDDT.20 When adding a table, identify its relationships with existing tables and enforce associated integrity rules.

Relational Database Design Techniques


RDDT.21 Make structural database change as transparent as possible to existing users and applications.

Relational Database Design Techniques


RDDT.22 Use synonyms or views to rename objects (instead of using a RENAME command).

Relational Database Design Techniques


RDDT.23 When adding columns, consider implication with respect to domain and sequencing of columns within a table.

Relational Database Design Techniques


RDDT.24 Document all design change in the data dictionary. Include a description of the original design, the changed design, and the reason of change.

Relational Database Design Techniques


RDDT.25 Create a scenario for incorporating change before attempting to implement them.

You might also like