Professional Documents
Culture Documents
Spatial DBMS
Introduction
Computers and Information Technology have driven the human society towards a comfortable life where
thousands of activities are performed on a click of a button and help save a lot of time. Everyday activities such
as withdrawing money from a bank, making airline reservations, accessing a book from a library etc., are the
processes that involve computer programs that access databases and database systems to facilitate all these
services to the users.
The database systems have been growing and are involved in diverse applications such as:
Let us start understanding the world of databases with its functional unit known as data.
Data is defined as a collection of observations that could be in forms of facts, values, measurement, images etc.
The data when used with a context becomes the information. It can be further processed, organized or simply
summarized.
Given below is a list of numbers. This list of numbers is data since it communicates no meaning as such.
25
23
24
27
As soon as it is qualified that the numbers refer to the temperature recorded in degree Celsius in different cities
of India on a particular day, the numbers become the temperature information.
Date
23 October 2011
Cities
Delhi
Mumbai
Kolkata
Chennai
Temperature (C)
25
23
24
27
Data Storage
There are two approaches of storing data:
1. File based
2. Database
Unordered files
Ordered files
Index files
Unordered files
Also known as heap files, the unordered sequential files have the basic type of organization where records are
placed in the file in the order in which they are inserted i.e. new records are inserted at the end of the file.
Ordered files
The records in such a file can be ordered based on the values of one of their fields known as ordering fields. If
the ordering field is the field whose value are distinct for each individual entity of the file, then the field is known
as ordering key. Reading the records in order of ordering key values becomes efficient as no sorting is required.
Index Files
Indexes are additional access structures which are used to speed up the retrieval of the records in response to a
search condition. These provide alternative ways of accessing the records without affecting the physical
placement of records.
The figure above describes the electronic gadgets in day today use. We can see that flash is a child of mp 3
players, which is a child of portable electronics, which is a child of electronics. The topmost
element electronicshas no parent.
Tube, LCD, plasma, CD players and 2 way radios are leaf nodes (dont have any children).
Advantages
Easy to understand: The organization of database parallels a family tree understanding which is quite
easy.
Accessing records or updating records are very fast since the relationships have been predefined.
Disadvantages
Large index files are to be maintained and certain attribute values are repeated many times which lead to
data redundancy and increased storage.
The rigid structure of this model doesnt allow alteration of tables, therefore to add a new relationship
entire database is to be redefined.
The many to many relationships are easily implemented in a network data model.
Data access and flexibility in network model is better than that in hierarchical model. An application can
access an owner record and the member records within a set .
It enforces data integrity as a user must first define owner record and then the member records.
The model eliminated redundancy but at the expense of more complicated relationships.
Disadvantages
The network model has a complex structure that requires familiarity from users as well as programmers
end.
Advantages
The manager or administrator does not have to be aware of any data structure or data pointer. One can
easily add, update, delete or create records using simple logic.
Disadvantages
A few search commands in a relational database require more time to process compared with other
database models.
Relate
Relate defines a relationship between two tables. Relates are bidirectional which means both tables involved will
be able to use the relate regardless of which table owns the relate. The associated data isnt appended in the
table like it is in join. However one can access the related data by selecting a particular record and then going to
the related tables against that record.
STUDENT_INFO
STUDENTNAME STUDENTID AGE
ALICE
CE01
18
ANDREW
CE02
17
DAVID
CE03
18
DONA
CE04
18
SEX
F
M
M
F
The schema would define that STUDENT_INFO has four facts/attributes viz. STUDENTNAME, STUDENTID,
AGE, SEX. To ensure that correct data is filled in all the columns of the table one can also enforce rules and
constraints for data input.
Advantages of DBMS
Controlling Redundancy
Redundancy means storing the same data multiple times. DBMS checks redundancy and prevents duplication of
efforts, saves storage space and preserves the data files from becoming inconsistent.
A DBMS provides a security and authorization system, which the database administrator uses to create accounts
and to specify account restriction.
Database systems provide capabilities for efficient execution of queries and updates. Because the database is
typically stored on disk, it provides specialized data structures to speed up disk search for the desired records.
Auxiliary files called indexesare used for this purpose.
The backup and recovery subsystem of the DBMS helps in recovering from hardware or software failures.
Most database applications have certain integrity constraints that must be held for the data. A DBMS should
provide capabilities for defining and enforcing these constraints.
Along with the advantages, the DBMS usage involves overhead costs that are not incurred in conventional file
processing. These overhead costs are due to:
One may use regular files instead of a DBMS under the following circumstances:
The database and applications are simple, well defined and are not expected to change.
Multiple user access to data isnt required
Database Architecture
We can consider the database on three levels of abstraction:
1. External Level refers to users view of the database. It describes a part of the database for particular
group of users. Depending on their needs, different users access different parts of the database. It
employs a powerful and flexible security mechanism by hiding parts of the database from certain users.
2. Conceptual Level refers to the logical structure of the entire database. It describes data as well as the
relationships among the data.
3. Internal Level refers to the details of physical storage of the database on the computer. It consists of
description of storage space allocation for data and indexes, record placements and data compression.
Entity Record
An entity is a real world item that exists on its own. The set of all possible values for an entity is the entity type.
For example, a particular student such as Ravi Kumar is an entity record. Student is the entity type in this case.
In ER diagram we show entity type as a rectangle containing the type name.
Attribute
Properties that describe an entity are known as its attributes. The value of an attribute could be expressed in
numbers or in text. The set of all possible values of an attribute is known as attribute domain. In ER diagram
attributes are represented by ovals attached to the entity by a line.
Attributes can be classified as:
Key attributes: An attribute whose values are distinct for each individual entity record and are used for
identifying an individual entity record are known as key attributes. For example in the student entity type,
StudentID is the key attribute since no two students can have same StudentID.
A key attribute is underlined in ER diagram.
Non-key attributes : Attributes that are not unique but are used to describe the entities are known as non-key
attributes. Names, age, address of a student are the non key attributes.
Simple : Attributes that cant be divided into subparts are called simple attributes. For example StudentIDwhich
is just a number is a simple attribute.
Composite : Attributes that can be divided into subparts with each subpart having their own independent
meaning are composite attributes. For example Name of a student can be divided into two parts i.e. first
nameand last name. This could be illustrated by branching off the components of the attribute.
Single valued: Attributes that can hold only single value at a time are called single valued attributes. Age of a
student cant have more than one value and hence it is a single valued attribute.
Multiple valued: Attributes that can have more than one value are called multiple valued attributes. For
example the contact number of a student can have two or more than two phone numbers.
A multi valued attribute is shown as:
Derived attributes: The attributes that are derived using a mathematical formula and operations on other
attributes are called derived attributes.
Stored attributes: The attributes from which another attributes can be derived are called stored attributes.The
age of a student can be calculated by counting the number of years starting from his date of birth to the present
date. In this case age is the derived attribute and date of birth is the stored attribute. In ER diagram a derived
attribute is represented with a dotted oval and a line.
Relationship
A relationship is an association among entity types. It is represented as a diamond in ER diagram.
For example an entity student can be associated with another entity class as follows:
Cardinality
Cardinality denotes the occurrences of data on either side of a relation. The cardinality ratio for a binary
relationship specifies the maximum number of relationship instances an entity can participate in.
A one to one relationship indicates that a single instance of one entity is associated with a single instance in
the related entity.
A student associated with many projects indicates a one to many relationship, and many students associated
with a single project indicate a many to one relationship.
A many to many relationship indicates that either entity participating in the relationship may have many
instances.
Every student is attending one or more classes. Every class has one or more students.
Example: The diagram shown below represents the academic functioning of a college. There are five entities viz.
Department, Faculty, Student, Course, and Hostel. All the five entities have their own attributes.DNumber,
FacultyID, StudentID, CourseID, and HostelID are the key attributes of Department, Faculty, Student,
Course and Hostel respectively. The entities are related to each other and the respective relationships are
explained below:
A college has many departments. A department would have students as well as faculty. The one to many
relationship between department and students, and, department and faculty states that a department belongs to
many students and it employs many faculty members. Looking at these relationships in a reverse direction
conveys that a student as well as a faculty belongs to a single department and thus establishes one to one
relationship.
A student can register himself into various courses; similarly a course can be studied by many students. A
student lives in a single hostel but a hostel accommodates many students. A department offers many courses
but a particular course belongs to a particular department. A faculty teaches many courses but a particular
course is taught by a single faculty only.
Normalization
Normalization is a design technique which helps designing relational databases. The objective of normalization is
to create a set of relational tables that are free of redundant data and to make data consistent.
First normal form
Create separate tables for sets of values that apply to multiple records .
Relate these tables with a foreign key
The example shows a table (Table1) in which the group class is mentioned three times. Since one student has
several classes, these classes should be listed in a separate table. Another table in first normal form (Table2) is
created by eliminating the repeating group (Class#), as shown below:
First Normalization (No Repeating Groups)
Note the Class# values for each Student# value in the above table. Class# is not functionally dependent on
Student# (primary key), so this relationship is not in second normal form. Class# is separated from the first
normalization table and is placed in another table (Table4).
The attribute Adv-Room is functionally dependent on the Advisor attribute. It must be moved from the student
table to some other table lets say faculty table (Table6).
Third Normalization (Eliminate data not dependent on key)
Longley et al. (2001) list the following features of object data models which make them good for modeling GIS
systems:
Encapsulation: packaging together of the description of state and behavior in each object
Inheritance: ability to use some or all characteristics of one object in another object
Polymorphism: specific implementation of operations such as create, delete etc for each object.
References
http://wofford-ecs.org/dataandvisualization/ermodel/material.htm viewed on 8 November 2011 Oppel, A
2010,Databases demystified, 2 nd edn, McGraw Hill, United States
Goodchild, M.F., Longley, P.A., Maguire, D. J. & Rhind, D.W 2001, Geographic information systems and science,
John Wiley & Sons Ltd. , England.