You are on page 1of 42

Database Management Systems

The Three-Tier Architecture of Web-Based Data Warehousing:


Intranet, / Extranet
Internet
Application Web - Pages Client
Server
Web-Browser
Data
warehouse Web-Server

Dr. K.T. Subhaschandra.


MBA – Coordinator,
Dept of Commerce and Management
Govt. R.C.College of Commerce
There is probably no segment of activity in the world attracting as
much attention at present as that of knowledge management

Data, Information, Knowledge


&
Wisdom
By:
Dr. K. T. Subhas chandra
Dept. of Commerce & Management
Govt. R. C. College of Commerce & Management
Data, Information, Knowledge, and Wisdom
Russell Ackoff, professor of organizational change, categorized the
content of the human mind as:

– Data: symbols
– Information: data that are processed to be useful; provides
answers to "who", "what", "where", and "when" questions
– Knowledge: application of data and information; answers "how"
questions
– Understanding: appreciation of "why"
– Wisdom: evaluated understanding.

Ackoff indicates that the first four categories relate to the past; they deal
with what has been or what is known. Only the fifth category, wisdom, deals
with the future because it incorporates vision and design. With wisdom,
people can create the future rather than just grasp the present and past. But
achieving wisdom isn't easy; people must move successively through the
other categories
A further elaboration of Ackoff's definitions follows:

• Data... data is raw. It simply exists and has no


significance beyond its existence (in and of itself).
It can exist in any form, usable or not. It does not
have meaning of itself. In computer parlance, a
spreadsheet generally starts out by holding data.

• Information... information is data that has been


given meaning by way of relational connection.
This "meaning" can be useful, but does not have
to be. In computer parlance, a relational database
makes information from the data stored within it.
Knowledge... knowledge is the appropriate collection of information,
such that it's intent is to be useful. Knowledge is a deterministic
process. When someone "memorizes" information (as less-
aspiring test-bound students often do), then they have amassed
knowledge. This knowledge has useful meaning to them, but it
does not provide for, in and of itself, an integration such as would
infer further knowledge. For example, elementary school children
memorize, or amass knowledge of, the “Multiplications table".
They can tell you that "2 x 2 = 4" because they have amassed that
knowledge (it being included in the table). But when asked what is
"1267 x 300", they can not respond correctly because that entry is
not in their Multiplications table. To correctly answer such a
question requires a true cognitive and analytical ability that is
only encompassed in the next level... understanding. In computer
parlance, most of the applications we use (modeling, simulation,
etc.) exercise some type of stored knowledge.
Understanding... understanding is an interpolative and
probabilistic process. It is cognitive and analytical. It is the
process by which I can take knowledge and synthesize new
knowledge from the previously held knowledge. The difference
between understanding and knowledge is the difference between
"learning" and "memorizing". People who have understanding
can undertake useful actions because they can synthesize new
knowledge, or in some cases, at least new information, from
what is previously known (and understood). That is,
understanding can build upon currently held information,
knowledge and understanding itself. In computer parlance, AI
systems possess understanding in the sense that they are able to
synthesize new knowledge from previously stored information
and knowledge.
Wisdom... wisdom is an extrapolative and non-deterministic, non-probabilistic
process. It calls upon all the previous levels of consciousness, and specifically
upon special types of human programming (moral, ethical codes, etc.). It
beckons to give us understanding about which there has previously been no
understanding, and in doing so, goes far beyond understanding itself. It is the
essence of philosophical probing. Unlike the previous four levels, it asks
questions to which there is no (easily-achievable) answer, and in some cases,
to which there can be no humanly-known answers period. Wisdom is
therefore, the process by which we also distinguish, or judge, between right
and wrong, good and bad. I personally believe that computers do not have,
and will never have the ability to posses’ wisdom. Wisdom is a uniquely
human state, or as I see it, wisdom requires one to have a soul, for it resides as
much in the heart as in the mind. And a soul is something machines will never
possess (or perhaps I should reword that to say, a soul is something that, in
general, will never possess a machine).
The following diagram represents the transitions from data, to information, to
knowledge, and finally to wisdom, and it is understanding that support the transition
from each stage to the next. Understanding is not a separate level of its own.
• Data represents a fact or statement of event without relation to
other things.
Ex: It is raining.
• Information embodies the understanding of a relationship of
some sort, possibly cause and effect.
Ex: The temperature dropped 15 degrees and then it
started raining.
• Knowledge represents a pattern that connects & generally
provides a high level of predictability as to what is described or
what will happen next.
Ex: If the humidity is very high and the temperature
drops substantially the atmospheres is often unlikely to
be able to hold the moisture so it rains.
• Wisdom embodies more of an
understanding of fundamental
principles embodied within the
knowledge that are essentially the basis
for the knowledge being what it is.
Wisdom is essentially systemic.
Ex: It rains because it rains. And this
encompasses an understanding of all the
interactions that happen between raining,
evaporation, air currents, temperature
gradients, changes, and raining.
• References:
• Ackoff, R. L., "From Data to Wisdom",
Journal of Applies Systems Analysis,
Volume 16, 1989 p 3-9.
• Gadomski, Adam Maria,
Information, Preferences and Knowledge,
An Interesting Evolution in Thought
• Sharma, Nikhil,
The Origin of the Data Information Knowle
dge Wisdom Hierarchy
DATABASE AND DATABASE MANAGEMENT SYSTEMS
Data is the key element that drives information systems of any organization.
The objective of Management Information System is to transform data into
meaningful management information. To make decisions and plan for the future,
managers need information that originates from a database: the business
Management Information System model is built on a database foundation: Database
is the mortar of MIS which runs all sub information sub-systems of an organization.

Since the inception of electronic computers the most challenging tasks of the
managers is Data Resource Management [DRM]. In DRM functions the
organizations have faced a lot of inconvenience in using electronic media. The
persistence and regular efforts of the IT industry resulted in the invention of new
database management devices / technology as solutions for all types of
inconvenience faced by the organizations. The database management systems
software [DBMS/RDBMS], distributed databases, date warehousing and data
mining, object-oriented database, web-based hypermedia database, technologies are
the stage-by-stage growth of such inventions.
FILE MANAGEMENT METHODS
An efficient information system provides users with timely, accurate, and
relevant information. This information is stored in manual or computer files. In
computer based management information systems file we mean computer files.
When files are properly maintained, users can easily access and retrieve the
information they need.
A computer system organizes data in a hierarchy that starts with bits and
bytes and progresses to fields, records, and databases. A Bit is a smallest unit of
data [i.e., either 0 or 1] a computer can handle. A group of such bits viz 8 bits called
a Byte, which represent a single character {i.e., A to Z, a to z, numbers, special
characters, space, etc.,}. A grouping of characters into a word, (a group of words, or
complete number, such as a person’s name , age, etc.,) is called a field. A group of
related fields is called record. A group of record of the same type is called file.
A record describes an Entity. An entity is a person, place, thing, or event
on which we maintain information. Each entity contains so many piece of
information describing a particular entity such a piece of information is called
attribute.
Every record in a file should contain at least one field that uniquely identifies
that record so that the record can be retrieved, updated, or sorted. Such identifier
field is called a KEY-FIELD. Finally a related file is organized into a database.
ILLUSTRATIVE FIGURE OF DATA HIERARCHY IN COMPUTER FILE MANAGEMENT SYSTEMS

HIERARCHY OF DATA EXAMPLE OF EMPLOYEE DATABASE

DATABASE Employee - File Welfare Benefit File


Pay-Roll File

Emp_name ID no. Dept. Shift Address Ph.no

File 1. Gopal Ab. 09909 A II …B’lore 3354126


2. Ramu Cl. 67890 C I ..mysore 48765
3. Chandra Dx.09860 D G …B’lore. 2225760

Emp_name ID no. Dept. Shift Address Ph.no

Record 1. Gopal Ab. 09909 A II …B’lore 3354126

Field 1. Gopal { Name field}

Byte G 01110100(ASCII code of 1st letter of name field Gopal)

Bit 0

OPERATION REQUIRED FOR PROCESSING RECORDS IN A FILES ARE :


a. File creation, b. Locating a record, c. Adding a record, d. Deleting a record, e. Modifying a record
LIMITATION OF FILE MANAGEMENT
Data Redundancy: It means the presence of duplicate data in multiple data files.
This occurs when different divisions, functional areas, and groups in an
organization independently collect the same piece of information. The same
piece of information collected and stored in different files for different
applications has different meaning for programmers, and analysts work in
isolation on different applications. So that creates lots of confusion.
Program-data dependence: Program-data dependence is the tight relationship
between data stored in files and the specific programs required to update and
maintain those files. Every computer program has to describe the location
and nature of the data with which it works. Any change in data requires a
change in all programs that access the data.
Lack of flexibility: The file management system can deliver routine scheduled
reports after extensive programming efforts, but it cannot deliver ad hoc
reports or respond to unanticipated information requirements in a timely
fashion.
Poor security: Because there is little control or management of data, access to
and dissemination of information are virtually out of control. What limits on
access exist tend to be the result of habit and tradition, as well as of the
sheer difficulty of finding information.
Lack of data sharing and availability: The lack of control over access to data in
this confused environment does not make it easy for people to obtain
information. Because pieces of information in different files and different parts
of the organization cannot be related to one another, it is virtually impossible
for information to be shared or accessed in a timely manner.
DATABASE & DATABASE MANAGEMENT SYSTEMS
The DATABASE Technology can cut through many of the problems created by
file management systems. It can be defined as follows.
A collection of interrelated data stored together with controlled
redundancy to serve one or more applications in optimal fashion;
the data are stored so that they are independent of programs which
use the data; a common and controlled approach is used in adding
new data and modifying and retrieving existing data within the data
base.
- Jerome Kanter “ Management Information Systems” – Third Edition- Prentice Hall of India
private Ltd., New Delhi – 110 001. Pp.90- 127

“DATABASE is a collection of data organized to serve many


applications efficiently by centralizing the data and minimizing
redundant data. Rather than storing data in separate files for each
application, data are stored physically to appear to users as being
stored in only one location. A single database services multiple
applications
- KENNETH C. LAUDON & JANE P. LAUDON “Essentials of Management Information
Systems” Third Edition – pp. 199 –229.
The following figure illustrates an example for database for
HUMAN RESOURCES MANAGEMENT.

INTEGRATED HUMAN
RESOURCE MANAGEMENT

Employees:
Name Personnel
Address Personnel application department
Social security number Programs
Position
Marital status

Payroll:
Database Payroll application
Programs
Hours Worked Management Payroll
Pay rate department
Gross pay
Systems
Deductions [ DBMS ]
Net pay
Benefits application
Programs
Benefits: Benefits
Life insurance department
Group insurance
Health care plan
Provident fund
Retirement benefits
DATABASE & DATABASE MANAGEMENT SYSTEMS
A database is a mechanized, formally defined, centrally
controlled collection of data in an organization. The data
structure is physically organized and stored to promote
share ability, availability, evoluability, and integrity.
– GORDON B. DAVIS & MARGRETHE H. OLSON “Management Information Systems” Third Edition – pp. 205-234

The database approach is made operational by a DATABASE


MANAGEMENT SYSTEMS {DBMS}, A SOFTWARE SYSTEM, WHICH
PERFORMS THE FUNCTIONS OF DEFINING, CREATING,
REVISING, AND CONTROLLING THE DATABASE. That is DBMS has
a specialized function to create and maintain a database and enable
individual business applications to extract the data they need without
having to create separate files of data definitions in computer
programs. The DBMS software provides facilities for retrieving data,
generating reports, revising data definitions, updating data, and
building applications.The DBMS ACTS AS AN INTERFACE BETWEEN
APPLICATION PROGRAMS AND THE PHYSICAL DATA FILES.
Conceptual Model of DBMS

Database query language

Database definition
Database creation
Database redefinition Database
Data restructure
Integrity controls

Database programming
language interface

Application
program
As per the above Database Model there are 3 types of users

1. The Non-Programming Users: this users are not writing any


program to use the database. Usually an analyst or end user with
special training. Programs ad hoc queries and reports using a
database query languages.

2. The Programming Users: An applications programmer who does


the analysis and programming of applications. Uses special database
interface instructions to program application access to the database
through the database management system. The instructions call the
database management system to request data, perform up-datates,
etc. the programming users can also use the database query
language for special assignments.

3. The Database Administrators: The DBA uses special instructions


and facilities of the database management system (a data definition
language or DDL) to define, create, redefine, and restructure the
database and to implement integrity controls.
Objectives of DATABASE & DATABASE MANAGEMENT SYSTEMS

1. Availability: Data should be made available for use by applications


(both current and future) and by queries.
2. Shareablility; Data items prepared by one application are available to
all applications or queries. No data items are ‘owned’ by an
application.
3. Evaluability: The database can evolve as application usage and
query needs evolve.
4. Data Independence: The users of the database establish their view
of the data and its structure without regard to the actual physical
storage of the data.
5. Data Integrity: The database establishes a uniform high level of
accuracy and consistency. Validation rules are applies by the
daabase management system.
6. Reduced Redundancy: The presence of duplicate data in multiple
data files in file structure of collection of data is completely eliminated.
Three components of DBMS are :
A Data Definition Language: A formal language used by the programmers to specify the content
and structure of the database. DDL defines each data element as it appears in the database
before that data element is translated into the forms required by application programs.
A Data Manipulation language: This language is used in conjunction with some conventional
third or fourth generation programming languages to manipulate the data in the database. This
language contains commands that permit end users and programming specialists to extract
data from the database to satisfy information requests and develop applications. The most
prominent data manipulation language today is Structured Query Language [SQL]. The SQL is
standard data manipulation language for relational database management systems (RDBMS).
A Data Dictionary: This is an automated or manual file, which includes definitions of data
elements and data characteristics such as usage, physical representation, ownership (person/s
responsible for maintaining the data in the organization), authorization, and security. Many data
dictionaries can produce lists and reports of data utilization, groupings, program locations, and
so on. A Data element represents a field. In simple data dictionary is a repository of
information about data. It contains following information about data:
– The name of the data item.
– A description of the data items.
– Sources of data i.e., various sources of input.
– Impact analysis i.e., users of the data including screens, reports, programs, and
organizational positions that access and use the data item.
– Key words used for categorizing and searching for data item descriptions.
LOGICAL AND PHYSICAL VIEWS OF DATA:
DBMS separates logical and physical views of the data, relieving the programmer or end user from
the task of understanding where and how the data are actually stored.
LOGICAL VIEW: It is a representation of data as they would appear (be perceived by ) to an application
programmer or end user.
PHYSICAL VIEW: It shows how data are actually organized and structured on physical storage media .

Logical structure or model (i.e., relationship among data)


HIERARCHICAL DATA MODEL: this model presents data to users in a tree like structure. Example.
IBM’s IMS (Information Management System). In this system a record is subdivided into
segments that are connected to each other in one-to-many or parent-child relationships.
Data are physically linked to one another by a series of pointers that form chains of related data
segment.
NETWORK DATA MODEL: this model allows many-to-many relationships among records (i.e., the
net-work model allows entry into a database at multiple points, because any data element or
record can be related to any number of other data elements).
RELATIONAL DATA MODEL: this model represents all data in the database as simple two-
dimensional tables called relations. In each table the rows (tuples) are unique records and the
columns (attributes) are fields (data elements). The relational data model can relate data stored
in one table to data in another as long as the two tables share a common data element (key
field).
In a relational database, three basic operations are used to develop useful sets of data:
viz., SELECT, PROJECT, & JOIN. The select operation creates a subset consisting of all
records in the file that meet stated criteria. The join operation combines relational tables to
provide the user with more information than is available in individual table. The project operation
creates a subset consisting of columns in a table, permitting the user to create new tables that
contain only the information required.
Resent Trends in Database
Billions of bytes of business-critical data are being created by many organisations’
computer systems daily, yet only small portion of them are used in business related
analysis, which causes most companies are “data rich” but “information poor” because
their ability to manipulate data and deliver information lags far behind the growth rate of
data. To overcome such difficulty there is also a significant development in hardware,
telecommunications, & database technologies, accompanied by increased rate of
computer literacy of end-users. The following are some of the latest development in
database technology.
Distributed Databases:
A distributed database is one that is stored in more than one physical
location. Some parts of the database are stored physically in one location and
other parts are stored and maintained in other locations. There are main two
ways of distributing a database viz., 1.Replicated database. It provides
duplicate of all data at all site this database is recommended if it is necessary for
every location to have frequent access to the same data. 2. Partitioned
database. In this method the database is divided into segments that are
appropriate locations and those segments distributed only to those locations.
Example the database may be partitioned along functional lines viz., Financial,
Logistic, Human-resource-management, Manufacturing, Marketing, etc.,
Data may be kept at corporate office and relevant production and personnel data
at each manufacturing plant and office site.
Partitioning may also be achieved along geographical lines. That is, all information- Financial, Logistic,
Human-resource-management, Manufacturing, Marketing, etc., may be kept at each of the separate
locations of an organization.
Many organizations with many locations partition the database hierarchically. Detailed data, such as payroll
and sales data are kept close to their source-the local site. Regional and national locations receive increasingly
less detailed summaries of the detailed data as these data are transmitted up through the organisation’s
hierarchy.
Distributed database systems usually reduce costs because; they reduce transfer of data between remote
sites and the organisation’s headquarters. This system may also provide organizations with faster response
times for filling orders, answering customer requests, or providing mangers with information. However,
distributed database systems also magnify the problems of databases. They compound the problems of control
over the database, increase problems of security, for the database, increase data redundancy and the resulting
danger to data integrity, and increase the need for more computer resources. Unless the distribution of a
database is done very carefully, many of the advantages of having a database in the first place can be lost.
The increased power and use of microcomputers by managers and professionals have created additional
problems for database administrators. When managers download data from a centralized database to their
microcomputers, there is no longer a truly centralized database. Parts of the database are segmented and
distributed to these microcomputers.
Because of the backlog of requests to be filled at many management information systems departments, other
departments may become so frustrated that they decide to acquire their own minicomputers or microcomputers
to provide their own information services. When this happens, additional files and databases are established
throughout an organization, creating much of the same redundancy, inconsistency, and incompatibility.
An important type of distributed database system is called client/server systems.
The distributed database & distributed processing has significantly increased the awareness of the data as a
key corporate resource and underscored the importance of its management that is Data Resource
Management (DRM). The success of DRM function in distributed environment can manifest in several different
ways. Success may be reflected by the degree to which preset DRM objectives are realized. The DRM
objectives relate to improvements in efficiency and effectiveness of the DRM function. Such objectives include
maintaining data integrity, accuracy, security, and availability; providing timely data; designing efficient data
distribution strategies; enhancing operational efficiency; setting and enforcing standards; facilitating enhanced
data sharing and reducing redundancy; developing strategic data plans; and training information systems
personnel and end-users, among others.
OBJECT-ORIENTED AND HYPERMEDIA DATABASES
Conventional database management systems are not well suited for handling
graphic-based or multimedia application such as drawings, images, photographs,
voice, and full-motion video etc. An Object Oriented database, on the other hand,
stores the data and procedures as objects that can be automatically retrieved and
shared. The object oriented database management is becoming popular because
they can be used to manage the various multimedia components or Java applets
used in Web applications, which typically integrate pieces of information from a
variety of sources. OODBMS are also useful for storing data types such as
recursive data (An example would be parts within parts as found in manufacturing
applications.).
The hypermedia database: This is an approach to DBMS that organizes data as a
network of nodes linked in any pattern established by the user; the nodes can
contain text, graphics, sound, full-motion video, or executable programs.
Although object oriented and hypermedia databases can store more complex types
of information than relational DBMS, they are relatively slow compared with
relational DBMS for processing large numbers of transactions. Hybrid object –
relational systems are now available to provide capabilities of both object-oriented
and relational DBMS. A hybrid approach can be accomplished in three different
ways: by using tools that offer object-oriented access to relational DBMS, by using
object-oriented extensions to existing relational DBMS, or by using object-relational
database management system.
MULTI-DIMENSIONAL DATA ANALYSIS
Multi-dimensional Data analysis
Some times managers require
to analyse data by viewing data from Area wise Budgeted & Actuals For All Products:
multi-dimensional angle for example BUDGETED: 1.Quantity

Budgeted & Actual for each product


2.Price 6
Cipla Pharmaceutical Industry 3.Amount 5
manufacturers different products & ACTUALS:4. Quantity 4
distributes such drugs available at 5. Price 3
6. Amounts 2
different Cities and Towns. If manager PRODUCTS: 1
(CEO) needs to know actual sales by CEFORPROX
products for each Cities or Towns and
LARPOSE
also want to compare them with
budgeted sales requires multi- FERTOMID
dimensional data analysis for Ex: for
LOMAC
the products viz., CEFORPROX,
LARPOSE, FERTOMID, LOMAC, ASTHALIN
ASTHALIN etc distributed in Cities
viz., BANGALORE, MYSORE,

DAVANGERE
BANGALORE
DHARWAD, DAVANGERE & Towns CITIES & TOWNS

DHARWAD

HOSKOTE

KENGERI
MYSORE
viz., Kolar, Malur, Kengeri, Anekal

MALUR
KOLAR
etc., is shown as follow.

To provide this type of information organizations can use either specialized multi-
dimensional database or a tool that creates multi-dimensional views of data in relational
databases. Another name used for multi-dimensional data analysis is On-Line Analytical
Processing [OLAP]. OLAP refers to capability for manipulating and analyzing large volume
of data from multiple perspectives.
DATA WAREHOUSES
Data warehousing is a multi-billon dollar industry that sprang up during the 1990s. Today most
fortune 1000 firms and many smaller ones now have their own data warehouses. The industry
resulted from a realization that conventional on-line transactions at a base are not adequate for
decision support, data-mining, and customer relationship applications (i.e., e-CRM). To meet the
requirement of DSS / GDSS, EIS, data-mining,
e-CRM, Supply-chain relationship management & some other similar type of application data
warehousing are emerged.
Data warehousing and Internet are the two key technologies that offer potential solutions for
managing corporate data. Data warehousing liberates information and the Internet makes it easy
and less costly to access information from anywhere at anytime.
Definition of data warehouse
Data warehouse is defined as “a subject-oriented, integrated, non-volatile, time-
variant collection of data organized to support management needs.” 2.7
A data warehouse differs from operational databases that mainly support the daily
business transactions and management information to managers in the form of
periodical reports. A data warehouse collects data from multiple (both internal &
external) sources, and stores it in a fashion that allows end users to have faster
easier, and more flexible access to key information and the data in the data
warehouse are standardized (cleaned) and are available for anyone to access as
needed but cannot be altered.

WILLIAM. H. INMAN “Data warehouse, Marts, Metadata, OLAP/ROLAP & Data mining Glossary” Management Accounting –Castelluccio,
(4:78), 1996 pp-59-69.
Data marts:
The organization build enterprise-wide data warehouses where a
central data warehouse serve the entire organization, or they can
create smaller, decentralized data warehouses called data marts. A
data mart is a subset of a data warehouse in which a
summarized highly focused portion of the organization data is
placed in a separate database for a specific population of users
(functional or department wise users).
Software tools needed for data warehouse
1. Warehouse Construction Software: This software is required to extract
relevant data both from operation databases & external sources to make sure
the data are “clean” (free from error), transform the data into a useable form
and load the data into the data warehouse. Warehouse construction
software is available from IBM, Information Builders, Platinum
Technology etc.,
2. Warehouse operation software: This software required for storing data and
managing the data warehouse. This is accomplished by DBMSs such as
Computer Associates CA-Ingres, IBM’s DB2, Oracle, Sybase Specialized
warehouse management software is offered by Hewlett-Packard, IBM,
Information Builders, NCR, Red-Brick, & Others.
3. Warehouse Access and Analysis Software: The widest variety of
software tools is available in the warehouse access and analysis
area, Information catalog tool, such as Platinum Technology’s
Platinum Repository; tell the user what is in the warehouse.
Reporting tools enable a user to produce customized reports from the
data warehouse, perhaps on a regular basis. Information Builder’s
FOCUS, a 4GL etc., widely used reporting tool.

Query tool, such as Brio-Technology’s Brio Query Enterprise,


make it easy for a user to query the warehouse. For more
sophisticated data analysis, specialized data-mining tools such as
Thinking Machines Darwin, are available. Visualising the data may
be important, using a tool such as SAS Institute’s SAS/Insight, and
presenting the data through an executive Information System (EIS)
can be done using Show Business Software’s Show Business EIS
or a similar tools. The Figure on the next slide illustrates the data
warehouse environment.
of
with
Data

Data
model

warehouse
warehouse

component
Data sources for
data warehouse Warehouse
Operation software Data mart -1

Internal data
Data mart -2

Operational &
Historical
data. Data mart -3
*Queries &
* Reports
External Data *OLAP
*Data mining.
Data
Data mart -4
warehouse
External data
Data mart -5
Warehouse Access & Analysis software

Warehouse Construction Software

Data mart -6
Internet&
Intranet
. A Structural Model showing Components of Data Warehouse

Data-bank
Data mart -7
Information
Directory

Components of Data Warehouse


A data warehouse is user driven. It provides greater flexibility
in using data than traditional information systems. Orr (1996)
identifies eight interconnected parts of data warehouse
architecture (DWA). The eight parts represent the overall
structure of data, communication, processing, and presentation
in a data warehouse environment. Among them, the information
access layer is the layer that end users deal with when using the
data warehouse. It includes the hardware and software that
constructs the user interface with the data warehouses. The
major goal of the user interface is to make the raw data
available easily and seamlessly to the end users. Currently, most
organizations implement data warehousing in either a
standalone or traditional client / server environment, and
most data warehousing applications implement their
information access layer using applications with graphic
user interface (GUI) running on desktop computers.
Drawbacks of data warehousing environment
1. The Client / Server infrastructure is expensive to establish and maintain

2 Using one user interface is no longer sufficient, because of increasing number


of Mobile users. The Gartner Group estimated that by year 2003, 137 million
business users worldwide will regularly work outside the boundaries of the
enterprise and without continuous LAN or high-speed WAN connections
(Reinaner, 1998).But, the number is increased in many folds as on today.
Providing information and decision support to these users becomes an
inevitable challenge to organization today.

3 System compatibility is always a problem for the traditional client / server


environment: deploying multiple computing platforms enormously increases
the cost of administration and maintenance.

4 Today supply-chain management (SCM) has becoming increasingly


important. Successful SCM requires an organization to invest heavily in
interenterprise coordination, distribution and channel partnerships and
customer responsiveness. Therefore, limiting the information access to a
small number of highly trained specialists within an organisation is no longer
sufficient. Information access must be extended to include an
organisation’s internal users, suppliers, partners, and customers.
Web-Based Data warehousing
Web based data warehousing involves accessing, analyzing and distributing the information extracted from a
data warehouse through Internet, intranet, or extranet using a Web browser as the user interface . Because of
the explosive development in the data warehousing and Web related technology available today. Web-based data
warehousing is starting to gain more and more popularity among organizations. The most important contribution of
Web-based data warehousing is assisting an organization to create innovative relationships with its suppliers,
partners and customers and helps organization to have better supply chain relations to react rapidly to opportunities .
ARCHITECTURE OF WEB-BASED DATA WAREHOUSING
The Architecture of Web-based data warehousing is three-tiered and includes client, Web server, and application
server. On the client side, all user needs is an Internet connection and Web-browser (preferably Java or C#
-enabled). The client computer can be of any platform, including PCs, Macintoshes, UNIX machines, network
computers, and so on. The Internet / intranet / extranet is the communication medium between client and servers. On
the server side, a Web ser is used to manage the inflow and outflow of information between client and server. Both a
data warehouse and an application server, which houses down-loadable Java applications, Common Gateway
Interfaces (CGI) programs, and other applications that are utilized to manipulate the data in the data warehouse, back
it. The query results are displayed on Web pages that are constructed on the fly or by Java-based data visualization
tools. The three tiered Web-based data warehousing is exhibited as bellow

2.7 LEI- DACHEN AND MARK N. FROLICK “Web- Based


data warehousing “(Information
Systems
Management Spring –2000), pp.
The Three-Tier Architecture of Web-Based Data Warehousing:


Intranet, / Extranet
Internet
Application Web - Pages Client
Server
Web-Browser
Data
warehouse Web-Server

Ref …..EI- DACHEN AND MARK N. FROLICK “Web- Based data


warehousing “(Information Systems Management Spring –2000),
The Web-based data warehousing allows end users to
use Web browsers as a user interface in order to access
and manipulate data. Such applications can be Internet,
Intranet, or Extranet based. Web-based data warehousing
offers following advantages:
1. In recent years, end users have become more experienced in using the Internet
for both business and leisure purposes: this phenomenon makes the Web
browser a easy-to-use interface for users of all levels of computer skill.

2. Web-based data warehousing reduces the establishment and management cost


by offering a thin-client solution. The thin-client solution moves most of the
application processing to the server: therefore, there is a reduced need for
hardware and software cost and support on the desktop. It brings the power of
many computers into one relatively simple desktop device connected to network.
Pre-installation of the software is not required, in many instances, and future
upgrading and maintenance are performed only on the server, which serves as a
device with enormous amount of resources.

3. The Internet provides a way to distribute data to a large number of users in a


low-cost and platform- independent environment.
Data Mining
Data mining forms the most crucial part of data
warehousing as it acts like a catalyst that is
responsible for sifting gold out of useless
pebbles. It analyses all the information stored in
databases and extracts only the valuable data for
deriving productive results.2.8 In data mining, the
data in a data warehouse are processed to
identify key factors and trends in historical
patterns of business activity. This can be used to
help managers make decisions about strategic
changes in business operations to gain
competitive advantages in the marketplace.
Ref:….2.8 SWASTI OHRI “Data Warehousing – Mining holds the Key to Success” ‘i.t’
Bureau, US. The complete Magazine on Information Technology-Sept-99.
Managerial considerations for Data Resource Management
Many different end users and a variety of application programs can
access the data in database / data warehouse, the data in database/
warehouse is considered as a corporate resource. Hence it is necessary
to manage and control such corporate resource. The elements of data
resource management are briefly explained as follows:
1. Data Administration: It is a special organizational function for managing
the organisation’s data resources, concerned with information policy,
data planning, maintenance of data dictionaries, and data quality
standards. The fundamental principle of data administration is that all
data are the property of the organization as a whole. Data cannot belong
exclusively to any one business area or organizational unit. All data are
to be made available to any group that requires them to fulfill its mission.
Hence the organization needs to formulate an information policy that
specifies its rules for sharing, disseminating, acquiring, standardizing,
classifying, and inventorying information throughout the organization.
Information policy lays out specific procedures and accountabilities,
specifying which organizational units share information: where
information can be distributed: and who has responsibility for updating
and maintaining the information.
2. Data Planning and Modeling Methodology: The organizational
interests served by the DBMS are much broader than those in the
traditional file environment: therefore the organization requires
enterprise-wide planning for data. An Enterprise analysis, which
addresses the information requirements of the entire organisation (as
opposed to the requirements of individual applications), is needed to
develop databases. The purpose of enterprise analysis is to identify
the key entities, attributes, and relationships that constitute the
organisation's data.
3. Database Technology, Management and Users: Databases require
new software and a new staff specially trained in DBMS techniques,
as well as new management structures. Most corporations develop a
database design and management group within the corporate
information system division that is responsible for the more technical
and operational aspect of managing data. The functions it performs
are called database administration. This group does the following:
• Defines and organizes database structure and content
• Develop Security Procedures and safeguard the database
• Develops database documentation
• Maintains the database management software
In close cooperation with users, the design group
establishes the physical database, the logical relations
among elements, and the access rules and procedures.
A database serves a wider community of users than
traditional systems. Relational systems with fourth
generation query languages permit employees who are not
computer specialists to access large databases. In addition,
users include trained computer specialists and non-
specialist, the database helps to optimize access for non-
specialists & more resources must be devoted to training
end users. Professional systems workers must be retrained
in the DBMS language, DBMS application development
procedures, and new software practices.
Benefits and limitations database management
• The database management approach provides managerial end
users with several important benefits. Database management reduces
the duplication of data and integrates data so that multiple programs and
users can access them. Programs are not dependent on the format of the
data and the type of secondary storage hardware being used. Users are
provided with an inquiry / response and reporting capability that allows them
to easily obtain information they need without having to write computer
programs. Computer programming is simplified, because programs are not
dependent on either the logical format of the data or their physical storage
location. Finally, the integrity and security of the data stored in databases
can be increased, since database management system software, a data
dictionary, and a database administrator function control access to data and
modification of the database.
• The limitations of database management arise from its increased
technological complexity. Developing large databases of complex types and
installing a DBMS can be difficult and expensive. More hardware capability
is required, since storage requirements for the organisation’s data, overhead
control data, and the DBMS programs are greater. Longer processing times
may result from these additional data and software. Finally, if an
organisation relies on centralized databases, its vulnerability to errors, fraud,
and failures is increased. Yet problems of inconsistency of data can arise if
a distributed database approach is used. Therefore, the security and
integrity of organisation’s databases are major concerns of an organisation’s
data resource management effort.
Database Management System and
Its Significance in Management

Any Questions ?

Thank You !
42

You might also like