You are on page 1of 9

1

- Information repository with knowledge discovery


Authors

to extract intelligence/knowledge in a near real


time. The data warehouse, also called
Abstract knowledge, allows the storage of data in a
Organisations are today suffering from a format that facilitates its access, but if the tools
malaise of data overflow. The developments in for deriving information and/or knowledge and
the transaction processing technology has given presenting them in a format that is useful for
rise to a situation where the amount and rate of decision making are not provided the whole
data capture is very high, but the processing of rationale for the existence of the warehouse
this data into information that can be utilised for disappears. Various technologies for extracting
decision making, is not developing at the same new insight from the data warehouse have come
pace. Knowledge and data mining provide a up which we classify loosely as "Data Mining
technology that enables the decision-maker in Techniques". Our paper focuses on the need for
the corporate sector/govt. to process this huge information repositories and discovery of
amount of data in a reasonable amount of time, knowledge and thence the overview of, the so
hyped, Data Warehousing and Data Mining.

Discovery
Content Overview • Technologies used in Data Mining
Introduction • Goals of Data Mining and
Warehouse with a database Knowledge Discovery
What is Data-Warehousing?
Warehousing Functions Compendium
Architecture Of Data Warehouse
What is Data Mining?
Warehousing and Mining
Data Mining as a part of Knowledge

1
2
Bibliography excellence. Information technology (IT) tools
that are oriented towards knowledge processing
can provide the edge that organizations need to
survive and thrive in the current era of fierce
competition. The increasing competitive
Introduction
pressures and the desire to leverage information
“Knowledge [no more Information] is not only
technology techniques have led many
power, but also has significant competitive
organizations to explore the benefits of new
advantage”
emerging technology - "Data Warehousing and
Organizations have lately realized that
Data Mining". What is needed today is not just
just processing transactions and/or information’s
the latest and updated to the nano-second
faster and more efficiently, no longer provide
information, but the cross-functional
them with a competitive advantage vis-à-vis
information that can help decisions making
their competitors for achieving business
activity as "on-line" process.

Evolution of Information Technology Tools data. The managerial knowledge acquisition


function is/was not directly supported by these
The evolution of the information
systems. The evolution of new patterns in the
systems characterize the evolution of systems
changing scenario could not be provided by
from data maintenance systems, to systems that
these systems directly, the planner was
transform the data into "information" for use in
supposed to do this from experience.
the decision making process. These systems
supported the information acquisition from the
database of transactional

Data Processing Information Processing Knowledge

Transactions Processing Data Mining Tools & On-Line Analytical


systems Management Processing Tools

The Transformation of Data into Knowledge and associated tools.

Warehouse with a database

2
3
One thing that remains constant , especially in that supports the decision-making process and
corporate world , is “ Change” provides businesses the ability to access and
analyze data to increase an organization's
These days, change is occurring at
competitive advantage. Datawarehousing is a
an ever-increasing rate. A key challenge is
process, not an off-the-shelf solution you buy,
implementing an information infrastructure that
but hardware--database and tools integrated into
allows your company to rapidly respond to
an evolving information infrastructure--that
change. One solution to this challenge is the
changes with the dynamics of the business.
datawarehouse. Datawarehousing is an
information infrastructure based on detail data
What is Data-Warehousing?

The data warehouse makes an attempt to figure  Data in a warehouse is not updates or
out "what we need" before we know we need it. changed in any way, but is only loaded
What it actually is? and accessed later on

 A data warehouse stores current and  Data is organized according to subject


historical data instead of application.

 This data is taken from various, perhaps In general a database is not a data
incompatible, sources and stored in a warehouse unless it has the following two
uniform format features:

 Several tools transform this data into • It collects information from a


meaningful business information for the number of different disparate sources
purpose of comparisons, trends and and is the place where this disparity is
forecasting reconciled, and

3
4
• It allows several different
applications to make use of the same
information.

Conceptually, a Data Warehouse looks like this:

Information Sources always include the The Data Warehouse itself is the bridge
core operational systems, which form the between the operational systems and the
backbone of day-to-day activities. It is decision support tools. It holds a copy of
these systems, which have traditionally much of the operational system data in a
provided management information to logical structure, which is more
support decision-making. conducive to analysis. The Data
Warehouse, which will be refreshed in
Decision Support Tools are used to
scheduled bursts from operational
analyze the information stored in the
systems and from relevant external data
warehouse, typically to identify trends
sources, provides a single, consistent
and new business opportunities.
view of corporate data, leaving
operational systems unaffected.

Data – Warehouse Functions * Increasing the speed and flexibility of


analysis.
The main function behind a data
* Providing a foundation for enterprise-
warehouse is to get the enterprise-wide data in a
wide integration and access.
format that is most useful to end-users,
* Improving or re-inventing business
regardless of their locations. Data warehousing
processes.
is used for:
* Gaining a clear understanding of
customer behavior.

4
5
Data Warehouse Architecture relational, or multidimensional.
While choosing a DBMS it must be
Each implementation of a data
kept in view that the database
warehouse is different in its detailed design (as
management system should be
shown in figure below), but all are characterised
powerful enough to handle huge
by a handful of the following key components:
amount of data running up to
terabytes.
• A data model to define the
warehouse contents. • A front end for Decision Support
System (DSS) for reporting and for
• A carefully designed warehouse
structured and unstructured analysis.
database, whether hierarchical,

Data Mining Databases) is a process that aims to use existing


data to invent new facts and to uncover new
Data base mining or Data mining (DM)
relationships previously unknown even to
(formally termed Knowledge Discovery in
5
6
experts thoroughly familiar with the data. It is knowledge. The data mining process is
based on filtration and assaying of mountain of diagrammatically exemplified in Figure below
data “ore” in order to get “nuggets” of

Transformed Data

Data Sources

Extracted
1 Information

Assimilated Information
2 Data
Selected
Warehouse
Data

Select Transform Mine Assimilate

The Data Mining Process.

Data Mining and Data Warehousing transactions. To make data mining more
efficient, the data warehouse should have an
The goal of a data warehouse is to
aggregated or summarized collection of data
support decision making with data. Data mining
.Data mining helps in extracting meaningful
can be used in conjunction with a data
new patterns that cannot be found necessarily by
warehouse to help with certain types of
merely querying or processing data or metadata
decisions. Data mining can be applied to
in the data warehouse.
operational databases with individual

Data Mining as a Part of the Knowledge Knowledge Discovery in Databases, frequently


Discovery Process abbreviated as KDD, typically encompasses more than

6
7
data mining. The knowledge discovery process Transformation: Create general
comprises five phases: representation and tables of metadata

Selection: creating possible Extraction: Extract patterns from


segmentation criteria for selecting data the data warehouse, turning data into
knowledge
Preprocessing: Normalize,
rationalize, and cleanse the data Interpretation and Evaluation:
Evaluate utility of extracted and
identified patterns

Technologies Used in Data Mining : combination, mutation, and natural selection


in a design based on the concepts of natural
Artificial neural networks: Non-linear
evolution.
predictive models that learn through training
and resemble biological neural networks in Decision trees: Tree-shaped structures
structure. that represent sets of decisions. These
decisions generate rules for the classification
Genetic algorithms: Optimization
of a dataset.
techniques that use processes such as genetic
7
8
Case Based reasoning (CBR):To Rule induction: The extraction of
forecast a situation, or to make a correct useful if-then rules from data based on
decision, such systems find the closest past statistical significance.
analogs of the present situation and choose
Data visualization: The visual
the same solution, which was the right one
interpretation of complex relationships in
in those past situations. That is why this
multidimensional data. Graphics tools are
method is also called the nearest neighbor
used to illustrate data relationships.
method

Goals of Data Mining and Knowledge Classification: Data mining can


Discovery partition the data so that different classes or
categories can be identified based on
The goals of data mining fall into the following
combinations of parameters.
classes:
Optimization :One eventual goal
Prediction: Data mining can show how
of data mining may be to optimize the use of
certain attributes within the data will behave in the
limited resources such as time, space,
future.
money, or materials and to maximize output
Identification: Data patterns can be
variables such as sales or profits under a
used to identify the existence of an item, an event, or an
given set of constraints.
activity.

be relational or multi-dimensional in
nature)

Compendium Manages it into a format that is


optimised for end users to access and
A data warehouse takes the analyse.
organisations operational data, historical data
When a data warehouse has been
and external data
constructed, it provides a complete picture of
Consolidates it into a separately the enterprise. It provides an unparalleled
designed database (which can either opportunity to the management to learn about
their customers. The data warehouse technology

8
9
together with online transaction processing and presentation techniques, like hypertext mark up
data mining, allows the management to provide language (HTML), Open Database Connectivity
better customer service, create greater customer (ODBC) etc. the database mining (Data & Text)
loyalty and activity, focus customer acquisition operation has gained wide spread recognition as
and retention of the most profitable customer, a viable tool for business intelligence gathering.
increase revenue, reduce operating cost; Advances in the document mining technology
provides tools that facilitate sounder decision (database mining of free form text/data, in
making; improves worker/management contrast to the “classical” approach to data
knowledge and productivity; spares the mining of fixed length records) are making the
operational database from ad-hoc queries with data mining technology more powerful. Last but
the resulting performance degradation and never the least, the Internet has emerged as the
clears the legacy database system, while moving largest data warehouse of unstructured and free
the corporate system architecture forward. With form data. The new technologies are geared
the incorporation of new data delivery and towards mining this great data warehouse.

Bibliography

Using Information Technology by


William Sawyers Hutchinson
Data Base System Concepts by
Silberschatz, Korth and Sudharshan
Data Base Management Systems by
Alexis Leon and Mathews Leon
http://www.technology-and-
computers.com/

You might also like