You are on page 1of 4

Data Mining

Definition

 Data mining, the extraction of hidden predictive information from large databases, is a powerful
new technology with great potential to help companies focus on the most important
information in their data warehouses.

Evolutionary Step

 Data Collection (1960s)

 Data Access (1980s)

 Data Warehousing & Decision Support(1990s)

 Data Mining (Emerging Today)

Goals of Data Mining

 Prediction

 How certain attributes within the data behave in future.

 Identification

 Data patterns used to identify the existence of an item, an event or an activity.

 Classification

 Data partition to identify different classes or patterns based on combination of


parameters.

 Optimization

 Optimize the use of limited resources like time, space, money or material and maximize
sale & profit under given constraints.

Prediction

 What customers buy with discount.

 How much sale value a store generates in a given period.

 Whether deleting a sale line yield more profit.

 Uses techniques like Regression, correlation etc.


Identification

 Intruders trying to break the computer system may be identified by the program executed, files
accessed and CPU time per session.

 Existence of gene is identified by certain sequence of nucleotide symbols present in the DNA
sequence.

 Authentication

Classification

 Customers can be identified as discount seekers, shoppers in a rush, loyal regular customers,
shoppers attached to name brands etc.

 Classification can help in categorizing food as health food, party food, school lunch food etc.

Levels of analysis

 Regression

 Decision Trees

 Nearest Neighbor Classification

 Neural Networks

 Rule Induction –If - else - then

Technological infrastructure required

 Depends on

 Size of the database

 Query complexity

 Relational database storage

 extensive indexing capabilities

 Massively Parallel Processors (MPP)

Applications of Data Mining

 Marketing

 Consumer behavior based on buying pattern


 Determination of market strategy

 Targeted mailing

 Advertisement campaigns

 Design of catalogue

 Store layout

 Finance

 Analysis of trustworthiness of a client

 Segmentation of account receivables

 Performance analysis of financial investments like stocks, bonds & mutual funds

 Evaluation of financing options

 Fraud detection

 Manufacturing

 Optimization of resources like machine, manpower & material

 Optimal design of manufacturing processes

 Shop floor layout

 Product design

 Packaging design

 Healthcare

 Discovery pattern in radiological images

 Analysis of micro array to relate to diseases

 Analysis of side effects of drugs.

 Effectiveness of certain treatments.

 Optimizing processes within hospitals.

 Relating patients wellness data with doctor’s qualification.

 Web site personalization


 Credit card fraud detection

 SAS lie detector

 Market based analysis

You might also like