You are on page 1of 72

Data Mining

For
ØMarketing
ØSales
ØCustomer Relationship
Management
A producer wants to know….

Which
Whichare
areour
our
lowest/highest
lowest/highest
margin
margin
customers
customers?? Who
What Whoarearemy
my
Whatisisthe
themost
most customers
customers
and
andwhat
whatproducts
products
effective
effective are
distribution arethey
theybuying?
buying?
distribution
channel?
channel?

Which
Whichcustomers
customers
What
Whatproduct
productprom-
prom- are
are mostlikely
most likelytoto
-otions have the
-otions have the go
go
biggest
biggest to
tothe
the
impact
impactononrevenue?
revenue? What competition
competition??
Whatimpact
impactwill
will
new
new
products/services
products/services
have
haveon
onrevenue
revenue
and
andmargins?
margins? 2
Data Mining Evaluation

3
What is Data Mining?

•Interactive and iterative process


•To find underlying relationships and features
in data
•Knowledge driven
•Enhance the value of existing information
resources
•Predicting valuable information
•Data or knowledge discovery
•To predict future trend and behavior

Copyright KEYSOFT Solutions


What is Data Mining?
Information Harvesting
Knowledge Mining
Data Mining
Knowledge Discovery
in Databases Data Dredging

Data Archaeology
Data Pattern Processing

Database Mining
Knowledge Extraction
Siftware
The process of discovering meaningful new correlations, patterns, and trends by sifting
through large amounts of data, often previously unknown, using pattern recognition
technologies and statistical and mathematical techniques
(Thuraisingham 1998)
Knowledge Discovery Process
Integration

Interpretation Knowledge
Da & Evaluation
ta
Mi
nin
Tra g Knowledge
ns
RawData for
ma

Understanding
Se __ __ __
tio Patterns
& lec n __ __ __
Cl tio
ea n
__ __ __ and
nin Rules
g
Transformed
DATA Target Data
Ware Data
house
Data mining is not a linear
process.
Data Mining is Not …

•Data warehousing
•SQL / Ad Hoc Queries / Reporting
•Software Agents
•Online Analytical Processing (OLAP)
•Data Visualization

8
Data Mining Applications

9
Data Mining Applications:
Retail
 Performing basket analysis
◦ Which items customers tend to purchase together. This
knowledge can improve stocking, store layout
strategies, and promotions.
 Sales forecasting
◦ Examining time-based patterns helps retailers make
stocking decisions. If a customer purchases an item
today, when are they likely to purchase a
complementary item?
 Database marketing
◦ Retailers can develop profiles of customers with certain
behaviors, for example, those who purchase designer
labels clothing or those who attend sales. This
information can be used to focus cost–effective
promotions.
 Merchandise (Goods)planning and allocation
◦ When retailers add new stores, they can improve
merchandise planning and allocation by examining
patterns in stores with similar demographic
characteristics. Retailers can also use data mining to 10
Data Mining Applications:
Banking
 Card marketing
◦ By identifying customer segments, card issuers and
acquirers can improve profitability with more effective
acquisition and retention programs, targeted product
development, and customized pricing.
 Cardholder pricing and profitability
◦ Card issuers can take advantage of data mining
technology to price their products so as to maximize
profit and minimize loss of customers. Includes risk-
based pricing.
 Fraud detection
◦ Fraud is enormously costly. By analyzing past
transactions that were later determined to be
fraudulent, banks can identify patterns.
 Predictive life-cycle management
◦ DM helps banks predict each customer’s lifetime value
and to service each segment appropriately (for
example, offering special deals and discounts).
11
Data Mining Applications:
Telecommunication
 Call detail record analysis
◦ Telecommunication companies accumulate
detailed call records. By identifying customer
segments with similar use patterns, the
companies can develop attractive pricing and
feature promotions.
 Customer loyalty
◦ Some customers repeatedly switch providers, or
“churn”, to take advantage of attractive
incentives by competing companies. The
companies can use DM to identify the
characteristics of customers who are likely to
remain loyal once they switch, thus enabling the
companies to target their spending on customers
who will produce the most profit. 12
Data Mining Applications:
Other Applications
 Customer segmentation
◦ All industries can take advantage of DM to discover
discrete segments in their customer bases by
considering additional variables beyond traditional
analysis.
 Manufacturing
◦ Through choice boards, manufacturers are beginning to
customize products for customers; therefore they
must be able to predict which features should be
bundled to meet customer demand.
 Warranties
◦ Manufacturers need to predict the number of customers
who will submit warranty claims and the average cost
of those claims.
 Frequent flier incentives
◦ Airlines can identify groups of customers that can be
given incentives to fly more. 13
Data Mining in CRM
Every Company’s Big Unknown ... Customer Value

bi hip
Full Potential

y
ita s
lit
of on
Pr lati
Re
Number of Relationships

Current

Current
Customer
Value

Current

Relationship Duration
Customer Relationship Management Definition
Value ( $ )

on ship
ati
Rel
o f the
Val ue Duration of Customer Relationship
The

Targeting Acquisition Retention Expansion

•Who Do we target •What is the best channel for •How can we improve •How many products does our
•What segments are most each segment retention average customer buy
profitable •What is the acquisition cost for •What is our average customer •How can we induce our
•What segments match our Value a channel / segment relationship length current base to buy more
Proposition •Do certain channels deliver •How can we hold customer products
•What is the best segmentation certain types of customers for as long as possible •Who are the prime targets for
strategy for us / our industry •Cost effective acquisition •What is the most cost expansion
effective method of retention •What is the cost of expansion

Customer
Customer Relationship
Relationship Management
Management can
can be
be simply
simply defined
defined as
as everything
everything involved
involved with
with
managing the customer relationship.
managing the customer relationship.
Data Mining in CRM:
Customer Life Cycle
 Customer Life Cycle
◦ The stages in the relationship between a customer
and a business
 Key stages in the customer lifecycle
◦ Prospects: people who are not yet customers but
are in the target market
◦ Responders: prospects who show an interest in a
product or service
◦ Active Customers: people who are currently
using the product or service
◦ Former Customers: may be “bad” customers
who did not pay their bills or who incurred high
costs
 It’simportant to know life cycle events (e.g.
retirement)
17
Data Mining in CRM:
Customer Life Cycle
What marketers want: Increasing
customer revenue and customer
profitability
◦ Up-sell
◦ Cross-sell
◦ Keeping the customers for a longer
period of time
Solution: Applying data mining

18
Data Mining in CRM
DM helps to
◦ Determine the behavior
surrounding a particular lifecycle
event
◦ Find other people in similar life
stages and determine which
customers are following similar
behavior patterns

19
Importance of CRM

Scope Depth
Customer Management Process Threads
Marketing Selling Servicing
Are we making the
Customer Interaction Channels

Customer right level and type


Relationship of marketing , sales ,
Broadcast and service
Strategies investments in each
of our customer
Mail segments?

Field Personnel
Are we taking a
Customer holistic approach to
Agents / Distributors Relationship our customers across
Structure processes and
Call Center channels?

Retail

Internet Customer Have we implemented


Relationship best practices and
Performance technology in
process / channel?

Back Office Process / Systems


Data Mining in CRM (cont.)

Customer
Data Warehouse Data Cu
Mining
Profile
st
om
er
Li
fe
Cy
cl
e
In
fo
.
Campaign Management

22
What Tasks Can Be Performed
with Data Mining?
Classification
Estimation
Prediction
Association Rules
Clustering
Profiling
Data Mining Motivation

•Changes in the Business Environment


–Customers becoming more demanding
–Markets are saturated
•Decisions must be made rapidly
•Decisions must be made with maximum knowledge
•Databases a growing at an unprecedented
rateDatabases today are huge:
–More than 1,000,000 entities/records/rows
–From 10 to 10,000 fields/attributes/variables

24
Very Large Data Bases

•Terabytes -- 10^12 bytes: Walmart -- 24 Terabytes

•Petabytes -- 10^15 bytes: Geographic Information


Systems
•Exabytes -- 10^18 bytes: National Medical Records
•Zettabytes -- 10^21 bytes:
Weather images
•Zottabytes -- 10^24 bytes:
Intelligence Agency
– Videos

25
Data Mining Motivation
“ The key in business is to know
something that nobody else knows .”
— Aristotle Onassis

PHOTO: LUCINDA DOUGLAS-MENZIES


PHOTO: HULTON-DEUTSCH COLL

“ To understand is to perceive
patterns .”
— Sir Isaiah Berlin

26
Data Mining Techniques

27
Clustering ( Unsupervised Learning )

Clustering

•Unsupervised learning when old data with class


labels not available
•Group/cluster existing customers based on time
series of payment history such that similar
customers locate in the same cluster.
•Key requirement: Need a good measure of similarity
between instances.
Clustering methods

Hierarchical clustering
◦agglomerative Vs. divisive
◦single link Vs. complete link

Partitionalclustering
◦distance-based: K-means
◦model-based: GMM
◦density-based: DBSCAN
Clustering Techniques

Example
A Insurance company groups similar customers
based on various parameters and launches special
promotional offers targeting each segment and
achieves higher response rate.

Copyright KEYSOFT Solutions


Agglomerative Hierarchical clustering

Given: matrix of similarity between every point pair


Start with each point in a separate cluster and merge
clusters based on some criteria:
◦Single link: merge two clusters such that the
minimum distance between two points from the two
different cluster is the least
◦Complete link: merge two clusters such that
maximum distance between two points from the
two different cluster is the least
Partitional methods: K-means

Criteria:
minimize sum of square of distance between
each point and centroid of the cluster.
Algorithm:

◦Randomly select K points as initial centroids


◦Repeat until stabilization:
Assign each point to closest centroid
Generate new cluster centroids
◦Adjust clusters by merging/splitting
K-Means

•Strength
–Easy to use.
–Efficient to calculate.

•Weakness
–Initialization problem
–Cannot handle clusters of different
densities.
–Restricted to data for which there is
a notion of a center/centroid.
Model-based methods: GMM

Each data point is viewed as an observation from a mixture of


Gaussian distribution.
( x−µ j )2
K −
1
∑ w P (x | θ
2σ j 2
P( x | θ ) = j j j)
,where Pj ( x | θ j ) = e
j =1
2π σ j
m
p( X | θ ) = Π p( x | θ )
i =1
i
Model-based methods

•Strength
–More general than K-means
–Better representation of cluster
–Satisfy the statistical assumptions
•Weakness
–Inefficient in estimating the
parameters
–How to choose the models
–Problems with noises and outliers
Density based method: DBSCAN

Given the radius Eps and the threshold MinPts

Core Point: the number points within the neighborhood of the


point, defined by Eps, exceeds the threshold MinPts.

Border Point: not core points, but within a neighborhood of a core


point.

Outlier: neither core points nor border points.


Density based method: DBSCAN

1.Label all points as core, border and outlier


points.
2.Eliminate outlier points.
3.Put an edge between all core points that are
within Eps of each other.
4.Make each group of connected core points into a
separate cluster
5.Assign each border point to one of the clusters of
its associated core points (ties may need to be
solved).
Density-based methods

•Strength
–Relatively resistant to noise.
–Handle clusters of arbitrary shapes
and sizes.

•Weakness
–Problem with clusters having widely
varying densities.
–Density is more difficult to define
with high-dimensional data.
–Expensive in calculating all pairwise
proximities.
Why clustering?

A few good reasons ...

•Simplifications
•Pattern detection
•Useful in data concept construction
•Unsupervised learning process

Amazon . com
Recommendations
Item-to-Item
Collaborative
Filtering
Classification ( Supervised
Learning )

How Models are built and
used?
Data Mining Process
Classification in Large Databases

•Classification—a classical problem extensively


studied by statisticians and machine learning
researchers
•Scalability: Classifying data sets with millions of
examples and hundreds of attributes with reasonable
speed
Presentation of Classification Results

05/06/11Data Mining: Concepts and Techniques 50


Visualization of a Decision Tree in SGI/MineSet 3.0

05/06/11Data Mining: Concepts and Techniques 51


Interactive Visual Mining by Perception-Based
Classification (PBC)

Data Mining: Concepts and Techniques 52


SVM—Support Vector Machines
•A new classification method for both linear and
nonlinear data
•It uses a nonlinear mapping to transform the original
training data into a higher dimension
•With the new dimension, it searches for the linear
optimal separating hyperplane (i.e., “decision
boundary”)
•With an appropriate nonlinear mapping to a
sufficiently high dimension, data from two classes
can always be separated by a hyperplane
•SVM finds this hyperplane using support vectors
(“essential” training tuples) and margins (defined by
the support vectors)

53
SVM—General Philosophy

Small Margin Large Margin


Support Vectors

54
SVM—Margins and Support
Vectors

05/06/11Data Mining: Concepts and Techniques 55


Classification Techniques

Predictive Models
•Neural Nets
•Rule Induction
•Linear and Logistic Regression

Example
Credit card Company can rate the customer
before issuing Credit card by using the predictive
models. Customer’s applications which are rated low
by the predictive models are rejected.

Copyright KEYSOFT Solutions


Lazy Learner: Instance-Based Methods

 Instance-based learning:
◦ Store training examples and delay the
processing (“lazy evaluation”) until a new
instance must be classified
 Typical approaches
◦ k-nearest neighbor approach
 Instances represented as points in a
Euclidean space.
◦ Locally weighted regression
Constructs local approximation
◦ Case-based reasoning
Uses symbolic representations and
knowledge-based inference

05/06/11Data Mining: Concepts and Techniques 57


www.alibaba.com
Association Rules ( Unsupervised
Learning )
Association rules

Transaction
•Input: a set of groups of items
•Goal: find all rules on itemsets of the milk, cereal,
bread
form a-->b such that tea, milk, bread
–Support of a and b > threshold s
milk, rice
–Confidence (conditional
probability ) of b given a > cereal
threshold c
•Example: milk --> bread
Support(milk, bread) = 2/4
Confidence(milk --> bread) = 2/3
Modeling Techniques

Association Rules
•APPRIORI
•GRI

Example
A leading Super market applies association
techniques in its transaction database and finds out
items which are often purchased together and comes
up with new bundled offer to promote its other non
selling items.

Copyright KEYSOFT Solutions


What Is Prediction?
•(Numerical ) prediction is similar to classification
–construct a model
–use model to predict continuous or ordered value for a
given input
•Prediction is different from classification
–Classification refers to predict categorical class label
–Prediction models continuous-valued functions
•Major method for prediction: regression
–model the relationship between one or more independent or
predictor variables and a dependent or response
variable
•Regression analysis
–Linear and multiple regression
–Non-linear regression
–Other regression methods: generalized linear model,
Poisson regression, log-linear models, regression trees

66
Prediction: Numerical Data

05/06/11Data Mining: Concepts and Techniques 67


Prediction: Categorical Data

05/06/11Data Mining: Concepts and Techniques 68


Problems of Data Mining Tools
Difficultto use
Needs Expert to run the tool
Difficult to add new functionality
Difficult to interface
Limited Number of algorithms
Need lot of resources
www.KDnuggets.com
Data Mining Software Guide

70
Finally ….

Data mining lets you be Proactive rather than


Retrospective

Copyright KEYSOFT Solutions


Thank You All
For Your
Attention

ASHRAF_SA77@YAHOO . COM

You might also like