Professional Documents
Culture Documents
Association Rules
Finds Interesting associations / correlation
relationships among large sets of data
Business Decision Making
Example Market Basket Analysis
Items likely to be purchased
Advertising strategy, Catalog Design, Store
layout
Association Rules
Forming Association rules
Universe all items
Boolean Vector
Example:
Computer Accounting_Software
[support = 5%, confidence = 60%]
Minimum support and Confidence threshold
Basic Concepts
I = {i1, i2, im} Set of Items
D Set of database Transactions
T Transaction contains a set of items and T I
Association rule
(A)
Itemset
K-Itemset
Occurrence frequency of an itemset
Frequency, support_count (absolute support) or
count
Itemset satisfies minimum support when count >=
min_sup * number of transactions in D
Minimum Support Count
Frequent Itemset
Association Rule Mining Process
Find all frequent itemsets
Generate strong association rules from frequent
itemsets
Satisfy Minimum Support and Minimum
Confidence
Itemsets
Complete Itemsets
Closed Frequent Itemset
X is closed in a data set S if there exists no proper
super itemset Y such that Y has the same support
count as X in S
X is frequent
Maximal Frequent Itemset
X is Frequent and there exists no super-itemset Y
such that X Y and Y is frequent in S
Example:
T = { {a1,a2,a100}, {a1,a2,a50}}, min_sup = 1
Closed frequent itemsets : Both {{a1,a2,a100}:1,
{a1,a2,a50}: 2}
Maximal frequent itemset: {a1,a2,a100}
th
li[j] j item in li
Members of Lk-1 are joinable if their first (k-2)
items are common
Members l1 and l2 of Lk-1 are joinable if
(l1[1]=l2[1])
(l1[2]=l2[2]) (l1[k-2]=l2[k-2]) (l1[k-1]< l2[k1])
Resulting itemset is l1[1], l1[2], l1[k-1], l2[k-1]
Prune Step
Ck is a superset of Lk
Determine the count of each candidate of Ck
To reduce the size of Ck - if any (k-1) subset is
not in Lk-1 it can be removed from Ck