You are on page 1of 12

SAK 5609

DATA MINING
Prof. Madya Dr. Md. Nasir bin
Sulaiman
nasir@fsktm.upm.edu.my
03-89466507
012-6323430
Synopsis
Kredit: 3(3+0)
Contact hours: 3 x 1 hour per week
Semester: I
Emphasis on concepts of data mining. It includes
principles of data mining, data mining functions,
data mining processes, data mining techniques
such as K-nearest neighbour and clustering
algorithms, rule induction, decision tree
algorithms, association rule mining, neural
networks and genetic algorithms; and data mining
examples. Industrial and scientific applications
will be given.
Assessment & References
Assessment:
Exercises (10%)
Project I (15%) + presentation I (5%) Week 7
Project II (15%) + presentation II (5%) Week 14
Mid-exam 20% (1 hour) Week 6
Final exam 30% (1.5 hours) Week 15 - 17

References:
Jiawei Han & Micheline Kamber, (2001), Data Mining: Concepts
and Techniques, Morgan Kaufman.
Michael J.A.Berry & Gordon S. Linoff, (2004), Data Mining
Techniques (2nd edition), Wiley.
Other related articles


Course Contents
Chapter 1 Introduction
Motivation
Origin of data mining
What it is/ isnt
The KDD process
Types of data
Data mining tasks
Association rule mining, sequential rules, clustering,
classification, anomaly detection
Course contents
Chapter 2 Data issues
What is data set?
Types of attributes
Transformation for different types
Types of data
Structured data, record data, data matrix, document
data, transaction data, graph data, ordered data
Data quality
Noise and outliers, missing values,
inconsistent/duplicate data
Course contents
Chapter 3 Data preprocessing
Why Data Preprocessing?
Why Is Data Preprocessing Important?
Major Tasks in Data Preprocessing
Data Cleaning
Data integration
Data transformation
Data reduction
Data discretization
Course contents
Chapter 4 Association rule mining
Introduction
The Model
Goal and Key Features
Mining Algorithms
Problems with the Association Rule Model
Issues of association rules
Other Main Works on Association Rules
Course contents
Chapter 5 Classification
Overview
An example application
Definition
Classification Model
General Approach
ClassificationA Two-Step Process
Classification Techniques
Evaluating classification methods
Decision Tree Based Classification, rule based classifiers, nearest
neighbor classifiers etc
Course contents
Chapter 6 Clustering
Introduction
What is/is not cluster analysis?
Examples of clustering applications
Concepts of clustering
Types of data in clustering analysis
Types of clustering hierarchical, partitional
Major Clustering Techniques
Types of clusters
Clustering algorithms
Chapter 7 Anomaly Detection
Applications
Causes of anomalies
Approaches to anomaly detection
Statistical
Proximity-based outlier detection
Density-based outlier detection
Clustering-based techniques
Issues dealing with anomalies

Course Contents
Chapter 8 Visualization
What is visualization?
Motivation for visualization
General categories of visualization
Representation
Arrangement
Selection
Dos and donts
Visualization techniques
Course contents
Chapter 9 Text mining, web mining
Introduction
Text processing
Relevance judgement
Web Search
Search engines

You might also like