Professional Documents
Culture Documents
JAI-HIND COLLEGE
sushiltry@yahoo.co.in
Social Networks : Example
Technology used
What is Data Mining?
DM Process & Example
DM Queries
DM Tasks and Methods
Relation & Data Warehouse
What is ETL ?
Data Preprocessing
What is a Network?
node
Lin
k
node node
node node
node
node node
node node
node node
node node
A social network is
a social structure of
people, related
(directly or indirectly)
to each other through
a common relation or
interest
Social Network Analysis
Social network analysis [SNA] is the mapping and measuring
of relationships and flows between people, groups,
organizations, computers or other information/knowledge
processing entities.
The nodes in the network are the people and groups while the
links show relationships or flows between the nodes.
A shift in approach: from ‘synthesis’ to
‘analysis’
Cognitive
Problems Cognitive network for B
network for A
• High cost of
manual surveys
• Survey bias B
- Perceptions of
individuals may be
incorrect
• Logistics
- Organizations
are now spread A
Cognitive
across several network for C
countries.
Sdfdsfsdf
Fvsdfsdfsd
C
Employee
Sdfdsfsd
fdfsd f
Sdfdsfsdf Fvsdfsdfs Sdfdsfsd
Sdfsdfs
Surveys
dfdfsd f
` Sdfdsfsd Fvsdfsdfs
f dfdfsd
Sdfsdfs Sdfdsfsd
` f
- Email Analysis
Sdfsdfs
`
- Web logs
Electronic
Synthesis communication
Email
Blogs
Social Networking Software like Orkut,
Face Book, Flickr etc.
SOCIAL NETWORK:
Profile & Platforms
USENET
SOCIAL NETWORK:
Profile & Platforms
Social Community
SOCIAL NETWORK: Growth
SOCIAL NETWORK : Growth Rate
SOCIAL NETWORK : Growth Rate
Technology :
What is Your Network?
- When your connections invite their connections, your
Network starts to grow.
- Your Network is your connections, their connections, and
so on out from you at the center.
ME
FRIEND
FRIEND
ON ANY OF SOCIAL NETWORK
Name
Gender
Age
Birth date/Home town
School attended FRIEND
Interests/ Hobbies
Photoes
Friends
Activities
Audio clips
Video clips
Name
Gender
Age
Birth date/Home town
School attended
Interests/ Hobbies YOU
Photoes
Friends
Activities
Audio clips
Video clips
ON ANY OF SOCIAL NETWORK
Name
Gender
Age
Birth date/Home town After making the friend,
School attended FRIEND
Interests/ Hobbies I can able to access his/ her friends
Photoes
Friends
, audios, videos, share information
Activities A friend may be from any remote site.
Audio clips
Video clips
Name
Gender
Age YOU
Birth date/Home town
School attended
Interests/ Hobbies
Photoes
Friends
Activities
Audio clips
Video clips
SOCIAL NETWORK : Growth Rate
SOCIAL NETWORK : Visualization
Between friends: How many of them ?
Male vs. Female Young vs. Old
Coffee Chocolate
Friends Friends
HOW
HOWMANY
MANYOF
OFMADHURI DIXIT’S
PRASHANT FRIEND
DAMLE’S LIKE LIKE
FRIEND ? ?
FRIENDS OF A FRIENDS OF A FRIEND
SHOULD KNOW
How many friends use a social network
regularly?
How many friends send messages
frequently?
What is the mood of your friend list?
How many friends are vegetarian?
How many friends are closest or far from
you?
How many friends studied or studying in
your school?
FRIENDS OF A FRIENDS OF A FRIEND
SHOULD KNOW
INTERESTING PATTERNS
FROM UNKNOWN DATA
DEFINE DATA MINING
Data Mining is:
Grow connections.
• Query • Query
– Well defined – Poorly defined
– SQL – No precise query language
Data Data
– Operational data – Not operational data
Output Output
– Precise – Fuzzy
– Subset of – Not a subset
database of database
QUERY EXAMPLES
Database
– Find all credit applicants with first name of Sane.
– Identify customers who have purchased
more than Rs.10,000 in the last month.
– Find all customers who have purchased milk
Data Mining
– Find all credit applicants who are poor
credit risks. (classification)
– Identify customers with similar buying
habits. (Clustering)
– Find all items which are frequently
purchased with milk. (association rules)
ARE ALL THE ‘DISCOVERED’
PATTERNS INTERESTING?
Interestingness measures:
Bayes Theorem
Regression Analysis
EM Algorithm
K-Means Clustering
Time Series Analysis
Algorithm Design Techniques
Algorithm Analysis Neural Networks
Data Structures
Decision Tree
Algorithms
RELATION (r)
D 1, D 2, ……, D n are domains
r ⊆ D 1× D 2 × … … × D n
EXAMPLE : r
D1 = {Ram, Shyam} , D 2 = {24, 34}
r is a sub set of D 1× D 2
NAME
Ram
Employee
TUPLES OR ROWS : t
Instance of the relation is a tuple or row
Notation :
t < (a(1), a(2), a(3),… a(n)):
a(i) ∈ A(i); i ∈ N >
Example: t < (Ram,24) >
RELATION (r)
R A 1
A 2
A 3
…… A k
……. A n
a 11
a 21
a 31
…… a k1
……. a n1
a 12
a 22
a 32
…… a k2
…… a n2
k th attribute R of i th tuple t
WHAT IS
DATA WAREHOUSE ?
Subject-oriented:
customers, patients, students,
products, time.
Time - variant:
Non - updatable:
Transform
Record-level: Field-level:
Selection – data partitioning single-field – from one field to
Joining – data combining one field
Aggregation – data multi-field – from many fields to
summarization one, or one field to many
Steps in data reconciliation
Data discretization
– Part of data reduction but with particular
importance, especially for numerical data
Forms of data preprocessing
DATA CLEANING
39 ? Yankees F
45 45,390 ? F
Binning method:
- first sort data and partition into (equi-depth) bins
- then one can smooth by bin means, smooth by
bin median, smooth by bin boundaries, etc.
Clustering
- detect and remove outliers
HOW TO HANDLE NOISY DATA?
Discretization : Smoothing techniques
Regression
- smooth by fitting the data into regression
functions
SIMPLE DISCRETISATION
METHODS: BINNING
Equal-width (distance) partitioning:
METHODS: BINNING
Equal-depth (frequency) partitioning:
1 {0,4} [ - , 10)
Equi-width
binning: 0-10 10-20 20-30 30-40 40-50 50-60 60-70 70-80
Equi-depth
binning: 0-22 22-31 62-80
38-44 48-55
32-38 44-48 55-62
THANK YOU !
Any Questions?
SUSHIL KULKARNI
sushiltry@yahoo.co.in