Welcome to Scribd!

Skip carousel

Pattern Discovery 11.1

Uploaded by

Rossana Cunha

0% found this document useful (0 votes)

21 views10 pages

Lecture 11. Advanced Topics on Pattern Discovery

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Lecture 11. Advanced Topics on Pattern Discovery

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

21 views10 pages

Pattern Discovery 11.1

Uploaded by

Rossana Cunha

Lecture 11. Advanced Topics on Pattern Discovery

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 10

Search inside document

Lecture 11.

Advanced Topics
on Pattern Discovery

Lecture 11. Advanced Topics on Pattern Discovery

Frequent Pattern Mining in Data Streams

Spatiotemporal and Trajectory Pattern Mining

Pattern Discovery for Software Bug Mining

Pattern Discovery for Image Analysis

Pattern Mining and Society: Privacy Issues

Looking Forward

Session 1. Frequent Pattern

Mining in Data Streams

Challenges for Data Analysis in Data Streams

Data Streams
Features: Continuous, ordered, changing, fast, huge volumn
Contrast with traditional DBMS (finite, persistent data sets)
Characteristics
Huge volumes of continuous data, possibly infinite
Fast changing and requires fast, real-time response
Data stream captures nicely our data processing needs of today
Random access is expensive: single scan algorithm (can only have
one look)
Store only the summary of the data seen thus far
Most stream data are at low-level and multi-dimensional in nature,
needs multi-level and multi-dimensional processing

Architecture: Stream Data Processing

SDMS (Stream Data Management System)

User/Application

Continuous Query

Multiple streams

Stream Query
Processor

Q: How to mine frequent patterns

in data streams?
Scratch Space
(Main memory and/or Disk)
5

Results

Stream Data Mining Tasks

Stream mining vs. stream querying
Stream mining shares many difficulties with stream querying
E.g., single-scan, fast response, dynamic,
But often requires less precision, e.g., no join, grouping, sorting
Patterns are hidden and more general than querying
Stream data mining tasks
Pattern mining in data streams
Multi-dimensional on-line analysis of streams
Clustering data streams
Classification of stream data
Mining outliers and anomalies in stream data

Mining Approximate Frequent Patterns

Mining precise frequent patterns in stream data: Unrealistic
Cannot even store them in a compressed form (e.g., FPtree)
Approximate answers are often sufficient for pattern analysis
Ex.: A router
is interested in all flows whose frequency is at least 1% () of the entire traffic
stream seen so far
and feels that 1/10 of ( = 0.1%) error is comfortable
How to mine frequent patterns with good approximation?
Lossy Counting Algorithm (Manku & Motwani, VLDB02)
Major ideas: Not to keep the items with very low support count
Advantage: Guaranteed error bound
Disadvantage: Keeping a large set of traces

Lossy Counting for Frequent Single Items

Bucket 1

Bucket 2

Bucket 3

Divide stream into buckets (bucket size is 1/ = 1000)

First Bucket of the Stream

Empty (summary)

At bucket boundary, decrease all counters by 1

Next Bucket of the Stream

+
8

Approximation Guarantee

Given: (1) support threshold: , (2) error threshold: , and (3) stream length N

Output: items with frequency counts exceeding ( ) N

How much do we undercount?

If stream length seen so far = N and bucket-size = 1/

then frequency count error # of buckets

= N/bucket-size = N/(1/) = N

Approximation guarantee

No false negatives

False positives have true frequency count at least ()N

Frequency count underestimated by at most N

Other issues on pattern discovery in data streams

Space-saving computation of frequent and top-k elements (Metwally,

Agrawal, and El Abbadi, ICDT'05)

Mining approximate frequent k-itemsets in data streams

Mining sequential patterns in data streams

Recommended Readings
G. Manku and R. Motwani, Approximate Frequency Counts over Data
Streams, VLDB02
A. Metwally, D. Agrawal, and A. El Abbadi, Efficient Computation of
Frequent and Top-k Elements in Data Streams, ICDT'05

The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
From Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
Rating: 4 out of 5 stars
4/5 (5794)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
Rating: 4 out of 5 stars
4/5 (98)
cs229 Linalg
Document26 pages
cs229 Linalg
Awais Ali
No ratings yet
Stanford University CS224d - Deep Learning For Natural Language Processing - Syllabus
Document3 pages
Stanford University CS224d - Deep Learning For Natural Language Processing - Syllabus
Rossana Cunha
No ratings yet
cs229 Prob PDF
Document12 pages
cs229 Prob PDF
indoexchange
No ratings yet
(Bastien - 1993) Ergonomic Criteria For The Evaluation of Human-Computer Interfaces RT-0156
Document83 pages
(Bastien - 1993) Ergonomic Criteria For The Evaluation of Human-Computer Interfaces RT-0156
Rossana Cunha
No ratings yet
VVV Project
Document16 pages
VVV Project
Rossana Cunha
No ratings yet
(Garcia-2009) Beyond Translation Memory Computers and The Profes
Document17 pages
(Garcia-2009) Beyond Translation Memory Computers and The Profes
Rossana Cunha
No ratings yet
Pattern Discovery 11.3
Document6 pages
Pattern Discovery 11.3
Rossana Cunha
No ratings yet
Pattern Discovery 10.3
Document6 pages
Pattern Discovery 10.3
Rossana Cunha
No ratings yet
The Impact of Translation Technologies
Document23 pages
The Impact of Translation Technologies
Rossana Cunha
No ratings yet
Pattern Discovery 11.5
Document5 pages
Pattern Discovery 11.5
Rossana Cunha
No ratings yet
Pattern Discovery 11.4
Document5 pages
Pattern Discovery 11.4
Rossana Cunha
No ratings yet
Pattern Discovery 11.6
Document2 pages
Pattern Discovery 11.6
Rossana Cunha
No ratings yet
Session 5. A Comparative Study of Three Strategies
Document7 pages
Session 5. A Comparative Study of Three Strategies
Rossana Cunha
No ratings yet
Pattern Discovery 11.2
Document7 pages
Pattern Discovery 11.2
Rossana Cunha
No ratings yet
Pattern Discovery 10.4
Document10 pages
Pattern Discovery 10.4
Rossana Cunha
No ratings yet
Pattern Discovery 10.2
Document4 pages
Pattern Discovery 10.2
Rossana Cunha
No ratings yet
Coursera - Pattern Discovery in Data Mining - Week 1 Quiz
Document5 pages
Coursera - Pattern Discovery in Data Mining - Week 1 Quiz
Rossana Cunha
No ratings yet
Pattern Discovery 10.1
Document5 pages
Pattern Discovery 10.1
Rossana Cunha
No ratings yet
Preparing Curriculum For Translator Training Courses
Document4 pages
Preparing Curriculum For Translator Training Courses
Rossana Cunha
No ratings yet
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
Rating: 3.5 out of 5 stars
3.5/5 (231)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
From Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
Rating: 4 out of 5 stars
4/5 (895)
The Little Book of Hygge: Danish Secrets to Happy Living
From Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
Rating: 3.5 out of 5 stars
3.5/5 (400)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
Rating: 4.5 out of 5 stars
4.5/5 (537)
Never Split the Difference: Negotiating As If Your Life Depended On It
From Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
Rating: 4.5 out of 5 stars
4.5/5 (838)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
From Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
Rating: 4.5 out of 5 stars
4.5/5 (474)
Grit: The Power of Passion and Perseverance
From Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
Rating: 4 out of 5 stars
4/5 (588)
Yes Please
From Everand
Yes Please
Amy Poehler
Rating: 4 out of 5 stars
4/5 (1891)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
Rating: 4.5 out of 5 stars
4.5/5 (271)
On Fire: The (Burning) Case for a Green New Deal
From Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
Rating: 4 out of 5 stars
4/5 (74)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
Rating: 4.5 out of 5 stars
4.5/5 (234)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
From Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
Rating: 4.5 out of 5 stars
4.5/5 (266)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
From Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
Rating: 4.5 out of 5 stars
4.5/5 (344)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
Rating: 3.5 out of 5 stars
3.5/5 (137)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
Rating: 3.5 out of 5 stars
3.5/5 (2259)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
Rating: 3.5 out of 5 stars
3.5/5 (738)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
Rating: 4 out of 5 stars
4/5 (1090)
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
Rating: 4 out of 5 stars
4/5 (599)
John Adams
From Everand
John Adams
David McCullough
Rating: 4.5 out of 5 stars
4.5/5 (2409)
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
Rating: 4 out of 5 stars
4/5 (45)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
Rating: 4.5 out of 5 stars
4.5/5 (1712)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
Rating: 4.5 out of 5 stars
4.5/5 (440)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
Rating: 4.5 out of 5 stars
4.5/5 (806)
Bad Feminist: Essays
From Everand
Bad Feminist: Essays
Roxane Gay
Rating: 4 out of 5 stars
4/5 (1015)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
Rating: 4 out of 5 stars
4/5 (1839)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
Rating: 4.5 out of 5 stars
4.5/5 (789)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
From Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
Rating: 4.5 out of 5 stars
4.5/5 (121)
Brooklyn: A Novel
From Everand
Brooklyn: A Novel
Colm Tóibín
Rating: 3.5 out of 5 stars
3.5/5 (1937)
The Woman in Cabin 10
From Everand
The Woman in Cabin 10
Ruth Ware
Rating: 3.5 out of 5 stars
3.5/5 (2322)
A Man Called Ove: A Novel
From Everand
A Man Called Ove: A Novel
Fredrik Backman
Rating: 4.5 out of 5 stars
4.5/5 (4609)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
Rating: 4.5 out of 5 stars
4.5/5 (2104)
Wolf Hall: A Novel
From Everand
Wolf Hall: A Novel
Hilary Mantel
Rating: 4 out of 5 stars
4/5 (3811)
Little Women
From Everand
Little Women
Louisa May Alcott
Rating: 4 out of 5 stars
4/5 (104)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
Rating: 3.5 out of 5 stars
3.5/5 (792)
The Art of Racing in the Rain: A Novel
From Everand
The Art of Racing in the Rain: A Novel
Garth Stein
Rating: 4 out of 5 stars
4/5 (4200)
The Constant Gardener: A Novel
From Everand
The Constant Gardener: A Novel
John le Carré
Rating: 3.5 out of 5 stars
3.5/5 (104)
A Tree Grows in Brooklyn
From Everand
A Tree Grows in Brooklyn
Betty Smith
Rating: 4.5 out of 5 stars
4.5/5 (1929)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
Rating: 4 out of 5 stars
4/5 (821)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
Rating: 4 out of 5 stars
4/5 (1103)
Map 3 D 2015 Install
Document15 pages
Map 3 D 2015 Install
Harrison Hayes
No ratings yet
Oop Mini Project Format
Document7 pages
Oop Mini Project Format
sreenivas sirivaram
No ratings yet
QR Codes..Seminar
Document25 pages
QR Codes..Seminar
Sumit Garg
No ratings yet
Storage and Maintenance of Quality Records
Document17 pages
Storage and Maintenance of Quality Records
Carl Justin de Jesus
No ratings yet
BlackHat Practical Padding Oracle Attacks
Document17 pages
BlackHat Practical Padding Oracle Attacks
Mal Ob
No ratings yet
Crack For Test Generator Software Standard Edition 1.2 CASOS EMBLEMÁTICOS
Document1 page
Crack For Test Generator Software Standard Edition 1.2 CASOS EMBLEMÁTICOS
purushottam kumar
No ratings yet
First Look (MasterMover) - Script
Document22 pages
First Look (MasterMover) - Script
Brajnandan Arya
No ratings yet
Unit 19 Ta Ündem Formacio Ün
Document16 pages
Unit 19 Ta Ündem Formacio Ün
Sergio Ruge
No ratings yet
Summarize OPM Transaction Data
Document2 pages
Summarize OPM Transaction Data
anchauhan
No ratings yet
Consolidate and Increase Recording Power With The 3254F EAFR
Document2 pages
Consolidate and Increase Recording Power With The 3254F EAFR
Gnani Reddy
No ratings yet
Cabs in Pune Documentation
Document40 pages
Cabs in Pune Documentation
Anonymous bWINUVA
No ratings yet
Guess and Check Technique Based On Polya's Four-Step
Document17 pages
Guess and Check Technique Based On Polya's Four-Step
Muhammad Safwan
No ratings yet
RAFM Training
Document2 pages
RAFM Training
Muhammad Islam
No ratings yet
Multibeam Calibration Module
Document41 pages
Multibeam Calibration Module
Yano Maximata
No ratings yet
Divyan CDC Apply
Document2 pages
Divyan CDC Apply
Logesh War
No ratings yet
ESwitching PTAct 6 3 3
Document3 pages
ESwitching PTAct 6 3 3
Wayne E. Bollman
No ratings yet
Dbms
Document36 pages
Dbms
Md Nasir
No ratings yet
Semantic Networks
Document2 pages
Semantic Networks
Ryan Davis
No ratings yet
Stepper Motors
Document3 pages
Stepper Motors
Benedict Jo
No ratings yet
HKUST ELEC1200 Lecture 6
Document25 pages
HKUST ELEC1200 Lecture 6
TKF
No ratings yet
Cisco Call Manager System Guide
Document472 pages
Cisco Call Manager System Guide
satya28
No ratings yet
VBA User Form Demonstration
Document6 pages
VBA User Form Demonstration
cniraj743566
No ratings yet
IEEE Drexel
Document4 pages
IEEE Drexel
jyotsnavemasani
No ratings yet
Computational Complexity
Document36 pages
Computational Complexity
abc xyz
No ratings yet
Hardware Maintenance Manual: Thinkpad T450
Document120 pages
Hardware Maintenance Manual: Thinkpad T450
David Valencia
No ratings yet
Speed Kit
Document42 pages
Speed Kit
jamesyu
No ratings yet
1
Document100 pages
1
Jason Asor
No ratings yet
MOCK EXAM 1-I
Document5 pages
MOCK EXAM 1-I
robertz_tolentino014
No ratings yet
CMME 2 Report
Document7 pages
CMME 2 Report
Micro Yuchen
No ratings yet
Smart Street Lighting Case Study Web
Document18 pages
Smart Street Lighting Case Study Web
ALEXANDRE JOSE FIGUEIREDO LOUREIRO
No ratings yet