Professional Documents
Culture Documents
UMass Amherst
Weifeng Chen
California University of Pennsylvania
Enterprise Internet
(university)
Packets
Monitor
Packet header traces
Used for networking research
Many public repositories (UMass, CAIDA, LBNL, …)
Adversarial model:
4
Bruno Ribeiro, Weifeng Chen, Gerome Miklau, and Don Towsley, Analyzing Privacy in Enterprise Packet Trace Anonymization
Outline
Our contributions
New attack on IP anonymization:
Attack overview
Defined as a tree editing distance problem
Worst-case analysis:
From a set of trace labels (information)
Assesses worst-case attack
Related work
Conclusions
5
Bruno Ribeiro, Weifeng Chen, Gerome Miklau, and Don Towsley, Analyzing Privacy in Enterprise Packet Trace Anonymization
Proposed attack overview
Adversary provides:
Labeled tree constructed using anonymized trace
Labeled tree constructed from probing enterprise
A cost (or distance) function (to deal with “mismatched” labels)
6
Bruno Ribeiro, Weifeng Chen, Gerome Miklau, and Don Towsley, Analyzing Privacy in Enterprise Packet Trace Anonymization
Full prefix preserving anonymization
7
Bruno Ribeiro, Weifeng Chen, Gerome Miklau, and Don Towsley, Analyzing Privacy in Enterprise Packet Trace Anonymization
Labeled trees
0 1
0 1 0 1
00 01 10 11 00 01 10 11
sets:
Match set:
{01}01, 10, 11}
00 maps to {00,
{10, 01,
10 maps to {00, 11}10, 11} 8
Bruno Ribeiro, Weifeng Chen, Gerome Miklau, and Don Towsley, Analyzing Privacy in Enterprise Packet Trace Anonymization
Imperfect information
Probed tree Trace tree
Probed IP leaf labels Trace IP leaf labels
Web server Traffic on port 80
Not a Web server No traffic on port 80
00 01 10 11 00 01 10 11
Mapping cost
Sum of all individual costs
Example:
Probed tree Trace tree
Total cost = 1
Cost = 1
0
0 0
Cost = 0
Cost = 1 1
10
Bruno Ribeiro, Weifeng Chen, Gerome Miklau, and Don Towsley, Analyzing Privacy in Enterprise Packet Trace Anonymization
Proposed attack
00 01 10 11 00 01 10 11
?
And our algorithm is fast
10 seconds (on this laptop) for all mappings of a network with 216 addresses
11
Bruno Ribeiro, Weifeng Chen, Gerome Miklau, and Don Towsley, Analyzing Privacy in Enterprise Packet Trace Anonymization
Experiment
Network: class B (64K addresses)
Labels
“Active host”
Active ports: FTP, SSH, Telnet, E-mail, Time, DNS, Web, POP3, SOCKS
Trace IP labels
“Active host” label – recorded any outgoing traffic
“Active ports” – Recorded traffic from ports 80, 22, ….
Probed IP labels
Probed over all network
“Active host” label – PING
“Active ports” – TCP SYN ACK reply from ports 80, 22, …
12
Bruno Ribeiro, Weifeng Chen, Gerome Miklau, and Don Towsley, Analyzing Privacy in Enterprise Packet Trace Anonymization
Experiment results
Trace collected: 2007, June 18th (9097 active IPs)
Network probed: 2007, June 18th
60%
Incorrect
50%
cumulative fraction of
matches
Data publisher’s view
40%
30%
Correct
BAD
20% matches
10%
0%
1 2 3 4 5 6 7 8
size of matching set
Uniquely re-identified
13
Bruno Ribeiro, Weifeng Chen, Gerome Miklau, and Don Towsley, Analyzing Privacy in Enterprise Packet Trace Anonymization
Worst-case analysis
14
Bruno Ribeiro, Weifeng Chen, Gerome Miklau, and Don Towsley, Analyzing Privacy in Enterprise Packet Trace Anonymization
Worst-case experiment
Full prefix preservation
June 18th experiment
Naïve attack
Worst-case
100%
cumulative fraction of
attack
hosts in the trace
80%
Data publisher’s view
60%
40%
BAD
20%
0%
1 2 3 4 5 6 7 8
size of matching set
Uniquely re-identified
15
Bruno Ribeiro, Weifeng Chen, Gerome Miklau, and Don Towsley, Analyzing Privacy in Enterprise Packet Trace Anonymization
Partial prefix preservation
Does not retain part of the address structure
Used in Pang et al., 2006
Solution also formulated as an instance of the tree edit distance problem
Anonymization
Probed
Probedtree
treeroot
root mapping Anonymized
Anonymizedtree
treeroot
root
Up to 256 addresses
16
Bruno Ribeiro, Weifeng Chen, Gerome Miklau, and Don Towsley, Analyzing Privacy in Enterprise Packet Trace Anonymization
Partial vs. Full prefix preservation
Full prefix
hosts in the trace
80% preservation
Data publisher’s view
60%
BAD
40%
20%
0%
1 2 3 4 5 6 7 8
size of matching set
17
Bruno Ribeiro, Weifeng Chen, Gerome Miklau, and Don Towsley, Analyzing Privacy in Enterprise Packet Trace Anonymization
Worst-case analysis (II)
Uniquely re-identified
18
Bruno Ribeiro, Weifeng Chen, Gerome Miklau, and Don Towsley, Analyzing Privacy in Enterprise Packet Trace Anonymization
Related work
19
Bruno Ribeiro, Weifeng Chen, Gerome Miklau, and Don Towsley, Analyzing Privacy in Enterprise Packet Trace Anonymization
Conclusions
Attack
Include global mapping restrictions
An instance of the tree edit distance problem
Indicates that full prefix preservation has flaws
Impact of late probing on the de-anonymization
Worst-case analysis
Can help future anonymization schemes
A tool for data publishers
Experiments indicate that:
Partial is much safer than full prefix preservation
But still not completely safe
20
Bruno Ribeiro, Weifeng Chen, Gerome Miklau, and Don Towsley, Analyzing Privacy in Enterprise Packet Trace Anonymization
Thanks
21
Bruno Ribeiro, Weifeng Chen, Gerome Miklau, and Don Towsley, Analyzing Privacy in Enterprise Packet Trace Anonymization