You are on page 1of 20

A Million Mousetraps

Using Big Data and Little Loops to Build Better Defenses

Allison Miller, Tagged

Overview
Protecting customers on an open platform Big data + Little loops enable automation via analytics Decisions as defenses

Putting your data to work

the interdepen

the porous

Scammers Account takeover Spam DOS

Credential Theft

Fraud
Bots

so, about Malware Griefers that

Phishi

The Better Mousetrap


Automates defensive action x-platform

- Fast - Accurate - Cheap


Big Data & Little Loops

In Real Time In Time to Minimize Loss Reasonable False Positives As good as a human specialist Reduces More Loss than Cost Created Cheaper than Manual intervention

123.123.123.123 - - [26/Apr/2000:00:23:48 -0400] "GET /pics/wpaper.gif HTTP/1.0" 200 6248 "http://www.jafsoft.com/asctortf/" "Mozilla/4.05 (Macintosh; I; PPC)" 123.123.123.123 - - [26/Apr/2000:00:23:47 -0400] "GET /asctortf/ HTTP/1.0" 200 8130 "http://search.netscape.com/Computers/Data_Formats/Document/Text/RTF" "Mozilla/4.05 (Macintosh; I; PPC)" 123.123.123.123 - - [26/Apr/2000:00:23:48 -0400] "GET /pics/5star2000.gif HTTP/1.0" 200 4005 "http://www.jafsoft.com/asctortf/" "Mozilla/4.05 (Macintosh; I; PPC)" [Tue Mar 9 22:02:41 2004] [info] created shared memory segment #10813446[Tue Mar 9 22:02:41 2004] [notice] Apache/1.3.29 (Unix) mod_ssl/2.8.16 OpenSSL/0.9.7c configured -- resuming normal operations[Tue Mar 9 22:02:41 2004] [info] Server built: Mar 7 2004 13:38:59pausing [http://xmlrevenue.com/s.php?username=jenneypan&keywords=Online+Gambling] for 50000 ms[Tue Mar 9 22:04:16 2004] [error] [client 218.93.92.137] mod_security: Access denied with code 200. Pattern match "Basic" at HEADER.[Tue Mar 9 22:07:16 2004] [error] [client 203.121.182.190] mod_security: Invalid character detected [4] 123.123.123.123 - - [26/Apr/2000:00:23:50 -0400] "GET /pics/5star.gif HTTP/1.0" 200 1031 "http://www.jafsoft.com/asctortf/" "Mozilla/4.05 (Macintosh; I; PPC)" 123.123.123.123 - - [26/Apr/2000:00:23:51 -0400] "GET /pics/a2hlogo.jpg HTTP/1.0" 200 4282 "http://www.jafsoft.com/asctortf/" "Mozilla/4.05 (Macintosh; I; PPC)" 123.123.123.123 - - [26/Apr/2000:00:23:51 -0400] "GET /cgibin/newcount?jafsof3&width=4&font=digital&noshow HTTP/1.0" 200 36 "http://www.jafsoft.com/asctortf/" "Mozilla/4.05 (Macintosh; I; PPC)" [Tue Mar 9 22:02:41 2004] [notice] Accept mutex: sysvsem (Default: sysvsem)[Tue Mar 9 22:03:26 2004] [error] [client 218.93.92.137] mod_security: [Tue Mar 9 22:07:16 2004] [error] [client 203.121.182.190] mod_security: Invalid character detected [4] 123.123.123.123 - - [26/Apr/2000:00:23:50 -0400] "GET /pics/5star.gif HTTP/1.0" 200 1031 "http://www.jafsoft.com/asctortf/" "Mozilla/4.05 (Macintosh; I; PPC)" 123.123.123.123 - - [26/Apr/2000:00:23:51 -0400] "GET /pics/a2hlogo.jpg HTTP/1.0" 200 4282 "http://www.jafsoft.com/asctortf/" "Mozilla/4.05 (Macintosh; I; PPC)" 123.123.123.123 - - [26/Apr/2000:00:23:51 -0400] "GET /cgibin/newcount?jafsof3&width=4&font=digital&noshow HTTP/1.0" 200 36 "http://www.jafsoft.com/asctortf/" "Mozilla/4.05 (Macintosh; I; PPC)" [Tue Mar 9 22:02:41 2004] [notice] Accept mutex: sysvsem (Default: sysvsem)

Big Data & Little Loops

APPLIED RISK ANALYTICS


Use of technology, data, research & statistics to solve problems associated with losses or costs due to security vulnerabilities / gaps in a system -- resulting in the deployment of optimized detection, prevention, or response capabilities.

Decisions, Decisions
RESPONSE
Authorize Block

Good Action Gets Blocked

Good

false positive false negative

POPULATION
Bad

Downstream Impacts

Incorrect decisions have a cost Correct decisions are free (usually)

Bad Action Gets Through

Applying Decisions

Risk management is decision management


Su SUBMIT
RESULT ACTION OCCURS

ACTOR ATTEMPTS ACTION

WHAT IS THE REQUEST

HOW TO HONOR THE REQUEST

SHOULD WE HONOR?

For example:
Decision
ACTOR ATTEMPTS Payment

Su SUBMIT

Authorize Review

Refer
Request Authentication

p (actor attempting payment is accountholder)

Decline

f(variable A + Variable B + ...)

Study history...
User IP Country <> Billing Country

Buying prepaid mobile phones


Add new shipping address in cart

However Buyer = Phone reseller, static machine ID

How much $$ is at risk? What is normal for this customer? What bad profiles does this match?

SHALL WE PLAY A GAME?

LOGIN INCE WE CANT PLAY CLUE FOR EVERY WE BUILD RISK MODELS) TRANSACTION NEW USER MESSAGE FRIEND REQUEST ATTACHMENT PACKET WINK POKE CLICK

Model Development Process


Target -> Yes/No questions best Find Data, Variable Creation -> Best part Data Prep -> Worst part Model Training -> Pick an algorithm Assessment -> Catch vs FP rate Deployment -> Decisioning vs Detection

User IP Country <> Billing Country

Buying prepaid mobile phones

Add new shipping address in cart

Date Added Address Type String Matching

Cart Category

Geolocate IP

Flag on Mismatch TXN-$-AMT

Merch Risk Level

Buyer = PhoneCustomer reseller, static Profile machine ID

Device ID Device History

Convert geo to country code

How much $$ is at risk? What is normal for this customer? What bad profiles does this match?

Churn Risk, CL ... TXNs, logins Stolen CC,

Dependent Variable

Variance in dependent p-value of variable explained by significance, throw independent variables out if > .05

Independe nt Variables

Factor odds of dependent go up when independent var incremented

p-value should be < significance level (.05)

419 example: the 411


Trigger

- Contact receives 419 from a (free) business email


account, who contacts victim OOB

Backtrack

- Password was changed (user had to go through reset process) Contacts, inbox, outbox deleted Nigerian IP login

Elaboration

- Reply-to: changed an i to an l (same ISP) - Only takes Western Union

419 example: Reducing 911s


Variables

New session variables: New login IP, new login IP country, new cookie/machine ID Change account variables: Change password, change secondary email, change name, change public profile New activity variables: Send to all contacts, # of accounts in cc or bcc, Edit/delete contacts en masse Association variables: New recipients, New reply-to fields, Similar accounts created/associated (fuzzy=more difficult) Stronger password reset options (SMS) Transparency: Other current sessions, past session history (IPs, logins) Auto-logout all other sessions upon password reset Reporting: Details of elaboration as well as cut and paste messages

User empowerment

Recap
Protecting customers requires understanding not just technology but also behavior. This requires:

Activity data Clear definitions of good vs bad results

Constant feedback
Analysis

Big Data & Little Loops

Designing data-driven defenses

Decisions that can be automated w/data Where/what data sets to use Business drivers to keep in mind
p (bad)
f(variable A + Variable B + ...)

An example

Prediction is very difficult, especially about the future Niels Bohr

Allison Miller allison@tagged.com

about.tagged.com/jobs

You might also like