You are on page 1of 88

Black Ops

Dan Kaminsky Chief Scientist DKH dan@doxpara.com (2012)

Another Year, Another Talk


Good News and Bad News
Good News: Were going to fix this thing. We have no choice. The global economy is based on Information Technology being trustworthy. An economy where you have to be big enough to field a cyber army in order to participate, is a broken economy indeed Its not like the big guys are doing a great job defensively Bad News: Were not going to fix it according to

dogma.
Hows the status quo working out for us? There are many alternatives to dogma that are even worse How do we find those that are better?

A Riddle
What is the fundamental difference between

attack and defense?

Answer
When an attack doesnt work, you can tell.

Offense has an inherent quality filter


Put up or shut up Doesnt mean there arent bugs in offensive

disclosure
The Oracle Critical is just an unencrypted transport if its

a bug, then Wireshark is dropping hundreds of 0day Press Will Report Anything

But its not the same

The Reality Of Defense


Too much dogma Not enough science You have to defend against every bug -> Thats

impossible -> You dont have to show youve defended against anything
Critiques of defenses arent much better nobody is

measuring or critiquing effectiveness


So this is a talk about skepticism and the processes of

finding effective defenses to the real and legitimate threats we cannot ignore
You shouldnt agree with everything Im going to present My goal is to show you some new ideas, and give you a

framework to consider them as worthwhile or not This is the only way were going to get defense to work
Lest you think theres nothing concrete here

The Fundamental Test


Take 2000 systems with a defense. Take 2000 systems without. Come back in six months, and manually audit all 4000 systems. Is there or is there not a statistically significant difference in the infection rate? Even if we dont do the above, let us at least respect a gold standard when we see one! The time may come when we spend as much money on security research as we do on medical research.

Medicine took hundreds of years to become scientific, and they

had dead bodies to motivate them We dont have dead bodies, or hundreds of years. We still need to fix these problems. Some vendors out there care along these lines. Reward them!

The Three Heads Of The Security Hydra


1) The inability to authenticate 2) The inability to write secure code 3) The inability to bust the bad guys What were not talking about today
Authentication DNSSEC no time, ask me in private (or wait

a few months) Busting the bad guys


Remarkable lack of consensus regarding which bad guys are most

important I tend to worry about the Aurora attack, which involved espionage against (lets face it) the entire Fortune 500, and against those raiding SMB payrolls, because that calls into question the very viability of SMB Others have different priorities

What we are talking about


The inability to write secure code

An immediate clarification
Its not that its impossible to write secure code
Its not impossible to deploy X.509 PKI Its not impossible to bust the bad guys

Its just plainly and utterly improbable


At least in most organizations

Possible is not enough. Probable or bust.

What are we looking at today?


How do we address timing attacks?

How do we generate random numbers?


How do we suppress SQL Injection? How do we detect network manipulation? How do we scan the Internet? These are all things that are possible today.

How do we make them more deployable, less expensivemore probable?

Timing Attacks
Many systems are modeled in terms of just what data

they send
Not in terms of when they send it Sometimes data leaks security sensitive data

Possible to distinguish 15-100 microseconds of

latency over Internet, and 100 nanoseconds of latency over LAN (1000 samples)
Opportunities and limits of remote timing attacks (Scott A

Crosby , Rudolf H. Riedi , Dan S. Wallach)


Possible to exploit string comparison functions in

widespread scripting languages, thus breaking HMAC compare (OpenID/OAuth)


Exploiting timing attacks in widespread systems

Nate Lawson and Taylor Nelson @ Black Hat 2010

The Proposed Fix


Any time values need to compared in a security

critical context, compare them in constant time (so that theres no correlation between whats compared, and how long it takes)
public static boolean isEqual(byte[] a, byte[] b) {

if (a.length != b.length) { return false; } int result = 0; for (int i = 0; i < a.length; i++) { result |= a[i] ^ b[i] } return result == 0; }
Looks good, right?

The Problem
You have to remember to do this everywhere

theres a security critical comparison


You dont get to do it all the time, because the

performance impact is too high You thus must actually identify all the security critical comparisons
Its possible. But its not probable.

A Solution?
I seem to note that distinguishing against Internet

noise yields less accuracy (15,000-100,000ns) than LAN noise (100ns)


Thats three to four orders of magnitude! And Internet noise is not actually random

What if we actually did have a random delay?


tc qdisc change dev eth0 root netem delay

3ms 1ms

For all packets emitted from the first Ethernet interface, add a

random amount of lag between 1,000,000ns and 3,000,000ns Boltzmann Filter


At minimum, the LAN should be as secure as the

Internet. Maybe Internet attackers also are impacted. This is a lot easier to deploy. That really matters.
But does it work?

What Could Go Wrong?


All timing noise can be averaged out eventually, so a global random

delay cant work

Pretty much all password comparisons are done with non-constant time

compares, so I guess all passwords are vulnerable? Heres some SSH 0day sys_auth_passwd(Authctxt *authctxt, const char *password){ /* Encrypt the candidate password using the proper salt. */ encrypted_password = xcrypt(password, (pw_password[0] && pw_password[1]) ? pw_password : "xx"); return (strcmp(encrypted_password, pw_password) == 0); Strcmp is not constant time. So, you just offline brute force for passwords that have certain characters and see how far you get.
It is highly unlikely that the above attack actually works
Nanosecond differentials are too small to recover Maybe not locallyhmmm

What We Really Need To Know


How much timing noise, of what nature, will

permanently obscure how much timing signal beyond the point of infeasible return?
Somewhere between 1 nanosecond and 1 day there

is an amount of noise that will indefinitely obscure an n nanosecond differential


Theres likely to be an equation here
CSI Enhance has its limits

There is a limit to how much lag we can ask for, from the

performance guys
It is higher for some requests than for others

We might require more lag than perf is willing to give (at least

in general) Need to discover these numbers

What could actually go wrong


The distribution of lag from the interface may be easy

to filter
Quantized into 1ms chunks? Gaussian when it should be uniform, or uniform when it

should be Gaussian Could be filterable thanks to TCP timestamps (which have ~10ms accuracy, but also have sharp edges)
All of the above can be fixed, the question is if they

need to be
The perfect (constant time comparisons) is the enemy of

the good (interface-wide jitter)


Jitter does not need to apply to all packets; could be a TCP

setsockopt or whatnot Could also be applied at the end of a php script

Another Day, Another Time


RSA is broken!
No, not the thing with the smartcards that would (maybe,

depending on vendor) leak their private key No, not the thing with the SecureID seeds that were stolen The thing with certificates with easily breakable RSA keys
Something like 1 in 200 RSA keys on the Internet failed! Hughes and Lenstra had first announce, Nadia Heninger had parallel

research

At the time, the break was blamed on RSA itself


Two primes in RSA (p and q) If either is repeated (p and q1, p and q2), then all are easy to

derive
Euclids Greatest Common Denominator

RSA is bad!

Reality
Bad random number generators create trapdoor

functions in all cryptosystems


Rather than breaking the crypto, you guess the key Basic concept of 2011s Phidelius (expanded a

password into a pseudorandom stream, which was then used to feed a key generator for RSA/DSA/ECC).
Bad RNG isnt a bug, its a feature!

They thought theyd shown RSA was bad They actually showed that RNGs are still broken
Debians bug wasnt just Debians Werent operating systems supposed to fix this?

Theory
Collecting and providing entropy is hard; let the

operating system do it for you

/dev/random for good bits, /dev/urandom for best effort

bits If /dev/random runs out of bits, block until more are found
Sources for entropy

Hardware RNG Keyboard Mouse Disk Rotation (as impacted by air)

Problem: Lots of environments dont have any of that

Actual Environments
Desktops
Humans w/ keyboards and mice Often disks

Servers
Sometimes have disks

VMs Embedded devices

The Reality of Hardware RNG


Its just not there.

Yes, I know Ivy Bridge is coming out with a

Hardware RNG. In 2012.


Thats top of the line gear now.

Yes, I know some TPMs are reported to have

Hardware RNGs.
For some reason, people treat TPM hardware as

unstable radioactive gunk Its also rarely in embedded kit

Whats Happening: An Analogy


Proteins causes cancer
http://ukpmc.ac.uk/abstract/MED/3007842/reload=0;jsessionid

=3X3Cs6G7VbyRT1xEPcUX.4
Carbohydrates cause cancer
http://www.smh.com.au/lifestyle/diet-and-fitness/high-

carbohydrate-diet-tied-to-cancer-20110616-1g4o9.html
Fats cause cancer
http://www.telegraph.co.uk/health/healthnews/5650141/High-

fat-diet-can-increase-risk-of-deadly-cancer.html
Alcohol causes cancer
http://pubs.niaaa.nih.gov/publications/arh25-4/263-270.htm

So you dont consume proteins, carbohydrates, fats, or

booze.
You starve to death.

What Actually Happens


How do I know? I actually asked some devs. 1) They have some code that depends on

/dev/random 2) On initialization of their embedded device, the code tries to generate a key. 3) Theres no human at the keyboard, no hand at the mouse, no disk to spin, and no hardware RNG. /dev/random blocks. The device is a brick.
Quite literally, starving for entropy

4) At best, they switch to /dev/urandom. At worst they

switch to rand() and then they ship.


/dev/urandom is underseeded, though, and is still broken

A comparison
What perfectionists think will happen:
Its broken! Sure theyll demand hardware RNG!

What developers actually do:


Security failed us again. Lets ship something that

works.
Perfectionism caused (at least) 1 out of 200

RSA keys on the Net to be easily broken


Its almost certainly worse than that Those are just the keys we can easily detect We can do better.

TrueRand: An Old Hack [0]


Why do we like measuring keyboard and mice?
Humans and computers are not synchronized Humans do not operate on nanosecond clocks like

computers do
Human is slow clock, CPU is fast clock

Any system with two clocks, has a Hardware

Random Number Generator


Even if the error is one part per million, thats a bit

per second per megahertz The error is generally much larger than a part per million, just from thermal noise
(Not just thermal noise)

TrueRand: An Old Hack [1]


What TrueRand (from Matt Blaze and D.P. Mitchell, in

1996) does
Run the CPU in a tight loop (count++); Every 16ms, fire an interrupt On interrupt, shuffle the count variable, and integrate it into a buffer The entropy comes in here timer is slow clock, CPU is fast clock After 11 shuffles, return the buffer as an integer Hash two buffers together using sha1, return only the

first byte
It aint bad. But its disowned.
Thats too bad, because it would have prevented (at

least) 1/200 keys from being broken.

Why is it disowned?
(Literally Matt Blaze was vaguely horrified that

Im revisiting this code) Perfectionism


We cant model its behavior. We dont know how

good or bad it is, so we shouldnt do it at all.


This attitude has actually led to a reduction in

available entropy in the Linux kernel


Used to look at interrupt counts from various

devices Now they arent used, because they might be polluted

DakaRand 1.0 [0]


An update to the old model

Multiple generators
Sleepers: Measure usleep with CLOCK_MONOTONIC CLOCK_REALTIME RDTSC (on X86 platforms) CPU counter there are equivalents for ARM, MIPS Incrementer: See how many times we can

increment an integer within a certain time period (100% CPU)

DakaRand 1.0 [1]


RTC: Measure interrupts from the realtime clock

using CLOCK_MONOTONIC (dedicated IRQ!)


128hz 8192hz

Threads: Measure the status of an integer

modulated by a runaway thread (100% CPU)


Anyone who thinks computers are completely deterministic

creations has never written threaded code ;) Two Threads, One Int (one adds, one subtracts, main polls) Two Threads, Two ints (both add, main compares) One Thread, One Int (one adds, main polls) Possible addition: Noisier functions than add

DakaRand Flow
Short version
Push all bits into a SHA-256 Hash Dont undercount entropy Only count them as entropy when they pass Von

Neumanns debiasing check


Count 1s to decide whether 0 or 1 Throw away 00 and 11, count only 01 and 10 Actually insert a 0 or a 1 when you count a bit Dont overcount entropy

Scrypt (time/memory hard function) the resulting SHA-

256 value
Make it miserable to guess entropy

Use the output of Scrypt as the input to AES-256-CTR,

emit the resulting stream

Attacking DakaRand
The game: Find a platform (Desktop/Server/VM/Embed) or

an OS under which DakaRand provides poor entropy in one of its modes Userspace/Hypervisor Scheduling
Were only called some number of times per second These times per second may be at predictable intervals If sufficiently predictable, theyll bias the output Will they simultaneously and identically bias both clocked entities?

Autoclocking If you time something against itself, youre going to have a bad

time
Clocks are highly correlated to themselves

RTC and CLOCK_MONOTONIC could be the same underlying

timer in a VM VMs, more than anything else, should be exposing a random device (even if the random device itself uses clock differentials) Still, this code seems to still work on VMs

The VM Cloning Issue


/dev/random keeps bits around for a long time When you clone an image, you end up with those bits

being static for a long time


Meaning you keep generating the same entropy for a long

time

DakaRand attempted guarantee: Each read is atomic


The results of the read may be used across multiple images But two separate calls at two separate times MUST yield two

uncorrelated streams
Cant do anything after the read is fully completed During the read (which does last a second, due to scrypt)

is already after I actually dont think you can do better than this, though I was considering XORing the keystream with /dev/urandom anyway

Is The Underlying Use Of Crypto Safe?


Modified Von Neumann We absorb a tremendous amount of data into our hash structure

that has obvious patterns If you have 100GB of 0s and 128 bits of actual randomness, output of hash has 128 bits of randomness We do explicitly include the 0 and 1
Stream Function vs. Raw Output Lots of raw output from a function tends to leak external state So lets not leak external state. Cryptographic Stream Function RNGs tend to have their own family of functions that are distinctly

not cryptographically validated


Mersenne Twister, not AES-256 in Counter Mode

Is it in fact the case that strong (not RC4) cryptographic functions

encompass all properties of RNGs?


Well, what does dieharder say?

DieHarder CipherSuite Test


About 16,000 CPU hours of DieHarder Entropy Tests

was run across 21 ciphers, with inputs of either 16MB of zero or (the same) 16MB of /dev/urandom output
About 24,000 different tests per cipher/content class Thanks, Jamie Schwettman, who did all the work to

make this sweep happen


No obvious statistical leanings to the data Machine learning people are taking a look
Thanks, Prior Knowledge, Aleks Jakulin!

No conclusive findings yet


Releasing this data too

Neat tool want it? csql: run SQL against CSV files
$ cat pass2.csv | head -n 20000 | ./csql - "SELECT cipher,

content, test, subtest, count(pv), avg(pv) from c group by cipher, content, test, subtest;" | head -n 10 aes-128-cbc,urandom,dab_bytedistrib,0,10,0.0 aes-128-cbc,urandom,dab_dct,256,10,0.47393035 aes-128-cbc,urandom,diehard_2dsphere,2,10,0.627572674 aes-128-cbc,urandom,diehard_3dsphere,3,10,0.664239991 aes-128-cbc,urandom,diehard_birthdays,0,10,0.50850473 aes-128-cbc,urandom,diehard_bitstream,0,10,0.017056331 aes-128-cbc,urandom,diehard_count_1s_byt,0,10,0.441374983 aes-128-cbc,urandom,diehard_count_1s_str,0,10,0.538731369 aes-128-cbc,urandom,diehard_craps,0,20,0.0394997795 aes-128-cbc,urandom,diehard_dna,0,10,0.396250338

Kernel Recommendations
/dev/random MUST not block. Make an IOCTL if you must Return data slowly if you like CryptGenRandom on Windows does not appear to block
1 out of 200 RDP keys are not likely to be corrupt

Dont be so shy about interrupt sources Care less about interrupt counts than interrupt timings ftrace exposes microsecond timings, which might not be fine

grained enough Use nanosecond arrival times, as much as possible, from devices on foreign busses. The slower the foreign device is, the better.
You want to be measuring slow clocks against fast clocks By definition, the kernel is interrupted at finer grain than userspace.

Obviously you dont have to include every last interrupt it takes

time to check the time.


Maybe consider this Modified Von Neumann construction

From The Bottom To The Top


Our biggest problems in security do not revolve

around Random Number Generation They revolve around languages Language Theoretic Security: The hypothesis that security vulnerabilities are the consequence of the languages code is written in
Coined by Len Sassaman and Meredith Patterson Sapir-Whorf is true for code Corollary: If language got us into this mess, language

can get us out More important corollary: Languages are spoken or written by humans. Ignore their needs at your peril.

The Shift
One way to look at language theoretic security is

through the lens of computability theory


Different classes of code have different amounts of

power, and communication should be limited to the least amount of power necessary Attacks expands power from Declarative to through Regular Expression through Turing Complete This is indeed a valid lens
Another lens

Diagramming Sentences: IT WAS ACTUALLY USEFUL

Injection Vulnerabilities: When Trees Disagree


Parsers, almost by definition, turn streams of bytes into

trees
Injection Vulnerabilities exist when a sending language and a

receiving language (which may or may not be the same) disagree on the nature of the tree sent An extreme case of this is when bytes flow out into surrounding memory But SQL Injection, LDAP Injection, XSS, etc are all just situations where (generally) the sender thought it sent the users data, but the receiver thought it received a peers code
A purely declarative language can still (easily) be injected into, and

complexity can remain declarative and still yield damage. The attack is not in the increase of complexity, but in the transition of content from one identity/context to another through parse tree differentials.

So what?

We have to stop injection vulnerabilities


Theyre killing us

Theyre not l33t


Theyre totally effective Theyre the vast majority of vulnerabilities ever

written and discovered We havent actually fixed them


If we did fix them, they wouldnt still be costing

billions of dollars
[Yes, were going to revisit Interpoliqueits OK,

were going to bash it too]

What is the importance of another theoretical model?


It declares the rules of the game.
1) We want to synchronize parse trees. 2) We want developers to actually use our method. A language unspoken has a term: A dead language

It explains what is surprisingly not understood


Why did XML become popular? Instead of spending months figuring out just how to say hello, they have their code, you have your code, and its self describing strings in each direction. No fiddly the eighth bit on the fourth byte changes everything Why did JSON become popular? XML invented its own modes of being fiddly

The Hard Truth


Developers are in charge.
Not architects (they love ASN.1 and XML and WS-

ZOMG) Not academics (they love Haskell) Not management (they love money)
Money is made by performance, reliability, maintainability,

features, rapid development Money is later lost by security, maybe

So, not us.

What is the #1 thing developers like?


Code working

Thus, the biggest explanation


Why is PHP so popular?
If you dont think it is, see here:

What is PHP incredibly good at?


Copy and paste codeand it works We understand that CPAN makes PERL We dont understand that PHP sample code makes PHP Java Alternative: Look how much

code my IDE can write for me!


Copy and paste with a suit on

The Language Success Metric


What are the odds, if I try this, that it will work?
Not, when it fails, it fails fast!

Surprisingly, nobody tracks this metric


(Except maybe Processing, which is incredible) Thats why all the successful languages tend to be

the brainstorms of one guy Art is science before we know what were doing PHP beats your favorite language If we want to fix security, here is a good place to work

Whats Wrong With ORMs?


Object Relational Models
Problems with SQL Injection? Dont use SQL!

Instead, the database just looks like your favorite languages native objects. Great, right up until the moment you need to make a query.

Look at this. It matters.


+[,+[-[>+>+<<-]>[<+>-]+>>++++++++[<-------->-]<-[< [-

]>>>+[<+<+>>-] <[>+<-]<[<++>>>+[<+<->>-]<[>+<]]>[<]<]>>[-]<<<[[- ]<[>>+>+<<<-]>> [<<+>>]>>++++++++[<-------->-]<->>++++[<++++ ++++>-]<<[>>>+<<[ >+>[-]<<-]>[<+>]>[<<<<<+>>>>++++[<++++++++>-]>-]< <-<-]>[<<<<[]>>>>[<<<<->>>>-]]<<++++[<<++++++++>>-]<<[>>+>+<<<-]>>[<<+ >>-]+>>+++++[<----->-]<-[<[]>>>+[<+<->>-]<[>+ <-]<[<++>>>+[<+<+ >>-]<[>+<]]>[<]<]>>[-]<<<[[-]<<[>>+>+<<<-]> >[<<+>>-]+>-----------[<[-]>>>+[<+<->>-]<[>+<-]<[<++>>>+[<+<+>>-]< [>+<]]>[<]<]>>[-]< <<<<------------->>[[-]+++++[<<+++++>>]<<+>> ]<[>++++[<<+++++++ +>>-]<-]>]<[]++++++++[<++++++++>-]<+>]<.[ -]+>>+<]>[[-]<]<]

BrainF*cks Rejoinder
There are more things in this world broken by

punctuation than just BrainF*ck. Compare.


$result = from('$name')->in($names) ->where('$name =>

strlen($name) < 5') ->select('$name'); 32 characters of punctuation, deeply interspersed $result = query(SELECT $name FROM $names WHERE length($name)<5); 12 characters of punctuation (with large gaps) Which would you rather write?

Theres a reason SQL persists after all these years.

Its really expressive and surprisingly without noise.


for structured queries.

Put another way: Its a language thats shockingly good

Turns out this matters.

The Classics
Escape? mysql_real_escape_string really? 25 characters? Bigger problems: Fails open code still works if its just missing
Greppability is huge you cant grep for a missing escape!

Escapes are a blacklist. Whens the last time you saw a blacklist

work properly?
Parameterization First you declare a template for a query Then you link individual variables to the template, on a positional

basis
This is the first argument This is the second argument MAYBE, if youre lucky, your language supports argument aliases.

The argument marked with :name should get the value of the variable name

One line of code becomes many Resources need to be synced

Reality
Nobody has ever written a parameterized query without a gun to

their head. We know, we hold the gun.


Even secure code, when audited, tends to be safe things written

quickly and we realized this was unsafe so we parameterized it That you have to threaten people with getting fired, is itself a data point.
For some strange reason, databases dont seem to provide

mechanisms to disable unparameterized queries entirely


More interestingly, its a crapshoot whether you get to

parameterize at all
Just try to parameterize SELECT.

SQL, for all its elegance, builds a remarkably complex parse tree

out of a mostly unpunctuated string


Some nodes in the parse tree can be filled by functions, some can be

parameterized, etc. Its a decent RNG to know what you can get away with

Interpolique [0]
Released in 2010 at HOPE Concept for eliminating injection attacks while

retaining dangerous (but developer preferred) coding styles


Both SQLi and XSS

Basic idea
SELECT * FROM foo where x=$x and y=$y Humans can pretty easily see the separation between

code and data. Data begins with $. Code does not. The language throws that data away and just smashes strings together. Does it have to?

Interpolique [1]
The original approach for Interpolique
First, use an alternate syntax to identify the desired

variables
SELECT * FROM foo where x=^^x and y=^^y

Then, create a function that returns the code wed have

liked the developer to write.


$stmt = $conn->prepare(SELECT * FROM foo where x=? and

y=?); $stmt->bind_param(ss, $x, $y); $stmt->execute();

Finally, evaluate the generated code eval(b(SELECT * FROM foo where x=^^x and y=^^y); Eval is, surprisingly, the only way to retrieve the values of $x and $y from inside the function b().

Whats Wrong With Interpolique?


What if the dev writes:
eval(b(SELECT * FROM foo where x=$x and

y=$y); If $x and $y are attacker controlled, hes not far from an eval that will run code in PHPs context! The b() function is in a position to defend the code that ultimately enters eval, but now youre entirely dependent on b() knowing what PHP will do given arbitrary bytes.
GOOD LUCK WITH THAT

Highly greppable error case, but its pretty scary

Building A Safe Interpolique


Eval only exists so that variables from the calling

scope can be dereferenced One approach is to implement create_selfscoped_function()


Returns a function that always runs in the scope of its

parent Could implement proxies so it can only read variables, and cant rewrite
$rows=$mysql_safequery(select * from foo where

x=^^x and y=^^y); Requires a patch to PHP -- Daniel Zulla is working on this!

Code Rewriting?
If we know what we would have liked developers to

have written, why dont we just transform code once?


Never really been a fan of this Have you ever audited autogenerated code? What do you do when the code looks like:

$z = SELECT * from foo where x=$x and y=$y;; $rows = mysql_query($z); Static analysis can of course find such situations (thus knowing $x came in from a HTTP variable) but most devs dont have access to such static analysis tools
Should they?

Tainting
What if we actually marked every character that came in from an

HTTP query as tainted?

Metadata, on a character by character basis Would survive passing from function to function Might even survive reasonable mangling by built in filters

Then, you could write something like:

mysql_query_safe(select * from foo where x=$x and y=$y;);


Even though $x and $y would expand, the wrapper function

would see that those particular characters were once tainted with the mark of the web, and could rewrite the unsafe query around it This still works with mysql_query_safe($x) when $x was assembled elsewhere, even concatenated;
Could have problems with silent failure with filtering functions Requires a patch to PHP Daniel Zulla also working on this

SuperEncoding as Explicit Tainting


Based on discussions with Zane Lackey and Nick Galbreath at

Etsy, based on an approach theyre already running in production What if all variables from the web, were encoded in a whitelisted format?
Simple hex encoding -- &%41 which, coincidentally, renders as

an A in any HTML parser


All non-DB access would have to go through accessors r($x) to read, w($x) to write Surprisingly easy to grep for access that isnt wrapped Could do two things mysql_query_safe($x) could simply treat all superencoded

characters as data and parameterize accordingly mysql itself could have its lexer modified to handle HTML encoding, exposing such characters to less of the SQL parser (this is just a string) very LangSec

A Last Minute Alternative


Perhaps weve got this backwards Rather than tainting data as data, we mark code as code. SQL tends not to be passed around from function to function, let

alone parsed in the frontend $sql = c(select * from foo where x=); $sql += $x; $sql += c(and y =); $sql += $y; Then either mysql_query_safe or mysql itself (cowardly) refuses to execute anything with unmarked code
Or, if this is baked into MySQL, it just doesnt see bytes as code if theyre

not deeply marked as code

Moderately greppable youre basically finding all SQL in your

code and wrapping it with some sort of taint


Either implicit as per Zulla, or explicit as per Etsy Most likely failure mode is an attacker controlled variable

somehow getting inside of c();

This is what LangSec means


What are people trying to say?

How can we make it easier to say that?


How hard will it be for people to migrate? What errors will they make when trying to use

this? Can we limit how much code might contain a bug? CARE ABOUT YOUR DEVS OR THEY WILL NOT CARE ABOUT YOU

Whats Going On With The Web?


It doesnt matter what code you write, if there are

parties in the middle changing or blocking what you send Content alteration and blocking is becoming a real thing
Verizon is claiming the first amendment right to

rewrite Internet connections Entire countries are silently blocking web pages
Indonesias blocking a million porn sites in the run up to

Ramadan

What Went Wrong With N00ter


N00ter was a really fun (and really powerful) mechanism

for detecting network manipulation


Allowed a remote server and a cooperating client to pretend

to have a conversation with anyone on the Internet, using any protocol To any MITM, it would look like a real, unmodified conversation
So any alterations that might normally hit the real server, would hit this

too

Unfortunately, N00ter does a lot of very low level

packetcrafting, meaning (realistically) it requires custom hardware in front of user machines


This is not fun to deploy Especially if you need to get between NAT and actual network connection Not impossible. Definitely improbable.

What Else Can We Use?


Executable code on the client
OONI-Probe

Web Pages with Iframes


Herdict (Herd Verdict) Needs either user cooperation, or a Chrome

extension, to know if content is up or down


Is it possible to determine whether content is up

or not, from just a web page?


Can we crowdsource censorship data? Maximize data per user Minimize installation load per user

Imaging
Browsers Same Origin Policy usually prevents web

pages from doing much with one another


You wouldnt want Yahoo able to read from your Gmail

account
But there is one exception
Any domain is allowed to load any other domains

images Beyond that, its allowed to know that the load was successful
Not merely that there was a file at that location, but that it was

actually an image You even get image dimensions (which youd have to, because it resizes the page)

If a domain is being censored, the image will not load


What one image is on most domains?

Favicon.ico
(Its the picture to the left of Google in the tab)

So this is CensorSweeper
(Also by Joseph Van Geffen and Michael Tiffany) Written for Wall Street Journal Data Transparency Hackathon

Whats going on
img = new Image();

img.onload = function(event) { }// render favicon img.onerror = function(event) { validate(); } img.src = http://somesite.com/favicon.ico The above is done in parallel, reading from a list of sites that have confirmed presence of favicon.ico Six failures are required before a bomb is dropped on the map

Error Handling
Six failures isnt actually enough! Web browsers provide remarkably little feedback to a

developer to know whats failing, and why

Put simply, flow control hasnt really been implemented

for the web Everythings been designed around infinite bandwidth


For reliability, going to need to shut down all other

traffic, and then do two simultaneous lookups


One for a known-up site, the other for the supposedly-

down site
That being said, CensorSweeper works pretty well
Can we do better?

Sockets
Once upon a time, web browsers could act like

proxies, giving you connections anywhere


There were bugs in Flash and Java; we fixed them They can now only create connections to IP

addresses that invite them


But ~20% of the time there are transparent

proxies between web servers and their users


See Staring into the Abyss by me, or Socket

Capable Browser Plugins Result In Transparent Proxy Abuse by Bob Augur


This has been knownbut not explored for

mapping censorship!

HTTP Censorship Detection


1) Using Flash (or HaXe) Create a HTTP

connection back to your own IP on port 80


Host a socket policy file, so Flash allows this

2) Request anything, from any domain


If the request comes to you, there is no transparent

proxy Otherwise, the request will be hijacked by the proxy, serviced, and sent back to your Flash app You now see what that user would see, if they browsed to that site! You can then submit it back to yourself.

HTTPS Certificate Extraction


Just as HTTP traffic on 80/tcp is hijacked, so may HTTPS

traffic on 443/tcp
MITM may have an alternate certificate for you
But (if youre careful) it cant tell the difference between the

browser starting SSL, and Flash/HaXe starting SSL It has to know which domain to pretend to have a certificate for
The proxy can parse the Server Hello, with its certificate

(Its your server saying hello) The proxy can parse the Client Hello, with its Server Name Indication (Its your Flash app saying hello) You can actually host the real Facebook certificate, or even proxy the real Facebook SSL endpoint Hard to keep track of all of Facebooks IPs It has to forge the certificate, before you have to prove you actually have Facebooks private key (assuming you arent proxying)

Slight Annoyance
No normal way, via Browser DOM, to determine the

certificate that provided content


This at least allows a page to query for its exposed

certificates kinda cool!


Limitations
You can test anyones certificate, as long as the attacker

isnt interposing themselves via DNS hijacking


The Flash app sees whats at the named IP; if hijacking is at

the DNS layer, then Flash wont get hijacked


You are able to test your own certificate, but then the

attacker has already MITMd you and can alter your security validation layer

Full Proxying
One of the goals of N00ter was seeing if everyday content was

being altered or slowed down One of the headaches with these custom probes is writing these custom probes
How do you look just like a real web browser trying to access

YouTube? Answer: Be a real web browser trying to access YouTube


The last time we played with Flash and Sockets, we created a

full VPN But now sockets are limited to a single destination


It turns out that it may still be possible/useful to proxy an entire

browser (at the server) down to the Flash app (in the client), which will then make open connections back to the server who will proxy them to the rest of the Internet This will allow, at minimum, a protocol correct sequence of messages for HTTP and HTTPS that are only incorrect by destination IP
So basically, if the intercepting server doesnt care about IP correctness,

you get to interrogate its ruleset with no installed code on the client

Last but not least: Scanning Networks Quickly


Actionable Intelligence: What can an attacker do

today, that he couldnt do yesterday, for what class attacker, to what class victim?
Rather related to this: How many potential victims are

out there?
Ive run two major scans this year (that Ive talked

about)
Telnet Determining presence of Telnet Encryption support Answer: Very rare RDP Determining presence of open RDP access Answer: VERY common

My Process
Once upon a time, simply flooding TCP SYNs

was enough to find out what was out there Nowadays, many, many IP addresses will three way handshake, but there wont actually be anything there Solution: Split process
1) Identify candidate IP addresses, that are listening

on a given port 2) Given a candidate, actually connect to the IP

More Detail
Candidate collection
For each IP, incrementing the first byte first,

(1.1.1.1, 2.1.1.1, 3.1.1.1), send a TCP SYN on the required port (23 for telnet, 3389 for RDP) In a separate window, log TCP SYN|ACKs with tcpdump
tcpdump w log 'tcp[tcpflags] = (tcp-syn|tcp-ack)' Scanrand was being buggy, this maximized logging

Candidate Inspection
Telnet Encryption nmap team whipped up a quick

check, so I just fed the IP list to it Very few found

RDP Sweep: Black Mamba


Probably the most pleasant environment for reasonable scale TCP

probing ever devised


http://rootfoo.org/blackmamba

from blackmamba import *

def get(host, port=80): msg = "GET / HTTP/1.1\r\nHost: %s\r\n\r\n" % host yield connect(host, port) yield write(msg) response = yield read() yield close() print response def generate(host, count=100): for i in range(count): yield get('example.com')

run(generate('example.com')) You end up getting ~3000 IPs a second


May need to increase ulimit n May need to alter hardcoded limits in blackmamba.py

Can We Get Faster?


Always wanted to write a userspace TCP stack
HD Moore kinda kicked me into working on one for critical.io,

his mysterious new scanning project


I am not at all beyond being motivated by other peoples awesome

and mysterious projects Especially when they give me CPU and Network Bandwidth So. Scanrand3! A new scanner that doesnt just flood SYNs,

but actually connects to every node and extracts data


Original plan: TCP stack with SQLite as the backend
SELECT * FROM sockets WHERE data_sent!=data_acked

and data_sent_time-now()>3 (to find sockets where a retransmit is needed) is just funny! SQLite, in memory-only mode, is really really fast
160K inserts/sec fast

Unfortunately, that speed disappears when you add indexes 20K inserts/sec with two indexes

New Plan: Let The Servers Keep TCP State

Details! Details!
Scanrand didnt get its speed by keeping track of who it did

or didnt send traffic to

Why should Scanrand3?

1) Send SYN
Maximum Segment Size==1460 Window Size==1460 (for all packets)

2) Upon receiving a SYN|ACK, reply with an ACK


Include GET / HTTP/1.0 payload Yes, you can put a payload in the initial ACK!

3) Upon receiving an ACK, if there is a payload, ACK it


Save the payload

4) Upon receiving a FIN|ACK, RST


Save the payload, if any

No Local State
If the first SYN is dropped OK, nobodys around to

retransmit it
May want to log RST|ACK to avoid future retransmits

If the SYN|ACK is dropped to the client, server

retransmits SYN|ACK If the ACK w/ initial payload is dropped to the server, server retransmits SYN|ACK, causing new ACK w/ payload If any ACK w/ response payload is dropped to the client, server will retransmit ACK w/ response payload
Same with FIN|ACK Window size of 1460 means we always know which

particular packet to acknowledge only one in flight (usually)

Performance
Relatively unoptimized code on a well hosted but

underpowered server (cheap Dual Opteron) 50-80K servers/sec w/ full payloads 3.25M IPs takes 60-80 seconds, retrieves about 800MB of content Task is embarrassingly parallelizable across threads, databases, etc.
Should be able to use multiple bpf filters to route packets

to their appropriate thread with kernel filtering Writing to a SQLite DB, and then backing up to disk, is really fast (substantially faster than fwrite, though havent tested a large mmap yet) You basically reassemble payloads in SQLite as a postprocess

Security
Scanrand pioneered inverse SYN cookies you protect

against spoofed responses by validating fields in the response against hashes of data plus a secret only you know 16 bits in source port + 32 bits in sequence number are possible
May be able to get another 32 bits out of TCP Timestamps,

which are usually supported Havent implemented yet, so very easy to poison me Sequence space becomes less secure, the more data you actually send
You do know the exact size of each payload, so you can say I only

accept responses with no payload seq, payload 1 seq, payload 2 seq, etc Technically the other said can ACK at any byte offset, but that doesnt mean they actually will

Some Notes
Kernels have actually gotten kind of fast

Non-blocking connect() plus epoll should be able

to get pretty fast


Certainly easier to code for that model! Didnt work for me (not sure why)

This approach ultimately becomes fastest


Probably need a writev call to spew many packets

w/o a write for each

More Notes
Can also try more efficient stores than sqlite
Giant allocation of RAM with fixed offsets per IP MemSQL Neat project by ex-facebookers compiles SQL to C++ They think even with the indexes they can do +100K

Can have merged approaches too


Only start keeping state if I like the response from

the server
Note that stateless client + stateless server = no

retransmits

What should the coding model be?


Flat file / command line?

C?
JavaScript? Lua?
Could implement support for nmap scripts

Most Important Feature


Blacklist support
Most networks dont mind getting swept They certainly are, already Some do Part of being a whitehat is you let people know who you are, and listen to their requests So you end up with a pile of IP ranges not to sweep It can actually take a substantial amount of CPU if you check the list naively Need to compile it into a quickly queriable structure I dont think firewall rules apply to spoofed traffic

Simple Architectural Note


Dont try to interact with the Linux firewall
Just pick another IP on the LAN and send from their Respond to ARP traffic for it (Yes, it is an advantage of the socket model that

you dont need to requisition another IP)

Whew!
Lots of stuff!

Hope you enjoyed!


This may not be how you try to fix stuffbut its

what I try to do
Thanks to everyone cited in the slides Thanks also to Nick, Johnny, Blackstock, Alex,

Allessandra, Allessandra, and Andrew of The Sub for putting up with me in DEFCON mode ;)

You might also like