You are on page 1of 11

Hash Functions - Bart Preneel

November 2014

Hash functions
X.509 Annex D

RIPEMD-160

MDC-2

SHA-256

MD2, MD4, MD5

SHA-512

SHA-3

SHA-1

Cryptographic Hash Functions


This is an input to a crypto
cryptographic hash function. The input
is a very long string, that is
reduced by the hash function to a
string of fixed length. There are
additional security conditions: it
should be very hard to find an
input hashing to a given value (a
preimage) or to find two colliding
inputs (a collision).

Bart Preneel
KU Leuven - COSIC
firstname.lastname@esat.kuleuven.be

1A3FD4128A198FB3CA345932

Univ. of Piraeus, 11 November 2014


Insert presenter logo
here on slide master

Applications

Agenda
Definitions

short unique identifier to a string


digital signatures
data authentication

Iterations (modes)

one-way function of a string

Compression functions

protection of passwords
micro-payments

Constructions

confirmation of knowledge/commitment
g

SHA-3
SHA 3

pseudo-random string generation/key derivation


entropy extraction
construction of MAC algorithms, stream ciphers, block
ciphers,

Conclusions

2005: 800 uses of MD5 in Microsoft Windows


3

Preimage resistance

Security requirements (n-bit result)


preimage

collision

2nd preimage

preimage

in a password file, one does not store

(username, password)

but
(username,hash(password))

h(x)

h(x)

2n

= h(x)
2n

h(x)

h(x)

h
h(x)

2n/2

this is sufficient to verify a password


an attacker with access to the
password file has to find a preimage

2n
5

Hash Functions - Bart Preneel

November 2014

Second preimage resistance


2nd preimage

Collision resistance
hacker Alice prepares two versions
of a software driver for the O/S
company Bob

Channel 1: high capacity and insecure

h(x)
Channel 2: low capacity but secure
( authenticated
(=
th ti t d cannott be
b modified)
difi d)

an attacker can modify x but not h(x)

h(x)

= h(x)

he can only fool the recipient if he


finds a second preimage of x

2n
7

Collision resistance (2/2)


in many cryptographic protocols,
Alice wants to commit to a value x
without revealing it
Alice picks a secret random string r
and sends y = h(x || r) to Bob
in a later phase of the protocol
protocol, Alice
reveals x and r to Bob and he
checks that y is correct
if Alice can find a collision, that is
(x,r) and (x,r) with x x she can
cheat
if Bob can find a preimage, he can
learn x and cheat

Alice submits x for inspection to Bob


if Bob is satisfied, he digitally signs
h(x) with his private key
Alice now distributes x to users of
the O/S; these users verify the
signature with Bobs public key

h
h(x)

h(x)

2n/2

this signature works for x and for x,


since h(x) = h(x)

The birthday paradox

collision

x is correct code
x contains a backdoor that gives Alice
access to the machine

collision

how many people r do I need to have in a room to


have a probability of p=50% to have at least 2
people with the same birthday?
answer: 23

what is the probability that the birthdays of r people are distinct?

h
h(x)

r terms
q = 1 - p = 1 . 364/365 . 363/365 . 362/365 (365-(r-1))/365
q = 1-p 0.5 for r = 23

h(x)

intuition: number of distinct pairs of people is 23.22/2 = 253; each pair has
probability 1/365 to have the same birthday

2n/2
9

exercise: how many people do you need in a room to have a probability


of 0.50 to have 3 people with the same birthday?

10

The birthday paradox - proof

The birthday paradox (2)


given a set with S elements
choose r elements at random (with replacements) with r S
the probability p that there are at least 2 equal elements (a
collision) 1 - exp - r(r-1)/2S)
more precisely, it can be shown that
p 1 - exp - r(r-1)/2S)
if r < 2S then p 0.6 r (r-1)/2S

r terms

q = 1-p = 1 . ((S-1)/S) . ((S-2)/S) . ((S-(r-1))/S)


or q = k=1r-1 (S-k/S)
ln q = k=1r-1 ln (1-k/S) k=1r-1 -k/S = -r(r-1)/2S
Taylor: if x 1: ln (1-x) x
summation: k=1r-1 k = r (r-1)/2

for a hash function with an n-bit result, a collision can be


found in time 2n/2 (and memory 2n/2)

hence p = 1 q = 1 - exp - r(r-1)/2S)


11

12

Hash Functions - Bart Preneel

November 2014

Brute force (2nd) preimage

Pseudo-random function
computationally indistinguishable from a random function

multiple target second preimage (1 out of many):

Advhprf = Pr [ K $ K: AhK(.) 1] - Pr [ f $ RAND(m,n): Af 1]

if one can attack 2t simultaneous targets, the effort to find a single


preimage is 2n-t

RAND(m,n): set of all functions from m-bit to n-bit strings

multiple target second preimage (many out of


many):
titime-memory trade-off
t d ff with
ith (2n) precomputation
t ti and
d
storage (22n/3) time per (2nd) preimage: (22n/3)

f
? or ?

[Hellman80]

This concept makes only


sense for a function with a
secret key

answer: randomize hash function with a parameter S


(salt, key, spice,)

13

14

Brute force attacks in practice

Quantum computers
in principle exponential parallelism

(2nd) preimage search


n = 128: 14 B$ for 1 year if one can attack
parallel

240

inverting a one-way function: 2n reduced to 2n/2

targets in

[Grover96]

collision search: can we do better than 2n/2 ?

parallel collision search: small memory using


cycle
y
finding
g algorithms
g
((distinguished
g
p
points))

2n/3 computation + hardware [Brassard-Hoyer-Tapp98] = 22n/3


[Bernstein09] classical collision search requires 2n/4 computation
and hardware (= standard cost of 2n/2 )

n = 128: 1 M$ for 5 hours (or 1 year on 60K PCs)


n = 160: 56 M$ for 1 year
need 256-bit result for long term security (30 years or more)

15

16

Properties in practice
collision resistance is not always necessary
other properties are needed:

PRF: pseudo-randomness if keyed (with secret key)


PRO: pseudo-random oracle property (formalization of security
properties when there is no key)
near-collision resistance
partial preimage resistance (most of input known)
multiplication freeness

Iteration

( d off compression
(mode
i ffunction)
ti )

how to formalize these requirements and the


relation between them?

17

18

18

Hash Functions - Bart Preneel

November 2014

How not to construct a hash function

Hash function: iterated structure

Divide the message into t blocks xi of n bits each

IV

Message block 1: x1

Message block 2: x2

x1

H1

x2

H2

H3

x4

x3

split messages into blocks of fixed length and hash them


block by block with a compression function f
need padding at the end

Message block t: xt

efficient and elegant. but

Hash value h(x)


19

Security relation between f and h

Security relation between f and h (2)

iterating f can degrade its security

solution: Merkle-Damgrd (MD) strengthening


fix IV, use unambiguous padding and insert length at the end

trivial example: 2nd preimage

IV

H1

x2

x1

IV = H1

H2

x2

f is collision resistant h is collision resistant

H3

[Merkle89-Damgrd89]

H2

H3

f is ideally 2nd preimage resistant h is ideally 2nd


preimage resistant [Lai-Massey92]

x4

x3

20

many other results

x4

x3

21

22

Attacks on MD-type iterations

Security relation between f and h (3)


length extension: if one knows h(x), easy to compute h(x || y) without knowing x or IV

IV

x2

x1
IV

H1

H1

H2

H3

Sec security degrades lineary with number 2t of message blocks


hashed: 2n-t+1 + t 2n/2+1
appending the length does not help here!

H4= h(x || y)

x3

x1

H1

x2

multi-collision
multi collision attack and impact on concatenation
herding attack

[J
[Joux04
04]

[Kelsey-Kohno06]

reduces security of commitment using a hash function from 2n


on-line 2n-t + precomputation 2.2(n+t)/2 + storage 2t

solution: output transformation


IV

long message 2nd preimage attack


[Dean-Felten-Hu'99], [Kelsey-Schneier05]

H3= h(x)

x3

x2

x1

H2

H2

x3

H3

x4

example: t=n/2 on-line 2n/2 + precomputation 2.23n/4 + storage 2n/2

g
23

24

Hash Functions - Bart Preneel

November 2014

How (NOT) to strengthen a hash function?

Multiple collisions multi-collision

[Coppersmith85][Joux04]

Assume ideal hash function h with n-bit result


(2n/2) evaluations of h (or steps): 1 collision

answer: concatenation
h1 (n1-bit result) and h2 (n2-bit result)

h(x)=h(x)

(r. 2n/2) steps: r2 collisions

iintuition:
t iti
th
the strength
t
th off g against
i t
collision/(2nd) preimage attacks is the
product of the strength of h1 and h2

if both are independent

h1

h(x1)=h(x1)) ; h(x2)=h(x2)) ; ; h(xr2)=h(xr2))

h2

(22n/3) steps: a 3-collision


h(x)= h(x)=h(x)

g(x) = h1(x) || h2(x)

(2n(t-1)/t) steps: a t-fold collision (multi-collision)

but.

h(x1)= h(x2)= =h(xt)


25

26

Multi-collisions on iterated hash function (2)


IV

x1, x1

H1
x2, x2

H2

x3, x3

H3

Multi-collisions [Coppersmith85][Joux 04]

finding multi-collisions for an iterated hash function is not

much harder than finding a single collision (if the size of the
internal memory is n bits)

algorithm
generate R =

x4, x4

f IV:
for
IV collision
lli i for
f block
bl k 1:
1 x1, x1

for H1: collision for block 2: x2, x2


for H2: collision for block 3: x3, x3

R
2n1/2-fold

multi-collision
lti lli i ffor h2
in R: search by brute
force for h1

h1

Time: n1. 2n2/2 + 2n1/2

for H3: collision for block 4: x4, x4

<< 2(n1 + n2)/2

now h(x1||x2||x3||x4) = h(x1||x2||x3||x4) = h(x1||x2||x3||x4) =


= h(x1||x2||x3||x4) a 16-fold collision (time: 4 collisions)

h2

g(x) = h1(x) || h2(x)

27

Multi-collisions [Coppersmith85][ Joux 04]

28

Improving MD iteration

consider h1 (n1-bit result) and h2 (n2-bit result), with n1 n2.


concatenation of 2 iterated hash functions (g(x)= h1(x) || h2(x))
is as most as strong as the strongest of the two (even if both
are independent)

salt + output transformation + counter + wide pipe


salt

IV

cost of collision attack against g at most


n1 . 2n2/2 + 2n1/2 << 2(n1 + n2)/2
cost of (2nd) preimage attack against g at most
n1 . 2n2/2 + 2n1 + 2n2 << 2n1 + n2
if either of the functions is weak, the attacks may work better

29

2n

x1

salt

H1

f
1

2n

x2

f
2

salt

H2
2
2n

x3

f
3

salt

salt

H3
2n

x4

security reductions well understood


many more results on property preservation
impact of theory limited

2n

|x|
4

30

Hash Functions - Bart Preneel

November 2014

Improving MD iteration

Tree structure: parallelism

degradation with use: salting (family of functions,


randomization)
or should a salt be part of the input?

[Damgrd89], [Pal-Sarkar03], [Keccak team13]

x1 x2

x3

x7

x8

PRO: strong output transformation g

x5 x6

x4

also solves length extension

long message 2nd preimage: preclude fix points

counter f fi [Biham-Dunkelman07]

multi-collisions, herding: avoid breakdown at 2n/2


with larger internal memory: known as wide pipe

e.g., extended MD4, RIPEMD, [Lucks05]

31

32

Permutation () based: sponge


x1

x2

x3

h1

x4

Modes: summary
growing theory to reduce security properties of
hash function to that of compression function
(MD) or permutation (sponge)

h2

H10
r

preservation of large range of properties

H20

relation between properties

it is very nice to assume multiple properties of the


compression function f, but unfortunately it is very
hard to verify these

squeeze

absorb

if result has n bits, H1 has r bits (rate), H2 has c bits (capacity) and
the permutation is ideal collisions
min (2c/2 , 2n/2)
2nd preimage min (2c/2 , 2n)
preimage
min (2c , 2n)
33

still no single comprehensive theory

34

Single block length: [Rabin78]

IV

Compression
p
functions

H1

x4

x3

x2

x1

H2

E
H2

H3

H4

H2

Merkles meet-in-the-middle: (2nd) preimage in time 2n/2


select 2n/2 values for (x1,x2) and compute forward H2
select 2n/2 values for (x3,x4) and compute backward H2
by the birthday paradox expect a match and thus a (2nd) preimage

35

35

36

Hash Functions - Bart Preneel

November 2014

Block cipher (EK) based: single block length

Permutation () based
small permutation
Grstl

parazoa

Miyaguchi-Preneel

Davies-Meyer

JH
Hi

Hi-1

xi

xi

Hi

H1i-1

H2i-1

Hi-1

H2i

Hi-1

xi

xi
H1i

output length = block length m; rate 1; 1 key schedule per encryption

Hi

12 secure compression functions (in ideal cipher model)

lower bounds: collision 2m/2, (2nd) preimage 2m


[Preneel+93], [Black-Rogaway-Shrimpton02], [Duo-Li06], [Stam09],
37

Block cipher (EK) based: double block length


(3n to 2n compression)

what is the best


collision/preimage security for
2 block cipher calls?

[Mennink12]

Number of
block cipher
calls

For optimal collision security:


what is the best preimage
security for s block cipher
calls? (upper bounds are
known)

[Jetchev-zen-Stam12]

(2)

2n/3

MDC-4
MDC-2

(2)

analysis of slightly more complex schemes very


difficult
MD versus sponge debate:

(4)

sponge is simpler
should xi and Hi-1 be treated differently?

trivial

(1)

n 5n/4

3n/2

2n

preimage
39

40

Hash function history 101


RSA

DES
1980

Hash function
constructions
t ti

HARDWARE

5n/8
3n/5
n/2

powerful tools available

single
block
length

1990

2000

SOFTWARE

(3)

Iteration modes and compression functions


security of simple modes well understood

Open problems:

collision

38

double
block
length
AES

small
permutations

ad hoc
schemes

Dedicated
MD2
MD4
MD5

security
reduction
for
factoring,
DLOG,
lattices

SNEFRU

SHA-1
RIPEMD-160
SHA-2
Whirlpool
SHA-3

2010
41

41

42

Hash Functions - Bart Preneel

November 2014

MDx-type hash function history


MD4

Ext. MD4

HAVAL

RIPEMD

x0

RIPEMD-160

SHA-2

92

94
95
02

SHA-3

12

A0

B0

A1

B1

C0

D0

C1

D1

C16

D16

93

SHA(-0)

[Rivest91]: 4 rounds of 16 steps

H i-1

90
91

MD5

SHA-1

MD5

x15

A16

B16

xp(0)

A17

B17

xp(15)

A32

B32

xq(0)

A33

B33

xq(15)

A48

B48

C48

D48

xr(0)

A49

B49

C49

D49

xr(15)

A64

B64

C17

D17

C32

D32

C33

D33

xi
K
i

C64

D64

+
Hi

43

44

The complexity of collision attacks

State updates in the MD4 family

brute force: 1 million PCs (1 year) or US$ 100,000 hardware (4 days)


MD4

SHA/SHA-1

SHA-256

90
80
70
60
50
40
30
20
10
0

Design principles copied in MD5, RIPEMD, HAVAL, SHA,


SHA-1, SHA-256, ...

19
92
19
92
19
94
19
96
19
98
20
00
20
02
20
04
20
06
20
08
20
10

All hash functions in use today

MD4
MD5
SHA-0
SHA-1
Brute force

45

46

Slide credit: C. Rechberger

SHA-1

Rogue CA attack

designed by NIST (NSA) in 94

[Sotirov-Stevens-Appelbaum-Lenstra-Molnar-Osvik-de Weger 08]

log2 complexity

request user cert; by special

[Wang+05]
[Mendel+08]
[Wang+04]
[Sugita+06]

[Manuel+09]
[Stevens12]

collision this results in a fake CA


cert (need to predict serial
number + validity period)

impact: rogue CA
that can issue certs
that are trusted by
all browsers

[McDonald+09]

Most attacks
unpublished/withdrawn

User1

Self-signed
root key
CA1

CA2

User2

User x

Rogue CA

6CAshaveissuedcertificatessignedwithMD5in2008:

prediction: collision for SHA-1 in the next 12 months

47

RapidSSL,FreeSSL(freetrialcertificatesofferedbyRapidSSL),TCTrustCenter
AG,RSADataSecurity,Verisign.co.jp
48

Hash Functions - Bart Preneel

November 2014

Upgrades

SHA-2 [FIPS180,NIST02]
SHA-224, SHA-256, SHA-384, SHA-512

RIPEMD-160 is good replacement for SHA-1

upgrading algorithms is always hard

non-linear message expansion


64/80 steps
SHA-384 and SHA-512: 64-bit architectures

SHA-256 collisions: 31/64 steps 265.5 [Mendel+13]

TLS uses MD5 || SHA


SHA-1
1 to protect algorithm
negotiation (up to v1.1)

free start collision: 52/64 steps (2128-) [Li+12]

non-randomness
d
4
47/64
/64 steps ((practical)
i l) [Bi
[Biryukov+11][Mendel+11]
k
11][M d l 11]

SHA-256 preimages: 45/64 steps (2256-) [Khovratovitch12]

upgrading negotiation algorithm is even


harder: need to upgrade TLS 1.1 to TLS 1.2

implementations today faster than anticipated


18 cycles/byte on Core 2 (2008) 7.8 cycles/byte on Haswell (2013)

adoption accelerated by other attacks on TLS

since 2013 deployment in TLS 1.2

49

50

NIST AHS competition (SHA-3)


SHA-3: 224, 256, 384, and 512-bit message digests
(similar to SHA-2)
Call:

02/11/07

Deadline (64): 31/10/08

SHA-3

Round 1 (51): 09/12/08

80

(bit and
(bits
db
bytes)
t )

60

Round 2 (14): 24/7/09

64
51

Final (5):

10/12/10

Selection:

02/10/12

40
14

20

Q4/10

Q4/12

0
Q4/08

Q3/09
round 1

final
52

The candidates

Preliminary cryptanalysis

53

Slide credit: Christophe De Cannire

round 2

51

51

54

Slide credit: Christophe De Cannire

Hash Functions - Bart Preneel

November 2014

End of Round 1 candidates

Round 2 candidates

55

Slide credit: Christophe De Cannire

56

Slide credit: Christophe De Cannire

Properties: bits and bytes

Reductions: 256-bit result

[Watanabe10]

Blake-256
Grstl-256
JH-256
K
Keccak-256
k 256
Skein-256

pre

sec

coll.

indiff. assumption

256
256
256
256
256

256
256-L
256
256
256

128
128
128
128
128

128
128
256
256
256

E ideal
, ideal
ideal
ideal
id l
E ideal
ideal

SHAKE-128

128

128

128

128

NIST

256

256-L

128

57

58

Software performance
eBash [Bernstein-Lange]

Reductions: 512-bit result

Blake-512
Grstl-512
JH-512
K
Keccak-512
k 512
Skein-512

pre

sec

coll.

512
512
256
512
512

512
512-L
256
512
512

256
256
256
256
256

256
256
256
512
256

E ideal
, ideal
ideal
ideal
id l
E ideal
ideal

logarithmic scale

indiff. assumption

SHAKE-512

256

256

256

256

NIST

512

512-L

256

slower

59

factor 4 in cycles/byte

60

10

Hash Functions - Bart Preneel

November 2014

Hardware: post-place & route results


ASIC 130nm [Guo-Huang-Nazhandali-Schaumont10]
Throughput
(Gbps)
20

Keccak

SHA256
Blake

Keccak

BMW

16

CubeHash
ECHO
Fugue

12

Grostl

Grstl

Hamsi
JH
Keccak

Luffa

Blake

Shabal

SIMD

nominal version:

Skein

Skein

5x5 array of 64 bits

0
0

permutation: 25, 50, 100, 200, 400, 800, 1600

SHAvite

JH

40,000

80,000

120,000

160,000

200,000

Area
(GateEqv)

24 rounds of 5 steps
61

62

Slide credit: Patrick Schaumont, Virginia Tech

Keccak: FIPS 202

Performance of hash functions [Bernstein-Lange]

(draft: 28 May 2014)

(cycles/byte) Intel Core 2 Quad Q9550; 4 x 2833MHz (2008)

append 2 extra bits for domain separation to allow

2001

flexible output length (XOFs or eXtendable Output Functions)


tree structure (Sakura) allowed by additional encoding

6 versions

SHA3-224: n=224; c =
SHA3-256: n=256; c =
SHA3 384: n=384; c =
SHA3-384:
SHA3-512: n=512; c =
SHAKE128: n=x; c =
SHAKE256: n=x; c =

448;
512;
768;
1024;
256;
512;

r = 1152
r = 1088
r = 832
r = 576
r = 1344
r = 1088

(72%)
(68%)
(52%)
(36%)
(84%)
(68%)

pad 01

pad 11 for XOF

if result has n bits, H1 has r bits (rate), H2 has c bits (capacity) and
the permutation is ideal collisions
min (2c/2 , 2n/2)
2nd preimage min (2c/2 , 2n)
preimage
min (2c , 2n)
(estimated)
63

64

Hash functions: conclusions

SHA-1 would have needed 128-160 steps


instead of 80
2004-2009 attacks: cryptographic meltdown but
not dramatic for most applications
theory is developing for more robust iteration
modes and extra features; still early for building
blocks
Nirwana: efficient hash functions with security
reduction
65

11

You might also like