Professional Documents
Culture Documents
hashing algorithm
Brian Murray
ISSM 533
Table of Contents
Executive Summary............................................................................................................. ..........................3
Summary of MD5....................................................................................................................................... .....3
Definitions............................................................................................................................
......................3
Input data................................................................................................................................
....................3
Step 1: Data padding........................................................................................................ ..........................3
Step 2: Appending length............................................................................................................. ..............3
Step 3: Initialize the Message Digest buffer.............................................................................. ................3
Step 4: Process input........................................................................................................ ..........................4
Step 5: Output.............................................................................................................................................6
T Value Table........................................................................................................................... ..................6
Technical Vulnerabilities........................................................................................................ ........................7
Signature attack with a hidden message........................................................................ ............................7
X.509 Certificate Signature attack........................................................................................... ..................7
Conclusions............................................................................................................................
.........................8
References.............................................................................................................................
..........................9
Executive Summary
In this paper, the MD5 hashing algorithm is discussed. MD5 is a common hashing algorithm used in
many cryptographic schemes. It is also used for file verification for file downloads. Part one of this paper
outlines how MD5 works. There have been quite a few recent discussions on whether or not MD5 is still
a solid algorithm. Some less informed have stated that MD5 has been completely broken. Part two of this
paper investigates the current state of MD5 and its attack vectors. In the final section, this paper outlines
conclusions based on section two.
Summary of MD5
Definitions
A word is a 32 bit group.
A byte is an 8 bit group.
'b' represents the length of the input data in bits.
'|' is a bitwise OR. '&' is a bitwise AND. '^' is a bitwise XOR. '~' is a bitwise NOT.
X <<< s is a circular bit shift of X, by s bit positions.
Input data
The data input into MD5 can be any number of bits. They do not need to fall on a 8 bit boundary. It may
also be 0 bits long.
A: 01 23 45 67
B: 89 ab cd ef
C: fe dc ba 98
D: 76 54 32 10
Next, a table is constructed of 64 elements. Their values are: 4294967296 times abs(sin(i)), where i is the
number of the element, and in radians. The table is numbered T[1 .. 64]. So, T[1] = 0xD76AA478. These
can be calculated in each round on the fly, but since they never change, it is most efficient to simply
statically code them into the end code, as was done in the reference section of RFC 1321.
Step 5: Output
We are finally left with an output of 4 words. A is the low order word, and D is the high order word.
From here, we can simply print them.
T Value Table
The following table is the numbers that are to be used in step 4 as the values of T. The formula for the
values is: 4294967296 times abs(sin(i)), where i is the number of the element, and is in radians.
Although there have been 2 attacks against MD5, it is my belief that MD5 is still a completely valid
algorithm. Both of these attacks are very case specific.
For instance, the attack against Alice's boss requires pre-existing malicious code to be inserted before the
signature is taken. Signing other documents, such as a PDF, make it impossible to carry out such an
attack. In the case of signing code using the same method, it would require the attacker to implement the
malicious code upstream, which would require a code maintainer to sign off on the change. At that point,
they may as well just add a separate command line switch, or watch if a file exists to trigger the attack, as
it would be much simpler.
In the case of the X509 certificates, it in fact has the opposite effect. First, the original certificate owner
must perform the attack before the certificate is signed, very much like the Alice's boss type of attack.
However, if ever a duplicate signature is found, one can use the same method as described by Lensta, A.
et al., to reverse engineer the certificate, and provide the original private key and modulo. This, in effect,
defeats the security behind the certificate in the first place, and leaves the original attacker very open to
an attack back on them.
To date, there has been no attack against MD5 for creating files that are of identical MD5's, even though
their contents vastly differ. Alice's boss types of attacks require an attacker to prepare the malicious code,
and for the signer to forgo due-diligence in checking the contents. In the X509 certificate attack, it simply
allows for 2 different, but similar certificates to be created. However, if it is not known that a second
certificate exists, then it is impossible for a relying party to know if the receiving party is the actual
recipient. Of course, the attacker would have needed to give the certificate to the other party, and if that is
the case, then the attacker may as well just give the data to the other party freely as well.
In my opinion, both of these attacks require a breakdown in other systems to make these attacks feasible.
The only MD5 attack that is feasible is against passwords, where brute forcing becomes a possibility.
Without any current method to 'pick' a resultant MD5, either based on a IV or not, MD5 is still a valid
method for verifying data.
References
Rivest, R., “The MD5 Message_Digest Algorithm”, RFC 1321, MIT and RSA Data Security, Inc., April
1992
Klima, V., “Tunnels in Hash Functions: MD5 Collisions Within a Minute”, April 2006
Daum, M. & Lucks, S., “Attacking Hash Functions by Poisoned Messages "The Story of Alice and her
Boss"”, http://www.cits.rub.de/MD5Collisions/, June 2005
Lensta, A., Wang, X., Weger, B., “Colliding X.509 Certificates”, March 2005