You are on page 1of 57

Exam questions

Exam questions
1. Describe the basic concepts of the cryptanalysis. What science is
connected with cryptanalysis? Explain main term of this science
Cryptanalysis (from the Greek krypts, "hidden", and analein, "to loosen"
or "to untie") is the study of analyzing information systems in order to study
the hidden aspects of the systems. [1] Cryptanalysis is used to breach
cryptographic security systems and gain access to the contents of encrypted
messages, even if the cryptographic key is unknown.
The other side of cryptography, it is used to break codes by finding weaknesses
within them. In addition to being used by hackers with bad intentions, this
discipline is also often used by the military. It is also appropriately used by
designers of encryption systems to find, and subsequently correct, any weaknesses
that may exist in the system under design.
There are several types of attacks that a cryptanalyst may use to break a code,
depending on how much information he or she has. A ciphertext-only attack is one
where the analyst has a piece of ciphertext (text that has already been encrypted),
with no plaintext (unencrypted text). This is probably the most difficult type of
cryptanalysis, and calls for a bit of guesswork. In a known-plaintext attack, the
analyst has both a piece of ciphertext and the corresponding piece of plaintext.
Other types of attacks may involve trying to derive a key through trickery or theft,
such as in the "man-in-the-middle" attack. In this method, the cryptanalyst places a
piece of surveillance software in between two parties that communicate. When the
parties' keys are exchanged for secure communication, they exchange their keys
with the attacker instead of each other.
The ultimate goal of the cryptanalyst is to derive the key so that all ciphertext can
be easily deciphered. A brute-force attack is one way of doing so. In this type of
attack, the cryptanalyst tries every possible combination until the correct key is
identified. Although using longer keys make the derivation less statistically likely
to be successful, faster computers continue to make brute-force attacks feasible.
Networking a set of computers together in a grid combines their strength, and their
cumulative power can be used to break long keys. The longest keys used, 128-bit
keys, remain the strongest, and less likely to be subject to this type of attack.
At its core, cryptanalysis is a science of mathematics, probability, and fast
computers. Cryptanalysts also usually require some persistence, intuition,
guesswork and some general knowledge of the target. The field also has an
interesting historical element; the famous Enigma machine, used by the Germans

to send secret messages, was ultimately cracked by members of the Polish


resistance and transferred to the British.
2. Determine basic idea of dictionary attack and write the possibility of
dictionary attack in cryptanalysis. Give an example.
In cryptanalysis and computer security, a dictionary attack is a technique
for defeating a cipher or authentication mechanism by trying to determine its
decryption key or passphrase by trying hundreds or sometimes millions of
likely possibilities, such as words in a dictionary.
A dictionary attack uses a targeted technique of successively trying all the
words in an exhaustive list called a dictionary (from a pre-arranged list of
values).[1] In contrast with a brute force attack, where a large proportion key
space is searched systematically, a dictionary attack tries only those
possibilities which are most likely to succeed, typically derived from a list of
words for example a dictionary (hence the phrase dictionary attack).
Generally, dictionary attacks succeed because many people have a tendency
to choose passwords which are short (7 characters or fewer), such as single
words found in dictionaries or simple, easily predicted variations on words,
such as appending a digit. However these are easy to defeat. Adding a single
random character in the middle can make dictionary attacks untenable.
Unlike Brute-force attacks, Dictionary attacks are not guaranteed to succeed.
,
. : , ,
, ,


, .
:
Online: ,
, ,
.
Offline: ,
, .
, ,
.


.
3. Analyze how Brute force attack works . List advantages and
disadvantages of Brute Force attack. Give an example.
In computer science, brute-force search or exhaustive search, also known
as generate and test, is a very general problem-solving technique that
consists of systematically enumerating all possible candidates for the
solution and checking whether each candidate satisfies the problem's
statement.
A brute-force algorithm to find the divisors of a natural number n would
enumerate all integers from 1 to the square root of n, and check whether
each of them divides n without remainder. A brute-force approach for the
eight queens puzzle would examine all possible arrangements of 8 pieces on
the 64-square chessboard, and, for each arrangement, check whether each
(queen) piece can attack any other.
While a brute-force search is simple to implement, and will always find a
solution if it exists, its cost is proportional to the number of candidate solutions
which in many practical problems tends to grow very quickly as the size of
the problem increases. Therefore, brute-force search is typically used when the
problem size is limited, or when there are problem-specific heuristics that can
be used to reduce the set of candidate solutions to a manageable size. The
method is also used when the simplicity of implementation is more important
than speed.
This is the case, for example, in critical applications where any errors in the
algorithm would have very serious consequences; or when using a computer
to prove a mathematical theorem. Brute-force search is also useful as a
baseline method when benchmarking other algorithms or metaheuristics.
Indeed, brute-force search can be viewed as the simplest metaheuristic.
Brute force search should not be confused with backtracking, where large
sets of solutions can be discarded without being explicitly enumerated (as in
the textbook computer solution to the eight queens problem above). The
brute-force method for finding an item in a table namely, check all entries
of the latter, sequentially is called linear search.
The most obvious advantage is that your chance of actually finding the
password is quite high since the attack uses so many possible answers.
Another advantage is that it is a fairly simplistic attack that doesn't require a
lot
of
work
to
setup
or
initiate.

The biggest disadvantage is that it is very hardware intensive. Brute force


attacks try as many possible answers as possible, this takes a lot of
processing power. There is also the possibility that the system being attack
has some other security measures. For instance, they might lock you out
after 3 fail attempts and this extends the amount of time needed to crack the
code by a huge margin.
,
,
. ,
, ,

.
,
. ,
, ,
. ,
.
, 3 ,
, ,
.
4. Describe types of attack, classify them. Which attack is most stable in
cryptanalysis on your opinion
5. Today DES is considered vulnerable to brute force attacks due to larger
computer power. Why were 3DES introduced and not 2DES to prevent
brute force attacks?
The Data Encryption Standard (DES) is an outdated symmetric-key method
of data encryption.
DES works by using the same key to encrypt and decrypt a message, so both
the sender and the receiver must know and use the same private key. Once
the go-to, symmetric-key algorithm for the encryption of electronic data,
DES has been superseded by the more secure Advanced Encryption
Standard (AES) algorithm.
In cryptography, Triple DES (3DES) is the common name for the Triple Data
Encryption Algorithm (TDEA or Triple DEA) symmetric-key block cipher,
which applies the Data Encryption Standard (DES) cipher algorithm three times to
each data block.

The original DES cipher's key size of 56 bits was generally sufficient when that
algorithm was designed, but the availability of increasing computational power
made brute-force attacks feasible. Triple DES provides a relatively simple method
of increasing the key size of DES to protect against such attacks, without the need
to design a completely new block cipher algorithm.
Double des is not safe poetomu 3 des ispolzuetsya
Triple DES (3DES) ,
, 1978
DES,
(56 ),
. 3DES
3 , DES, ,
3DES,
, , DES. 3DES
, DES,
( 1998 Electronic Frontier Foundation,
DES Cracker, DES 3
). 3DES DES.
3DES DES,
, DES.
6. Specify the main concepts of the Differential analysis.

(
, , - ).


. , ,

2,
. . DES,
FEAL , ,
90-. (AES, Camellia
.)
.

1990
,
DES. , DES
,
.
1994 IBM , ,
IBM 1974
, DES
. IBM . :

.
,
,
. ,
,
.
DES ,
. , ,
FEAL. 4 FEAL-4
8 , 31- FEAL
.
DES
1990
DES, ,
. ,
,

DES.


,
.

Differential cryptanalysis is a general form of cryptanalysis applicable


primarily to block ciphers, but also to stream ciphers and cryptographic hash
functions. In the broadest sense, it is the study of how differences in
information input can affect the resultant difference at the output. In the case
of a block cipher, it refers to a set of techniques for tracing differences
through the network of transformations, discovering where the cipher
exhibits non-random behavior, and exploiting such properties to recover the
secret key.
7. Define the main term of differential cryptanalysis and list the
possibilities of Differential analysis for DES

8. List the possibilities of Differential analysis for FEAL, write to which


cipher FEAL is similar

In cryptography, FEAL (the Fast data Encipherment ALgorithm) is a block


cipher proposed as an alternative to the Data Encryption Standard (DES), and
designed to be much faster in software. There have been several different revisions
of FEAL, though all are Feistel ciphers, and make use of the same basic round
function and operate on a 64-bit block. One of the earliest designs is now termed
FEAL-4, which has four rounds and a 64-bit key.
Unfortunately, problems were found with FEAL-4 from the start: Bert den Boer
related a weakness in an unpublished rump session at the same conference where
the cipher was first presented. A later paper (den Boer, 1988) describes an attack
requiring 10010000 chosen plaintexts, and Sean Murphy (1990) found an
improvement that needs only 20 chosen plaintexts. Murphy and den Boer's
methods contain elements similar to those used in differential cryptanalysis.
The designers countered by doubling the number of rounds, FEAL-8 (Shimizu and
Miyaguchi, 1988). However, eight rounds also proved to be insufficient in
1989, at the Securicom conference, Eli Biham and Adi Shamir described a
differential attack on the cipher, mentioned in (Miyaguchi, 1989). Gilbert and
Chass (1990) subsequently published a statistical attack similar to differential
cryptanalysis which requires 10000 pairs of chosen plaintexts.
In response, the designers introduced a variable-round cipher, FEAL-N
(Miyaguchi, 1990), where "N" was chosen by the user, together with FEAL-NX,
which had a larger 128-bit key. Biham and Shamir's differential cryptanalysis
(1991) showed that both FEAL-N and FEAL-NX could be broken faster than
exhaustive search for N 31. Later attacks, precursors to linear cryptanalysis,
could break versions under the known plaintext assumption, first (Tardy-Corfdir
and Gilbert, 1991) and then (Matsui and Yamagishi, 1992), the latter breaking
FEAL-4 with 5 known plaintexts, FEAL-6 with 100, and FEAL-8 with 215.
FEAL , .
64- 64- .
, , DES,
. , .
, .
64-
. XOR 64
.
. XOR
.
N ( 4).
F 16 XOR
, . (
) . N (

N- )
XOR ,
, 64- .
XOR 64
.
9. Write the possibility of Differential analysis for AES
Rijndael
[13],
AES. , Rijndael

:
1. ,
(
, ,
,

).
2. .
3.

.
4. Square- (
, AES, [11])

Rijndael.
5. .
10.Define the basic concept of differential cryptanalysis and write the
possibility of Differential analysis for GOST
28147-89
, 1990 , [1].
28147-89 .

. .

.
[2], .
, , ,
,
( ), , ,

1980-
.

,
1994 .
2014
.

,[8] ,
.

(slide attack) (reflection attack). ..,
.., .., .. [9]
,
.
, , , (
0 1) . ,
, , .
2011
, 28 (256)
2 64
/ . [10][11]

. , 264 / ,
, , .
,
,
, ,
.

.[4]
11.Explain how do you can use Differential Cryptanalysis for Multi-Round
Cipher in cryptanalysis ?
Multiround eto tot je des
12.Describe the basic concepts of the Linear analysis.

.

(Mitsuru Matsui). 1993 . ( -93)
DES FEAL.

.

.
.

.
,
,
.

.
/ ,
K.

K.

.. , .. ,
P 1/2
/

:
PI1 PI2 PIa CI1 CI2 CIb = KI1 KI2 KIc (1),
Pn, Cn, Kn n- , .

1 ,

1
T , (1)

0,

T>N/2, N .
, KI1 KI2 KIc = 0 ( P>1/2) 1 ( P<1/2).

, KI1 KI2 KIc = 1 ( P>1/2) 0 ( P<1/2).


, |P-1/2|
/ N.
P (1) 1/2,

,
:
(1).

.

13.Write the possibility of Linear analysis for DES

14.
Define the basic concept of linear analysis and write possibility of
Linear analysis for AES
Advanced Encryption Standard (AES), Rijndael
( [rindal] ([1]))
( 128 , 128/192/256 ),
AES.
,
DES.
(. National Institute of Standards and Technology, NIST)
AES 26 2001 ,
15 . 26 2002 AES

. 2009 AES
. [2][3]
AES ( ) Intel x86
Intel Core i7-980X Extreme Edition, Sandy
Bridge.

,
.

(Mitsuru Matsui). 1993 . ( -93)
DES FEAL.
.

.
.

.
,
,
.

.
/ ,
K.

K.

.. , .. ,
P 1/2 /

:
PI1 PI2 PIa CI1 CI2 CIb = KI1 KI2 KIc (1),
Pn, Cn, Kn n- , .

1 ,

15.Explain possibility of Linear analysis for GOST, write in short history


of GOST-who writes and where is developed
28147-89
, 1990 , [1].
28147-89 .

. .

.
[2], .
, , ,
,
( ), , ,
1980-
.

,
1994 .
2014
.

,[8]
, .

(slide attack)
(reflection attack). .., .., ..,
.. [9] ,

. ,
, , (
0 1) . ,
, ,
.
2011
, 2 8 (256)

264 / . [10][11]

. , 264

/ , ,
, .

,
, ,
.

.[4]
16.Analyze the important steps of the S box designing and explain the main
purpose of S box
In cryptography, an S-box (substitution-box) is a basic component
of symmetric key algorithms which performs substitution. In block ciphers,
they are typically used to obscure the relationship between the key and
the ciphertext . In general, an S-box takes some number of input bits, m,
and transforms them into some number of output bits, n, where n is not
necessarily equal to m.[1] An mn S-box can be implemented as a lookup
table with 2m words of n bits each. Fixed tables are normally used, as in
the Data Encryption Standard (DES), but in some ciphers the tables are
generated dynamically from the key (e.g. the Blowfish and the Twofish
encryption algorithms).
S- (S-box Substitution-box) ,
. S-
,

(S-)

,
n-
,
(
) ,
- n- . n S- n ,
,
( ).
.
S-
/,

, 2
.
.

, .
, ,

.
17. Explain the main purpose of S box and analyze the design of S box for
DES
The Data Encryption Standard (DES) is an outdated symmetric-key method
of data encryption.
DES works by using the same key to encrypt and decrypt a message, so
both the sender and the receiver must know and use the same private key.
Once the go-to, symmetric-key algorithm for the encryption of electronic
data, DES has been superseded by the more secure Advanced Encryption
Standard (AES) algorithm.
Originally designed by researchers at IBM in the early 1970s, DES was
adopted by the U.S. government
S-boxes ,

DES S-boxes c 6 4- , 48
32

18.Describe the main purpose of S box and analyze the design of S box for
AES.
The Advanced Encryption Standard (AES) is a symmetric-key block
cipher algorithm and U.S. government standard for secure and
classified data encryption and decryption.

for whole 128-bit block:


in key out
where is bitwise exclusive-or (XOR)
(For decryption, inverse operation is identical.)

step3: Shift Rows


for 4 4 byte matrix, rotate rows 03
step4: Mix Columns
the steps Shift Rows, Mix Columns, & Add Round Key are linear operations
(and easy)
the S-box function is nonlinear due to the inverse operation in GF(28) (not
easy to compute)
The Rijndael S-Box was specifically designed to be resistant
to linear and differential cryptanalysis. This was done by minimizing the

correlation between linear transformations of input/output bits, and at the


same time minimizing the difference propagation probability.
In addition, to strengthen the S-Box against algebraic attacks, the affine
transformation was added. In the case of suspicion of a backdoor being
built into the cipher, the current S-box might be replaced by another one.
The authors claim that the Rijndael cipher structure should provide enough
resistance against differential and linear cryptanalysis, even if an S-Box
with "average" correlation / difference propagation properties is used.

19.Analyze the design of S box for GOST and write in short history of
GOST-who writes and where is developed
Block cipher GOST [5, 12] is a Feistel cipher with 32 rounds. Its block size
is 64 bits, and key-size is 256 bits. It is a Soviet and Russian
government standard symmetric key block cipher. Also based on this
block cipher is the GOST hash function.
The subkeys are chosen in a pre-specified order. The key schedule is very
simple: break the 256-bit key into eight 32-bit subkeys, and each subkey is
used four times in the algorithm; the first 24 rounds use the key words in
order, the last 8 rounds use them in reverse order.
The S-boxes accept a four-bit input and produce a four-bit output. The Sbox substitution in the round function consists of eight 4 4 S-boxes. The
S-boxes are implementation-dependent parties that want to secure their
communications using GOST must be using the same S-boxes. For extra
security, the S-boxes can be kept secret. In the original standard where
GOST was specified, no S-boxes were given, but they were to be supplied
somehow. This led to speculation that organizations the government
wished to spy on were given weak S-boxes. One GOST chip manufacturer
reported that he generated S-boxes himself using a pseudorandom number
generator.
20.Describe the basic concepts of the Boomerang attack.
In cryptography,
the boomerang
attack is
a
method
for
the cryptanalysis of block ciphers based on differential cryptanalysis. The
attack was published in 1999 by David Wagner, who used it to break
the COCONUT98 cipher.

The boomerang attack has allowed new avenues of attack for many
ciphers previously deemed safe from differential cryptanalysis.
Refinements on the boomerang attack have been published: the amplified
boomerang attack, then the rectangle attack.

.
,
(
).
.

-]

,


,
,

.
,
16-

DES
8


.


:
"-"

[1] ,

- ,
,
. , ,
,
-.
[5], ,
,
, -.

.

,
, . ,

,
.
[ |

-]

-
.
.

.

.
.

21.Determine the main idea of the Amplified boomerang attack.


- ]

[2][

-
,
.

, .
.
, ,

:


,
.


, K- .

, (
) - ,
.

22.Define the basic concept of Boomerang attack and write possibility of


Boomerang attack for FEAL
The boomerang attack was discovered by David Wagner as a
way to extend the power of normal differential cryptanalysis. It
allows the analysis to use two unrelated differential
characteristics to attack the same cipher. By using one differential
to defeat the first half and another to defeat the second half, he
can beat the full cipher. There is no requirement that the
differentials share an intermediate value. The effect of this
discovery on cipher design is that the number of rounds that must
be secure against differential attacks effectively doubles. Whereas
normal differential cryptanalysis is a chosen-plaintext attack, the
boomerang technique is an adaptive-chosen ciphertext attack.
The analyst chooses two plaintext blocks which differ by some
input differential value. Next, he encrypts them using the target
cipher (and its secret keys, of course). He then takes each of the
cooresponding ciphertexts and XORs them with another
differential to produce two new adaptive-chosen ciphertexts.
Next, he decrypts each of those new ciphertexts and decrypts
them using the target algorithm (with secret keys) to produce two
new plaintexts. Finally, the analyst must XOR those new plaintexts
together. If the differential produced matches the differential that
was used to produce the original plaintexts, it is almost certain
that the encrypt/decrypt functions used are identified (as FEAL-6
in this case).

23.Write the possibility of Boomerang attack for AES


Please note that this 24-round boomerang attack does not apply to the real
CAST-256 AES proposal .
AES has many truncated differentials with probability one spanning up to
three rounds, however they are quite expensive in terms of probability when
trying to extend them at either end of the boomerang distinguisher.
AES has sufficient security margin against the boomerang attacks

5- AES
6- AES

[4].
[4

24.Analyze the possibility of Boomerang attack for GOST.


[15],
12 s-.
.
[15].
25 26- ,
[15].
.
32 31- ,

.
30,29,. . . , 25-
( ).
s8
25.Describe the basic concepts of the Slide attacks, analyze about
importance of number of rounds

26.Write the possibility of the Slide attacks for AES

AES can be broken when it is only nine rounds, but the full strength cipher
still stands unbroken.
27.Describe the basic concepts of the Impossible Differential Cryptanalysis
In cryptography, impossible differential cryptanalysis is a form
of differential cryptanalysis for block ciphers. While ordinary
differential cryptanalysis tracks differences that propagate through the
cipher with greater than expected probability, impossible differential
cryptanalysis exploits differences that are impossible (having
probability 0) at some intermediate state of the cipher algorithm.
Biham, Biryukov and Shamir also presented a relatively efficient
specialized method for finding impossible differentials that they called
a miss-in-the-middle attack. This consists of finding "two events with
probability one, whose conditions cannot be met together.
. In normal differential analysis, we try to find a differential
characteristic that holds true with some high probability. If this
characteristic does from the plaintext to the input to the last round, we
can have a go at some straight-forward key recovery. If you need a
refresher, I recommend checking out my page on the differential
cryptanalysis of FEAL-4. Impossible differentials are a natural idea:
instead of looking for high-probability differentials, we look for those
that never happen. This give information about intermediate states in
the cipher that differ from a random permutation. Exploiting these
weaknesses can enable us to recover the key with less work than
exhaustive search.
28.Specify the main idea of the Meet-(miss) in the middle attacks. Analysis
of double and triple encryptions.

Meet-in-the-middle is a known attack that can exponentially reduce the


number of brute force permutations required to decrypt text that has been
encrypted by more than one key. Such an attack makes it much easier for
an intruder to gain access to data.
The meet-in-the-middle attack targets block cipher cryptographic functions.
The intruder applies brute force techniques to both
theplaintext and ciphertext of a block cipher. He then attempts to encrypt

the plaintext according to various keys to achieve an intermediate ciphertext


(a text that has only been encrypted by one key). Simultaneously, he
attempts to decrypt the ciphertext according to various keys, seeking a
block of intermediate ciphertext that is the same as the one achieved by
encrypting the plaintext. If there is a match of intermediate ciphertext, it is
highly probable that the key used to encrypt the plaintext and the key used
to decrypt the ciphertext are the two encryption keys used for the block
cipher.
The name for this exploit comes from the method. Because the attacker
tries to break the two-part encryption method from both sides
simultaneously, a successful effort enables him to meet in the middle of the
block cipher.
Although a meet-in-the-middle exploit can make the attacker's job easier, it
can't be conducted without a piece of plaintext and corresponding
ciphertext. That means the attacker must have the capacity to store all
possible intermediate ciphertext values from both the brute force encryption
of the plaintext and decryption of the ciphertext.
Meet-in-the-middle is a passive attack, which means that although the
intruder can access messages, in most situations he can not alter them or
send his own. The attack is not practical for the average hacker and is more
likely to be used in corporate espionage or some other venue that can
accomodate the storage required to carry it out.
A meet-in-the-middle attack is not the same thing as a man in the middle
attack.
__
Multiple encryption is the process of encrypting an already encrypted
message one or more times, either using the same or a different algorithm.
It is also known as cascade encryption, cascade ciphering, multiple
encryption. Picking any two ciphers, if the key used is the same for both,
the second cipher could possibly undo the first cipher, partly or entirely.
This is true of ciphers where the decryption process is exactly the same as
the encryption processthe second cipher would completely undo the first.
If an attacker were to recover the key through cryptanalysis of the first

encryption layer, the attacker could possibly decrypt all the remaining
layers, assuming the same key is used for all layers.
To prevent that risk, one can use keys that are statistically independent for
each layer .
Double-DES is two successive DES instances, while Triple-DES is three
successive DES instances.
We use 3DES and not 2DES because 2DES does not yield the security
increase that you would believe. Namely, 2DES uses 112 key bits (two 56bit DES keys) but offers a security level of about 2 57, not 2112, because of a
"meet-in-the middle attack" which is well explained there (not to be
confused with "man-in-the-middle", a completely different concept).
Similarly, 3DES uses 168 key bits, but offers "only" 2 112security (which is
quite sufficient in practice). This also explains why 3DES is sometimes
used with a 112-bit key (the third DES key is a copy of the first): going to
168 bits does not actually make things more secure.
This can be summarized as: we use n-DES because a simple DES is too
weak (a 56-bit key can be brute-forced by a determined attacker), but in
order to really improve security, we must go to n 3. Of course, every
additional DES implies some computational overhead (simple DES is
already quite slow in software, 3DES thrice as much).
29.Analyze the main concepts of the Algebraic attack
Algebraic attacks are a class of techniques which rely for their success on
some block cipher exhibiting a high degree of mathematical structure.
An algebraic attack is a method of cryptanalysis against a cipher. It
involves:

expressing the cipher operations as a system of equations


substituting in known data for some of the variables

solving for the key

The attacker can choose which algebraic system to use; for example,
against one cipher he might treat the text as a vector of bits and use
Boolean algebra while for another he might choose to treat it as a vector of
bytes and use arithmetic modulo 28.
What makes this attacks impractical is a combination of the sheer size of
the system of equations and nonlinearity in the relations involved. In any
algebra, solving a system of linear equations is more-or-less
straightforward provided there are more equations than variables. However,

solving nonlinear systems of equations is far harder. Cipher designers


therefore strive to make their ciphers highly nonlinear.
One technique for introducing nonlinearity is to mix operations from
different algebraic systems, for example using both arithmetic and logical
operations within the cipher so it cannot readily be described with linear
equations in either normal or boolean algebra. Another is to use S-boxes,
which are lookup tables containing nonlinear data. See
the nonlinearity section of the block cipher article for discussion.
An algebraic attack is similar to a brute force attack or a code book
attack in that it can, in theory, break any symmetric cipher but in practice it
is wildly impractical against any reasonable cipher.

30.Describe the basic concepts of the Key-related attack


In cryptography,
a related-key
attack is
any
form
of cryptanalysis where the attacker can observe the operation of
a cipher under several different keys whose values are initially
unknown, but where some mathematical relationship connecting the
keys is known to the attacker. For example, the attacker might know
that the last 80 bits of the keys are always the same, even though he
doesn't know, at first, what the bits are. This appears, at first glance,
to be an unrealistic model; it would certainly be unlikely that an
attacker could persuade a human cryptographer to encrypt plaintexts
under numerous secret keys related in some way. However, modern
cryptography is implemented using complex computer protocols,
often not vetted by cryptographers, and in some cases a related-key
attack is made very feasible.

31)Determine the main ideas of the Stream ciphers attacks, write which simple
operation is used here
A stream cipher is a symmetric key cipher where plaintext digits are combined
with a pseudorandom cipher digit stream (keystream). Stream ciphers, where
plaintext bits are combined with a cipher bit stream by an exclusive-or operation
(xor), can be very secure if used properly. However they are vulnerable to attack if
certain precautions are not followed:
-keys must never be used twice
-valid decryption should never be relied on to indicate authenticity
Stream ciphers are vulnerable to attack if the same key is used twice (depth of two)
or more.

Say we send messages A and B of the same length, both encrypted using same
key, K. The stream cipher produces a string of bits C(K) the same length as the
messages. The encrypted versions of the messages then are:
E(A) = A xor C
E(B) = B xor C
where xor is performed bit by bit. Say an adversary has intercepted E(A) and E(B).
He can easily compute:
E(A) xor E(B)
However, xor is commutative and has the property that X xor X = 0 (self-inverse)
so:
E(A) xor E(B) = (A xor C) xor (B xor C) = A xor B xor C xor C = A xor B
If one message is longer than the other, our adversary just truncates the longer
message to the size of the shorter and his attack will only reveal that portion of the
longer message. In other words, if anyone intercepts two messages encrypted with
the same key, they can recover A xor B, which is a form of running key cipher.
Even if neither message is known, as long as both messages are in a natural
language, such a cipher can often be broken by paper-and-pencil methods. If one
message is known, the solution is trivial.
32)Describe the basic concepts of the Attacks on hash functions, write any types of
hash functions in cryptography, which do you know
A cryptographic hash function is a hash function which takes an input (or
'message') and returns a fixed-size alphanumeric string, which is called the hash
value (sometimes called a message digest, a digital fingerprint, a digest or
a checksum).
The ideal cryptographic hash function has four main properties:

it is easy to compute the hash value for any given message

it is infeasible to generate a message that has a given hash

it is infeasible to modify a message without changing the hash

it is infeasible to find two different messages with the same hash.

A cryptographic hash function must be able to withstand all known types of


cryptanalytic attack. At a minimum, it must have the following properties:

Pre-image resistance
Given a hash h it should be difficult to find any message m such that h =
hash(m). This concept is related to that of one-way function. Functions that
lack this property are vulnerable to preimage attacks.

Second pre-image resistance


Given an input m1 it should be difficult to find another input m2 such
that m1 m2 and hash(m1) = hash(m2). Functions that lack this property are
vulnerable to second-preimage attacks.

Collision resistance

It should be difficult to find two different messages m1 and m2 such


that hash(m1) = hash(m2). Such a pair is called a cryptographic hash
collision. This property is sometimes referred to as strong collision
resistance. It requires a hash value at least twice as long as that required for
preimage-resistance; otherwise collisions may be found by a birthday attack.
These properties imply that a malicious adversary cannot replace or modify the
input data without changing its digest. Thus, if two strings have the same digest,
one can be very confident that they are identical.
A function meeting these criteria may still have undesirable properties. Currently
popular cryptographic hash functions are vulnerable to length-extension attacks:
given hash(m) and len(m) but not m, by choosing a suitable m' an attacker can
calculate hash(m || m')where || denotes concatenation.
This property can be used to break naive authentication schemes based on hash
functions.
CRYPTOGRAPHIC HASH FUNCTIONS:
MD2, MD4, MD5, MD6, SHA-1,SHA-256, SHA-512, GOST, HAVAL.
33)Explain how works of the Birthday Attack, describe its main concept
The birthday attack is a statistical phenomenon relevant toinformation security that
makes the brute forcing of one-way hasheseasier. Its based off of the birthday
paradox, which states that in order for there to be a 50% chance that someone in a
given room shares your birthday, you need 253 people in the room.
If, however, you are looking for a greater than 50% chance that any two
people in the room have the same birthday, you only need 23 people.
This works because the matches are based on pairs. If I choose myself as one
side of the pair, then I need a full 253 people to get to the magic number of 253
pairs. In other words, its me combined with 253 other people to make up all 253
sets.
But if I am only concerned with matches and not necessarily someone
matching me, then we only need 23 people in the room. Why? Because it only
takes 23 people to form 253 pairs when cross-matched with each other.
So the number 253 doesnt change. Thats still the number of pairs required
to reach a 50% chance of a birthday match within the room. The only question is
whether each person is able to link with every other person. If so you only need 23
people; if not, and youre comparing only to a single birthday, you need 253
people.
This applies to finding collisions in hashing algorithms because
itsmuch harder to find something that collides with a given hash than it is to find
any two inputs that hash to the same value.:
As an example, consider the scenario in which a teacher with a class of 30 students
asks for everybody's birthday, to determine whether any two students have the

same birthday (corresponding to a hash collision as described further [for


simplicity, ignore February 29]). Intuitively, this chance may seem small. If the
teacher picked a specific day (say September 16), then the chance that at least one
student was born on that specific day is
, about 7.9%. However,
the probability that at least one student has the same birthday as any other student
is around 70% for n = 30, from the formula
34)Describe the basic concepts of the Birthday Attack on hash function, write any
types of hash functions in cryptography, which do you know
Every cryptographic hash function is inherently vulnerable to collisions using
a birthday attack. Due to the birthday problem, these attacks are much faster than a
brute force would be. A hash of n bits can be broken in 2n/2 time (evaluations of the
hash function)
More efficient attacks are possible by employing cryptanalysis to specific
hash functions. When a collision attack is discovered and is found to be faster than
a birthday attack, a hash function is often denounced as "broken". The NIST hash
function competition was largely induced by published collision attacks against
two very commonly used hash functions, MD5 and SHA-1. The collision attacks
against MD5 have improved so much that it takes just a few seconds on a regular
computer. Hash collisions created this way are usually constant length and largely
unstructured, so cannot directly be applied to attack widespread document formats
or protocols.
35)Determine the main term of hash function and analyze what kind of attack on
MD5 you know
A hash function is any function that can be used to map digital data of
arbitrary size to digital data of fixed size, with slight differences in input data
producing very big differences in output data. Hash functions are related to (and
often confused with) checksums, check digits, fingerprints, randomization
functions, error-correcting codes, and ciphers. Although these concepts overlap to
some extent, each has its own uses and requirements and is designed and optimized
differently. The Hash Keeper database maintained by the American National Drug
Intelligence Center, for instance, is more aptly described as a catalog of file
fingerprints than of hash values.
A hash value can be used to uniquely identify secret information. This
requires that the hash function is collision resistant, which means that it is very
hard to find data that generate the same hash value. These functions are
categorized into cryptographic hash functions and provably secure hash functions.
Functions in the second category are the most secure but also too slow for most
practical purposes. Collision resistance is accomplished in part by generating very

large hash values. For example SHA-1, one of the most widely used cryptographic
hash functions, generates 160 bit values.
The security of the MD5 hash function is severely compromised. A collision
attack exists that can find collisions within seconds on a computer with a 2.6 GHz
Pentium 4 processor (complexity of 224.1)
In April 2009, a preimage attack against MD5 was published that breaks
MD5's preimage resistance. This attack is only theoretical, with a computational
complexity of 2123.4 for full preimage.

36)Explain the main term of hash function and analyze what kind of attack on
SHA-1, SHA-2 you know
In cryptography, SHA-1 is a cryptographic hash function designed by the United
States National Security Agency and is a U.S. In February 2005, the attacks can
find collisions in the full version of SHA-1, requiring fewer than 269 operations.
SHA-2 is a set of cryptographic hash functions designed by the NSA. In 2005,
security flaws were identified in SHA-1, namely that a mathematical weakness
might exist, indicating that a stronger hash function would be desirable. [6] Although
SHA-2 bears some similarity to the SHA-1 algorithm, these attacks have not been
successfully extended to SHA-2.
Currently, the best public attacks break preimage resistance 52 rounds of SHA-256
or 57 rounds of SHA-512, and collision resistance for 46 rounds of SHA-256, as
shown in the Cryptanalysis and validation section below.
Published in

New Collision
attacks Against
Up To 24-step
SHA-2 [25]

Year Attack method

2008

Preimages for step- 2009


reduced SHA-2 [26]

Deterministic

Meet-in-themiddle

Attack

Collision

Preimage

Variant Rounds Complexit


SHA-256

24/64

228.5

SHA512

24/80

232.5

SHA-256

42/64

2251.7

43/64

2254.9

42/80

2502.3

46/80

2511.5

SHA-256

42/64

2248.4

SHA512

42/80

2494.6

46/64

2178

46/64

246

SHA-256

45/64

2255.5

SHA512

50/80

2511.5

SHA-256

52/64

2255

SHA512

57/80

2511

Collision

SHA-256

31/64

265.5

Pseudocollision

SHA-256

38/64

237

SHA-512

38/80

240.5

SHA512

Advanced meet-inthe-middle
2010
[27]
preimage attacks

Higher-Order
Differential Attack
2011
on Reduced SHA256 [2]

Meet-in-themiddle

Differential

Preimage

Pseudocollision

Preimage
Bicliques for
Preimages: Attacks
on
2011
Skein-512 and the
SHA-2 family [1]

Improving Local
Collisions: New
Attacks on
Reduced SHA256 [28]
Branching
Heuristics in
Differential
Collision

2013

2014

SHA-256

Biclique
Pseudopreimage

Differential

Heuristic
Differential

Pseudocollision

Search with
Applications to
SHA-512 [29]

37)Describe the algorithm SHA-3, explain main concept of hash functions


To be considered for the SHA-3 standard, candidate hash functions had to meet
four conditions set by NIST. If a candidate failed to meet these conditions, it was
disqualified:
The candidate hash function had to perform well regardless of
implementation. It should expend minimal resources even when hashing large
amounts of message text. Many proposed candidates were actually unable to meet
this requirement.
The candidate function had to be conservative about security. It should
withstand known attacks, while maintaining a large safety factor. It should emit the
same four hash sizes as SHA-2 (224-, 256-, 384-, or 512-bits wide), but be able to
supply longer hash sizes if need be.
The candidate function had to be subjected to cryptanalysis. Both source
code and analytical results were made public for interested third-parties to review
and comment. Any weaknesses found during analysis were to be addressed,
through tweaks or through redesign.
The candidate function had to exercise code diversity. It could not use the
Merkle-Damgard engine to produce the message hash.
The SHA-3 competition saw 51 candidate functions enter the first round of
evaluations. Out of those, 14 managed to advance to the second round. Round
three saw the candidates whittled down to five. And from those five, Keccak was
declared the winner.
38)Describe the basic concepts of the RSA attacks, and analyze on which
algorithm and theorems it works
RSA is one of the first practicable public-key cryptosystems and is widely used for
secure data transmission. In such a cryptosystem, the encryption key is public and
differs from the decryption key which is kept secret. In RSA, this asymmetry is
based on the practical difficulty of factoring the product of two large prime
numbers,
the factoring
problem.
RSA stands
for Ron
Rivest, Adi
Shamir and Leonard Adleman, who first publicly described the algorithm in 1977
It's now time to get into the details of attacks on RSA.
SEARCHING THE MESSAGE SPACE
One of the seeming weaknesses of public key cryptography is that one
has to give away to everybody the algorithm that encrypts the data. If

the message space is small, then one could simply try to encrypt every
possible message block, until a match is found with one of the
ciphertext blocks. In practice this would be an insurmountable task
because the block sizes are quite large.
GUESSING D
Another possible attack is a known ciphertext attack. This time the
attacker knows both the plaintext and ciphertext (they simply has to
encrypt something). They then try to crack the key to discover the
private exponent, d. This might involve trying every possible key in the
system on the ciphertext until it returns to the original plaintext.
Once d has been discovered it is easy to find the factors of n (for
example use the algorithm in chapter 8 of The Handbook of Applied
Cryptography). Then the system has been broken completely and all
further ciphertexts can be decrypted.
The problem with this attack is that it is slow. There are an enormous
number of possible ds to try. This method is a factorizing algorithm as it
allows us to factor n. Since factorizing is an intractable problem we
know this is very difficult. This method is not the fastest way to
factorize n. Therefore one is suggested to focus effort into using a more
efficient algorithm specifically designed to factor n. This advice was
given in the original paper.
CYCLE ATTACK
This attack is very similar to the last. The idea is that we encrypt the
ciphertext repeatedly, counting the iterations, until the original text
appears. This number of re-cycles will decrypt any ciphertext. Again
this method is very slow and for a large key it is not a practical attack. A
generalisation of the attack allows the modulus to be factored and it
works faster the majority of the time. But even this will still have
difficulty when a large key is used. Also the use of p-- strong primes aids
the security.
The bottom line is that the generalized form of the cipher attack is
another factoring algorithm. It is not efficient, and therefore the attack is
not good enough compared with modern factoring algorithms (e.g.
Number Field Sieve).
I noticed an improvement on this algorithm. The suggested way is to use
the public exponent of the public key to re-encrypt the text. However
any exponent should work so long as it is coprime to (p-1).(q1) (where p, q are factors of the modulus). So I suggest using an
exponent such as 216 + 1. This number has only two 1s in its binary
representation. Using binary fast exponentiation, we use only 16
modular squarings and 1 modular multiplication. This is likely to be

faster than the actual public exponent. The trouble is that we cannot be
sure that it is coprime to (p-1).(q-1). In practice, many RSA systems
use 216 + 1 as the encrypting exponent for its speed.
COMMON MODULUS
One of the early weaknesses found was in a system of RSA where the
users within an organization would share the public modulus. That is to
say, the administration would choose the public modulus securely and
generate pairs of encryption and decryption exponents (public and
private keys) and distribute them all the employees/users. The reason for
doing this is to make it convenient to manage and to write software for.
However, Simmons shows how this would allow any eavesdropper to
view any messages encrypted with two keys; for example when a memo
is sent to several employees. DeLaurentis went further to demonstrate
how the system was at even more risk from insiders, who could break
the system completely, allowing them to view all messages and sign
with anybody's key.
FAULTY ENCRYPTION
Joye and Quisquater showed how to capitalise on the common modulus
weakness due to a transient error when transmitting the public key.
Consider the situation where an attacker, Malory, has access to the
communication channel used by Alice and Bob. In other words, Malory
can listen to anything that is transmitted, and can also change what is
transmitted. Alice wishes to talk privately to Bob, but does not know his
public key. She requests by sending an email, to which Bob replies. But
during transmission, Malory is able to see the public key and decides to
flip a single bit in the public exponent of Bob, changing (e,n) to (e',n).
When Alice receives the faulty key, she encrypts the prepared message
and sends it to Bob (Malory also gets it). But of course, Bob cannot
decrypt it because the wrong key was used. So he lets Alice know and
they agree to try again, starting with Bob re-sending his public key. This
time Malory does not interfere. Alice sends the message again, this time
encrypted with the correct public key.
Malory now has two ciphertexts, one encrypted with the faulty exponent
and one with the correct one. She also knows both these exponents and
the public modulus. Therefore she can now apply the common modulus
attack to retrieve Alice's message, assuming that Alice was foolish
enough to encrypt exactly the same message the second time.
A demonstation of the Common Modulus attack and the Faulty
Encryption attack can be found in this Mathematica notebook.
LOW EXPONENT

IN THE CYCLE ATTACK SECTION ABOVE, I SUGGESTED

THAT THE

ENCRYPTING EXPONENT COULD BE CHOSEN TO MAKE THE SYSTEM


MORE EFFICIENT .

MANY RSA SYSTEMS USE E =3 TO MAKE


ENCRYPTING FASTER . HOWEVER , THERE IS A VULNERABILTY WITH
THIS ATTACK . IF THE SAME MESSAGE IS ENCRYPTED 3 TIMES WITH
DIFFERENT KEYS ( THAT IS SAME EXPONENT , DIFFERENT MODULI)
THEN WE CAN RETRIEVE THE MESSAGE. THE ATTACK IS BASED ON THE
CHINESE REMAINDER THEOREM. SEE THE HANDBOOK OF APPLIED
CRYPTOGRAPHY FOR AN EXPLANATION AND ALGORITHM.
FACTORING THE PUBLIC KEY
FACTORING THE PUBLIC KEY IS SEEN AS THE BEST WAY TO GO ABOUT
CRACKING RSA.
39)Specify the NIST statistical tests, list types which you remember
NIST- National Institute of Standard and Technology
A total of fifteen statistical tests were developed, implemented and evaluated. The
following describes each of the tests.
FREQUENCY (MONOBITS) TEST
Description: The focus of the test is the proportion of zeroes and ones for the entire
sequence. The purpose of this test is to determine whether that number of ones and
zeros in a sequence are approximately the same as would be expected for a truly
random sequence. The test assesses the closeness of the fraction of ones to , that
is, the number of ones and zeroes in a sequence should be about the same.
RUNS TEST
Description: The focus of this test is the total number of zero and one runs in the
entire sequence, where a run is an uninterrupted sequence of identical bits. A run of
length k means that a run consists of exactly k identical bits and is bounded before and
after with a bit of the opposite value. The purpose of the runs test is to determine
whether the number of runs of ones and zeros of various lengths is as expected for a
random sequence. In particular, this test determines whether the oscillation between
such substrings is too fast or too slow.
LINEAR COMPLEXITY TEST
Description: The focus of this test is the length of a generating feedback register. The
purpose of this test is to determine whether or not the sequence is complex enough to
be considered random. Random sequences are characterized by a longer feedback
register. A short feedback register implies non-randomness.
SERIAL TEST
Description: The focus of this test is the frequency of each and every overlapping mbit pattern across the entire sequence. The purpose of this test is to determine whether
the number of occurrences of the 2 m m-bit overlapping patterns is approximately the
same as would be expected for a random sequence. The pattern can overlap.

APPROXIMATE ENTROPY TEST


Description: The focus of this test is the frequency of each and every overlapping mbit pattern. The purpose of the test is to compare the frequency of overlapping blocks
of two consecutive/adjacent lengths (m and m+1) against the expected result for a
random sequence.
CUMULATIVE SUM (CUSUM) TEST
Description: The focus of this test is the maximal excursion (from zero) of the random
walk defined by the cumulative sum of adjusted (-1, +1) digits in the sequence. The
purpose of the test is to determine whether the cumulative sum of the partial
sequences occurring in the tested sequence is too large or too small relative to the
expected behavior of that cumulative sum for random sequences. This cumulative sum
may be considered as a random walk. For a random sequence, the random walk
should be near zero. For non-random sequences, the excursions of this random walk
away from zero will be too large.
RANDOM EXCURSIONS TEST
Description: The focus of this test is the number of cycles having exactly K visits in a
cumulative sum random walk. The cumulative sum random walk is found if partial sums
of the (0,1) sequence are adjusted to (-1, +1). A random excursion of a random walk
consists of a sequence of n steps of unit length taken at random that begin at and
return to the origin. The purpose of this test is to determine if the number of visits to a
state within a random walk exceeds what one would expect for a random sequence.
RANDOM EXCURSIONS VARIANT TEST
Description: The focus of this test is the number of times that a particular state occurs
in a cumulative sum random walk. The purpose of this test is to detect deviations from
the expected number of occurrences of various states in the random walk.

40)Describe main concept of block cipher Give an overview of the modern


standard of block cipher
In cryptography, a block cipher is a deterministic algorithm operating on fixed-length groups of bits,
called blocks, with an unvarying transformation that is specified by a symmetric key. Block ciphers
are important elementary components in the design of manycryptographic protocols, and are widely
used to implement encryption of bulk data.
The modern design of block ciphers is based on the concept of an iterated product cipher. Iterated
product ciphers carry out encryption in multiple rounds, each of which uses a different subkey
derived from the original key. One widespread implementation of such ciphers is called a Feistel
network, named after Horst Feistel, and notably implemented in the DES cipher.[2] Many other
realizations of block ciphers, such as the AES, are classified as substitution-permutation networks.[3]
The publication of the DES cipher by the U.S. National Bureau of Standards (now National Institute
of Standards and Technology, NIST) in 1977 was fundamental in the public understanding of modern
block cipher design. In the same way, it influenced the academic development of cryptanalytic

attacks. Both differential and linear cryptanalysis arose out of studies on the DES design. Today,
there is a palette of attack techniques against which a block cipher must be secure, in addition to
being robust against brute force attacks.
A block cipher consists of two paired algorithms, one for encryption, E, and the other for
decryption, D.[4] Both algorithms accept two inputs: an input block of size n bits and a key of
size k bits; and both yield an n-bit output block. The decryption algorithm D is defined to be
the inverse function of encryption, i.e., D = E1. More formally,[5][6] a block cipher is specified by an
encryption function

which takes as input a key K of bit length k, called the key size, and a bit string P of length n,
called the block size, and returns a string C of n bits. P is called the plaintext, and C is termed
the ciphertext. For each K, the function EK(P) is required to be an invertible mapping on {0,1} n.
The inverse for E is defined as a function

taking a key K and a ciphertext C to return a plaintext value P, such that

For example, a block cipher encryption algorithm might take a 128-bit block of plaintext as input, and
output a corresponding 128-bit block of ciphertext. The exact transformation is controlled using a
second input the secret key. Decryption is similar: the decryption algorithm takes, in this example,
a 128-bit block of ciphertext together with the secret key, and yields the original 128-bit block of plain
text.[7]
For each key K, EK is a permutation (a bijective mapping) over the set of input blocks. Each key
selects one permutation from the possible set of

.[8]

Federal Information Processing Standard


41)Give an overview of the AES attacks , list advantages of AES
The Advanced Encryption Standard (AES) is a specification for the encryption of electronic data
established by the U.S. National Institute of Standards and Technology (NIST) in 2001.[4]
AES

is

based

on

the

Rijndael

cipher [5] developed

by

two Belgian cryptographers, Joan

Daemen and Vincent Rijmen, who submitted a proposal to NIST during the AES selection process.
[6]

Rijndael is a family of ciphers with different key and block sizes.

For AES, NIST selected three members of the Rijndael family, each with a block size of 128 bits, but
three different key lengths: 128, 192 and 256 bits.
AES became effective as a federal government standard on May 26, 2002 after approval by
the Secretary of Commerce. AES is included in the ISO/IEC 18033-3 standard. AES is available in
many different encryption packages, and is the first publicly accessible and open cipher approved by
the National Security Agency (NSA) for top secret information when used in an NSA approved
cryptographic module (see Security of AES, below).

AES has a fairly simple algebraic description. [15] In 2002, a theoretical attack, termed the "XSL
attack", was announced by Nicolas Courtois and Josef Pieprzyk, purporting to show a weakness in
the AES algorithm due to its simple description. [16] Since then, other papers have shown that the
attack as originally presented is unworkable; see XSL attack on block ciphers.
During the AES process, developers of competing algorithms wrote of Rijndael, "...we are concerned
about [its] use...in security-critical applications." [17] However, in October 2000 at the end of the AES
selection process, Bruce Schneier, a developer of the competing algorithm Twofish, wrote that while
he thought successful academic attacks on Rijndael would be developed someday, he does not
"believe that anyone will ever discover an attack that will allow someone to read Rijndael traffic." [18]
On July 1, 2009, Bruce Schneier blogged [19] about a related-key attack on the 192-bit and 256-bit
versions of AES, discovered by Alex Biryukov and Dmitry Khovratovich,[20] which exploits AES's
somewhat simple key schedule and has a complexity of 2 119. In December 2009 it was improved to
299.5. This is a follow-up to an attack discovered earlier in 2009 by Alex Biryukov, Dmitry Khovratovich,
and Ivica Nikoli, with a complexity of 296 for one out of every 235 keys.[21]
Another attack was blogged by Bruce Schneier [22] on July 30, 2009 and released as a preprint [23] on
August 3, 2009. This new attack, by Alex Biryukov, Orr Dunkelman, Nathan Keller, Dmitry
Khovratovich, and Adi Shamir, is against AES-256 that uses only two related keys and 2 39 time to
recover the complete 256-bit key of a 9-round version, or 2 45 time for a 10-round version with a
stronger type of related subkey attack, or 2 70 time for an 11-round version. 256-bit AES uses 14
rounds, so these attacks aren't effective against full AES.
Side-channel attacks do not attack the underlying cipher, and thus are not related to security in that
context. They rather attack implementations of the cipher on systems which inadvertently leak data.
There are several such known attacks on certain implementations of AES.
In April 2005, D.J. Bernstein announced a cache-timing attack that he used to break a custom server
that used OpenSSL's AES encryption.[27] The attack required over 200 million chosen plaintexts.
[28]

The custom server was designed to give out as much timing information as possible (the server

reports back the number of machine cycles taken by the encryption operation); however, as
Bernstein pointed out, "reducing the precision of the server's timestamps, or eliminating them from
the server's responses, does not stop the attack: the client simply uses round-trip timings based on
its local clock, and compensates for the increased noise by averaging over a larger number of
samples."[27]
In October 2005, Dag Arne Osvik, Adi Shamir and Eran Tromer presented a paper demonstrating
several cache-timing attacks against AES. [29] One attack was able to obtain an entire AES key after
only 800 operations triggering encryptions, in a total of 65 milliseconds. This attack requires the
attacker to be able to run programs on the same system or platform that is performing AES.

Advantages of 3DES over AES:

AES in Galois/Counter Mode (GCM) is challenging to implement in software in a


manner that is both performant and secure.

3DES is easy to implement (and accelerate) in both hardware and software.

3DES is ubiquitous: most systems, libraries, and protocols include support for it.

Advantages

of

AES

over

AES is more secure (it is less susceptible to cryptanalysis than 3DES).

AES supports larger key sizes than 3DES's 112 or 168 bytes.

AES is faster in both hardware and software.

3DES:

AES's 128-bit block size makes it less open to attacks via the birthday problem than
3DES with its 64-bit block size.
AES is required by the latest U.S. and international standards.

42)Give an overview of the GOST attacks write in short history of GOST-who


writes and where is developed,
The GOST block cipher, defined in the standard GOST 28147-89, is a Soviet and Russian
government standard symmetric key block cipher. Also based on this block cipher is the GOST hash
function.
Developed in the 1970s, the standard had been marked "Top Secret" and then downgraded to
"Secret" in 1990. Shortly after the dissolution of the USSR, it was declassified and it was released to
the public in 1994. GOST 28147 was a Soviet alternative to the United States standard
algorithm, DES.[1] Thus, the two are very similar in structure.
GOST has a 64-bit block size and a key length of 256 bits. Its S-boxes can be secret, and they
contain about 354 (log2(16!8)) bits of secret information, so the effective key size can be increased to
610 bits; however, a chosen-key attack can recover the contents of the S-Boxes in approximately
232 encryptions.[2]
GOST is a Feistel network of 32 rounds. Its round function is very simple: add a 32-bit
subkey modulo 232, put the result through a layer of S-boxes, and rotate that result left by 11 bits. The
result of that is the output of the round function. In the diagram to the right, one line represents 32
bits.
The subkeys are chosen in a pre-specified order. The key schedule is very simple: break the 256-bit
key into eight 32-bit subkeys, and each subkey is used four times in the algorithm; the first 24 rounds
use the key words in order, the last 8 rounds use them in reverse order.
The S-boxes accept a four-bit input and produce a four-bit output. The S-box substitution in the
round function consists of eight 4 4 S-boxes. The S-boxes are implementation-dependent parties
that want to secure their communications using GOST must be using the same S-boxes. For extra
security, the S-boxes can be kept secret. In the original standard where GOST was specified, no Sboxes were given, but they were to be supplied somehow. This led to speculation that organizations
the government wished to spy on were given weak S-boxes. One GOST chip manufacturer reported
that he generated S-boxes himself using apseudorandom number generator

43)Analyze the main requirements for modern cipher, list the ciphers which you
know in cryptography
Most modern ciphers can be categorized in several ways

By whether they work on blocks of symbols usually of a fixed size (block ciphers), or on a
continuous stream of symbols (stream ciphers).

By whether the same key is used for both encryption and decryption (symmetric key
algorithms), or if a different key is used for each (asymmetric key algorithms). If the algorithm is
symmetric, the key must be known to the recipient and sender and to no one else. If the
algorithm is an asymmetric one, the enciphering key is different from, but closely related to, the
deciphering key. If one key cannot be deduced from the other, the asymmetric key algorithm has
the public/private key property and one of the keys may be made public without loss of
confidentiality.

Modern encryption methods can be divided by two criteria: by type of key used, and by type of input
data.
By type of key used ciphers are divided into:

symmetric key algorithms (Private-key cryptography), where the same key is used for
encryption and decryption, and

asymmetric key algorithms (Public-key cryptography), where two different keys are used for
encryption and decryption.

In a symmetric key algorithm (e.g., DES and AES), the sender and receiver must have a shared key
set up in advance and kept secret from all other parties; the sender uses this key for encryption, and
the receiver uses the same key for decryption. The Feistel cipher uses a combination of substitution
and transposition techniques. Most block cipher algorithms are based on this structure. In an
asymmetric key algorithm (e.g., RSA), there are two separate keys: a public key is published and
enables any sender to perform encryption, while a private key is kept secret by the receiver and
enables only him to perform correct decryption.
Ciphers can be distinguished into two types by the type of input data:

block ciphers, which encrypt block of data of fixed size, and

stream ciphers, which encrypt continuous streams of data

44)Compare and classify the attacks: differential and linear cryptanalysis


Multi-round ciphers such as DES are clearly very difficult to crack. One property they
have is that even if one has some corresponding plaintext and ciphertext, it is not at all
easy to determine what key has been used.

DIFFERENTIAL CRYPTANALYSIS

However, if one is fortunate enough to have a large quantity of corresponding


plaintext and ciphertext blocks for a particular unknown key, a technique called
differential cryptanalysis, developed by Eli Biham and Adi Shamir, is available to
obtain clues about some bits of the key, thereby shortening an exhaustive search.
After two rounds of DES, knowing both the input and output, it is trivial to determine
the two subkeys used, since the outputs of both f-functions are known. For each Sbox, there are four possible inputs to produce the known output. Since each subkey is
48 bits long, but the key is only 56 bits long, finding which of the four possibilities is
true for each group of six bits in the subkeys is a bit like solving a crossword puzzle.
Once the number of rounds increases to four, the problem becomes much harder.
However, it is still true that the output depends on the input and the key. For a limited
number of rounds, it is inevitable, without the need for any flaws in the S-boxes, that
there will be some cases where a bit or a combination of bits in the output will have
some correlation with a simple combination of some input bits and some key bits.
Ideally, that correlation should be absolute with respect to the key bits, since there is
only one key to solve for, but it can be probabilistic with respect to the input and
output bits, since there need to be many pairs to test.
As the number of rounds increases, though, the simple correlations disappear.
Differential cryptanalysis represents an approach to finding more subtle correlations.
Instead of saying "if this bit is 1 in the input, then that bit will be 0 (or 1) in the
output", we say "changing this bit in the input changes (or does not change) that bit in
the output".
In fact, however, a complete pattern of which bits change and do not change in the
input and in the output is the subject of differential cryptanalysis. The basic principle
of differential cryptanalysis, in its classic form, is this: the cipher being attacked has
a characteristic if there exists a constant X such that given many pairs of plaintexts
A, B, such that B = A xor X, if a certain statement is true about the key, E(B,k) =
E(A,k) xor Y for some constant Y will be true with a probability somewhat above that
given by random chance.
LINEAR CRYPTANALYSIS

Linear cryptanalysis, invented by Mitsuru Matsui, is a different, but related technique.


Instead of looking for isolated points at which a block cipher behaves like something
simpler, it involves trying to create a simpler approximation to the block cipher as a
whole.
For a great many plaintext-ciphertext pairs, the key that would produce that pair from
the simplified cipher is found, and key bits which tend to be favored are likely to have
the value of the corresponding bit of the key for the real cipher. The principle is a bit

like the summation of many one-dimensional scans to produce a two-dimensional


slice through an object in computer-assisted tomography.

45)Classify the attacks: boomerang and slide attack


In cryptography, the boomerang attack is a method for the cryptanalysis of block ciphers based
on differential cryptanalysis. The attack was published in 1999 by David Wagner, who used it to
break theCOCONUT98 cipher.
The boomerang attack has allowed new avenues of attack for many ciphers previously deemed safe
from differential cryptanalysis.
Refinements on the boomerang attack have been published: the amplified boomerang attack, then
the rectangle attack.
he slide attack is a form of cryptanalysis designed to deal with the prevailing idea that even
weak ciphers can become very strong by increasing the number of rounds, which can ward off
a differential attack. The slide attack works in such a way as to make the number of rounds in a
cipher irrelevant. Rather than looking at the data-randomizing aspects of the block cipher, the slide
attack works by analyzing the key schedule and exploiting weaknesses in it to break the cipher. The
most common one is the keys repeating in a cyclic manner.
The attack was first described by David Wagner and Alex Biryukov. Bruce Schneier first suggested
the term slide attack to them, and they used it in their 1999 paper describing the attack.
The only requirements for a slide attack to work on a cipher is that it can be broken down into
multiple rounds of an identical F function. This probably means that it has a cyclic key schedule.
The F function must be vulnerable to a known-plaintext attack. The slide attack is closely related to
the related-key attack.
The idea of the slide attack has roots in a paper published by Edna Grossman and Bryant
Tuckerman in an IBM Technical Report in 1977. Grossman and Tuckerman demonstrated the attack
on a weak block cipher named New Data Seal (NDS). The attack relied on the fact that the cipher
has identical subkeys in each round, so the cipher had a cyclic key schedule with a cycle of only one
key, which makes it an early version of the slide attack. A summary of the report, including a
description of the NDS block cipher and the attack, is given in Cipher Systems

46.

Describe the main parameters of computational complexity

The term computational complexity has two usages which must be


distinguished. On the one hand, it refers to an algorithm for solving instances of
a problem: broadly stated, the computational complexity of an algorithm is a
measure of how many steps the algorithm will require in the worst case for an

instance or input of a given size. The number of steps is measured as a function of


that size.
The term's second, more important use is in reference to a problem itself. The
theory of computational complexity involves classifying problems according to
their inherent tractability or intractability that is, whether they are easy or
hard to solve. This classification scheme includes the well-known
classes P and NP; the terms NP-complete and NP-hard are related to the
class NP.
To understand what is meant by the complexity of an algorithm, we must define
algorithms, problems, and problem instances. Moreover, we must understand how
one measures the size of a problem instance and what constitutes a step in an
algorithm. A problem is an abstract description coupled with a question requiring
an answer; for example, the Traveling Salesman Problem (TSP) is: Given a graph
with nodes and edges and costs associated with the edges, what is a least-cost
closed walk (or tour) containing each of the nodes exactly once? An instance of a
problem, on the other hand, includes an exact specification of the data: for
example, The graph contains nodes 1, 2, 3, 4, 5, and 6, and edges (1, 2) with cost
10, (1, 3) with cost 14, and so on. Stated more mathematically, a problem can
be thought of as a function p that maps an instance x to an output p(x) (an answer).
An algorithm for a problem is a set of instructions guaranteed to find the correct
solution to any instance in a finite number of steps. In other words, for a
problem p, an algorithm is a finite procedure for computing p(x) for any given
input x. Computer scientists model algorithms by a mathematical construct called
a Turing machine, but we will consider a more concrete model here. In a simple
model of a computing device, a step consists of one of the following operations:
addition, subtraction, multiplication, finite-precision division, and comparison of
two numbers. Thus if an algorithm requires one hundred additions and 220
comparisons for some instance, we say that the algorithm requires 320 steps on
that instance. In order to make this number meaningful, we would like to express it
as a function of the size of the corresponding instance, but determining the exact
function would be impractical. Instead, since we are really concerned with how
long the algorithm takes (in the worst case) asymptotically as the size of an
instance gets large, we formulate a simple function of the input size that is a
reasonably tight upper bound on the actual number of steps. Such a function is
called thecomplexity or running time of the algorithm.

47.
Describe the basic concepts of the Frequency attack,
analyze who are the first uses Frequency attack in history
Encrypted text is sometimes achieved by replacing one letter
by another. To start deciphering the encryption it is useful to
get a frequency count of all the letters. The most frequent
letter may represent the most common letter in English E
followed by T, A, O and I whereas the least frequent are Q, Z
and X. Common percentages in standard English are:
and ranked in order:
e
12
.7
m
2.
4

t
9
.
1
w
2
.
4

a
8
.
2
f
2
.
2

o
7
.
5
y
2
.
0

i
7
.
0
g
2
.
0

n
6
.
7
p
1
.
9

s
6
.
3
b
1
.
5

h
6
.
1
v
1
.
0

r
6
.
0
k
0
.
8

d
4
.
3
x
0
.
2

l
4
.
0
j
0
.
2

u
2
.
8
q
0
.
1

c
2
.
8
z
0
.
1

Common pairs are consonants TH and vowels EA. Others are


OF, TO, IN, IT, IS, BE, AS, AT, SO, WE, HE, BY, OR, ON, DO,
IF, ME, MY, UP. Common pairs of repeated letters are SS, EE,
TT, FF, LL, MM and OO. Common triplets of text are THE, EST,
FOR, AND, HIS, ENT or THA.
If the results show that E followed by T are the most common
letters then the ciphertext may be a transposition cipher
rather than a substitution. If one of the characters has a 20%
then the language may be German since it has a very high
percentage of E. Italian has 3 letters with a frequency greater
than 10% and 9 characters are less than 1%.
The first known recorded explanation of frequency analysis (indeed, of
any kind of cryptanalysis) was given in the 9th century by Al-Kindi,
an Arab polymath, in A Manuscript on Deciphering Cryptographic
Messages.[3] It has been suggested that close textual study of
the Qur'an first brought to light that Arabic has a characteristic letter
frequency.[4] Its use spread, and similar systems were widely used in
European states by the time of the Renaissance. By 1474, Cicco
Simonetta had written a manual on deciphering encryptions

of Latin and Italian text.[5] Arabic Letter Frequency and a detailed study
of letter and word frequency analysis of the entire book of Qur'an are
provided by Intellaren Articles.[6]
48.
Avalanche effect is a desirable property in cipher systems.
Describe what this effect is
In cryptography, the avalanche effect refers to a desirable property of
cryptographic algorithms, typically block ciphers and cryptographic hash
functions. The avalanche effect is evident if, when an input is changed
slightly (for example, flipping a single bit) the output changes
significantly (e.g., half the output bits flip). In the case of high-quality
block

ciphers,

such

small

change

in

either

the key or

the plaintext should cause a drastic change in the ciphertext. The actual
term was first used by Horst Feistel,[1] although the concept dates back
to at least Shannon's diffusion.

The SHA-1 hash function exhibits good avalanche effect. When a single
bit is changed the hash sum becomes completely different.
If a block cipher or cryptographic hash function does not exhibit the
avalanche effect to a significant degree, then it has poor randomization,
and thus a cryptanalyst can make predictions about the input, being
given only the output. This may be sufficient to partially or completely
break the algorithm. Thus, the avalanche effect is a desirable condition
from the point of view of the designer of the cryptographic algorithm or
device.

Constructing a cipher or hash to exhibit a substantial avalanche effect is


one of the primary design objectives. This is why most block ciphers
are product ciphers. It is also why hash functions have large data
blocks. Both of these features allow small changes to propagate rapidly
through iterations of the algorithm, such that every bit of the output
should depend on every bit of the input before the algorithm terminates.
49.
Suppose passwords are hashed and the hashes which are
16 bits long are saved in a database. Estimate the number of
passwords saved in the database when the probability that two
passwords have the same hash value will become larger than
50%
????????????????????????
A cryptographic

hash

function is

a hash

function which

is

considered practically impossible to invert, that is, to recreate the input data
from its hash value alone. These one-way hash functions have been called
"the workhorses of modern cryptography". [1] The input data is often called
the message, and the hash value is often called themessage digest or
simply the digest.
The ideal cryptographic hash function has four main properties:

it is easy to compute the hash value for any given message

it is infeasible to generate a message that has a given hash

it is infeasible to modify a message without changing the hash

it is infeasible to find two different messages with the same hash.

Cryptographic hash functions have many information security applications,


notably in digital signatures, message authentication codes (MACs), and
other forms ofauthentication. They can also be used as ordinary hash
functions, to index data in hash tables, for fingerprinting, to detect duplicate
data or uniquely identify files, and aschecksums to detect accidental data
corruption. Indeed, in information security contexts, cryptographic hash

values are sometimes called (digital) fingerprints,checksums, or just hash


values, even though all these terms stand for more general functions with
rather different properties and purposes.
PASSWORD

VERIFICATION [EDIT]

A related application is password verification (first invented by Roger


Needham). Storing all user passwords as cleartext can result in a massive
security breach if the password file is compromised. One way to reduce this
danger is to only store the hash digest of each password. To authenticate a
user, the password presented by the user is hashed and compared with the
stored hash. (Note that this approach prevents the original passwords from
being retrieved if forgotten or lost, and they have to be replaced with new
ones.) The password is often concatenated with a random, nonsecret salt value before the hash function is applied. The salt is stored with
the password hash. Because users have different salts, it is not feasible to
store tables of precomputed hash values for common passwords. Key
stretching functions,

such

as PBKDF2, Bcrypt or Scrypt,

typically

use

repeated invocations of a cryptographic hash to increase the time required


to perform brute force attacks on stored password digests.
50.
Name a demand on a cipher system for it to achieve perfect
secrecy, i.e. be theoretical secure?
A cipher system is perfectly secure, or equivalently, satisfies a
perfect secrecy constraint
if the message U and the ciphertext X are statistically independent,
I(U; X) = 0.
Perfect Secrecy (or information-theoretic secure) means that the ciphertext
conveys no information about the content of the plaintext. In effect this
means that, no matter how much ciphertext you have, it does not convey
anything about what the plaintext and key were. It can be proved that any
such scheme must use at least as much key material as there is plaintext
to encrypt. In terms of probabilities, it means that the probability distribution
of the possible plaintexts is independent of the ciphertext.
We contrast this with semantic security, which I define by quoting the
seminar 1984 paper by Goldwasser&Micali:

Whatever is efficiently computable about the cleartext given the cyphertext,


is also efficiently computable without the cyphertext.
For two examples, I quote my answer to this related question:
When used correctly, the One Time Pad (OTP) is information-theoretic
secure, which means it can't be broken with cryptanalysis. However, part of
being provably secure is that you need as much key material as you have
plaintext to encrypt. Such a key needs to be shared between the two
communicants, which basically means you have to give it to the other
person through a perfectly secure protocol (eg by hand/trusted courier). So,
actually it just allows you to have your trusted meeting in advance, rather
than at the time of transmitting the secret information.
To illustrate this, consider what happens if one tries to brute force OTP?
Since you have allowed an attacker infinite computational resources, he
can keep guessing keys and calculating the appropriate plaintext until
every key has been tested. Supposing the message was b bits long, this
would leave him with 2b possible keys, each of which would generate a
unique plaintext, making 2b plaintexts. What is important here is that this
means they would have candidate plaintexts corresponding to ever
possible bit-string of length b. This means, even if you knew the message
was "Meet me at the stadium at 2?:15" (where ? is 0,1,2 or 3), you still
wouldn't have any idea what the ? was, because the possible plaintexts
would contain this string with every possible value of ?.
Most cryptographic methods we use now are computationally secure.
There are lots of different ways to do this, and I'll just sketch at a few of
them. We might come up with a reduction to a problem conjectured to be
hard (eg the Diffie-Hellman Problem or Discrete Log Problem). That is, we
prove that "If you can break my cipher, you can solve [hard-problem]",
meaning our problem is at least a difficult to solve as [hard-problem]. So, if
the problem is indeed hard to solve, so must cracking our encryption be.
51.
State Kerckhoff's Principle for cryptography. Why is it
reasonable to assume that it holds?
In cryptography, Kerckhoffs's
principle (also
called Kerckhoffs's
desiderata, Kerckhoffs's assumption, axiom, or law) was stated
by Auguste Kerckhoffs in the 19th century: A cryptosystem should be
secure even if everything about the system, except the key, is public
knowledge.
Stated simply, the security of a cryptosystem should depend solely on the
secrecy of the key and the private randomizer.[5] Another way of putting it is
that a method of secretly coding and transmitting information should be

secure even if everyone knows how it works. Of course, despite the


attacker's familiarity with the system in question, the attacker lacks
knowledge as to which of all possible instances is being presently
observed.
ADVANTAGE

OF SECRET KEYS [EDIT]

Using secure cryptography is supposed to replace the difficult problem of


keeping messages secure with a much more manageable one, keeping
relatively small keys secure. A system that requires long-term secrecy for
something as large and complex as the whole design of a cryptographic
system obviously cannot achieve that goal. It only replaces one hard
problem with another. However, if a system is secure even when the enemy
knows everything except the key, then all that is needed is to manage
keeping the keys secret.
There are a large number of ways the internal details of a widely used
system could be discovered. The most obvious is that someone could
bribe, blackmail, or otherwise threaten staff or customers into explaining
the system. In war, for example, one side will probably capture some
equipment and people from the other side. Each side will also use spies to
gather information.
If a method involves software, someone could do memory dumps or run the
software under the control of a debugger in order to understand the
method. If hardware is being used, someone could buy or steal some of the
hardware and build whatever programs or gadgets needed to test it.
Hardware can also be dismantled so that the chip details can be seen with
microscopes.
MAINTAINING

SECURITY[EDIT]

A generalization some make from Kerckhoffs's principle is: "The fewer and
simpler the secrets that one must keep to ensure system security, the
easier it is to maintain system security." Bruce Schneier ties it in with a
belief that all security systems must be designed to fail as gracefully as
possible:

Kerckhoffs's principle applies beyond codes and ciphers to security


systems in general: every secret creates a potential failure point. Secrecy,
in other words, is a prime cause of brittlenessand therefore something
likely to make a system prone to catastrophic collapse. Conversely,
openness provides ductility.[6]
Any security system depends crucially on keeping some things secret.
However, Kerckhoffs's principle points out that the things kept secret ought
to be those least costly to change if inadvertently disclosed.
For example, a cryptographic algorithm may be implemented by hardware
and software that is widely distributed among users. If security depends on
keeping that secret, then disclosure leads to major logistic difficulties in
developing, testing, and distributing implementations of a new algorithm it
is "brittle". On the other hand, if keeping the algorithm secret is not
important, but only the keys used with the algorithm must be secret, then
disclosure of the keys simply requires the simpler, less costly process of
generating and distributing new keys.
52.
What is the purpose of diffusion in the design of block
ciphers?
in diffusion the statistical structure of the plaintext is
dissipated into long-range statistics of the ciphertext.
That is achieved by having each plaintext digit affect
the value of many ciphertext digits, or equivalently
having each ciphertext digit be affected by many
plaintext digits.
The idea of mixing linear and nonlinear operations in order to
obscure the relationship between
Diffusion: To design a cipher according to the principle of diffusion
means that
one designs it to ensure that "the statistical structure of plaintext
which leads to its
redundancy is 'dissipated' into long term statistics" [53]. We state
the principle of
diffusion as follows: For virtually every key, the encryption
function should be such

that there is no statistical dependence between simple structures


in the plaintext
and simple structures in the ciphertext and that there is no simple
relation between2.4 ITERATED CIPHERS 13
different encryption functions. The principle of diffusion requires,
for instance, that
a block cipher should be designed to be "complete" [28], i.e.,
each bit of plaintext
and each bit of key should influence each bit of ciphertext.
53.
What is the purpose of confusion in the design of block
ciphers?
the purpose of confusion is to make the relationship
between statistics the ciphertext and the value of the
encryption key as complex as possible, to thwart
attempts to discover the key.
Confusion: To design a cipher according to the principle of
confusion means that
one designs it so as "to make the the relation between the simple
statistics of ciphertext
and the simple description of key a very complex and involved
one" [53]. We
state the principle of confusion as the dependence of the key on
the plaintext and
ciphertext should be so
cryptanalysis. For example,

complex

that

it

is

useless

for

the binary equations that describe the block cipher should be so


"nonlinear" and
"complex" that to solve for z from x and y =
E(x, z) is infeasible.
54.
How many rounds have DES, how big is the key and how
big is the block?
5 VOPROS

16ounds, k=48 bit, plaintext=64 bit


55.
How big can be the key in AES, how many rounds have
AES for each key, and how big is the block?
41 vopros
Key and block can be 128, 192, 256 bit, 14 round???
56.
What are the two best known general attacks against block
ciphers

6 and 11 quest
57.
What is the probability of finding a collision for an ideal 60bit hash function? What is the main reason for this probability?
Great, so this magic expression serves as our probability that
all values are unique. Subtract it from one, and you have the
probability of a hash collision:
1ek(k1)2N
Here is a graph for N=232. This illustrates the probability of collision
when using 32-bit hash values. Its worth noting that a 50% chance of
collision occurs when the number of hashes is 77163.

Collision resistance is a property of cryptographic hash functions: a hash


function is collision resistant if it is hard to find two inputs that hash to the
same output; that is, two inputs a and b such that H(a) = H(b), and a b.
[1]:136

Every hash function with more inputs than outputs will necessarily have
collisions. [1]:136Consider a hash function such as SHA-256 that produces
256 bits of output from an arbitrarily large input. Since it must generate one
of 2256outputs for each member of a much larger set of inputs,
the pigeonhole principle guarantees that some inputs will hash to the same
output. Collision resistance doesn't mean that no collisions exist; simply
that they are hard to find.[1]:143

The "birthday paradox" places an upper bound on collision resistance: if a


hash function produces N bits of output, an attacker who computes "only"
2N/2 (or

) hash operations on random input is likely to find two matching

outputs. If there is an easier method than this brute force attack, it is


typically considered a flaw in the hash function. [2]
Cryptographic hash functions are usually designed to be collision resistant.
But many hash functions that were once thought to be collision resistant
were later broken. MD5 and SHA-1 in particular both have published
techniques more efficient than brute force for finding collisions. [3][4] However,
some hash functions have a proof that finding collisions is at least as
difficult

as

some

hard

mathematical

problem

(such

as integer

factorization or discrete logarithm). Those functions are called provably


secure.[2]

You might also like