You are on page 1of 15

Jeremy Lennert Jail Cell Cipher – March 19, 2002 -1-

Jail Cell Cipher


Jeremy Lennert
(jeremy@mindflare.com)

March 19, 2002

Abstract
The possibility of construction of a cipher that
can be implemented using only paper and a pencil
while still remaining secure against computer
cryptanalytic attacks is investigated. A
modified version of the RC4 cipher that is
believed to meet these requirements is proposed.
This result challenges the belief, widely-held
by the cryptographic community, that paper-and-
pencil ciphers are inherently insecure.

1. Introduction
A scenario is defined in which a cipher is required that can be implemented
without the benefit of a computer. Several existing ciphers are considered for this
application. A modified version of the RC4 cipher is proposed which is believed to be of
practical use in the scenario. The security of this cipher is analyzed in light of known
cryptanalytic attacks against RC4, and the practical paper-and-pencil implementation of
the cipher is considered.

1.1 Cryptographic Overview


A cipher, commonly known as a “secret code,” allows construction of a message
which can be read by the sender and intended recipient, but not by unintended recipients.
A secret shared by the sender and the intended receiver, called the key, allows them to
read the message. Cryptography is the science of creating ciphers, and cryptanalysis is
the science of “breaking” ciphers. Cryptology is both sciences.
A message in its normal, readable state is referred to as a plaintext. When a
cipher is applied to a plaintext, it is said to be encrypted (or enciphered). The result is a
ciphertext. By applying the inverse operation to the ciphertext, it is decrypted (or
deciphered) to the original plaintext, and becomes readable again.
A computationally secure cipher is one which is reasonably safe from
cryptanalysis, usually such that the plaintext cannot be recovered (with a reasonable
amount of computation) with knowledge of only the cipher and ciphertext, and the key
cannot be recovered even with knowledge of the cipher, the ciphertext and the plaintext.
The definition of a “reasonable amount of computation” is not fixed, but generally an
attack is considered to be infeasible in practice if the computational complexity of the
attack exceeds about 280. The Electronic Frontiers Foundation mounted an attack on the
Jeremy Lennert Jail Cell Cipher – March 19, 2002 -2-

Data Encryption Standard (DES) cipher with an attack complexity of 2 56 using a custom-
designed computer costing less than $250,000, completing the attack in three days [4].

1.2 The Scenario: Imprisoned


The following scenario was used to define the requirements of the cipher:
A man is searched to be certain that he is not carrying
anything of consequence, and he is then locked into a jail cell.
The jailor provides a pencil and a pad of paper, and instructs the
prisoner to write a message to someone outside, which will then be
delivered for him. The letter must be ready by dawn.
The prisoner wishes to write a message asking his friends to
break him out, providing certain information about his location
that will help them do so, but he does not wish the jailor to read
the message. Moreover, the jailor is known to be a skilled
cryptanalyst with a computer at his disposal.
Ignoring the possibility of steganography (hiding a secret
message within a larger one in such a way that others can not
discern the presence or contents of the hidden message), is there a
way for the imprisoned man to send a message that the jailor will
not be able to read, and also to receive a reply that is equally
unreadable by his captor?
It might be useful in this scenario to use an asymmetric cipher, with intensive
computation on one side and very little on the other, so that a friend outside the jail cell
with access to a computer can perform the bulk of the calculation. However, asymmetric
ciphers are not examined here.

1.3 Cipher Requirements


There are three important requirements that must be met by a cipher to be
employed in this scenario.
First, because the prisoner is not allowed to carry anything in with him, the
algorithm of the cipher and its key must both be memorized. A random string is probably
the best form of key, but a string of bits (only zeroes and ones) would be a bad choice for
human memorization. Natural language (like the words in this sentence) has very high
entropy. Because the majority of random strings of letters do not form comprehensible
words, using strings that do limits the key to a very small fraction of all possible strings.
Natural language only contains roughly one bit of actual information per letter. A random
string is thought to be preferable to memorizing natural language and then extracting the
one bit of real information per letter. We assume that the key is a random series of letters
and numbers.
Second, the prisoner must perform the cipher without using any materials other
than paper and a pencil, since he has access to no other materials. Most importantly, he
cannot use a computer or other calculation aid, so all computation involved in the cipher
must be paper-and-pencil computable. Additionally, since the letter must be ready by
dawn, it must be possible to perform the cipher in less than one day.
Finally, in order to prevent the jailer from reading the message, the cipher must be
secure against computer-aided cryptanalysis.
Jeremy Lennert Jail Cell Cipher – March 19, 2002 -3-

2. Discarded Options
Various ciphers have been previously designed that can operate without the use of
computers. Several such ciphers were considered for use in this scenario and ultimately
rejected for various reasons. Most well-known paper-and-pencil ciphers, such as the
simple substitution cipher, are known to be extremely insecure and were not considered.

2.1 Rotors and Machine Ciphers


The first ciphers that were really computationally difficult to break were the rotor
ciphers used in World War II. The Enigma cipher is usually cited as the beginning of
modern cryptography. Various other ciphers include code wheels or other enciphering
machines. John Savard has even developed a code wheel designed to approach the power
of a single-rotor cipher [13].
These, however, are not paper-and-pencil ciphers, because they require machines
to perform them, and the jailor is most assuredly not going to allow the prisoner to bring
such a device in with him. If we are going to allow specialized equipment, it may as well
be a computer.

2.2 One-Time Pads and Book Ciphers


There is one known type of cipher which has easy encryption and absolute
security: the one-time pad. In this cipher, a long, genuinely random string is combined
with the message one letter at a time using some one-to-one function (a shift is usually
used for paper-and-pencil). If the key (the random string) is unknown to the attacker and
truly random, then the ciphertext might decrypt to absolutely anything under the right
key, and so it is impossible to learn anything about it other than its length.
Unfortunately, this cipher requires that the key be completely random and at least
as long as the message, and it is infeasible to memorize a string of random letters as long
as the combined length of every message you wish to send or receive. Sending two
messages with the same key is insecure. This is because each letter of the ciphertext is
simply one letter of the plaintext added to the keystream (P + K). If two plaintexts A and
B are sent with the same key, the cryptanalyst can subtract one from the other. The
keystream cancels itself out, leaving only the two plaintexts (A + K - (B + K) = A - B).
There is enough redundancy in natural language that recovering A and B from (A - B) is
easy. For this reason, two messages can never be sent using the same key.
It is sometimes tempting to use a nonrandom key for a one-time pad, such as
natural language. For example, get some agreed-upon version of the Bible, pick a
random starting point, and use the text from that point forward as the key. Such “book
ciphers,” however, suffer from the two-time pad problem on the first message: natural
language is so redundant that separating the key and the message is easy. Additionally,
for our purposes, the jailor will no doubt enter the cell and take any written materials
which are suspected to have been used in the encryption, further reducing the uncertainty
of this separation.
Jeremy Lennert Jail Cell Cipher – March 19, 2002 -4-

2.3 Solitaire and Card Ciphers


Bruce Schneier developed a cipher called Solitaire that uses a deck of playing
cards [15], and others have since followed suit and developed their own card ciphers.
Solitaire was used in the novel Cryptonomicon by Neal Stephenson, although in the book
it is referred to by the name “Pontifex” to temporarily conceal the fact that it uses a deck
of cards. Paul Crowley has identified some problems with the cipher [3], specifically,
that the algorithm was not reversible, as Schneier intended it to be (non-reversible
pseudo-random number generators tend to have shorter periods and more easily exhibit
bias) and that successive outputs are identical with significantly higher probability than
they would be in a truly random output. However, these weaknesses have not yet been
extended to an attack against the Solitaire cipher.
Still, there are two reasons why Solitaire was not used. First, it is not a paper-and-
pencil cipher—it requires a deck of cards. One could perhaps tear a sheet of paper into
54 pieces (Solitaire requires two jokers), but removing the cryptographer’s reliance on the
cards takes the technology requirement for encryption down another step. A true paper-
and-pencil cipher could be performed using any available writing surface, and is one step
closer to eliminating the need for mechanical aid altogether.
Secondly, the key for Solitaire is the initial deck configuration, and since the deck
cannot be carried in, this needs to be memorized. Memorizing the order of an entire deck
of cards is beyond most people. There is a method presented by Schneier for turning a
memorized “passphrase” in natural English into a deck configuration, but because
Solitaire is a stream cipher, it is insecure to send more than one message using a single
key, and memorizing numerous passphrases is implausible. It may be possible to develop
a key schedule which would solve this problem.

2.4 TEA, a Tiny Encryption Algorithm


The Tiny Encryption Algorithm [17] was considered, given its promising name.
However, although the TEA algorithm is small, it relies on a large number of rounds (32
recommended) to be secure, making human computation impractical. Additionally, one
of the primary operations used in TEA is a binary shift (sliding all binary digits of a
number to either the left or the right) and binary operations are particularly difficult for
humans.

3. The Cipher
RC4 (which stands for “Ron’s Code #4” or “Rivest Cipher #4”) is a cipher
developed by Ron Rivest in 1987 for RSA Data Security, Inc. [14] It is a stream cipher,
which means that it generates a string of pseudo-random numbers (called the keystream)
of the same length as the plaintext, and combines the two to obtain the ciphertext.
The operation for combining the two is usually a bitwise XOR (exclusive or), but
for the jail cell cipher, addition (modulo the size of the alphabet) may be used instead,
because it is computationally easier for humans. Any one-to-one function for mapping
the plaintext to the ciphertext using the keystream is equally secure.
Jeremy Lennert Jail Cell Cipher – March 19, 2002 -5-

A jail cell version of RC4 (hereafter JCRC4) is developed based on a generalized


definition of RC4. It is believed that JCRC4 fulfills the requirements for the scenario
defined in section 1.2 of this paper.

3.1 Generalized RC4


RC4 is usually defined such that the size (N) of the permutation table is always a
power of two (N = 2n), with N = 28 = 256 being the most common size. Here N is
defined such that N = m, the size of whatever alphabet is used (regardless of whether this
is a power of two). For the proposed use, N = m = 37 is recommended.
The internal state of the keystream generator for generalized RC4 consists of a
permutation table of an alphabet, P[0..m-1] (where m is the size of the alphabet), and two
counters, i and j. For standard 8-bit RC4, the alphabet is the set of all binary strings of
length eight bits. The counters are both initialized to zero, and the initial state of the
permutation table is the key. To generate one keystream letter k

(1) i = i + 1 mod m
(2) j = j + P[i] mod m
(3) swap P[i] and P[j]
(4) output k = P[P[i] + P[j] mod m]

Where P[x] denotes the value of the xth element of the permutation table P.

3.2 Key Scheduling


It is not secure to encrypt two messages with a stream cipher using the same key.
A stream cipher is effectively a weaker version of the one-time pad, and so reusing a key
is insecure for the same reason.
However, because it is often useful to send more than one message, a key-
scheduling algorithm is used that allows the cryptographer to transform one key into
another. The key-scheduling algorithm developed by Ron Rivest for RC4 initializes the
permutation table to the identity (P[x] = x) and then runs through N rounds of RC4
without generating outputs, but adding one of a series of key values to the j counter each
round, such that the second operation becomes:

(2) j = j + P[i] + K[i] mod m

with K[i] denoting the ith value of the key. Key values are all in the range [0, N-1].
After this, the i and j counters are once again initialized to zero and the cipher
begins generating the keystream. Therefore, a small change in the secret key produces a
pseudo-random change in the initial permutation table, allowing the cryptographer to use
closely-related keys with much less risk. However, several keys which are weak under
this key-scheduling algorithm have been observed [5, 12].
For JCRC4, the original key schedule was thought to be infeasible due to limits
on human computational speed and accuracy. A different one, based on the idea of
double hashing [2], follows. The key-scheduling algorithm is shown in figure 1.
Keystream generation is shown in figure 2.
Jeremy Lennert Jail Cell Cipher – March 19, 2002 -6-

The initial permutation is derived from an array of key values K, where each
value is a nonzero alphabetical character or its numerical equivalent. For m = 37, K
needs to contain at least 13 values to achieve 64 bits of actual key information.
The first key value K[0] is taken to be the first alphabetical character to be
inserted into the permutation table. It is inserted into the slot of the permutation table
designated by the next key value K[1].
Each successive key value indicates the position relative to the last insertion (mod
m) where the next character should be inserted. For example, if the most recent character
A is inserted into slot 7, and the next key value is 12, then B is inserted into slot (7 + 12
mod 37) = 19. If this slot is occupied (for example, if X is already in slot 19), then the
position is displaced by the same key value again (19 + 12 mod 37 = slot 31). If m is
prime, and key values of 0 are disallowed, then all key values will be relatively prime to
m and therefore all table slots will eventually be probed if no open slot is found.
When the end of the key is reached, key values are reused, starting again with
K[0] (but using it as a displacement this time). Note that because zeroes cannot be used
in the key, zero cannot be the first character placed, nor can the first character be placed
in S[0]. This means that zero will appear in S[0] with 1/36 probability instead of 1/37.
This could be avoided if K[0] and K[1] were allowed to be zero, but if this is the case,
these key values cannot be reused, since zero is not relatively prime to m = 37.
Because RC4 is a stream cipher and the same permutation table configuration
cannot be used to send more than one message, K needs to be modified for each message.
The modification proceeds as follows: K[0] is incremented by 1 (mod m), and K[last] is
rotated to the front of the array (with all other values being moved down the list one slot).
This obviously does not rotate through all possible keys, but it does rotate through (m - 1)
* |K|, which for proposed JCRC4 is at least (36 * 13) = 468 keys, probably enough for
most realistic paper-and-pencil applications. Alternatively, the first thirteen keystream
outputs after the message is encrypted could be used as the key for the next message, but
this requires the cryptographer to memorize a completely new random string for every
message.
For a key of length l, the (l - 1)th output and later are guaranteed to be influenced
by every word of the key, because the i counter forced the generator to utilize at least x
values from the permutation table in the first x outputs, and no value is inserted by the
first key value (that value determines the first value to be inserted). Therefore, if the first
(l - 2) outputs are discarded, the cipher should not be vulnerable to partial-key attacks,
where information about the key is obtained from outputs that are unaffected by some
parts of the key.
However, it should be noted that if an attacker can recover the full state of the
permutation table, reconstruction of the original key is trivial, and so all other messages
sent by rotating the key as described above will then be readable by the attacker.

4. Cipher Security
The security of the JCRC4 cipher is analyzed in light of previous cryptanalytic
attacks against RC4. RC4 distinguishers are considered, as well as attacks on the RC4
generator and original key schedule.
Jeremy Lennert Jail Cell Cipher – March 19, 2002 -7-

4.1 Distinguishers
Distinguishers attempt to distinguish between the keystreams output by a pseudo-
random number generator and a truly random string. The ability to make such a
distinction does not itself constitute an attack, but attacks can often be derived from them,
and they can also be used to help the attacker more accurately guess the plaintext of a
message from knowledge of the ciphertext.
Distinguishers for RC4 are presented in [6] and [7], and some other statistical
anomalies of the RC4 generator are noted in [9]. Golić [7] estimates that his
distinguisher can be used to differentiate between an 8-bit RC4 keystream and a random
string by looking at roughly 240 outputs. Fluhrer and McGrew [6] use information theory
to revise this figure slightly, proving that 2 44.7 outputs are necessary using Golić’s
distinguisher to reduce the chances of false positive and false negatives to 10%. They
also present a better distinguisher that uses the output bigram frequencies from RC4
which can achieve the same accuracy with only 230.6 outputs.
However, even their distinguisher still needs 2 18.62 outputs, or roughly 400,000, to
distinguish between 5-bit RC4 and randomness with the same accuracy (JCRC4 is
slightly larger than 5-bit RC4). It is unlikely that any messages exceeding four hundred
thousand characters will be encrypted with paper and pencil, so this poses no significant
danger to JCRC4.

4.2 Attacks on RC4 Generator


The best known attack against RC4 which is independent of the key-scheduling
algorithm is a “tracking algorithm” that works by modeling the generator itself. Every
time a value needed to continue computing the generator forward is unknown, it is
calculated from known information if possible and guessed otherwise. When the
attacker’s generator outputs a keystream value different from that used in the encryption,
the algorithm backtracks and guesses differently.
This attack has been implemented several times [10, 11], sometimes with minor
alterations. 8-bit RC4 is clearly beyond reach of this attack (estimated complexity 2 779
[10]), but the reduced size of JCRC4 makes it more vulnerable. The estimated
complexity of an attack on 5-bit RC4 (only slightly smaller than JCRC4) is about 2 53
according to Knudsen, et al [10]. According to Mister and Tavares [11], a 5-bit RC4 key
that outputs a string of all zeroes (a very unlikely occurrence, for which the ciphertext
would be identical to the plaintext) can be found with a complexity of 2 42 (determined
experimentally), but the estimated complexity for an arbitrary keystream is 269.
This attack was reimplemented in a Java applet, but because the applet displays its
progress in detail at rapid intervals, it runs considerably slower than it could. This
implementation searched less than 0.1% of its search space after one day. At this pace, it
would take years to complete the search.
Bob Jenkins [8] has also posted to the newsgroup sci.crypt source code for an
attack on RC4 which he says breaks 3-bit RC4 in less than a second and 4-bit RC4 in two
to ten minutes, using less than 256 keystream outputs. He says he has extrapolated the
time to break 5-bit RC4 to be about two weeks, but does not say how this was
extrapolated.
Jeremy Lennert Jail Cell Cipher – March 19, 2002 -8-

None of the above have experimentally broken 5-bit RC4, but it seems that it
might be within the reach of a determined attacker with a large budget to crack it in a
matter of days or weeks. However, this is a known-plaintext attack (that is, designed
assuming that the attacker already knows the plaintext and is only trying to recover the
key), and would increase considerably in complexity if the attacker had knowledge only
of the ciphertext. Additionally, depending upon the message, remaining secure for a
matter of days or weeks may be sufficient.

4.3 Attacks on the RC4 Key Schedule


Various weak keys have been noted in the RC4 key schedule [5, 12] that can be
exploited by an attacker. Even more dangerous attacks have been developed when part of
the key is known to the attacker [1, 5]. However, these weaknesses can be overcome by
“warming up” the generator before beginning encryption by throwing away the first X
outputs (for 8-bit RC4, discarding 256 outputs is recommended, to ensure all portions of
the permutation table have been modified before encryption begins).
In addition, all of these rely upon the original key-scheduling algorithm for RC4,
which is not used in JCRC4. None of these attacks can be used in any obvious way
against the new key-scheduling algorithm, although the new key-scheduling algorithm is
certainly not secure if part of the key is known to the attacker (therefore, appending or
prepending known initialization vectors to the key is not secure). There are no known
attacks against the new key-scheduling algorithm at this time, but it has not yet been
subjected to expert cryptanalysis.

5. Cipher Implementation
The necessary time to implement JCRC4 and its key-scheduling algorithm using
only paper-and-pencil computation is estimated from experimental measurements of
human computation speed. The error occurrence rate in JCRC4 is estimated from the
same experimental data, and procedures for limiting error occurrence are suggested.

5.1 Human Computation Speed


According to Verhaeghen, et al [16], the accuracy of mentally computing a single-
digit simple addition or subtraction becomes asymptotic with respect to time at about one
second. A two-digit modular addition includes two single-digit simple additions, possibly
a carry (about half the time), and possibly a two-digit simple subtraction (about half the
time). Such an operation is therefore estimated to take, on average, about five seconds to
compute, and therefore twelve such problems could be computed per minute.
Looking up a value in a data table or recording a value on paper is estimated to
take approximately one second.

Operation Estimated Time


one-digit simple addition 1 second
two-digit modular addition 5 seconds
table look-up 1 second
data recording 1 second
Jeremy Lennert Jail Cell Cipher – March 19, 2002 -9-

To confirm this estimate, high school students were asked to perform a series of
randomly generated two-digit addition problems modulo 37. All students were given
three pages (340 problems) and ten minutes to complete as many as possible. They were
instructed to prioritize accuracy above speed, and were not allowed to use calculators or
other aids. Furthermore, testers were instructed to answer the problems in order, rather
than skipping more difficult problems.
Thirty-seven students took the test. Two tests were removed from the sample due
to failure to answer the problems in order, and one additional test was removed because it
was not signed by the tester. On one test, two problems answered with “37” which
should have been zero were marked correct, since the error was systematic and appeared
to result from a misunderstanding of the modulus operation. In other instances, an
answer larger than 36 was counted wrong even if it was equivalent (mod 37) to the
correct answer.
The speeds ranged from 2.1 problems/minute to 20.9 problems/minute (average =
8.9 problems/minute = 6.7 seconds/problem). The accuracies ranged from 83.9% to
100% (average = 95.0%).
The two math classes which took the test differed significantly in both accuracy
and speed. In the first class, out of nine testers, the speeds ranged from 6.4
problems/minute to 20.9 problems/minute (average = 12.5 problems/minute = 5.31
sec/problem) and the accuracies ranged from 95.7% to 100% (average = 97.7%). The
average speed for this class was very close to the estimated speed. In the second class,
out of twenty-three testers, speeds ranged from 2.1 problems/minute to 13.9
problems/minute (average = 7.3 problems/minute) and the accuracies ranged from 83.9%
to 100% (average = 93.8%).
Exactly one student in each class obtained an accuracy of 100%, but the student to
do so in the first class was faster than the student to do so in the second. On average, the
first class exceeded the speed of the second by 70% of the second class’s speed, and
exceeded their accuracy by an absolute margin of 3.9%.
Two factors were observed which may have contributed to this discrepancy. The
first is that the first class performed the experiment on the same day that they took the
American Mathematics Competition (AMC) test, and so may have been “warmed up”
from the additional mathematics practice earlier in the day. The second is that the second
class performed the experiment during the first period of the school day (prior to 9:00
A.M.) and may have been experiencing some residual sleepiness after awakening.
It should be noted that most of the test subjects had never performed the modulus
operation prior to this experiment. 72% of all errors occurred on problems where the
modulus operation changed the answer, even though such problems made up only 50% of
all problems on the test. It is thought that, with practice, the frequency of errors on these
problems could become as low as the frequency of errors on problems requiring no
modulus. If this occurred (without altering the error rate on problems requiring no
modulus) the average accuracy would rise from 95.0% to 97.2%.
It is thought that this error rate could be further reduced with practice, thought it is
not certain by what margin or with how much practice. Eventually, a practitioner could
construct a mental table for two-digit addition modulo 37, at which point the speed and
accuracy of the operation may become comparable to the speed and accuracy for single-
Jeremy Lennert Jail Cell Cipher – March 19, 2002 - 10 -

digit simple addition.


In the remainder of the paper, the current average accuracy of the first class to
take the test (97.7%) is used for purposes of estimation, based upon the assumption that
anyone actually experiencing the scenario outlined in section 1.2 will quickly become
“warmed up,” and will also experience alertness due to apprehension regarding his
predicament, thereby compensating for both observed occurrences thought to dampen the
results of the second class.

Speed and Accuracy of Hum an Com putation

100%

98%

96%

94%

92%

90%

88%

86%

84%

82%
0 4 8 12 16 20 24

S peed ( problems/ min.)

Triangles represent testers in the first class tested


Circles represent testers in the second class tested
Diamonds represent two additional volunteer testers

5.2 Encryption Computation Time


Each round of RC4 includes three table look-ups, a modular increment, three two-
digit modular additions (two to generate the keystream, and one to combine the
keystream output with the plaintext letter), and a swap function (two data recordings). If
the modular increment is treated as a single-digit simple addition, then using the above
time estimates, it will take approximately 21 seconds to encrypt one letter using JCRC4,
allowing encryption of roughly three letters per minute. At this rate, a 100-character
message could be encrypted in slightly more than half an hour.

5.3 Key-Scheduling Algorithm Computation Time


The key-scheduling algorithm requires a series of two-digit modular additions,
table look-ups, and data recordings. The number of each of these operations required to
construct the permutation table depends upon the number of collisions that occur while
inserting values into the table.
Jeremy Lennert Jail Cell Cipher – March 19, 2002 - 11 -

5.3.1 Best Case Analysis


If no collisions occur, then placing each character requires a two-digit modular
addition to shift from the last value placed, a table look-up to confirm that slot is empty,
and a data entry to place the value into the table.
The first placement does not require the addition, because it is added to zero (the
identity). The last placement does not require the addition either, because the value is
simply placed in the last available slot.

5.3.2 Worst Case Analysis


Each collision during key set-up causes one additional two-digit modular addition
and one additional table look-up to be required. If all key values are independently
random (this is not quite the case, but approximately so. This dramatically simplifies the
analysis), then the maximum number of collisions for the ith placement is i – 2 (since it
cannot collide with itself, the value just placed, or a value that has yet to be placed).
No collisions occur on the last character placed, because it is simply placed in the
last open slot. The maximum number of collisions is therefore:
36
 (i – 2) = 595 collisions (maximum)
i=3

The worst case scenario in practice is not quite this bad, because each placement
is not independent of the others. For example, it is not possible to experience the
maximum number of collisions for a placement on two consecutive placements.
However, the reuse of key values also subjects the placements to a very mild form of
secondary clustering, which may partially negate such effects. However, in any case, the
worst case scenario cannot be any worse than this, and may be significantly better.

5.3.3 Average-Case Analysis


To estimate average-case collisions, each insertion is modeled as being
independent and it is assumed that key values are not reused.
No collisions ever occur on placements #1, #2, or #37.
On placement #3, a maximum of 1 collision is possible, and occurs with
probability 1/36. The average number of collisions for this placement is 1/36.
On placement #4, a maximum of 2 collisions are possible. At least one collision
occurs with probability 2/36, and given that at least one collision occurs, the second
occurs with probability 1/35. The average number of collisions for this placement is
(2/36) + [(2/36) x (1/35)] = 2/35.
On placement #5, a maximum of 3 collisions are possible. At least one collision
occurs with probability 3/36. Given at least one collision occurs, a second occurs with
probability 2/35. Given that two occur, the third occurs with probability 1/34.
Average collisions: (3/36) + [(3/36) x (2/35)] + [(3/36) x (2/35) x (1/34)]
= (3/36) x (1 + (2/35) x (1 + (1/34)))
= 3/34

Exhaustive computation reveals that the average number of collisions for a single
placement is always equal to the maximum number of collisions (i - 2) divided by 37
Jeremy Lennert Jail Cell Cipher – March 19, 2002 - 12 -

minus the maximum number of collisions, which indicates that the average number of
total collisions in the key set-up is:
36
i–2
 39 – i  65 collisions (average)
i=3

Experimentally, however, out of 500 randomly-selected keys for which the key-
scheduling algorithm was completed by computer, the average number of collisions was
closer to 85. This is likely due to secondary clustering, where the reuse of key values
tends to cause values to be inserted in relative positions more likely to cause longer
chains of collisions.

Case Operations Estimated Time


Best Case (no collisions) 35 two-digit modular additions 249 sec
37 table look-ups (4.15 min)
37 data entries
Worst Case 630 two-digit modular additions 3,819 sec
632 table look-ups (63.65 min)
37 data entries
Average Case (calculated) 100 two-digit modular additions 639 sec
102 table look-ups (10.65 min)
37 data entries
Average Case (experimental) 120 two-digit modular additions 759 sec
122 table look-ups (12.65 min)
37 data entries

While only a very determined cryptographer would be likely to compute the


worst-case key by hand, the average case does not seem unreasonable.
If a computer is available when determining the initial key, one might be
strategically chosen to limit the number of collisions and reduce computation time.
Alternatively, the sender and receiver could agree that if any key set-up is found to entail
more than a certain number of collisions, set-up is aborted and the next key in the rotation
is used.

5.4 Error Creation and Propagation


The RC4 algorithm is as follows:

(1) i = i + 1 mod m
(2) j = j + P[i] mod m
(3) swap P[i] and P[j]
(4) output k = P[P[i] + P[j] mod m]

Since the cipher relies on human computation, there is likely to be a non-


negligible rate of introduction of errors into the cipher.
There are three basic types of errors possible when computing RC4:
(1) The first is if either the i counter or the j counter is set to an erroneous value.
Jeremy Lennert Jail Cell Cipher – March 19, 2002 - 13 -

Because both counters are set each round based, at least in part, upon their values from
the previous round, these errors are propagated throughout the rest of the keystream.
Such an error causes the entire keystream from that point onward to become invalid.
(2) The second occurs when one or more values in the permutation table become
inaccurate. On any round where the i counter points to an erroneous value, the error is
transmitted to the value of the j counter, creating an error of the first type, and
invalidating the entire keystream from that point onward.
On any round where the j counter points to an erroneous value, that word of the
output will be incorrect. Additionally, the incorrect value will be moved to the cell just
probed by the i counter, insuring that it will not be pointed to by i (and propagated
throughout the rest of the keystream) for at least N rounds.
(3) The third type of error occurs when the output for the current round is
miscalculated, or the plaintext character being encrypted is inadvertently shifted by a
value other than the output (the intended shift). This type of error corrupts only that letter
of the message, and cannot propagate.
Because line one only involves a simple increment, it is unlikely that any errors
will occur there. However, if an error did occur on line one, this would create an error of
the first type and corrupt the keystream from that point forward.
If an error occurred on line two while setting the j counter, this would also create
an error of the first type and corrupt the keystream from that point forward.
An error on line three would create an error of the second type, causing one or
two improper values to be entered into the permutation table. This would corrupt the
current output, and might corrupt additional outputs later on (see above).
An error on line four, or while applying the keystream word to the plaintext as a
shift, would result in an error of the third type, corrupting only that character of the
message.
If an error is made while computing the key-scheduling algorithm, an insertion
may be made in the wrong position, creating an error of the second type. Because each
insertion is also used as the starting point for the next insertion, barring a second error
that happens to correct the first, all subsequent values inserted into the permutation table
will also be erroneous.

5.5 Error Frequency Analysis


It is necessary to compute three two-digit modular additions to encrypt each letter
of the plaintext using JCRC4 (the increment, table look-up, and swap operations are
assumed to have negligible error rates). An error in the first of these three operations
(setting the j counter) results in a type 1 error, corrupting the keystream from that point
forward. An error in either of the other two results in a type 3 error, corrupting only that
letter of the message. With an accuracy of 97.7%, more than 95% of those characters
encrypted before the keystream is corrupted will be encrypted correctly, and this is high
enough that the redundancy in natural language should allow the recipient of the message
to correct the errors.
Corruption of the keystream, however, presents a more significant problem. With
97.7% accuracy, only the first four outputs of the generator have a greater than 90%
chance of being encrypted before the keystream is permanently corrupted. The first nine
have an 80% chance of coming through. The first twenty-nine have a 50% chance of
Jeremy Lennert Jail Cell Cipher – March 19, 2002 - 14 -

encryption prior to a permanent error. These figures are made still more troublesome by
the fact that the first 11 outputs need to be discarded for security.
The key schedule requires even more operations. For a key with the calculated
average number of collisions, there exists about a 10% chance that a human with 97.7%
accuracy will compute the initial permutation table without error.

5.6 Error Correction


If there is to be any reasonable chance that a message encrypted with JCRC4
without a computer is encrypted correctly, it is necessary to either greatly increase the
accuracy of computation or perform error-correction. It is recommended that after
calculating the key-scheduling algorithm once, the cryptographer should perform the
computations a second time (without referring to the first attempt), and then compare the
results. Upon finding the first discrepancy, determine which calculation was correct,
discard the flawed calculation, and compute the key-scheduling algorithm again.
Continue until two separate calculations of the initial permutation table are identical;
there is little chance that two erroneous configurations would be identical, so this can be
assumed to be correct.
Similarly, before actually encrypting the plaintext, compute the keystream
multiple times until accuracy is certain. Fortunately, computation for JCRC4 is
sufficiently fast to make multiple calculations practical.

6. Conclusion
A scenario was defined to determine the requirements of a jail cell cipher. It was
determined that most traditional paper-and-pencil ciphers, machine ciphers, one-time
pads, book ciphers, and card ciphers to not meet these requirements. A modified version
of the RC4 cipher was proposed with a reduced permutation table size and a new key-
scheduling algorithm which appears to fulfill all the requirements defined by the
scenario.
Additional cryptanalysis of the key-scheduling algorithm for JCRC4 is needed to
verify its security. Additional research is also required to determine the potential
improvement in the speed and accuracy of human computation with practice. However,
it is apparent that Jail Cell RC4 can be performed using reasonable human computational
abilities in less than an hour without error-checking, and in less than a day with sufficient
error-checking to compensate for the experimentally determined error rate.
There are also no known cryptanalytic attacks which can be used to recover the
key of JCRC4 without nontrivial computational resources. The best known attack is
sufficiently computationally complex that none of the cited implementations [8, 10, 11]
have actually performed the attack, and the attack becomes considerably more difficult if
the plaintext of the encrypted message is unknown to the attacker.
Jeremy Lennert Jail Cell Cipher – March 19, 2002 - 15 -

References

[1] Nikita Borisov, Ian Goldberg, David Wagner; “Intercepting Mobile Communications:
the Insecurity of 802.11”
[2] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest; Introduction to
Algorithms, The MIT Press, McGraw-Hill Book Company, 1990, p.232-237
[3] Paul Crowley; “Problems with Bruce Schneier’s ‘Solitaire,’” August 13, 2001,
http://www.ciphergoth.org/crypto/solitaire/
[4] Electronic Frontier Foundation; “Cracking DES,” http://www.eff.org/descracker/
January 19, 1999
[5] Scott Fluhrer, Itsik Mantin, Adi Shamir; “Weaknesses in the Key Scheduling
Algorithm of RC4,” Lecture Notes in Computer Science Vol. 2259, SAC 2001
[6] Scott R. Fluhrer, David McGrew; “Statistical Analysis of the Alleged RC4 Stream
Cipher,” Fast Software Encryption Seventh International Workshop, Springer-
Verlag, March, 2000
[7] Jovan Dj. Golić; “Linear Statistical Weakness of Alleged RC4 Keystream Generator,”
EUROCRYPT 1997,
http://www.wisdom.weizmann.ac.il/~itsik/RC4/Papers/Golic.PDF
[8] Bob Jenkins; “solitaire, cryptonomicon” post on sci.crypt newsgroup, August 11,
1999; source code at http://burtleburtle.net/bob/crypto/partrc4.c and
http://burtleburtle.net/bob/c/brute.c
[9] Robert J. Jenkins Jr.; “ISAAC and RC4,” 1993-1996,
http://burtleburtle.net/bob/rand/isaac.html
[10] Lars R. Knudsen, Willi Meier, Bart Preneel, Vincent Rijmen, Sven Verdoolaege;
“Analysis Methods for (Alleged) RC4,” ASIACRYPT 1998,
http://www.wisdom.weizmann.ac.il/~itsik/RC4/Papers/Knudsen.ps
[11] S. Mister, S. Tavares; “Cryptanalysis of RC4-like Ciphers,” in the Workshop Record
of the Workshop on Selected Areas in Cryptography (SAC ’98), Aug. 17-18,
1998, p.136-148
[12] Andrew Roos, “A Class of Weak Keys in the RC4 Stream Cipher,”
http://www.achtung.com/crypto/roosattack.txt
[13] John J. G. Savard, “A Cryptographic Compendium,” 1998, 1999, 2000,
http://home.ecn.ab.ca/~jsavard/crypto/entry.htm
[14] Bruce Schneier; Applied Cryptography, Second Edition, John Wiley & Sons, 1996
[15] Bruce Schneier; “The Solitaire Encryption Algorithm,” Appendix, Cryptonomicon
(by Neal Stephenson), Avon Books, 1999, p.911-918
[16] Paul Verhaeghen, Reinhold Kliegl, Ulrich Mayr; “Sequential and coordinative
complexity in time-accuracy functions for mental arithmetic,” Psychology &
Aging Vol. 12(4), Dec., U.S.: American Psychological Assn. 1997, p. 555-564
[17] David Wheeler, Rodger Needham; “TEA, a Tiny Encryption Algorithm,” November
1994, http://www.ftp.cl.cam.ac.uk/ftp/papers/djw-rmn/djw-rmn-tea.html

You might also like