Professional Documents
Culture Documents
Graham Browne
1.
INTRODUCTION 1.1 Cryptography 1.2 Standard Cryptographic Techniques 5 1.3 Applied Cryptography 6 1.3.1 Secure Communications 6 1.3.2 Secure Data Storage 1.3.3 Identification and Authentication 6 1.3.4 Electronic Commerce 1.3.5 Certification 1.3.6 Key and Password Recovery 1.3.7 Secure Computer Access 1.4 Cryptography Standards 8 BASIC CONCEPTS 2.1 Traditional Secret-Key Cryptography 9 2.1.1 Block Ciphers 2.1.1.1 Electronic Code Book 2.1.1.2 Cipher Block Chaining 10 2.1.1.3 Cipher Feedback 2.1.1.4 Output Feedback
4 4
6 7 7 7 7
2.
9 9 10 10 10
12 12 12
APPLICATIONS OF CRYPTOGRAPHY 15 3.1 Privacy 15 3.2 Password Encryption 3.3 Authentication 15 3.4 Key Agreement 15 3.5 Digital Envelopes HARD 4.1 4.2 4.3 18 PROBLEMS One-Way Functions The Factorization Problem The Discrete Logarithm Problem
15
16 17 17 17
4.
5. 6.
19 20
THE DES ALGORITHM 7.1 Triple DES 7.2 Restricting DES Key Usage 7.3 Using DES 7.4 Some Examples 25 7.5 Key Check Values DIFFIE-HELLMAN KEY AGREEMENT PROTOCOL 26 HASH ALGORITHMS 27 9.1 SHA-1 9.2 MD5 9.3 HMAC KEY MANAGEMENT 10.1 Key Generation 28 10.2 Key Distribution 28 10.3 Key Storage 10.4 A Key Distribution Centre Protocol PUBLIC KEY INFRASTRUCTURE 11.1 Shared Private Keys 11.2 Key Expiry 11.3 Loss or Compromise of Private Key 11.4 Key Recovery PERSONAL IDENTIFICATION NUMBERS CRYPTOGRAPHY STANDARDS LEGAL ISSUES 14.1 Legal Disclaimer 35 14.2 Cryptographic Patents 35
8. 9.
27 27 27 28
10.
29 29 30 30 30 31 31 32 34 35
11.
GLOSSARY
36
Cryptography is mainly concerned with keeping communications private and the protection of sensitive information has been the emphasis of the subject since ancient times. However, privacy is only one aspect of modern cryptography. Encryption transforms data that is impossible to read without knowledge of a secret key, and it ensures privacy by hiding information from anyone for whom it is not intended. Decryption turns encrypted data back into its original form. Encryption and decryption usually require the use of secret information, known as a key. Some encryption methods require the same key for both encryption and decryption. Other methods use different (but related) keys for encryption and decryption. Methods using the same key for both encryption and decryption are called symmetric, and those that use different keys are called asymmetric. Authentication is the electronic equivalent of a signature, and cryptography provides techniques based on symmetric cryptosystems (Message Authentication Codes) and asymmetric cryptosystems (Digital Signatures and Digital Timestamps). Cryptographic methods can be used to protect data and control access. More advanced applications allow us to pay using electronic money or to prove we have information without revealing the information itself.
The purpose of this guide is to give a general and useful overview of basic techniques sufficient for most practical purposes. Theoretical aspects of the subject are either ignored completely or glossed over, and the algorithms considered are those most commonly used, namely DES, RSA, MD5 and SHA-1. A full treatment of the subject can be found in Applied Cryptography, and Secrets and Lies by Bruce Schneier.
Symmetric Cryptosystems These are also known as secret-key systems, and require the same key for both encryption and decryption. The most widely used algorithm is the Data Encryption Standard (DES). Asymmetric Cryptosystems These are also known as public-key systems, in which each user has a public key and a private key. The public key is published while the private key is kept secret. The public key is used for encryption (or for verifying signatures) and the private key is used for decryption (or for creating signatures). The most widely used asymmetric algorithm is RSA named after its inventors (Rivest, Shamir and Adleman). Hashing Algorithms These are used to reduce a long message to a digest in such a way that any change in the message will result in a different digest, and that it is infeasible to construct a different digest that corresponds to the same message. Typically, hashing algorithms are used when creating digital signatures. The most commonly used algorithms are MD5 and SHA-1. Digital Signature Algorithm This is also a public-key technique, but it can only be used for signatures, not encryption. Diffie-Hellman Key Agreement Protocol This is a widely used public-key technique for setting up secret keys over an insecure channel.
1.3
Applied Cryptography
Cryptography can be applied in many applications such as: Secure Communications Secure Data Storage Authentication Identification Electronic Commerce Certification Secure Electronic Mail Key and Password Recovery Secure Computer Access
1.3.1 Secure Communications This is achieved by encrypting messages before sending and decrypting after they are received. A third party intercepting the messages will be unable to decipher the messages providing the algorithm is good and the key(s) are kept secure. Until the development of public-key cryptography, key management problems inhibited the wide spread use of secure communications. However, public-key techniques now exist which make it possible to create large-scale networks of people able to communicate securely even if they have never communicated before.
1.3.2Secure Data Storage This is essentially the same as secure communications except that the data is transferred to and from disk instead of being transmitted over a communications channel.
1.3.3 Identification and Authentication Identification is the process of verifying the identity of a person or thing. For example, when obtaining money from a bank via an ATM (automatic teller machine), you present a card that has associated with it PIN (personal identification number). On inserting the card into the ATM, you are prompted for the PIN. If you enter the correct PIN, the ATM identifies you as the rightful owner of the card and permits access to your account. Authentication is similar to identification, but merely determines whether a person is authorized for whatever action he is requesting.
1.3.4 Electronic Commerce A large amount of business is conducted over the Internet. Among the many examples are: Internet shopping Online banking Online brokerage accounts
To pay for such services you need to be able to present your credit card number securely, or there is scope for fraud. The obvious solution to this is to encrypt the credit card number when it is entered online, or to encrypt the entire session. When the web server receives the encrypted information it decrypts it and proceeds with the transaction confident that no personal information has been compromised.
1.3.5 Certification Certification authorities are trusted agents that vouch for unknown users. Typically a user identifies himself to the certification authority in the traditional manner (birth certificate, photograph etc) and is issued with a certificate which can be presented to other users who are then convinced that the owner is indeed who he claims to be. Digital signatures are used to implement certification.
1.3.6 Key and Password Recovery This technology allows a key or password to be revealed without its owner revealing it. This is useful if a user loses or inadvertently deletes his key or password. This could also be useful to a law enforcement agency wishing to eavesdrop on a suspected criminal, but this particular application has met with some resistance.
1.3.7 Secure Computer Access Using passwords for secure access may not always provide sufficient security. For example, passwords can be forgotten, guessed or stolen. Many vendors supply cryptographic products for remote access. Typically such products use a challenge/response protocol in which the remote system presents the user with a number (the challenge) and the user responds with a number generated by his remote access device. If this is the expected response, access is granted.
1.4
Cryptography Standards
Standards are required to enable systems to interoperate properly and in a uniform manner. The main purpose of standards is to allow technology from different sources to be compatible. Standards are necessary to insure that products from different companies are compatible. Standards help developers design new products because they can follow existing standards throughout the development without having to develop their own, and consumers can choose among competing products. Many organizations (including government and private industry) contribute to the standards on cryptography. Examples of these are ANSI, ISO, IEEE, and NIST.
Traditional cryptography uses a single key to encrypt and decrypt a message. An algorithm that uses the same key to encrypt and decrypt is called symmetric. This type of cryptography also deals with authentication, the main technique being the creation and verification of message authentication codes (MACs). The difficulty with secret-key cryptosystems is sharing a key between the sender and receiver without anyone else compromising it. In a system supporting a large number of users the key management problems can become very severe. The advantage of traditional cryptography is that it is usually much faster than public-key cryptography. The main techniques are: Block Ciphers Stream Ciphers Message Authentication Codes
2.1.1 Block Ciphers A block cipher transforms a fixed-length block of plaintext into a block of ciphertext of the same length, using a secret key. To decrypt, the reverse process is applied to the ciphertext block using the same secret key. In the case of DES, the block size is 64 bits (8 bytes) and the key is 56 bits presented as 8 bytes, the low order bit of each byte being ignored. It is usual to set every 8th bit so that each byte contains an odd number of set bits. This process is known as DES key parity adjustment. To use a block cipher to encrypt data of arbitrary length, we can use one of the following techniques (or modes of operation): Electronic Code Book (ECB) Cipher Block Chaining (CBC) Cipher Feedback (CFB) Output Feedback (OFB)
Most good block ciphers transform the secret key into a number of sub keys and the data is encrypted by a process that has several rounds (iterations) each round using a different sub key. The set of sub keys is known as the key schedule. In the case of DES the secret key is transformed into 16 sub keys and consequently DES takes 16 rounds to perform an encryption.
10
In ECB mode, each block of data is encrypted independently. If we take eK(D) to mean encrypt block D with key K, then the plaintext D1,D2,D3, ..,Dn is encrypted as eK(D1),eK(D2),.,eK(Dn). The trouble with ECB mode is that plaintext patterns show up in the ciphertext, because each identical block of plaintext gives an identical block of ciphertext. This can lead to attacks based on rearranging, deleting or repeating ciphertext blocks. ECB mode should only be used for encrypting very small blocks of data such as keys.
2.1.1.2
In CBC mode each plaintext block is XORd with the previous ciphertext block before it is encrypted. Because there is no previous ciphertext for the first block, an 8-byte block known as the Initial Chaining Value (ICV) is used to start the process. Patterns in the plaintext are hidden by the exclusive-OR. The ICV should be different for any messages encrypted with the same key, but it does not have to be kept secret and can be transmitted with the encrypted text. If the total length of the plaintext is not a multiple of 8, it is necessary to deal with the final short block. The obvious way to do this is to pad out the last block to 8 bytes, but the final block must contain a count of the number of filler bytes, so the message length is always increased by a maximum of 8 bytes. If this increase in length is not acceptable, a solution is to XOR the short block by re-enciphering the last complete ciphertext block (or, if there isnt one, the ICV).
2.1.1.3
Cipher Feedback
In CFB mode the previous ciphertext block is encrypted and is XORd with the plaintext to give the current ciphertext block. As with CBC mode, an ICV is needed to start the process. As well as full 64-bit feedback, it is possible to define 1-bit, 2-bit, and up to 63-bit cipher feedback. In software implementations there is no advantage over CBC mode, though CFB is often used in link encryption devices.
2.1.1.4
Output Feedback
OFB is similar to CFB mode except that the ciphertext XORd with each plaintext block is independent of the plaintext and ciphertext and is produced by repeatedly encrypting the ICV. The advantage of OFB mode is that transmission errors are not propagated and do not affect decryption of blocks that follow. It is therefore a useful method for encryption of satellite links where re-transmission of a corrupted message would be
11
12
To solve the key management problems associated with traditional cryptography, Diffie and Hellman came up with the idea of public-key cryptography in 1976. In a public-key system, users have a pair of related keys, one called the public key (which is published) and the other the private key (which is kept secret). There is no need for the sender and receiver to share secret information because all communications involve the public keys, and the private keys are never shared. In a public-key system there is no need for a secure channel to share keys. All that is necessary is that the public keys are associated with their users in a trusted manner. Anyone possessing a users public key can send an encrypted message, but only the intended recipient can decrypt it because only he possesses the corresponding private key. Public-key cryptography can also be used for authentication (digital signatures). In a public-key system, the private key is related mathematically to the public key. Therefore it is possible to derive the private key from the public key. The trick is to make the derivation process very difficult. In the case of RSA, the derivation process requires the factorization of a large number, so the number is chosen to be so large that the factorization is computationally infeasible.
2.2.1 Encryption When I wish to send a message to Fred, I look up Freds public key in a directory and use it to encrypt the message. I then send it to Fred, who then uses his private key to decrypt the message. No one eavesdropping can decrypt the message. Anyone can send Fred encrypted message, but only Fred can read it .
2.2.2 Digital Signatures If I wish to sign a message, I use my private key and the message data to create a quantity called a digital signature. I then attach the signature to the message and send it to Fred. To verify the signature Fred needs the signature, the message, and my public key. If the result of Freds calculation is correct according to the appropriate mathematical formula, the signature is taken to be genuine. If not, the signature is either fraudulent, or the message may have been changed.
13
The main advantage of public-key cryptography is better security and convenience because private keys are never revealed. In a secret-key system, keys have to be shared because the same key is used for encryption and decryption and it may be possible for an enemy to discover key during the sharing process. Public-key systems can support non-repudiation by use of digital signatures. Authentication in secret-key systems implies sharing of keys and sometimes requires involvement of a trusted third party as well. Therefore a sender can repudiate an authenticated message by claiming the key was compromised by one of the parties sharing it. Public-key systems prevent repudiation because each user is responsible for protecting his private key. Using public-key cryptography for encryption is usually very much slower than using secret-key cryptography. The best approach to encryption is to use a public-key technique to encrypt a key that is then used with a secret-key algorithm to encipher the data. In some cases secret-key cryptography is perfectly adequate, notably when small numbers of users can share keys in private, or in a closed system where a central key management authority can distribute the keys. When the number of users is very large, the administration problems for a key management authority may be insuperable. Public-key cryptography complements secret-key cryptography and makes it more secure. A particular use of public-key techniques is to establish keys in secret-key systems.
2.4
Hash Functions
A hash function is a process that takes data of an arbitrary length and returns a fixed-length string called the hash value. In cryptography, hash functions are usually required to have the following characteristics: The input data may be of any length. The output has a fixed length. It is one-way, in the sense that the original data cannot be recovered from the hash. It is collision-free, in the sense that it is computationally infeasible to find two different messages with the same hash value.
With these properties of the hash function, a hash value can be said to represent the longer message from which it was computed. The hash is often referred to as a message digest. Examples of hash functions are MD5 and SHA-1.
14
15
The simplest and most obvious application of cryptography is to implement privacy. The owner of the data simply encrypts it. The data is then only readable by someone with the key that was used to encrypt it. 3.2 Password Encryption
Sometimes data is encrypted in such a way that it cannot be decrypted. An example of this is a database of passwords where it is better to store hashes of the passwords rather than the passwords themselves. In this case the hash values are used to verify the passwords entered by the legitimate users. 3.3 Authentication
Cryptography can be used to verify the origin of a document and/or the identity of its sender, the integrity of its contents and other characteristics. The mechanisms available to achieve these are the digital signature and the message authentication code. The digital signature is based on the contents of the document and the sender's private key. It is usually created by hashing some or all of the document and encrypting the result with the private key. Since the only person with access to the private is the sender, successful verification means that only the sender could have created the signature. A message authentication code is used to protect messages exchanged between two parties who trust each other. Both sender and receiver have agreed on a common secret key. This key is used to create a cryptographic check sum based on the contents of the message. Successful verification means that the message has not been altered in transit, and that the sender is in possession of the key. However, because both parties are in possession of the same piece of secret information, this technique does not support non-repudiation. 3.4 Key Agreement
Cryptography can be used to establish a secret key between two parties who have never communicated before, and this can be done securely even in an insecure environment. The most obvious method of doing this is to use the Diffie-Hellman key agreement protocol. The disadvantage of Diffie-Hellman is that it does not authenticate the users. To overcome this one could use digital envelopes, but that implies the existence of previously established secret information and possibly the involvement of a third party to provide certification of the public keys involved.
16
3.5
Digital Envelopes
A digital envelope is a mechanism whereby secret keys (usually keys that are in use for a single message or for the duration of a communications session) are distributed encrypted under a previously established key known as a keyencrypting key (or KEK). The elements of a digital envelope are: Secret Key encrypted using Key Encrypting Key Message encrypted using Secret Key
The Key Encrypting Key can either be a symmetric key previously shared securely by the sender and the recipient, or it can be the public key of an asymmetric key pair.
17
A one-way function is one that is much easier to compute in one direction than in the opposite direction. For a really good one-way function it might be possible to compute in the forward direction in a few seconds, but impossible to compute in the reverse direction in less than several years. Sometimes a one-way function is easy to compute in the reverse direction provided a piece of extra information is given. This type of function is known as a trapdoor one-way function. Public-key systems rely trapdoor one-way functions. In the case of RSA encryption is easy knowing the public key and the modulus. However, decryption is very difficult unless the private key is known, and the private key can be derived if the factors of the modulus are known. Knowledge of these factors is the trap door in the system. The hard problem in this case is to factorize a very large number (the modulus). Nobody has yet proved that factorization is indeed a hard problem, but after hundreds of years of research into factorization it seems reasonable to assume that it is. However, a breakthrough in factorization could lead to systems like RSA becoming ineffective.
4.2
All non-prime integers can be uniquely expressed as the product of a number of prime factors. Multiplying a set of integers together is straightforward, but the problem of recovering the prime factors of a given integer is very much harder. This (presumed) hard problem is the basis of several public key cryptosystems, notably the RSA algorithm. It is not always true that large numbers are harder to factorize than smaller numbers. It is generally true is that numbers with large prime factors are harder to factorize than those with small prime factors. This is why the size of the RSA modulus determines the security of an actual implementation of the RSA cryptosystem. Factorization methods are the subject of much research among mathematicians. Algorithms for factorization are of two types:
Special-purpose algorithms work best for numbers with small factors, but the numbers used in cryptography do not usually have any small factors, so generalpurpose methods are of more interest to us.
18
4.3
The discrete logarithm problem is to find the integer x that satisfies P = (Ax mod M) where A, M and P are given. For example, the solution to 7x mod 13 = 11 is x = 5. We can work this out easily enough by trial because the solution obviously has to lie between 1 and 12, but when the numbers are very large the solution to the discrete logarithm problem is believed to be very difficult. This is a one-way function in which the easy direction is to compute Ax mod M, and the hard direction is to recover x from a given result. Because of the supposed difficulty of the discrete logarithm problem, it has been used as the basis of a number of public-key systems, including ElGamal and DSS. The best algorithms for solving the discrete logarithm problem have running times comparable to those of the best factorizing algorithms.
19
Cryptanalysis is concerned with cracking codes and breaking cryptographic protocols. The best way to test an algorithm or protocol is to try to crack it and then correct any weaknesses. This is why encryption should be made available for public scrutiny. For example, DES has been exposed to public scrutiny for many years, and is still resisting attack. It is therefore a well-trusted algorithm. It is a basic principle of cryptology that the security of an algorithm must not rely on its secrecy, because sooner or later it will be discovered and any weaknesses exploited. Attacks are classified into the following categories: Ciphertext Only The attacker obtains a sample of ciphertext and tries to determine the plaintext and/or the key. Known Plaintext The cryptanalyst obtains a sample of ciphertext and the corresponding plaintext. The objective is to determine the key.
Chosen Plaintext The cryptanalyst is able to choose some plaintext and obtain the corresponding ciphertext. The objective is to find the key. A brute-force way to do this is to try all the possible keys until the right one is found. This is known as exhaustive key search, and could be a practical attack as machines become faster, particularly if a large number of machines can be used in parallel. Adaptive Chosen Plaintext This is a special case of the chosen plaintext attack in which the cryptanalyst can select plaintext samples dynamically and modify his choices in the light of previous encryptions.
Chosen Ciphertext The cryptanalyst may choose the ciphertext and try to find the corresponding plaintext.
Adaptive Chosen Ciphertext This is the adaptive version of the chosen ciphertext attack.
20
CHAPTER 6: THE RSA ALGORITHM RSA is a public-key algorithm offering both encryption and digital signatures. The algorithm is named after its inventors, Rivest, Shamir, and Adleman. RSA uses two different but related keys for encryption and decryption. 6.1 Generating a Key Pair
Choose two large prime numbers p and q. Compute N, the product of p and q. N is known as the modulus. Choose a number e, relatively prime to (p-1)(q-1) and less than N. Compute a number d such that ed = 1 mod (p-1)(q-1).
The number e is called the public exponent and the number d is called the private exponent. The public key is the pair (N,e) and the private key is the pair (N,d). Given the public key it is possible to derive the private key, but to do this we need to factorize N to find p and q, and this is believed to be an intractable problem for sufficiently large N. A quick method of factorizing large numbers would undermine the security of RSA.
6.2
Encryption
To encrypt a message M we simply perform a modular exponentiation to give the ciphertext C thus: C = Me mod N Notice that M must be less than N. Also notice that this procedure is pretty useless if Me turns out to be less than N. For that reason it is usual to ensure that M contains sufficient padding to ensure that M2 is greater than N. Choose padding some of which is fixed and some random.
6.3
Decryption
The plaintext M is recovered from the ciphertext C by using d instead of e in the modular exponentiation: M = Cd mod N Note that if padding is used as recommended for encryption, this can be checked to determine whether the decryption has worked properly.
21
22
To sign a message M, perform a modular exponentiation using the private exponent d: S = Md mod N As with encryption it is as well to pad M, though in general there is unlikely to be a problem (unless M happens to be 0 or 1) because d is usually of the same order of magnitude as N and consequently the chance that Md is less than N is remote. Note that the only person who can sign M is the owner of the private exponent d. Anyone with the public exponent e can verify the signature by computing Se mod N and recovering M.
6.5
Speed of RSA
Because of the large number of arithmetic operations in modular exponentiation, and the huge numbers involved, RSA is very slow compared with a block cipher. To improve performance when encrypting or verifying, it is usual to choose e to be small and with just two bits set to 1 (common values are 3 and 65537). As a rough guide to software performance, and assuming a small public exponent, expect decryption to take about 10 times as long as encryption and key generation to take about 100 times as long as encryption. In general it is advisable to minimize the use of RSA by just using it for digital signatures and secret key transport (digital envelopes). Bulk encryption should be done with a symmetric cipher such as DES.
6.6
Breaking RSA
RSA is broken if you can determine the private exponent. To do this you need to factorize the modulus N, which is very difficult if N is chosen large enough with sensible choice of p and q. Another possibility is to compute modular eth roots of the ciphertext, but this does not appear to be a feasible method, particularly if the original message is padded to be of the same order of magnitude as N. Guessing plaintext is possible in some circumstances. Include some random padding in the message before to defeat this attack.
23
6.7
Simple Example
Choose two prime numbers 11 and 17. The modulus is then 11.17 = 187. We can then choose e as 13 and compute d as 37. To encrypt 3, we have (313 mod 187) = 148. To decrypt 148, we have (14837 mod 187) = 3. If you try this on a calculator you will find that the repeated multiplications quickly overflow the memory. To avoid this, divide by 187 after every multiplication and keep the remainder. This is precisely how the computation is done on a computer.
6.8
Key Size
The size of an RSA key refers to the size of the modulus N expressed in bits. The primes p and q should be roughly the same length in order to make N harder to factorize than if one prime is much smaller than the other. Thus, for a modulus of 1024 bits, each prime should have a length of about 512 bits. The larger you choose your modulus, the greater the security but the slower the performance. For most purposes a key sizes of 1024 bits is perfectly adequate. A certification authority might be well advised to choose 2048 bits for its root key. However 2048 bits will probably be a little too slow for most users, and certainly the time taken to generate a 2048 bit key is substantial. As technology moves on, it is advisable to increase your key length every few years. However this does not help with data already encrypted with the shorter keys, so the key length should be chosen to protect the data for as long as that data is sensitive or valuable. Unfortunately, some data gets more sensitive with age, so re-encrypting on change of key might be advisable. These considerations fall into the category of risk assessment and are outside the scope of these notes.
24
6.9
In practice, a message to be signed is usually much longer than the modulus of the signing key. Therefore the message is hashed to a fixed length by use of an algorithm such as SHA-1 or MD5, and the result is padded out to a length approximating to that of the modulus. Typically the data to be signed looks something like this: Header byte Padding bytes Message Digest (or hash) Trailer byte
When generating RSA keys it is usual to choose the two primes so that the modulus always has its high order bit set to 1. In the signature data it is usual to choose a header byte with high order bit 0 to ensure that the signature data is less than the modulus. If we take an example where the modulus is 1024 bits (128 bytes) and we have used SHA-1 to create the message digest (20 bytes), we have to provide 106 bytes of padding. It is not uncommon to include other useful data in the padding, such as a time stamp. However, whatever is placed in the signature data needs to be available to the system that will have to verify the signature.
25
7.1
Triple DES
There has always been argument about whether a 56-bit DES key provides sufficient protection. Longer keys are possible if triple-DES is used. This means that each data block is encrypted three times. If we have three keys k1, k2, k3 (effectively a 168-bit key) the encryption process on a block is defined as follows:
If you want to use a double length key (112 bits), set k3 equal to k1. Note that setting all three keys equal is equivalent to using a single 56-bit key. Thus it is possible to communicate with systems that only support single DES.
7.2
It is a principle of modern cryptography that keys should only be used for one purpose. This gives rise to the idea of key type. For example, a KEK (key encrypting key) should only be used for encrypting keys, and not data. Data keys can be for encryption, decryption or both. MAC keys can be used for creating or verifying MACs (or both). In fact, when you start to consider all the things can be used for, you end up with an impressive list of different key types. In practice, key values are embedded in structures called key tokens along with information on key usage. The key tokens are themselves enciphered under a master key, and all cryptographic operations are carried out in tamper resistant hardware to prevent misuse of keys. If you dont have hardware this level of protection can be
26
27
7.3
Using DES
In general the following rules should be observed when using DES. There may be circumstances when it is necessary to break one or more of these rules, but they should be regarded as sound general guidelines: Do not use single-length keys. Use 112-bit keys or (for preference) 168-bit keys. Only use ECB mode for encrypting keys. Use CBC mode for encrypting data. Change working keys frequently. Ideally use a different key for each message. Use digital envelopes to share keys. Change key encrypting keys regularly, using an RSA digital envelope or the Diffie-Hellman protocol. If possible, use tamper resistant hardware to protect your high level keys. Consider keeping your high level keys on diskette or some other token rather than storing them on your computer where they may be vulnerable to attack. The keys would only be loaded into the machine when needed. Some Examples
7.4
If you are implementing DES it is a good idea to have a few test cases so that you can verify your implementation. To check the processing as opposed to the creation of sub-keys, here are some examples with key value 0000000000000000 (incidentally, the zero key is one of four weak keys which yield the same result for both encryption and decryption): 0000000000000000 ---- 0123456789ABCDEF ---- FEDCBA9876543210 ---- 8CA64DE9C1B123A7 617B3A0CE8F07100 9231F236FF9AA95C
Here are some examples with key value 0123456789ABCDEF: 0000000000000000 ---- 0123456789ABCDEF ---- FEDCBA9876543210 ---- 7.5 Key Check Values D5D44FF720683D0D 56CC09E7CFDC4CEF 12C626AF058B433B
To compute a Key Check Value (KCV), encrypt a zero block and retain the first 3 bytes of the result (this is a VISA standard). If a sender and receiver produce the same KCV, they can be sure (with a high degree of probability) that they share the same key. In section 7.4 it is seen that the KCV of 0123456789ABCDEF is D5D44F. Some systems use more than the recommended 3 bytes but this can increase the possibility of determining the key by exhaustive search.
28
User_1 generates a secret random number x User_2 generates a secret random number y User_1 sends Ax mod N to User_2 User_2 sends Ay mod N to User_1 User_1 computes (Ay mod N)x mod N which simplifies to Axy mod N. User_2 computes (Ax mod N)y mod N which simplifies to Axy mod N. Both users now share a common number from which they can establish secret keys.
An eavesdropper intercepting the numbers sent between the two users needs to get x and y to obtain any useful information. To do this he has to solve the discrete logarithm problem, and he cannot do this if the numbers A and N are sufficiently large. The protocol can be undermined by a man in the middle attack. The enemy intercepts the numbers sent between the two users and substitutes a number Az mod N of his own. At the end of the exchange, the enemy shares a key with User_1 and another key with User_2 and can therefore intercept and read any traffic passing between the two legitimate users. A solution to this problem is to apply digital signatures to the exchanges. Since the man in the middle cannot forge the signatures, he cannot undermine the system. On the other hand, if the two users are already in possession of RSA key pairs they could use digital envelopes to establish their keys instead of the Diffie-Hellman protocol.
29
9.1
SHA-1
This is a revision of the SHA (secure hash algorithm) described in FIPS 180. The algorithm can process messages of less than 264 bits in length and produces a 20-byte message digest. Here is an example: abc (616263) hashes to A9993E364706816ABA3E25717850C26C9CD0D89D.
9.2
MD5
Ronald Rivest developed the MD5 algorithm in 1991. It predates SHA-1 and supercedes his earlier hash algorithms MD2 and MD4. MD5 processes a message of arbitrary length to produce a 128-bit (16 byte) message digest. Here is an example: abc (616263) hashes to 900150983CD24FB0D6963F7D28E17F72.
9.3
HMAC
HMAC is a keyed hash function and can be used in conjunction with any function (the obvious examples being MD5 and SHA-1). HMAC operates on a Key and a Message as follows:
If Key is longer than 64 bytes, replace it with its hash value. Pad Key out to 64 bytes with binary zeros. Create Key1 by exclusive-OR of Key with 64 bytes of value hex 36 (decimal 54) Create Key2 by exclusive-OR of Key with 64 bytes of value hex 5C (decimal 92) Append Message to Key1 and hash it to give Hash1 Append Hash1 to Key2 and hash it to give the final result
30
CHAPTER 10:
KEY MANAGEMENT
Key management encompasses the following: Key generation Key distribution Key storage
It is important that key management be carried out securely since most practical attacks on cryptographic systems are aimed at the key management function. 10.1 Key Generation Whatever type of cryptosystem is used, a good source of random numbers is needed for key generation. The random numbers used for this purpose must be unpredictable by potential enemies. The best source of random numbers is a white noise generator such as a noisy diode, but failing that one can base numbers on keyboard activity, mouse movement and other unpredictable events. Whatever the underlying mechanism, the results should be hashed or encrypted before being used to create keys. Satisfying statistical tests for randomness does not guarantee that a sequence of pseudo random numbers is a suitable basis for deriving keys, since the sequence may nevertheless be predictable.
10.2 Key Distribution In a symmetric key cryptosystem, high-level keys such as KEKs (Key-EncryptingKeys) and KTKs (Key-Transport-Keys) are distributed by splitting them into several components, each component entrusted to a key-holder. The idea is that no single key-holder has enough information to deduce any part of the key itself. Of course, if all the key-holders decide to get together with a view to compromising the system, this method fails, so it is as well to appoint key-holders with no real understanding of cryptographic techniques. In general, key components are loaded individually into tamper-resistant hardware and combined by exclusive-OR. Some hardware devices require a key check value to be input with each component to ensure that the key is properly loaded. Once a KEK or KTK has been successfully established, other (lower-level) keys can be distributed encrypted under KEK or KTK as appropriate. KEKs are used to distribute data keys, and KTKs are used do distribute KEKs. Systems that support KTKs are said to have a 3-level key hierarchy, while those that support KEKs only are said to be 2level. If public-key techniques are available, the manual process of combining key parts may be replaced by a public-key protocol. Of course, this may not always be possible because you may be trying to share a key with a user who does not have the required technology.
31
In a public-key cryptosystem, users must be able to look up other people's public keys and to publish their own public key. Moreover users must be assured that the public keys they access are legitimate, because otherwise an enemy could change keys in a public directory, or pretend to be another user. To assure integrity of public key directories, certificates issued by a Certification Authority (CA) are used. What happens is an intending user takes his public key to the CA and proves that he is who he says he is. The CA then embeds the users public key in an electronic document (the certificate) and signs it with the CA private key. All members of the club have access to the CAs public key and can therefore verify the digital signature on certificate. This assures a user accessing the public key that the CA is satisfied that it is genuine. Generally, certificates include an expiry date, beyond which it is considered that the public key should not be trusted. If a users private key is compromised, other users must be informed. For this purpose there is a mechanism known as a Certificate Revocation List s (CRL). The administrative details can become somewhat messy.
10.3 Key Storage The only place a clear key should be stored is in a tamper-resistant hardware device. Examples of such devices are HSMs (Host Security Modules), PCSMs (PC Security Modules), smart cards, and authentication devices such as Watchword and SecureID. Otherwise, keys are held encrypted under higher-level keys such as Key-EncryptingKeys or a Local Master Key (LMK). The Local Master Key (or Host Master Key) is a clear key held in a tamper resistant hardware device and which ultimately protects all the keys in the system. The LMK is never distributed and must never leave its tamper-resistant environment. Some systems contain several LMKs or variations on a base LMK to implement key usage restrictions (one LMK for encryption, one for authentication, and so on). If you do not have tamper-resistant hardware, but rely on software only, you should nevertheless implement a key hierarchy based on the LMK concept: it provides a layer of security and eases migration to hardware should you ever wish to do so. Furthermore it provides a level of compatibility with other users. 10.4 A Key Distribution Centre Protocol This is a commonly used protocol that allows two users to establish a shared key using a trusted third party (the Key Distribution Centre). User_A and User_B have previously shared KTKs with the Key Distribution Centre (KDC). Let us denote these keys by KA and KB. The protocol is:
User_A generates key KX and encrypts it with KA. He sends eKA(KX) to the KDC. The KDC retrieves KX and re-enciphers it with KB. eKB(KX) is sent to
32
33
There is no single accepted definition of a PKI (public key infrastructure), but loosely speaking it is a collection of services, standards and protocols for supporting public key applications. Among the services a PKI can be expected to provide is the management of public keys, via the use of the following components: Registration Authority (RA) Register the details of a new user of the PKI. Certification Authority (CA) Issue and/or cancel certificates for user public keys. Verification Authority (VA) Determine whether a certificate is valid and if so for what purpose.
It is unlikely that there will ever be a single global PKI. It is much more likely that there will be multiple independent PKIs and that these will inter-operate according to agreed standards. At present the standards allow a wide scope for interpretation, so the problem of achieving full inter-operability is severe and is unlikely to be solved in the near future.
11.1 Shared Private Keys Users who share a private key can impersonate one another so in general, private keys should not be shared among users. However, some large organizations need to share private keys among several secure modules (for resilience or performance), so in some circumstances it is necessary to get the private key out of a tamper-resistant environment and load it into another. By definition, this can be problematical. In RSA, each person should have a unique private key, but public exponent can be common to a group of users without loss of security. An example of this is in EMV (Europay, Mastercard and Visa) where the public exponent has been fixed as 3 (there was some argument over this because one of the members wanted to use 2, which is theoretically possible but practically not a very good idea).
11.2 Key Expiry The longer a key is in use, the more chance there is of it being compromised. Therefore every key should have an expiry date after which it is no longer valid. The time to expiration must be shorter than the likely time for cryptanalysis. The key must be long enough to make the chance of cryptanalysis before the expiry date negligible. The expiry date may also depend on the key usage and the value of the information it protects. On expiry a new key should be chosen and the old key destroyed (after reenciphering the information if appropriate). In general the new key should be longer
34
35
11.3 Loss or Compromise of Private Key If a private key is lost, anything previously signed with the lost key is still valid. The CA should be notified so that the key can be revoked and added to a certificate revocation list to prevent illegitimate use. Key loss can occur due to hardware malfunction or loss of a security token (a smart card, for example). It is necessary to obtain a new key and distribute it immediately to minimize the number of messages created which can no longer be read. If the key is compromised as opposed to lost, steps should be taken to store the new key more securely to minimize the possibility of future key compromise.
11.4 Key Recovery The subject of key recovery has been associated with the controversial issue of key escrow, which basically means you have to trust a big brother authority with your keys. The arguments for and against are beyond the scope of this introduction, but clearly some mechanism for recovering lost keys could be advantageous. A possible technique is to encipher all keys under the public key of a recovery authority. Note that the owner of the keys cannot recover them, but if need be they can be recovered by the recovery authority using his private key.
36
Personal Identification Numbers (PINs) are numbers between 4 and 12 digits long, used for customer identification. The most obvious and familiar example of PIN usage is obtaining money from an ATM (Automatic Teller Machine). PINs occupy a strange and rather eccentric position in cryptography, lying somewhere between data and keys. Like keys, once a PIN has entered the system it is encrypted and never appears in clear again, though it may be reenciphered and possibly reformatted as it progresses through the banking system. A PIN may be selected by the user, assigned by the financial institution, or derived from the users account data. To derive a PIN from account data, the procedure (in broad terms) is as follows: Compress the account data into 8 bytes (each digit occupying 4 bits). Encrypt the 8-byte block with a key, known as a PIN generating key. Decimalize the result (convert the digits A-F to numeric digits) Extract the required number of PIN digits from the decimalized result A variety of standards exist for the decimalization and extraction of the PIN digits. Once it enters the system a PIN is always held in encrypted form. The PIN digits always occupy 4 bits, so each byte can contain two PIN digits. Clearly, a PIN always occupies less than the full 8 bytes to be encrypted. Therefore the encrypted PIN is always held in an 8-byte PIN block. There are several standard PIN block formats: ANSI X9.8 which is also known as ISO-0, VISA-1, VISA-4, and ECI-1 ISO-1 which is also known as ECI-4 ISO-2 VISA-2 VISA-3 IBM-4700 IBM-3624 IBM-3621 ECI-2 ECI-3 ANSI X9.8 incorporates the PIN length, padding and part of the account number. This is the preferred format for maximum security. ISO-1 and ISO-2 incorporate the PIN length and random padding. VISA-2 incorporates the PIN length (restricted to a maximum of 6) and uses decimal padding. VISA-3 uses F to delimit the PIN. The rest of the block is filled with a fixed pad character. IBM-4700 incorporates PIN length, padding and a sequence number.
37
IBM-3624 contains the PIN, delimited by a pad character that fills the rest of the block. IBM-3621 contains a sequence number, the PIN, and a delimiter used to fill the rest of the block. ECI-2 restricts the PIN to just 4 digits. The rest of the block is filled with random digits. ECI-3 contains the PIN length (maximum 6) and a zero delimiter. The rest of the block is filled with random characters. PIN blocks are always encrypted. Keys used for this purpose are known as PIN encrypting keys. As a PIN progresses through the banking system it may be re-enciphered under different PIN encrypting keys, and converted to different PIN block formats. A common method of user identification (particularly in telephone banking) is to ask the user for two of his PIN digits. However, since the PIN is never revealed, methods are needed to compare the users response with the stored PIN and return a success/fail condition. Good cryptographic toolkits can be expected to provide many APIs for encrypting and formatting PIN blocks, and for the selection and verification of PIN digits.
38
ANSI X9.9 is a US banking standard for authenticating financial transactions. The algorithm specified is based on DES in CBC mode with Initial Chaining Value set to zero and the message padded out to a multiple of 8 bytes by appending binary zeros if necessary. The MAC is extracted from the final cipher-text block. ANSI X9.19 is a slightly extended version of X9.9. The equivalent ISO standards do not limit themselves to DES but allow the use of other MACs and block ciphers. ANSI X9.17 is the Financial Institution Key Management that defines protocols to be used by financial institutions to transfer secret keys using symmetric techniques. Financial institutions change encryption keys frequently and this precludes use of manual keys transfer methods. A three-level key hierarchy is defined: The Master Key (KKM). This is always manually distributed (typically in component form). Key-Encrypting Keys (KEKs). These are distributed on-line enciphered under the KKM. Low-level data keys (KDs). These are also distributed on-line enciphered under KEKs.
Data keys are used for encryption and authentication and are changed on a persession or per-day basis. A limitation of ANSI X9.17 is the difficulty of communicating in a large system since each pair of communicating terminals must share a common key. ANSI X9.28 was developed to allow distribution of keys between terminal systems that lack a common key center. ANSI X9.30 is the US standard for digital signatures based on the Digital Signature Algorithm (DSA), and ANSI X9.31 is the equivalent standard based on RSA. X.509 Specifies the authentication service for X.500 directories, as well as the X.509 certificate syntax. Version 3 (1995) addresses some of the security and flexibility concerns that were issues in the earlier versions. Directory authentication can be effected using secret-key or public-key methods. An X.509 v3 certificate consists of: Version, Serial number, and Signature algorithm identifier Issuer name, Validity period, User name, and User public key data Issuer identifier, User identifier and Extensions Signature on the above fields This certificate is signed by the issuer to tie the user's name to his public key. The main difference between version 3 and earlier versions is the extensions field, which enables the certificate to contain more than just the user name and public key. As with some other standards, X.509 has in some ways tried to be too clever and intending users are well advised to browse the Internet for information on difficulties
39
40
CHAPTER 14:
LEGAL ISSUES
There are many legal and political issues associated with cryptography, among which are the following: Government involvement Patent and copyright issues Import and export regulations
The rules and regulations vary from country to country and are frequently revised. It would be inappropriate to provide legal information that may soon be out of date, so if you are in any doubts about whether you may use cryptographic techniques, you should consult a lawyer.
14.1
Legal Disclaimer
The information in this document is intended as an introductory guide to cryptography. It is not intended to be complete or up to date with the latest developments and research. The authors cannot assume, and hereby disclaim, any liability to any person for any loss or damage caused by errors or omissions in the document resulting from negligence, accident or any other cause.
14.2
Cryptographic Patents
The patent for the RSA algorithm was issued on September 20, 1983, to RSA Security Inc. by the Massachusetts Institute of Technology, with an expiration date of September 20, 2000. On September 6, 2000, RSA Security made the RSA algorithm publicly available and waived its rights to enforce the RSA patent for any development activities that include the algorithm occurring after September 6, 2000. The patent for DES was assigned to IBM in 1976. IBM placed the patent in the public domain, offering royalty-free licenses conditional on adherence to the specifications of the standard. The patent expired in 1993. The patent for the Diffie-Hellman key agreement protocol was issued April 29, 1980 to Stanford University. This patent is now expired.
41
Glossary Adaptive-chosen-ciphertext A version of the chosen-ciphertext attack where the cryptanalyst can choose ciphertexts dynamically. A cryptanalyst can use this type of attack of this type when he has access to decryption hardware, but cannot extract the decryption key from it. Adaptive-chosen-plaintext A special case of the chosen-plaintext attack where attacker can choose plaintexts dynamically, and alter his choices based on the results of previous encryptions. Adversary Commonly used to refer to anyone wishing to compromise one's security. Algorithm A finite number of steps used to perform a task. ANSI American National Standards Institute. API Application Programming Interface. Attack An attempt at breaking part or all of a cryptosystem. Authentication Verifying identity, ownership or authorization. Birthday attack An attack used to find collisions. So called because the probability of two or more people in a group of 23 sharing the same birthday is greater than 1/2. Bit A binary digit, either 1 or 0. Block A sequence of bits of fixed length. Block cipher A symmetric cipher which encrypts a message a block at a time. Block cipher based MAC MAC that is performed by using a block cipher. Brute force attack All possible values are tried until the right one is found. Exhaustive search. Certificate An electronic document binding items of information together, such as a user identity and his public-key. Certifying Authorities issue certificates. Certificate revocation list A list of certificates revoked before their expiration date (CRL). Certification Authority (CA) An entity that creates certificates. Chosen ciphertext attack An attack where the attacker chooses the text to be decrypted. Chosen plaintext attack An attack where the attacker chooses the text to be encrypted. Ciphertext Encrypted data. Ciphertext-only attack The attacker has ciphertext only to work with. Cryptanalysis The process of breaking any form of cryptography. Cryptography The process of securing information by encryption. Cryptology The study of cryptography and cryptanalysis. Decryption The reverse of encryption. DES Data Encryption Standard. Diffie-Hellman key exchange A key agreement protocol allowing participants to establish a key over an insecure channel. Digest The output of a hash function. Digital envelope A key transport technique that uses a public or secret key cryptosystem to encrypt a secret key for a use in a secret-key cryptosystem. Digital signature A message digest encrypted with a private key. Distributed key A key split into several parts shared among different participants. Electronic commerce (e-commerce) Transactions carried out over the Internet. Encryption The transformation of plaintext into ciphertext. Exclusive-OR Combines bits a and b to give 1 if they are different or 0 if they are equal. Exhaustive search Trying every possibility until the correct value is found. Expiration date The date on which a certificate or key will cease to be trusted. FIPS Federal Information Processing Standards.
42
43
44