An Area Efficient Universal Cryptography Processor For Smart Cards

CHAPTER 1
INTRODUCTION
1.1 Introduction
Security is a broad topic and covers a multitude of sins. Most security problems are intentionally
caused by malicious people trying to gain some benefit or harm someone. The requirement of information
security has undergone two major changes in last two decades. In earlier days cabinets with a combination
lock for storing sensitive documents were used. With introduction of computer, the need for automated tools
for protecting files and other information became evident. This is very important in case of shared systems as
well as for data network or internet. The generic term for the collection of the tools designed to protect data
and thwart hackers is Computer Security
In this digital world, with the increment of Internet in human life every step like Banking, payment,
financial transaction etc. The importance of network security is also increasing. Security forms the backbone
of todays digital world.
1.1.1
Aim
To implement a n area efficient universal cryptography processor for smart cards.
1.1.2 Previous System

Data Encryption Standard (DES)
1. This is a well-established algorithm that has been used for more than two decades (since 1977) in
military and commercial data exchange and storage.
2. The algorithm is designed to encipher and decipher blocks of data consisting of 64 b using a 56-b key.
3. It uses 2 basic techniques of cryptography: Confusion & Diffusion. Confusion is achieved through
numerous permutations & Diffusion is achieved through XOR and Shift operations.
1.1.3 Present System
Advanced Encryption Standard (AES)
1. AES, also known as Rijndael, is a block encryption algorithm which encrypts blocks of 128 b using a
unique key for both encryption and decryption.
JOGINPALLY M.N. RAO WOMENs ENGINEERING COLLEGE
2. Three versions of the algorithm are available differing only in the key generation procedure and in the
number of rounds the data is processed for a complete encryption (decryption).
3. The 128-b input data is considered as a 4X4 array of 8-b bytes (also called state in the algorithm).
1.2
Objectives
The objective of this project is to find concurrent structure independent fault detection schemes for
reaching reasonable fault coverage. It makes robust implementation of AES against these above attacks and
provides highest efficiencies, showing reasonable area and time complexity overheads.
1.3 Literary Survey

Xilinx ISE
The Xilinx ISE is a design environment for FPGA products from Xilinx, and is tightly-coupled to the
architecture of such chips, and cannot be used with FPGA products from other vendors.[2] The Xilinx ISE is
primarily used for circuit synthesis and design, while the ModelSim logic simulator is used for system-level
testing. Other components shipped with the Xilinx ISE include the Embedded Development Kit (EDK), a
Software Development Kit (SDK) and ChipScope Pro.
Verilog:
Verilog HDL is an accepted IEEE standard. In 1995, the original standard IEEE 1364-1995 was approved.
IEEE 1364-2001 is the latest Verilog HDL standard that made significant improvements to the original
standard.
1. Most popular logic synthesis tools support Verilog HDL. This makes it the language of choice for
designers.
2. Verilog HDL is a general-purpose hardware description language that is easy to learn and easy to use. It is
similar in syntax to the C programming language. Designers with C programming experience will find it
easy to learn Verilog HDL.
3. Verilog is both a behavioral and structural language.
1.4 Organization of Project

Chapter 2 explains the general theory related to the project.
Chapter 3 explains the hardware description of the project.
Chapter 4 explains the software description of the project.
Chapter 5 gives the result analysis of the project.
CHAPTER 2
GENERAL THEORY
2.1 Introduction to VLSI
Very-large-scale integration (VLSI) is the process of creating an integrated circuit (IC) by combining
thousands
of transistors into
single
chip.
VLSI
began
in
the
1970s
when
complex semiconductor and communication technologies were being developed. The microprocessor is a
VLSI device. Before the introduction of VLSI technology most ICs had a limited set of functions they could
perform. An electronic circuit might consist of a CPU, ROM, RAM and other glue logic. VLSI lets IC
designers add all of these into one chip.
Overview:
The first semiconductor chip held one transistor each. Subsequent advances added more and more
transistors, and as a consequence, more individual functions or systems were integrated over time. The first
integrated circuits held only a few devices, perhaps as many as ten diodes , transistors, resistors and
capacitors, making it possible to fabricate one or more logic gates on a single device. Now known
retrospectively as "small-scale integration"(SSI), improvements in technique led to devices with hundreds of
logic gates, known as large-scale integration(LSI),i.e., system with at least a thousand logic gates. Current
technology has moved far past this mark and today's microprocessor have many millions of gates and
hundreds of millions of individual transistors.
At one time, there was an effort to name and calibrate various levels of large-scale integration above
VLSI. Terms like Ultra-large-scale Integration (ULSI) were used. But the huge number of gates and
transistors available on common devices has rendered such fine distinctions moot. Term suggesting greater
than VLSI levels of integration are no longer in widespread use. Even VLSI is now somewhat quaint, given
the common assumption that all microprocessors are VLSI or better .
As of early 2008 , billion-transistor processors are commercially available, an example of which is
Intel's Montecito Itanium chip. This is expected to become more common place as semiconductor
fabrication moves from the current generation of 65 nm processor to the next 45 nm generations(while
experiencing new challenges such as increased variation across process corner). Another notable example is
NVIDIA's 280 series GPU.
This microprocessor is unique in the fact that its 1.4 Billion transistor count capable of a teraflop of
performance, is almost entirely dedicated to logic(Itanium's transistor count is largely due to the 24MB L3
cache).Current design, as opposed to the earliest devices, use extensive design automation logic synthesis to
lay out the transistors, enabling higher levels of complexity in the resulting logic functionality. Certain highperformance logic blocks like the SRAM cell, however, are still designed by hand to ensure the highest
efficiency(sometimes by bending or breaking established design rules to obtain the last bit of performance
by trading stability).
What is VLSI?
VLSI stands for Very Large Scale Integration". This is the field which involves packing more and
more logic devices into smaller and smaller areas.
1.
Simply we say Integrated circuit is many transistors on one chip.
Design/manufacturing of extremely small, complex circuitry using modified semiconductor

material
Integrated circuit(IC) may contain millions of transistors, each a few mm in size
Applications wide ranging: most electronic logic devices.
2.2 History of Scale Integration:

1. Late 1940s Transistor invented at Bell Labs
2. Late 1950s First IC (JK-FF by Jack Kilby at TI)
3. Late 1960s Medium Scale Integration (MSI)
4. 100s of transistors on a chip
5. Early 1970s Large Scale Integration(LSI)
6. 1000s of transistor on a chip
7. Early 1960s Small Scale Integration(SSI)
8. Early 1980s VLSI 10,000s of transistors
9. Chip (later 100,000s&now 1,000,000s)
10. Ultra LSI is sometimes used for 1,000,000s
11. SSI - Small-Scale Integration(0-102)
12. MSI - Medium-Scale Integration(102-103)
13. LSI - Large-Scale Integration(103-105)
14. VLSI - Very Large-Scale Integration(105-10)
15. ULSI - Ultra Large-Scale Integration(>=107)
2.3 Advantages of ICs over Discrete Components

While we will concentrate on integrated circuits, the properties of integrated circuits-what we can and
cannot efficiently put in an integrated circuit-largely determine the architecture of the entire system.
Integrated circuits improve system characteristics in several critical ways. ICs have three key advantages over
digital circuits built from discrete components:
Size:
Integrated circuits are much smaller-both transistors and wires are shrunk to micrometer sizes,
compared to the millimeter or centimeter scales of discrete components. Small size leads to advantages in
speed and power consumption, since smaller components have smaller parasitic resistances, capacitances, and
inductances.
Speed:
Signal can be switched between logic 0 and logic 1 much quicker within a chip than they can between
chip. Communication within a chip can occur hundreds of times faster than communication between chips on
a printed circuits board. The high speed of circuits on-chip is due to their small size-smaller components and
wires have smaller parasitic capacitance to slow down the signals.
Power consumption:
Logic operations within a chip also take much less power. Once again, lower power consumption is
largely due o the small size of circuits on the chip-smaller parasitic capacitances and resistance require less
power to drive them.
VLSI and systems:
These advantages of integrated circuits translate into advantages at the system level:
Smaller physical size:
Smallness is often an advantage in itself considers portable television or handheld cellular telephones.
Lower power consumption:

Replacing a handful of standard parts with single chip reduces total power consumption. Reducing
power consumption has a ripple effect on the rest of the system: a smaller, cheaper power supply can be used;
since less power consumption means less heat, a fan may no longer be necessary.
Reduced cost:
Reducing the number of components, the power supply requirements, cabinet cost, and so an, will
inevitably reduce system cost. The ripple effect o integration is such that the cost of a system built from
custom ICs can be less, even though the individual ICs cost more than the standrads parts they replace.
Understanding why integrated circuit technology has such profound influse on the design of digital
system requires understanding both the technology of IC manufacturing and the economics of ICs and digital
systems.
Applications:
1. Electronics system in cars.
2. Digital electronics control VCRs.
3. Transaction processing system, ATM
4. Personal computers and Workstations
5. Medical electronic system.
2.4 Applications of VLSI

Electronic system now performs a wide variety of tasks in daily life. Electronic system in some cases
has replaced mechanisms that operated mechanically, hydraulically, or by other means; electronics are usually
smaller, more flexible, and easier to service. In other cases electronic systems have created totally new
applications. Electronic system perform a variety of tasks, some of them visible, some are hidden:
1. Personal entertainment system such as portable MP3 players and DVD players perform sophisticated
algorithm with remarkably little energy.
2. Electronic system in cars operate stereo systems and displays; they also control fuel injection systems,
adjust suspensions to varying terrain, and perform the control functions required for anti-lock
braking(ABS)systems.
3. Digital electronics compress and decompress video, even at high definition data rates, on-the-fly in
consumer electronics.
4. Low-cost terminals for Web browsing still require sophisticated electronics, despite their dedicated
function.
5. Personal computers and workstations provide word-processing, financial analysis, and games.
Computers include both central processing units (CPUs) and special-purpose hardware for disk access,
faster screen display, etc.
6. Medical electronic system measure bodily functions and perform complex processing algorithms to
warn about unusual conditions. The availability of these complex systems, far from overwhelming
consumers, only creates demand for even more complex systems.
The growing sophisticated of application continually pushes the design and manufacturing of
integrated circuits and electronic systems to new levels of complexity.
And perhaps the most amazing characteristic of this collection of systems is its variety-as systems
become more complex, we build not a few general purpose computers but an ever wider range of specialpurpose systems. our ability to do so is a testament to our growing mastery of both integrated circuit
manufacturing and design, but the increasing demands of customers continue to test the limits of design and
manufacturing
2.5 The main digital VLSI circuit testing problems

Due to the portability of the integrated circuit chips in their manufacturing, the physical test of the
designed chip is not possible. There are certain testing methods which must have to follow to find the
problems in that design. Those testing methods includes the functional testing methods, accessing of primary
I/O ports of the circuit board and by providing the limited coverage and poor diagnostic facilities for the
circuit board which is under test. All these testing requirements are mainly to detect the primary fault in the
circuit board itself. These testing methods are mainly helpful for finding the errors, so that the cost of the test
equipment will be decreases and also lead to the research taken place to find the main problems occurring
while testing the digital VLSI circuits.
The major problems founded so far are as follows.
Test generation problems.
The input combinatorial problems.
The gate to I/O pin ratio problems.
Test Generation Problems

The computers are taking very high automatic-test generation time of weeks to months sometimes for
the process of computation. This is because of the high number of gates in the digital circuit of its portability.
This affects the test patterns and the computation cost, which is held by external testing equipment. And
another main aspect is the time, the testing time also increased because of that problem.
There is one more test generation problem. Computer algorithms can generate automatic test patterns.
These are suited well only for the combinatorial logic circuits. This is the main problem because, they does
not suits for a sequential logic circuits, because they need more space in the memory and the steps followed
by the evaluating computation techniques is somewhat difficult.
The input combinatorial problems
For a combinatorial logic circuit; the number of test vectors for the exhaustive testing of the digital
circuit is given by 2N, for the combinatorial circuit having the N number of inputs. On the other hand, the
number of test pattern vectors is required for the Medium Scale Integrated circuit such as a 32 bit micro
processor is undefined.
The gate to I/O pin ration problems
These days, the number of gates of ICs is keeping on growing. Due to the rapid increase in these gates
of ICs, it is very much difficult to the corresponding pins of the IC to control the input signal, which is
nothing but controllability and also difficult for the output pins to observe the proper signal, which is called
observe ability. The rate of count of the pins is much slower than the rates of counts of gates, which affects
the controllability and observe ability conditions.
Because of the problems stated above, the designing engineers were motivated and they searched for reliable
testing circuitry. Obvious solution will be the special circuit, inserted on the digital VLSI circuit which has to
be tested is BIST, because of its special testing ability.
2.6 Introduction to Cryptography

Security is a broad topic and covers a multitude of sins. Most security problems are intentionally
caused by malicious people trying to gain some benefit or harm someone. The requirement of information
security has undergone two major changes in last two decades. In earlier days cabinets with a combination
lock for storing sensitive documents were used.
With introduction of computer, the need for automated tools for protecting files and other information
became evident. This is very important in case of shared systems as well as for data network or internet. The
generic term for the collection of the tools designed to protect data and thwart hackers is Computer Security.
Development of Cryptography
The history of cryptography begins thousands of years ago. Until recent decades, it has been the story
of what might be called classic cryptography that is, of methods of encryption that use pen and paper, or
perhaps simple mechanical aids. In the early 20th century, the invention of complex mechanical and
electromechanical machines, such as the Enigma rotor machine, provided more sophisticated and efficient
means of encryption; and the subsequent introduction of electronics and computing has allowed elaborate
schemes of still greater complexity, most of which are entirely unsuited to pen and paper.
The development of cryptography has been paralleled by the development of cryptanalysis the
"breaking" of codes and ciphers. The discovery and application, early on, of frequency analysis to the reading
of encrypted communications has on occasion altered the course of history. Thus the Zimmermann
Telegram triggered the United States' entry into World War I; and Allied reading of Nazi Germany's ciphers
shortened World War II, in some evaluations by as much as two years.
Until the 1970s, secure cryptography was largely the preserve of governments. Two events have since
brought it squarely into the public domain: the creation of a public encryption standard (DES), and the
invention of public-key cryptography.
Need of Cryptography
The main use of cryptography is mentioned below:
1) Private or confidentiality
2) Data integrity
3) Authentication
4) Non-repudiation
1. Confidentiality is a service used to keep the content of information from all but those authorized to
posses it. Secrecy is a term synonymous with confidentiality and privacy. There are numerous
approaches to providing confidentiality, ranging from physical protection to mathematical algorithms
which render data unintelligible.
2.
Data integrity is a service which addresses the unauthorized alteration of data. To assure data
integrity, one must have the ability to detect data manipulation by unauthorized parties. Data
manipulation includes such things as insertion, deletion, and substitution.
3.
Authentication is a service related to identification. This function applies to both entities and
information itself. Two parties entering into a communication should identify each other. Information
delivered over a channel should be authenticated as to origin, date of origin, data content, time sent,
etc. For these reasons this aspect of cryptography is usually subdivided into two major classes: Entity
authentication and data origin authentication. Data origin authentication implicitly provides data
integrity (for if a message is modified, the source has changed).
4.
Non-repudiation is a service which prevents an entity from denying previous commitments or actions.
When disputes arise due to an entity denying that certain actions were taken, a means to resolve the
situation is necessary. For example, one entity may authorize the purchase of property by another
entity and later deny such authorization was granted. A procedure involving a trusted third party is
needed to resolve the dispute.
A fundamental goal of cryptography is to adequately address these four areas in both theory and
practice. Cryptography is about the prevention and detection of cheating and other malicious activities and
to secure what you have as sensitive information.
2.7 Basics of Cryptography

2.7.1 Encryption
In cryptography, encryption is the process of encoding messages or information in such a way that
only authorized parties can read it. Encryption does not of itself prevent interception, but denies the message
content to the interceptor. In an encryption scheme, the message or information, referred to as plaintext, is
encrypted using an encryption algorithm, generating ciphertext that can only be read if decrypted.[2] For
technical reasons, an encryption scheme usually uses a pseudo-random encryption key generated by an
algorithm. It is in principle possible to decrypt the message without possessing the key, but, for a welldesigned encryption scheme, large computational resources and skill are required. An authorized recipient can
easily decrypt the message with the key provided by the originator to recipients, but not to unauthorized
interceptors.
10
Block diagram to converts plain text into cipher text.
Fig.2.1 converts plain text into cipher text

2.7.2 Decryption
Decryption is the process of transforming data that has been rendered unreadable through encryption
back to its unencrypted form. In decryption, the system extracts and converts the garbled data and transforms
it to texts and images that are easily understandable not only by the reader but also by the system. Decryption
may be accomplished manually or automatically. It may also be performed with a set of keys or passwords.
One of the foremost reasons for implementing an encryption-decryption system is privacy. As
information travels over the World Wide Web, it becomes subject to scrutiny and access from unauthorized
individuals or organizations. As a result, data is encrypted to reduce data loss and theft. Some of the common
items that are encrypted include email messages, text files, images, user data and directories. The person in
charge of decryption receives a prompt or window in which a password may be entered to access encrypted
information.
Block diagram to convert cipher text into plain text.
Fig 2.2 convert cipher text into plain text

What Is Cryptography?
Cryptography is the science of using mathematics to encrypt and decrypt data. Cryptography enables
you to store sensitive information or transmit across insecure networks (like the Internet) so that it cannot be
read by anyone except the intended recipient.
11
While cryptography is the science of securing data, cryptanalysis is the science of analyzing and
breaking secure communication. Classical cryptanalysis involves an interesting combination of analytical
reasoning, application of mathematical tools, pattern finding, patience, determination, and luck. Cryptanalysts
are also called attackers. Cryptology embraces both cryptography and cryptanalysis.
A related discipline is steganography, which is the science of hiding messages rather than making
them unreadable. Steganography is not cryptography; it is a form of coding. It relies on the secrecy of the
mechanism used to hide the message. If, for example, you encode a secret message by putting each letter as
the first letter of the first word of every sentence, its secret until someone knows to look for it, and then it
provides no security at all.
How Does Cryptography Work?
A cryptographic algorithm, or cipher, is a mathematical function used in the encryption and
decryption process. A cryptographic algorithm works in combination with a keya word, number, or
phraseto encrypt the plaintext. The same plaintext encrypts to different cipher text with different keys. The
security of encrypted data is entirely dependent on two things: the strength of the cryptographic algorithm and
the secrecy of the key. A cryptographic algorithm, plus all possible keys and all the protocols that make it
work, comprise a cryptosystem. PGP is a cryptosystem.
2.8 Types of Cryptography

There are two main types of cryptography:
Secret key cryptography
Public key cryptography
In cryptographic systems, the term key refers to a numerical value used by an algorithm to alter
information, making that information secure and visible only to individuals who have the corresponding key
to recover the information.
2.8.1
Secret Key Cryptography
Secret key cryptography is also known as symmetric key cryptography. With this type of cryptography,
both the sender and the receiver know the same secret code, called the key. Messages are encrypted by the
sender using the key and decrypted by the receiver using the same key.
12
This method works well if you are communicating with only a limited number of people, but it
becomes impractical to exchange secret keys with large numbers of people. In addition, there is also the
problem of how you communicate the secret key securely.
Block diagram of secret key cryptography.
Plaintext
Encryption
Ciphertext
Decryption
Plaintext
Fig2.3 secret key cryptography

Types of Secret Key Cryptography:Stream Ciphers:
Stream ciphers operate on a single bit (byte or computer word) at a time, and implement some form of
feedback mechanism so that the key is constantly changing.
Block Ciphers:
The scheme encrypts one block of data at a time using the same key on each block.
Stream Ciphers:
Stream ciphers come in several flavors but two are worth mentioning here :
Self-synchronizing stream ciphers calculate each bit in the keystream as a function of the previous n
bits in the keystream.
13
Synchronous stream ciphers generate the keystream in a fashion independent of the message stream
but by using the same keystream generation function at sender and receiver.
Block Ciphers
Block ciphers can operate in one of several modes; the following four are the most important:
Electronic Codebook (ECB) mode :
Cipher Block Chaining (CBC) mode :
Cipher Feedback (CFB) mode :
Output Feedback (OFB) mode
Symmetric Key Cryptographic Algorithms:

The symmetric key cryptographic algorithms are as follow:i.
DES
ii.
TRIPLE-DES
iii.
BLOWFISH
iv.
IDEA
v.
RC4
vi.
RC5
vii.
TwoFish
2.8.2 Public Key Cryptography

Public key cryptography, also called asymmetric encryption, uses a pair of keys for encryption and
decryption. With public key cryptography, keys work in pairs of matched public and private keys.
The public key can be freely distributed without compromising the private key, which must be kept secret by
its owner. Because these keys work only as a pair, encryption initiated with the public key can be decrypted
only with the corresponding private key. The following example illustrates how public key cryptography
works:
Ann wants to communicate secretly with Bill. Ann encrypts her message using Bills public key
(which Bill made available to everyone) and Ann sends the scrambled message to Bill.
14
When Bill receives the message, he uses his private key to unscramble the message so that he can read
it.
When Bill sends a reply to Ann, he scrambles the message using Anns public key.
When Ann receives Bills reply, she uses her private key to unscramble his message.
The major advantage asymmetric encryption offers over symmetric key cryptography is that senders and
receivers do not have to communicate keys up possible using the public keys.
Block diagram of public-key cryptography.
Public-key
Plaintext
Encryption
private-key
Ciphertext
Decryption
Plaintext
Fig 2.4 public-key cryptography

Public Key Cryptographic Algorithms:
The Asymmetric (public key) key cryptographic algorithms are as follow:a. RSA
b. Diffie-Hellman
c. Elliptic curve
15
How PGP Works?

PGP then creates a session key, which is a one-time-only secret key. This key is a random number
generated from the random movements of your mouse and the keystrokes you type. The session key works
with a very secure, fast conventional encryption algorithm to encrypt the plaintext; the result is ciphertext.
Once the data is encrypted, the session key is then encrypted to the recipients public key. This public keyencrypted session key is transmitted along with the ciphertext to the recipient.
Block diagram of Encryption.
Plaintext is encrypted with
Session key
Ciphertext + encrypted session key

Fig 2.5 Encryption
Decryption works in the reverse. The recipients copy of PGP uses his or her private key to recover the
session key, which PGP then uses to decrypt the conventionally encrypted ciphertext.
16
Block Diagram of decryption process.

Encrypted Message
Key
Ciphertext
Encrypted Session
Recipients Private Key Used
To Decrypt Session Key
Session Key Used

To Decrypt Ciphertext
Original
Plaintext
Fig 2.6 Decryption

The combination of the two encryption methods combines the convenience of public-key encryption
with the speed of conventional encryption. Conventional encryption is about 10,000 times faster than publickey encryption. Public-key encryption in turn provides a solution to key distribution and data transmission
issues. Used together, performance and key distribution are improved without any sacrifice in security.
Keys:
1. A key is a value that works with a cryptographic algorithm to produce a specific ciphertext. Keys are
basically really, really, really big numbers. Key size is measured in bits; the number representing a
2048-bit key is darn huge. In public-key cryptography, the bigger the key, the more secure the
ciphertext.
2. However, public key size and conventional cryptographys secret key size are totally unrelated. A
conventional 80-bit key has the equivalent strength of a 1024-bit public key. A conventional 128-bit
key is equivalent to a 3000-bit public key. Again, the bigger the key, the more secure, but the
algorithms used for each type of cryptography are very different and thus comparison is like that of
apples to oranges.
17
Digital Signatures:
1. A major benefit of public key cryptography is that it provides a method for employing digital
signatures. Digital signatures let the recipient of information verify the authenticity of the
informations origin, and also verify that the information was not altered while in transit. Thus, public
key digital signatures provide authentication and data integrity. These features are every bit as
fundamental to cryptography as privacy, if not more.
2. A digital signature serves the same purpose as a seal on a document, or a handwritten signature.
However, because of the way it is created, it is superior to a seal or signature in an important way. A
digital signature not only attests to the identity of the signer, but it also shows that the contents of the
information signed have not been modified. A physical seal or handwritten signature cannot do that.
However, like a physical seal that can be created by anyone with possession of the signet, a digital
signature can be created by anyone with the private key of that signing keypair.
3. Some people tend to use signatures more than they use encryption. For example, you may not care if
anyone knows that you just deposited $1,000 in your account, but you do want to be darn sure it was
the bank teller you were dealing with.
4. The basic manner in which digital signatures are created is shown in the following figure. The
signature algorithm uses your private key to create the signature and the public key to verify it. If the
information can be decrypted with your public key, then it must have originated with you.
Block diagram of Private key and public key
Private Key
Original Text
Signing
Public Key
Signed Text
Verifying
Verified Text
Fig 2.7 Private Key and Public Key
18
The Advantages of Public-Key Cryptography Compared with Secret-Key Cryptography is as follow:i.
The primary advantage of public-key cryptography is increased security and convenience: private keys
never need to transmitted or revealed to anyone. In a secret-key system, by contrast, the secret keys
must be transmitted (either manually or through a communication channel), and there may be a chance
that an enemy can discover the secret keys during their transmission.
ii.
Another major advantage of public-key systems is that they can provide a method for digital
signatures. Authentication via secret-key systems requires the sharing of some secret and sometimes
requires trust of a third party as well. As a result, a sender can repudiate a previously authenticated
message by claiming that the shared secret was somehow compromised by one of the parties sharing
the secret. For example, the Kerberos secret-key authentication system involves a central database that
keeps copies of the secret keys of all users; an attack on the database would allow widespread forgery.
Public-key authentication, on the other hand, prevents this type of repudiation; each user has sole
responsibility for protecting his or her private key. This property of public-key authentication is often
called non-repudiation.
The disadvantages of Public-Key Cryptography Compared with Secret-Key Cryptography are as follow:i.
A disadvantage of using public-key cryptography for encryption is speed: there are popular secret-key
encryption methods that are significantly faster than any currently available public-key encryption
method. Nevertheless, public-key cryptography can be used with secret-key cryptography to get the
best of both worlds. For encryption, the best solution is to combine public- and secret-key systems in
order to get both the security advantages of public-key systems and the speed advantages of secret-key
systems. The public-key system can be used to encrypt a secret key which is used to encrypt the bulk
of a file or message. Such a protocol is called a digital envelope.
ii.
Public-key cryptography may be vulnerable to
impersonation, however, even if users' private keys
are not available. A successful attack on a certification authority will allow an adversary to
impersonate whomever the adversary chooses to by using a public-key certificate from the
compromised authority to bind a key of the adversary's choice to the name of another user.
iii.
In some situations, public-key cryptography is not necessary and secret-key cryptography alone is
sufficient. This includes environments where secure secret-key agreement can take place, for example
by users meeting in private. It also includes environments where a single authority knows and
manages all the keys, e.g., a closed banking system. Since the authority knows everyone's keys
already, there is not much advantage for some to be "public" and others "private." Also, public-key
cryptography is usually not necessary in a single-user environment. For example, if you want to keep
19
your personal files encrypted, you can do so with any secret-key encryption algorithm using, say, your
personal password as the secret key. In general, public-key cryptography is best suited for an open
multi-user environment.
iv.
Public-key cryptography is not meant to replace secret-key cryptography, but rather to supplement it,
to make it more secure. The first use of public-key techniques was for secure key exchange in an
otherwise secret-key system; this is still one of its primary functions. Secret-key cryptography remains
extremely important and is the subject of much ongoing study and research. Some secret-key
cryptosystems are discussed in the sections on block ciphers and stream ciphers.
Why Three Encryption Techniques?

The three encryption techniques are used for following reasons:
a. Hash functions : for data integrity
b. Secret-key cryptography: ideally suited to encrypting message
c. public-key cryptography : for Key exchange
Examples:
1. The ABC Company maintains payroll information for a variety of organizations. This payroll
information is frequently transmitted over the Internet from participating companies. For security
reasons, the ABC Company conducts all of its Internet transactions using public key cryptography.
The company owns both a public and a private encryption key.
The public key is made available to all participating organizations and in fact is openly
available to anyone who wants to download it from the ABC website. The private key is kept secure in
a bank vault at ABC headquarters. When the XYZ Company wants to transmit its payroll data to the
ABC company, it first encrypts the data using the ABC companys public key. Once its encrypted, the
scrambled payroll data is transmitted securely over the Internet to the ABC companys processing
department.
If the information is intercepted along the way, all the interceptors will see is scrambled
information. Even if they have the public key, which is very possible, they will not be able to
unscramble the information. Only the private key can do that. Once the information is received by
ABC, the private key is used to unscramble the information, allowing the processing department to
process the payroll.
2. Using symmetric cryptography the ABC Company would have to deliver, through some secure means
(such as a courier), a copy of its one and only private key. Since the same key is used to both encrypt
and decrypt the information, both sender and receiver must have a copy.
20
So if XYZ is a new client for ABC, ABC must send XYZ a copy of the secret key so that XYZ
can then encrypt its payroll information and transmit it to ABC. ABC, using the same key, decrypts
XYZs information and processes the payroll data. Since a system is only as strong as its weakest link,
key security during transmission becomes as important for XYZ as encrypting the data.
3. As mentioned earlier, public key cryptography lends itself to a new technology called digital
signatures. Digital signatures involve a reversing of the normal public/private encryption/decryption
process. Here is an example that demonstrates its use. Suppose Mary wants to send the ABC company
a request for a special document. Before the ABC company can send that document, they must be
assured that the requestor is actually Mary.
A digital signature can verify Marys validity to ABC in the following way. Mary first encrypts
her name using her private key. She then encrypts the request along with the encrypted name using the
ABC companys well-known public key. When the ABC company receives the message, it decrypts
the request using its private key and then decrypts the signature using Marys well-publicized public
key. If the name decrypts successfully, then it must be Marys signature since she is the only one who
could have encrypted it with her secret private key. The request can be safely processed.
4. Digital signatures are gaining popularity in many Internet transactions involving signature verification
such as contracts and other legal negotiations as well as court documents. Recent enhancements to
digital signatures include digital time stamps. Digital timestamps apply a when criteria to a digital
signature by attaching a widely publicized summary number to the signature.
That summary number is only produced at some given point in time, essentially linking that
signature to a certain date/time. Its an especially effective technology since it doesnt rely on the
security of keys
5. As mentioned earlier that for large documents, use of public key cryptography is prohibitive because
transmission speeds are so slow. By using something called a digital envelope, the best of both
symmetric (transmission speed) and public key (security) cryptography can be used. Here is an
example of how a digital envelope works. Mary wants to send a very large document to her main
office overseas. Because of its sensitivity, Mary believes it should be sent using public key
cryptography but knows she cant because its too large. She decides to use a digital envelope.
6. Mary first creates a special session key and uses this key to symmetrically encrypt her document. That
is, she uses a symmetric cryptographic algorithm. She then encrypts the session key with her
organizations public key. So now the document is encrypted using symmetric cryptography and the
key that encrypted it is encrypted using public key cryptography. The encrypted key is called the
digital envelope. She then transmits both the key and the document to the main office.
21
CHAPTER 3
HARDWARE DESCRIPTION
3.1 Advanced encryption Standards
The Advanced Encryption Standard (AES), also referenced as Rijndael (its original name), is a
specification for the encryption of electronic data established by the U.S. National Institute of Standards and
Technology (NIST) in 2001.
AES
is
based
on
the
Rijndael
cipher developed
by
two Belgian cryptographers, Joan
Daemen and Vincent Rijmen, who submitted a proposal to NIST during the AES selection process. Rijndael
is a family of ciphers with different key and block sizes.
For AES, NIST selected three members of the Rijndael family, each with a block size of 128 bits, but
three different key lengths: 128, 192 and 256 bits.
AES has been adopted by the U.S. government and is now used worldwide. It supersedes the Data
Encryption Standard (DES), which was published in 1977. The algorithm described by AES is a symmetrickey algorithm, meaning the same key is used for both encrypting and decrypting the data.
In the United States, AES was announced by the NIST as U.S. FIPS PUB 197 (FIPS 197) on November
26, 2001. This announcement followed a five-year standardization process in which fifteen competing designs
were presented and evaluated, before the Rijndael cipher was selected as the most suitable (seeAdvanced
Encryption Standard process for more details).
AES became effective as a federal government standard on May 26, 2002 after approval by
the Secretary of Commerce. AES is included in the ISO/IEC 18033-3 standard. AES is available in many
different encryption packages, and is the first publicly accessible and open cipher approved by the National
Security Agency (NSA) for top secret information when used in an NSA approved cryptographic module
The name Rijndael (Dutch pronunciation: [rindal]) is a play on the names of the two inventors (Joan
Daemen and Vincent Rijmen). It is also a combination of the Dutch name for the Rhine river and a Dale.
22
Block diagram of AES.
Fig 3.1 Block diagram of AES

i.
AES is a block cipher with a block length of 128 bits.
ii.
AES allows for three different key lengths: 128, 192, or 256 bits. Most of our discussion will assume
that the key length is 128 bits. [With regard to using a key length other than 128 bits, the main thing
that changes in AES is how you generate the key schedule from the key an issue I address at the
end . The notion of key schedule in AES is explained].
Block diagram of Advanced Encryption Standards.
23
iii.
Encryption consists of 10 rounds of processing for 128-bit keys, 12 rounds for 192-bit keys, and 14
rounds for 256-bit keys.
iv.
Except for the last round in each case, all other rounds are identical.
v.
Each round of processing includes one single-byte based substitution step, a row-wise permutation
step, a column-wise mixing step, and the addition of the round key. The order in which these four
steps are executed is different for encryption and decryption.
vi.
To appreciate the processing steps used in a single round, it is best to think of a 128-bit block as
consisting of a 4 4 matrix of bytes, arranged as follows.
vii.
Therefore, the first four bytes of a 128-bit input block occupy the first column in the 4 4 matrix of
bytes. The next four bytes occupy the second column, and so on.
viii.
ix.
The 4 4 matrix of bytes is referred to as the state array.

AES also has the notion of a word. A word consists of four bytes that is 32 bits. Therefore, each
column of the state array is a word, as is each row.
x.
Each round of processing works on the input state array and produces an output state array.
xi.
The output state array produced by the last round is rearranged into a 128-bit output block.
xii.
Unlike DES, the decryption algorithm differs substantially from the encryption algorithm. Although,
overall, the same steps are used in encryption and decryption, the order in which the steps are carried
out is different, as mentioned previously.
xiii.
AES, notified by NIST as a standard in 2001, is a slight variation of the Rijndael cipher invented by
two Belgian cryptographers Joan Daemen and Vincent Rijmen.
xiv.
Whereas AES requires the block size to be 128 bits, the original Rijndael cipher works with any block
size (and any key size) that is a multiple of 32 as long as it exceeds 128. The state array for the
different block sizes still has only four rows in the Rijndael cipher. However, the number of columns
24
depends on size of the block. For example, when the block size is 192, the Rijndael cipher requires a
state array to consist of 4 rows and 6 columns.
xv.
As explained in Lecture 3, DES was based on the Feistel network. On the other hand, what AES uses
is a substitution permutation network in a more general sense. Each round of processing in AES
involves byte-level substitutions followed by word-level permutations. Speaking generally, DES also
involves substitutions and permutations, except that the permutations are based on the Feistel notion of
dividing the input block into two halves, processing each half separately, and then swapping the two
halves.
xvi.
The nature of substitutions and permutations in AES allows for a fast software implementation of the
algorithm.
The Encryption Key and Its Expansion

i.
a 128-bit key, the key is also arranged in the form of a matrix of 4 4 bytes. As with the input Assuming
block, the first word from the key fills the first column of the matrix, and so on.
ii.
The four column words of the key matrix are expanded into a schedule of 44 words. (As to how exactly
this is done, we will explain that later in Section 8.8.) Each round consumes four words from the key
schedule.
iii.
The figure below depicts the arrangement of the encryption key in the form of 4-byte words and the
expansion of the key into a key schedule consisting of 44 4-byte words.
Block diagram shows the four words of the original 128-bit key being expanded into a key schedule
consisting of 44 words
Fig 3.2 The four words of the original 128-bit key being expanded into a key schedule consisting of 44
words
25
The Overall Structure Of AES

i.
The overall structure of AES encryption/decryption is shown in Figure 3.2
ii.
The number of rounds shown in Figure 2, 10, is for the case when the encryption key is 128 bit long.
iii.
Before any round-based processing for encryption can begin, the input state array is XORed with the
rst four words of the key schedule. The same thing happens during decryption except that now we
XOR the ciphertext state array with the last four words of the key schedule.
iv.
For encryption, each round consists of the following four steps: 1) Substitute bytes, 2) Shift rows, 3)
Mix columns, and 4) Add round key. The last step consists of XORing the output of the previous three
steps with four words from the key schedule.
v.
For decryption, each round consists of the following four steps: 1) Inverse shift rows, 2) Inverse
substitute bytes, 3) Add round key, and 4) Inverse mix columns. The third step consists of XORing the
output of the previous two steps with four words from the key schedule. Note the dierences between
the order in which substitution and shifting operations are carried out in a decryption round vis-a-vis
the order in which similar operations are carried out in an encryption round.
vi.
The last round for encryption does not involve the Mix columns step. The last round for decryption
does not involve the Inverse mix columns step.
Block diagram of overall structure of AES
Fig 3.3 The overall structure of AES for the case of 128-bit encryption key
26
3.2 Over flow of AES Algorithm

High-level description of the algorithm
1. KeyExpansionsround keys are derived from the cipher key using Rijndael's key schedule. AES
requires a separate 128-bit round key block for each round plus one more.
2. InitialRound
1. AddRoundKeyeach byte of the state is combined with a block of the round key using bitwise
xor.
3. Rounds
1. SubBytesa non-linear substitution step where each byte is replaced with another according to
a lookup table.
2. ShiftRowsa transposition step where the last three rows of the state are shifted cyclically a
certain number of steps.
3. MixColumnsa mixing operation which operates on the columns of the state, combining the
four bytes in each column.
4. AddRoundKey
4. Final Round (no MixColumns)
1. SubBytes
2. ShiftRows
3. AddRoundKey.
27
3.3
Individual blocks
The Four Steps In Each Round Of Processing

The dierent steps that are carried out in each round except the last one.
Fig 3.4 One round of encryption is shown at left and one round of decryption at right
3.3.1
The Sub byte step

In the SubBytes step, each byte
in the state matrix is replaced with a SubByte
using an
8-bit substitution box, the Rijndael S-box. This operation provides the non-linearity in the cipher. The S-box
used is derived from the multiplicative inverse over GF(28), known to have good non-linearity properties. To
avoid attacks based on simple algebraic properties, the S-box is constructed by combining the inverse function
with an invertible affine transformation. The S-box is also chosen to avoid any fixed points (and so is
a derangement), i.e.,
,
And also any opposite fixed points, i.e.
While performing the decryption, Inverse SubBytes step is used, which requires first taking the affine
transformation and then finding the multiplicative inverse? In the SubBytes step, each byte in the state is
replaced with its entry in a fixed 8-bit lookup table, S; bij = S(aij)
28
Block diagram of SubByte step
Fig 3.5 Sub-byte step

3.3.2 The ShiftRows step
The ShiftRows step operates on the rows of the state; it cyclically shifts the bytes in each row by a
certain offset. For AES, the first row is left unchanged. Each byte of the second row is shifted one to the left.
Similarly, the third and fourth rows are shifted by offsets of two and three respectively. For blocks of sizes
128 bits and 192 bits, the shifting pattern is the same. Row n is shifted left circular by n-1 bytes. In this way,
each column of the output state of the ShiftRows step is composed of bytes from each column of the input
state. (Rijndael variants with a larger block size have slightly different offsets). For a 256-bit block, the first
row is unchanged and the shifting for the second, third and fourth row is 1 byte, 3 bytes and 4 bytes
respectivelythis change only applies for the Rijndael cipher when used with a 256-bit block, as AES does
not use 256-bit blocks. The importance of this step is to avoid the columns being linearly independent, in
which case, AES degenerates into four independent block ciphers.
In the ShiftRows step, bytes in each row of the state are shifted cyclically to the left. The number of
places each byte is shifted differs for each row.
Fig3.6 Shift-row step

29
3.3.3 The Mixcolumns Step

In the MixColumns step, the four bytes of each column of the state are combined using an
invertible linear transformation. The MixColumns function takes four bytes as input and outputs four bytes,
where each input byte affects all four output bytes. Together
with ShiftRows, MixColumns provides diffusion in the cipher.
During this operation, each column is multiplied by a fixed matrix:
Matrix multiplication is composed of multiplication and addition of the entries, and here the
multiplication operation can be defined as this: multiplication by 1 means no change, multiplication by 2
means shifting to the left, and multiplication by 3 means shifting to the left and then performing XOR with the
initial unshifted value. After shifting, a conditional XOR with 0x1B should be performed if the shifted value
is larger than 0xFF. (These are special cases of the usual multiplication in GF(28).) Addition is simply XOR.
In more general sense, each column is treated as a polynomial over GF(28) and is then multiplied
modulo x4+1 with a fixed polynomial c(x) = 0x03 x3 + x2 + x + 0x02. The coefficients are displayed in
their hexadecimal equivalent
of
the
binary
representation
of
bit
polynomials
from GF(2)[x].
The MixColumns step can also be viewed as a multiplication by the shown particular MDS matrix in the finite
field GF(28). This process is described further in the article Rijndael mix columns.
In the mixcolumns step, each column of the state is multiplied with a fixed polynomial c(x).
Fig 3.7. Mix-Column step

30
3.3.4
The Addroundkey Step

In the AddRoundKey step, the subkey is combined with the state. For each round, a subkey is derived
from the main key using Rijndael's key schedule; each subkey is the same size as the state. The subkey is
added by combining each byte of the state with the corresponding byte of the subkey using bitwise XOR.
In the AddRoundKey step, each byte of the state is combined with a byte of the round subkey using
the XOR operation ().
Fig 3.8 Addroundkey step

Optimization of the cipher
On systems with 32-bit or larger words, it is possible to speed up execution of this cipher by
combining the SubBytes and ShiftRows steps with theMixColumns step by transforming them into a sequence
of table lookups. This requires four 256-entry 32-bit tables, and utilizes a total of four kilobytes (4096 bytes)
of memory one kilobyte for each table. A round can then be done with 16 table lookups and 12 32-bit
exclusive-or operations, followed by four 32-bit exclusive-or operations in the AddRoundKey step.[11]
If the resulting four-kilobyte table size is too large for a given target platform, the table lookup
operation can be performed with a single 256-entry 32-bit (i.e. 1 kilobyte) table by the use of circular rotates.
Using
byte-oriented
approach,
it
is
possible
to
combine
the SubBytes, ShiftRows,
and MixColumns steps into a single round operation.
31
3.3.5 The Key Expansion Algorithm

i.
Each round has its own round key that is derived from the original 128-bit encryption key in the
manner described in this section. One of the four steps of each round, for both encryption and
decryption, involves XORing of the round key with the state array.
ii.
The AES Key Expansion algorithm is used to derive the 128bit round key for each round from the
original 128-bit encryption key. As youll see, the logic of the key expansion algorithm is desiged to
ensure that if you change one bit of the encryption key, it should aect the round keys for several
rounds.
iii.
In the same manner as the 128-bit input block is arranged in the form of a state array, the algorithm
rst arranges the 16 bytes of the encryption key in the form of a 44 array of bytes.
iv.
The rst four bytes of the encryption key constitute the word w0, the next four bytes the word w1, and
so on.
v.
The algorithm subsequently expands the words [w0,w1,w2,w3] into a 44-word key schedule that can
be labeled w0, w1, w2, w3,................., w43
vi.
Of these, the words [w0,w1,w2,w3] are bitwise XORed with the input block before the round-based
processing begins.
vii.
The remaining 40 words of the key schedule are used four words at a time in each of the 10 rounds.
viii.
The above two statements are also true for decryption, except for the fact that we now reverse the
order of the words in the key schedule. The last four words of the key schedule are bitwise XORed
with the 128-bit ciphertext block before any round-based processing begins. Subsequently, each of the
four words in the remaining 40 words of the key schedule are used in each of the ten rounds of
processing.
32
ix.
Now comes the dicult part: How does the Key Expansion Algorithm expand four words
w0,w1,w2,w3 into the 44 words w0,w1,w2,w3,w4,w5,........,w43
x.
The key expansion algorithm will be explained in the next subsection with the help of Figure 3.8. As
shown in the gure, the key expansion takes place on a four-word to four-word basis, in the sense that
each grouping of four words decides what the next grouping of four words will be.
The block diagram of key expansion algorithm
Fig 3.9 The key expansion takes place on a four-word to four-word basis.
3.4 Construction of the 16 16 Arrays

a) The subBytes Step
1. We rst ll each cell of the 16 16 table with the byte obtained by joining together its row
index and the column index. [The row index of this table runs from hex 0 through hex F.
Likewise, the column index runs from hex 0 through hex F.]
2. For example, for the cell located at row index 2 and column indexed 7, we place hex 0x27 in
the cell. So at this point the table will look like.
33
3. We next replace the value in each cell by its multiplicative inverse in GF(28) based on the
irreducible polynomial x8+x4+x3+x+1
4. The hex value 0x00 is replaced by itself since this element has no multiplicative inverse.
5. After the above step, lets represent a byte stored in a cell of the table by b7b6b5b4b3b2b1b0
where b7 is the MSB and b0 the LSB. For example, the byte stored in the cell (9, 5) of the
above table is the multiplicative inverse (MI) of 0x95, which is 0x8A. Therefore, at this point,
the bit pattern stored in the cell with row index 9 and column index 5 is 10001010, implying
that b7 is 1 and b0 is 0. [Verify the fact that the MI of 0x95 is indeed 0x8A. The polynomial
representation of 0x95 (bit pattern: 10010101) is x7 +x4 +x2 +1, and the same for 0x8A (bit
pattern: 10001010) is x7 + x3 + x. Now show that the product of these two polynomials
modulo the polynomial x8 + x4 + x3 + x + 1 is indeed 1.]For bit scrambling, we next apply the
following transformation to each bit bi of the byte stored in a cell of the lookup table:
b i = bib(i+4) mod 8b(i+5) mod 8b(i+6) mod 8b(i+7) mod 8ci
where ci is the ith bit of a specially designated byte c whose hex value is 0x63.
( c7c6c5c4c3c2c1c0 01100011 )
6. The above bit-scrambling step is better visualized as the following vector-matrix operation.
Note that all of the additions in the product of the matrix and the vector are actually XOR
operations. [Because of the [A]~x +~ b appearance of this transformation, it is commonly
referred to as the ane transformation.
7. The very important role played by the c byte of value 0x63: Consider the following two
conditions on the SubBytes step: (1) In order for the byte substitution step to be invertible, the
byte-to-byte mapping given to us by the 16 16 table must be one-one.
34
That is, for each input byte, there must be a unique output byte. And, to each output
byte there must correspond only one input byte. (2) No input byte should map to itself, since a
byte mapping to itself would weaken the cipher.
8. Taking multiplicative inverses in the construction of the table does give us unique entries in the
table for each input byte except for the input byte 0x00 since there is no MI dened for the allzeros byte. What is interesting is that if it were not for the c byte, the bit scrambling step would
also leave the input byte 0x00 unchanged.With the ane mapping shown above, the 0x00
input byte is mapped to 0x63. At the same time, it preserves the one-one mapping for all other
bytes.
9. In addition to ensuring that every input byte is mapped to a dierent and unique output byte,
the bit-scrambling step also breaks the correlation between the bits before the substitution and
the bits after the substitution.
10. The 16 16 table created in this manner is called the S-Box. The S-Box is the same for all the
bytes in the state array.
11. The steps that go into constructing the 16 16 lookup table are reversed for the decryption
table, meaning that you rst apply the reverse of the bit-scrambling operation to each byte, as
explained in the next step, and then you take its multiplicative inverse in GF(28).
12. For bit scrambling for decryption, you carry out the following bit-level transformation in each
cell of the table:
where di is the ith bit of a specially designated byte d whose hex value is 0x05.
( d7d6d5d4d3d2d1ddc0 = 00000101 ) Finally, you replace the byte in the cell by its
multiplicative inverse in GF(28).
13. The bytes c and d are chosen so that the S-box has no xed points. That is, we do not want S
box(a) = a for any a. Neither do we want S box(a) = a ,where a is the bit wise complement of
a.
35
b) The Shift Rows Step

This is where the matrix representation of the state array becomes important.
i.
The ShiftRows transformation consists of (i) not shifting the rst row of the state array at
all; (ii) circularly shifting the second row by one byte to the left; (iii) circularly shifting the
third row by two bytes to the left; and (iv) circularly shifting the last row by three bytes to
the left.
ii.
This operation on the state array can be represented by
iii.
Recall again that the input block is written column-wise. That is the first four bytes of the
input block fill the first column of 22 Computer and Network Security by Avi Kak Lecture
8 the state array, the next four bytes the second column, etc. As a result, shifting the rows
in the manner indicated scrambles up the byte order of the input block.
iv.
For decryption, the corresponding step shifts the rows in exactly the opposite fashion. The
rst row is left unchanged, the second row is shifted to the right by one byte, the third row
to the right by two bytes, and the last row to the right by three bytes, all shifts being
circular.
36
c) The Mix Columns Step

This step replaces each byte of a column by a function of all the bytes in the same column.
i.
More precisely, each byte in a column is replaced by two times that byte, plus three times
the the next byte, plus the byte that comes next, plus the byte that follows. [As you know
from Lecture 7, additions in GF(28) mean the same thing as XOR. So plus implies
XOR.] The words next and follow refer to bytes in the same column, and their meaning
is circular, in the sense that the byte that is next to the one in the last row is the one in the
rst row. [By two times and three times, we mean multiplications in GF(28) by the bit
patterns 000000010 and 00000011, respectively.]
ii.
For the bytes in the rst row of the state array, this operation can be stated as
iii.
For the bytes in the second row of the state array, this operation can be stated as
iv.
For the bytes in the third row of the state array, this operation can be stated as
v.
And, for the bytes in the fourth row of the state array, this operation can be stated as
37
vi.
More compactly, the column operations can be shown as
where, on the left hand side, when a row of the leftmost matrix multiples a column of the state
array matrix, additions involved are meant to be XOR operations.
vii.
The corresponding transformation during decryption is given by
38
CHAPTER 4
SOFTWARE DESCRIPTION
4.1 Introduction to Xilinx
The Xilinx ISE is a design environment for FPGA products from Xilinx, and is tightly-coupled to the
architecture of such chips, and cannot be used with FPGA products from other vendors.[2] The Xilinx ISE is
primarily used for circuit synthesis and design, while the ModelSim logic simulator is used for system-level
testing. Other components shipped with the Xilinx ISE include the Embedded Development Kit (EDK), a
Software Development Kit (SDK) and ChipScope Pro.
The main challenging areas in VLSI are performance, cost, testing, area, reliability and power. The
demand for portable computing devices and communication system are increasing rapidly. These applications
require low power dissipation for VLSI circuits [1]. The ability to design, fabricate and test Application
Specific Integrated Circuits (ASICs) as well as FPGAs with gate count of the order of a few tens of millions
has led to the development of complex embedded SOC. Hardware components in a SOC may include one or
more processors,
memories
and
dedicated
components
for accelerating critical tasks and
interfaces to various peripherals. One of the approaches for SOC design is the platform based approach. For
example, the platform FPGAs such as Xilinx Virtex II Pro and Altera Excalibur include custom designed
fixed programmable processor cores together with millions of gates of reconfigurable logic devices.
In addition to this, the development of Intellectual Property (IP) cores for the FPGAs for a variety of
standard functions including processors, enables a multimillion gate FPGA to be configured to contain all the
components of a platform based FPGA. Development tools such as the Altera System-On-Programmable
Chip (SOPC) builder enable the integration of IP cores and the user designed custom blocks with the Nios II
soft-core processor. Soft-core processors are far more flexible than the hard-core processors and they can be
enhanced with custom hardware to optimize them for specific application. Power dissipation is a challenging
problem for todays System-on-Chips (SOCs) design and test.
Evolution of Computer-Aided Digital Design
Digital circuit design has evolved rapidly over the last 25 years. The earliest digital circuits were
designed with vacuum tubes and transistors. Integrated circuits were then invented where logic gates were
placed on a single chip. The first integrated circuit (IC) chips were SSI (Small Scale Integration) chips where
39
the gate count was very small. As technologies became sophisticated, designers were able to place circuits
with hundreds of gates on a chip. These chips were called MSI (Medium Scale Integration) chips. With the
advent of LSI (Large Scale Integration), designers could put thousands of gates on a single chip. At this point,
design processes started getting very complicated, and designers felt the need to automate these
processes. Electronic Design Automation (EDA), techniques began to evolve. Chip designers began to use
circuit and logic simulation techniques to verify the functionality of building blocks of the order of about 100
transistors. The circuits were still tested on the breadboard, and the layout was done on paper or by hand on a
graphic computer terminal.
The earlier edition of the book used the term CAD tools. Technically, the term Computer-Aided
Design (CAD) tools refers to back-end tools that perform functions related to place and route, and layout of
the chip . The term Computer-Aided Engineering (CAE) tools refers to tools that is used for front-end
processes such HDL simulation, logic synthesis, and timing analysis. Designers used the terms CAD and CAE
interchangeably. Today, the term Electronic Design Automation is used for both CAD and CAE. For the sake
of simplicity, in this book, we will refer to all design tools as EDA tools.
With the advent of VLSI (Very Large Scale Integration) technology, designers could design single chips with
more than 100,000 transistors. Because of the complexity of these circuits, it was not possible to verify these
circuits on a breadboard. Computer-aided techniques became critical for verification and design of VLSI
digital circuits. Computer programs to do automatic placement and routing of circuit layouts also became
popular. The designers were now building gate-level digital circuits manually on graphic terminals. They
would build small building blocks and then derive higher-level blocks from them. This process would
continue until they had built the top-level block. Logic simulators came into existence to verify the
functionality of these circuits before they were fabricated on chip.
As designs got larger and more complex, logic simulation assumed an important role in the design process.
Designers could iron out functional bugs in the architecture before the chip was designed further.
Emergence of HDLs
For a long time, programming languages such as FORTRAN, Pascal, and C were being used to
describe computer programs that were sequential in nature. Similarly, in the digital design field, designers felt
the need for a standard language to describe digital circuits. Thus, Hardware Description Languages (HDLs)
came into existence. HDLs allowed the designers to model the concurrency of processes found in hardware
elements. Hardware description languages such as Verilog HDL and VHDL became popular. Verilog HDL
originated in 1983 at Gateway Design Automation. Later, VHDL was developed under contract from
DARPA. Both Verilog and VHDL simulators to simulate large digital circuits quickly gained acceptance from
designers.
40
Even though HDLs were popular for logic verification, designers had to manually translate the HDLbased design into a schematic circuit with interconnections between gates. The advent of logic synthesis in the
late 1980s changed the design methodology radically. Digital circuits could be described at a register transfer
level (RTL) by use of an HDL. Thus, the designer had to specify how the data flows between registers and
how the design processes the data. The details of gates and their interconnections to implement the circuit
were automatically extracted by logic synthesis tools from the RTL description.
Thus, logic synthesis pushed the HDLs into the forefront of digital design. Designers no longer had to
manually place gates to build digital circuits. They could describe complex circuits at an abstract level in
terms of functionality and data flow by designing those circuits in HDLs. Logic synthesis tools would
implement the specified functionality in terms of gates and gate interconnections.
HDLs also began to be used for system-level design. HDLs were used for simulation of system boards,
interconnect buses, FPGAs (Field Programmable Gate Arrays), and PALs (Programmable Array Logic). A
common approach is to design each IC chip, using an HDL, and then verify system functionality via
simulation.
Today, Verilog HDL is an accepted IEEE standard. In 1995, the original standard IEEE 1364-1995
was approved. IEEE 1364-2001 is the latest Verilog HDL standard that made significant improvements to the
original standard.
4.2 Typical Design Flow

A typical design flow for designing VLSI IC circuits is shown in Figure 4-1. Un-shaded blocks show
the level of design representation; shaded blocks show processes in the design flow.
41
Block diagram of typical design flow.
Fig 4.1. Typical Design Flow

The design flow shown in Figure 4.1 is typically used by designers who use HDLs. In any design,
specifications are written first. Specifications describe abstractly the functionality, interface, and overall
architecture of the digital circuit to be designed. At this point, the architects do not need to think about how
they will implement this circuit.
A behavioral description is then created to analyze the design in terms of functionality, performance,
compliance to standards, and other high-level issues. Behavioral descriptions are often written with HDLs.
42
New EDA tools have emerged to simulate behavioral descriptions of circuits. These tools combine the
powerful concepts from HDLs and object oriented languages such as C++. These tools can be used instead of
writing behavioral descriptions in Verilog HDL.
The behavioral description is manually converted to an RTL description in an HDL. The designer has
to describe the data flow that will implement the desired digital circuit. From this point onward, the design
process is done with the assistance of EDA tools.
Logic synthesis tools convert the RTL description to a gate-level netlist. A gate-level netlist is a
description of the circuit in terms of gates and connections between them. Logic synthesis tools ensure that
the gate-level netlist meets timing, area, and power specifications. The gate-level netlist is input to an
Automatic Place and Route tool, which creates a layout. The layout is verified and then fabricated on a chip.
Thus, most digital design activity is concentrated on manually optimizing the RTL description of the
circuit. After the RTL description is frozen, EDA tools are available to assist the designer in further processes.
Designing at the RTL level has shrunk the design cycle times from years to a few months. It is also possible to
do many design iterations in a short period of time.
Behavioral synthesis tools have begun to emerge recently. These tools can create RTL descriptions
from a behavioral or algorithmic description of the circuit. As these tools mature, digital circuit design will
become similar to high-level computer programming. Designers will simply implement the algorithm in an
HDL at a very abstract level. EDA tools will help the designer convert the behavioral description to a final IC
chip.
It is important to note that, although EDA tools are available to automate the processes and cut design
cycle times, the designer is still the person who controls how the tool will perform. EDA tools are also
susceptible to the "GIGO : Garbage In Garbage Out" phenomenon. If used improperly, EDA tools will lead to
inefficient designs. Thus, the designer still needs to understand the nuances of design methodologies, using
EDA tools to obtain an optimized design.
Importance of HDLs
HDLs have many advantages compared to traditional schematic-based design.
i.
Designs can be described at a very abstract level by use of HDLs. Designers can write their RTL
description without choosing a specific fabrication technology. Logic synthesis tools can automatically
convert the design to any fabrication technology. If a new technology emerges, designers do not need
to redesign their circuit.
ii.
They simply input the RTL description to the logic synthesis tool and create a new gate-level net list,
using the new fabrication technology. The logic synthesis tool will optimize the circuit in area and
timing for the new technology.
43
iii.
By describing designs in HDLs, functional verification of the design can be done early in the design
cycle. Since designers work at the RTL level, they can optimize and modify the RTL description until
it meets the desired functionality. Most design bugs are eliminated at this point. This cuts down design
cycle time significantly because the probability of hitting a functional bug at a later time in the gatelevel net list or physical layout is minimized.
iv.
Designing with HDLs is analogous to computer programming. A textual description with comments is
an easier way to develop and debug circuits. This also provides a concise representation of the design,
compared to gate-level schematics. Gate-level schematics are almost incomprehensible for very
complex designs.
HDL-based design is here to stay.[3] With rapidly increasing complexities of digital circuits and
v.
increasingly sophisticated EDA tools, HDLs are now the dominant method for large digital designs.
No digital circuit designer can afford to ignore HDL-based design.
vi.
New tools and languages focused on verification have emerged in the past few years. These languages
are better suited for functional verification. However, for logic design, HDLs continue as the preferred
choice.
Popularity of Verilog HDL

Verilog HDL has evolved as a standard hardware description language. Verilog HDL offers many useful
features
i.
Verilog HDL is a general-purpose hardware description language that is easy to learn and easy to use.
It is similar in syntax to the C programming language. Designers with C programming experience will
find it easy to learn Verilog HDL.
ii.
Verilog HDL allows different levels of abstraction to be mixed in the same model. Thus, a designer
can define a hardware model in terms of switches, gates, RTL, or behavioral code. Also, a designer
needs to learn only one language for stimulus and hierarchical design.
iii.
Most popular logic synthesis tools support Verilog HDL. This makes it the language of choice for
designers.
iv.
All fabrication vendors provide Verilog HDL libraries for postlogic synthesis simulation. Thus,
designing a chip in Verilog HDL allows the widest choice of vendors.
v.
The Programming Language Interface (PLI) is a powerful feature that allows the user to write custom
C code to interact with the internal data structures of Verilog. Designers can customize a Verilog HDL
simulator to their needs with the PLI.
44
Trends in HDLs
The speed and complexity of digital circuits have increased rapidly. Designers have responded by
designing at higher levels of abstraction. Designers have to think only in terms of functionality. EDA tools
take care of the implementation details. With designer assistance, EDA tools have become sophisticated
enough to achieve a close-to-optimum implementation.
The most popular trend currently is to design in HDL at an RTL level, because logic synthesis tools
can create gate-level net lists from RTL level design. Behavioral synthesis allowed engineers to design
directly in terms of algorithms and the behavior of the circuit, and then use EDA tools to do the translation
and optimization in each phase of the design. However, behavioral synthesis did not gain widespread
acceptance. Today, RTL design continues to be very popular. Verilog HDL is also being constantly enhanced
to meet the needs of new verification methodologies.
Formal verification and assertion checking techniques have emerged. Formal verification applies
formal mathematical techniques to verify the correctness of Verilog HDL descriptions and to establish
equivalency between RTL and gate-level netlists. However, the need to describe a design in Verilog HDL will
not go away. Assertion checkers allow checking to be embedded in the RTL code. This is a convenient way to
do checking in the most important parts of a design.
New verification languages have also gained rapid acceptance. These languages combine the
parallelism and hardware constructs from HDLs with the object oriented nature of C++. These languages also
provide support for automatic stimulus creation, checking, and coverage. However, these languages do not
replace Verilog HDL. They simply boost the productivity of the verification process. Verilog HDL is still
needed to describe the design.
For very high-speed and timing-critical circuits like microprocessors, the gate-level net list provided
by logic synthesis tools is not optimal. In such cases, designers often mix gate-level description directly into
the RTL description to achieve optimum results. This practice is opposite to the high-level design paradigm,
yet it is frequently used for high-speed designs because designers need to squeeze the last bit of timing out of
circuits, and EDA tools sometimes prove to be insufficient to achieve the desired results.
Another technique that is used for system-level design is a mixed bottom-up methodology where the
designers use either existing Verilog HDL modules, basic building blocks, or vendor-supplied core blocks to
quickly bring up their system simulation. This is done to reduce development costs and compress design
schedules.
For example, consider a system that has a CPU, graphics chip, I/O chip, and a system bus. The CPU
designers would build the next-generation CPU themselves at an RTL level, but they would use behavioral
models for the graphics chip and the I/O chip and would buy a vendor-supplied model for the system bus.
45
Thus, the system-level simulation for the CPU could be up and running very quickly and long before the RTL
descriptions for the graphics chip and the I/O chip are completed.
Hierarchical Modeling Concepts
Before we discuss the details of the Verilog language, we must first understand basic hierarchical
modeling concepts in digital design. The designer must use a "good" design methodology to do efficient
Verilog HDL-based design. In this chapter, we discuss typical design methodologies and illustrate how these
concepts are translated to Verilog. A digital simulation is made up of various components. We talk about the
components and their interconnections.
Learning Objectives
i.
Understand top-down and bottom-up design methodologies for digital design.
ii.
Explain differences between modules and module instances in Verilog.
iii.
Describe four levels of abstraction - behavioral, data flow, gate level, and switch level - to represent
the same module.
iv.
Describe components required for the simulation of a digital design. Define a stimulus block and a
design block. Explain two methods of applying stimulus.
Design Methodologies
There are two basic types of digital design methodologies: a top-down design methodology and
a bottom-up design methodology. In a top-down design methodology, we define the top-level block and
identify the sub-blocks necessary to build the top-level block. We further subdivide the sub-blocks until we
come to leaf cells, which are the cells that cannot further be divided.
Block diagram of Top design Methodology
Fig 4.2 Top-down Design Methodology
46
In a bottom-up design methodology, we first identify the building blocks that are available to us. We
build bigger cells, using these building blocks. These cells are then used for higher-level blocks until we build
the top-level block in the design. Figure 4.2 shows the bottom-up design process.
Typically, a combination of top-down and bottom-up flows is used. Design architects define the
specifications of the top-level block. Logic designers decide how the design should be structured by breaking
up the functionality into blocks and sub-blocks. At the same time, circuit designers are designing optimized
circuits for leaf-level cells. They build higher-level cells by using these leaf cells. The flow meets at an
intermediate point where the switch-level circuit designers have created a library of leaf cells by using
switches, and the logic level designers have designed from top-down until all modules are defined in terms of
leaf cells.
To illustrate these hierarchical modeling concepts, let us consider the design of a negative edgetriggered 4-bit ripple carry counter described, 4-bit Ripple Carry Counter.
4-bit Ripple Carry Counter
The ripple carry counter shown in Figure 4.3 is made up of negative edge-triggered toggle flipflops
(T_FF).
Fig 4.3. Ripple Carry Counter

Each
of
the
T_FFs can
be
made
up
from
negative
edge-triggered
D-flipflops(D_FF)
and inverters (assuming q_bar output is not available on the D_FF), as shown in Figure 4.4.
47
Block diagram of T-Flip Flop
Fig 4.4. T-flipflop

Thus, the ripple carry counter is built in a hierarchical fashion by using building blocks.
The block diagram of the design hierarchy.
Fig 4.5. Design Hierarchy

In a top-down design methodology, we first have to specify the functionality of the ripple carry
counter, which is the top-level block. Then, we implement the counter with T_FFs. We build the T_FFs from
the D_FF and an additional inverter gate. Thus, we break bigger blocks into smaller building sub-blocks until
we decide that we cannot break up the blocks any further. A bottom-up methodology flows in the opposite
direction.
We
combine
small
building
blocks
and
build
bigger
blocks;
e.g.,
we
could
build D_FF from and and or gates, or we could build a custom D_FF from transistors. Thus, the bottom-up
flow meets the top-down flow at the level of the D_FF.
48
Modules
We now relate these hierarchical modeling concepts to Verilog. Verilog provides the concept of
a module. A module is the basic building block in Verilog. A module can be an element or a collection of
lower-level design blocks. Typically, elements are grouped into modules to provide common functionality
that is used at many places in the design. A module provides the necessary functionality to the higher-level
block through its port interface (inputs and outputs), but hides the internal implementation. This allows the
designer to modify module internals without affecting the rest of the design.
In Figure 4.5, ripple carry counter, T_FF, D_FF are examples of modules. In Verilog, a module is
declared by the keyword module. A corresponding keyword end module must appear at the end of the module
definition. Each module must have amodule_name, which is the identifier for the module, and
a module_terminal_list, which describes the input and output terminals of the module.
Verilog is both a behavioral and a structural language. Internals of each module can be defined
at four levels of abstraction, depending on the needs of the design.
The module behaves identically with the external environment irrespective of the level of abstraction
at which the module is described. The internals of the module are hidden from the environment. Thus, the
level of abstraction to describe a module can be changed without any change in the environment. These levels
will be studied in detail in separate chapters later in the book. The levels are defined below.
i.
Behavioral or algorithmic level

This is the highest level of abstraction provided by Verilog HDL. A module can be
implemented in terms of the desired design algorithm without concern for the hardware
implementation details. Designing at this level is very similar to C programming.
ii.
Dataflow level
At this level, the module is designed by specifying the data flow. The designer is aware of how
data flows between hardware registers and how the data is processed in the design.
iii.
Gate level
The module is implemented in terms of logic gates and interconnections between these gates.
Design at this level is similar to describing a design in terms of a gate-level logic diagram.
iv.
Switch level
This is the lowest level of abstraction provided by Verilog. A module can be implemented in terms
of switches, storage nodes, and the interconnections between them. Design at this level requires
knowledge of switch-level implementation details.
49
Verilog allows the designer to mix and match all four levels of abstractions in a design. In the digital
design community, the term register transfer level (RTL) is frequently used for a Verilog description that uses
a combination of behavioral and dataflow constructs and is acceptable to logic synthesis tools.
If a design contains four modules, Verilog allows each of the modules to be written at a different level
of abstraction. As the design matures, most modules are replaced with gate-level implementations.
Normally, the higher the level of abstraction, the more flexible and technology-independent the
design. As one goes lower toward switch-level design, the design becomes technology-dependent and
inflexible. A small modification can cause a significant number of changes in the design. Consider the
analogy with C programming and assembly language programming. It is easier to program in a higher-level
language such as C. The program can be easily ported to any machine. However, if you design at the
assembly level, the program is specific for that machine and cannot be easily ported to another machine.
Instances
A module provides a template from which you can create actual objects. When a module is invoked,
Verilog creates a unique object from the template. Each object has its own name, variables, parameters, and
I/O interface.
Components of a Simulation
Once a design block is completed, it must be tested. The functionality of the design block can be tested
by applying stimulus and checking results. We call such a block the stimulus block. It is good practice to keep
the stimulus and design blocks separate. The stimulus block can be written in Verilog. A separate language is
not required to describe stimulus. The stimulus block is also commonly called a test bench. Different test
benches can be used to thoroughly test the design block
Two styles of stimulus application are possible. In the first style, the stimulus block instantiates the
design block and directly drives the signals in the design block.
50
Block diagram of Stimulus block Instantiates design
Fig 4.6. Stimulus Block Instantiates Design Block

The second style of applying stimulus is to instantiate both the stimulus and design blocks in a toplevel dummy module. The stimulus block interacts with the design block only through the interface. This style
of applying stimulus is shown in Figure 4.7. The stimulus module drives the signals d_clk and d_reset, which
are connected to the signals clk and reset in the design block. It also checks and displays signal c_q, which is
connected to the signal q in the design block. The function of top-level block is simply to instantiate the
design and stimulus blocks.
Block diagram of the Stimulus and Design Blocks Instantiated in a Dummy Top-Level Module
Fig 4.7 Stimulus and Design Blocks Instantiated in a Dummy Top-Level Module
51
FPGA Development Board

The Diligent Dig lab IIE board (available from www.digilentinc.com) will be used in this tutorial,
however, other boards utilizing a Xilinx FPGA can easily be substituted. The Digilab IIE board contains a
Xilinx Spartan-IIE device (XC2S200E) with the equivalent of approximately 200,000 gates. The Digilab IIE
board contains a minimal number of prototyping devices, but it contains 6 40-pin I/O headers for attaching
daughter boards with additional functionality. The Spartan-IIE devices are identical to those in Xilinxs
Virtex-E family of FPGAs. In fact, the product model stored on the device itself is the Virtex-E model number
(for the FPGA on the Digilab board the on-chip device name is XCV200E).
VHDL Design Entry
The Xilinx ISE tools allow the design to be entered several ways including graphical schematics, state
machine diagrams, VHDL, and Verilog. This tutorial will focus on VHDL entry, but the other methods are
similar and can be easily explored once the reader is comfortable with the ISE software.
Starting A Project
Start the Xilinx ISE Project Navigator. Choose File New Project. A popup dialog box will appear. Enter
tutor1 for Project Name. For the Project Location, select the directory where the project will be stored (i.e.,
Z:\XProj\tutor1) for your project.
Next, the FPGA that will be used with this project needs to be specified. The compilation process is
device specific, so the complete device specification (including package type) must be entered when creating
a new project. Look on the top of the FPGA you are using. The de-vice model, package type, and speed grade
will be printed on it. The Spartan-IIE device used on the Digi lab IIE board is shown in Fig-ure L1.2. For the
Digilab IIE board, set the Device Family to Spar-tan2E. The device model is XC2S200E, the package type is
PQ208, and the speed grade is 6, so set the Device to xc2s200e-6pq208. Finally, set the Design Flow to XST
VHDL and click OK
A project can be retargeted (i.e., compiled for another FPGA device) after the project is created by
selecting the xc2s200e-6pq208-XST VHDL item in the Module View window and then choosing Source )
Properties.
The programmable logic boards used for CIS 372 are Xilinx Virtex-II Pro development systems. The
centerpiece of the board is a Virtex-II Pro XC2VP30 FPGA (field-progammable gate array), which can be
52
programmed via a USB cable or compact flash card. The board also features PS/2, serial, Ethernet, stereo
audio and VGA video ports, user buttons, switches and LEDS, and expansion ports for connecting to other
boards.
1. Preliminaries
Each Klab station contains a Windows machine on the left and a Linux machine on the right. The
software for programming the FPGA (Xilinx ISE Project Navigator) is on the Windows machine. Open ISE
from Start -> All Programs -> Xilinx ISE 8.2i -> Project Navigator.
On the Windows machine, your eniac account is mounted on the S: drive. Xilinx tools have to access
many files. They get incredibly slow when they have to access those files over Samba. It is recommended that
you keep copy of your project in your eniac account, copy the project directory to the local drive (C:user is the
only writeable directory, so somewhere under there), use the local copy while in the lab, copy the project back
to your eniac account when you are done, and delete the local copy making sure to empty the recycling bin.
2. Creating a new project in ISE
1. First, ISE may have opened a previously used project. If so, close the project using File -> Close
Project.
2. An ISE project contains all the files needed to design a piece of hardware and download it to the
FPGA. Go to File -> New Project to create a new ISE project. Give the project a location on your
mapped Eniac drive and enter a name for the project, such as "tutorial". Set the Top-Level Source
Type to HDL and click Next.
3. The following screen allows you to set the properties for the FPGA you will be downloading your
design to. For our boards, the correct settings are Family = "Virtex2P", Device = "XC2VP30",
Package = "FF896", and Speed = "-7". Set the Synthesis Tool to "XST (VHDL/Verilog)" and
Simulator to "Modelsim-XE Verilog" and click Next.
4. On the next screen, click the New Source button. Select Verilog Module from the list and give the
module the file name "switch", then click Next. A Verilog module is a self-contained hardware unit
with an interface of inputs and outputs, which are specified on the next screen.
5. This screen takes your inputs and outputs and automatically generates code for your module.
6. Click Next and click Finish on the next screen. This will bring you back to the New Project window
click Next twice and then Finish once to generate your module.
53
3. Coding your switch module

1. Once your module is generated, the main ISE Project Navigator view appears. There are four main
windows in the Project Navigator: Sources, Processes, Console output, and the editor. The Design
Summary tab on the editor window will be selected after you generate your new module; for now, you
can close this tab with the X button in the upper-right corner.
2. Select the switch.v tab in the editor window. If this tab ever gets closed, you can open the file again by
double-clicking on it in the Sources window. The switch module will link the four user input switches
on the board to the four LEDs next to the switches, so toggling the switches will turn the LEDs on and
off. The values of the switches are inputs to the module, and the signals to the LEDs are outputs from
the module.
3. To "compile" your Verilog code, make sure the switch. v file is highlighted in the Sources window,
expand the Synthesize-XST item in the Processes window and double-click Check Syntax.
4. The console should not display any errors under the HDL Compilation section, and a
should
appear next to Check Syntax. If you get compilation errors, resolve them (ask a TA for help if
necessary) before continuing.
4. Assigning ports to pins
1. For the ports in the module (SWITCHES and LEDS) to control the components on the board, they
must be connected to pins on the FPGA. To do this, click on switch.v in the Sources window, expand
the User Constraints item in Processes and double-click on Assign Package Pins. When prompted to
add a UCF file to your project, click Yes.
2. Xilinx PACE will open up. On the left, the ports of your module will be listed, and on the right is a
diagram of the FPGA. Click the Package View tab at the bottom of this window to see a diagram of
the unconnected pins on the FPGA.
3. Each component on the board is connected to a pin on the FPGA. Connect the pins to your module's
ports by clicking in the Loc box next to each port and typing in the proper pin, as shown in the image
below. You should see each pin location fill in with blue on the pin diagram to the right. The other
information (I/O Std., Drive Str., etc.) does not have to be filled in. Your list of ports should look like
this when you are done.
4. When you are done entering the pin information, click Save. You may be prompted to choose a Bus
Delimiter; choose the top option, XST Default: < >, and click OK. You can then close PACE.
54
5. Go back to the ISE Project Navigator. In the Sources window, expand the hierarchy for the switch
module to see the new file, switch.ucf, that has been added to the project. By double-clicking on Edit
Constraints (Text) in the Processes window, you can see the format of the UCF file. If you made a
mistake in your pin locations or want to change them in the future, you can directly edit the UCF file
instead of using PACE.
5. Generating a programming file
1. Click on switch.v in the Sources window. In the Processes window, scroll down and double-click on
Generate Programming File. This will run all of the processes necessary to create a file that can be
downloaded onto the board to program the FPGA. Running these processes may take several minutes;
progress is indicated by the spinning
When a process completes, a
icon and output to the console.
appears next to it. If any errors occurred during the process, a
will
appear next to it. All errors must be resolved before a programming file can be generated. Errors are
output to the console, and can be more easily seen by clicking on the Errors tab. Warnings cause
a
to appear next to the process and can be seen under the Warnings tab.
2. All processes have run successfully when a
appears next to Generate Programming File. You can
scroll up in the Processes window and double-click on View Design Summary to see a report of your
design and links to more detailed reports.
Design Simulation
Verifying Functionality using Behavioral Simulation
Create a test bench waveform containing input stimulus you can use to verify the
Functionality of the counter module. The test bench waveform is a graphical view of a test bench.
Create the test bench waveform as follows:
i.
Select the counter HDL file in the Sources window.
ii.
Create a new test bench source by selecting Project New Source.
iii.
In the New Source Wizard, select Test Bench WaveForm as the source type, and type counter_tbw in
the File Name field.
55
iv.
Click Next.
v.
The Associated Source page shows that you are associating the test bench waveform with the source
file counter. Click Next.
vi.
The Summary page shows that the source will be added to the project, and it displays the source
directory, type and name. Click Finish.
vii.
You need to set the clock frequency, setup time and output delay times in the Initialize Timing dialog
box before the test bench waveform editing window opens.
viii.
The blue shaded areas that precede the rising edge of the CLOCK correspond to theInput Setup Time
in the Initialize Timing dialog box. Toggle the DIRECTION port to define the input stimulus for the
counter design as follows:
1. Click on the blue cell at approximately the 300 ns to assert DIRECTION high so that the counter
will count up.
2. Click on the blue cell at approximately the 900 ns to assert DIRECTION low so that the counter
will count down.
ix.
Save the waveform.
x.
In the Sources window, select the Behavioral Simulation view to see that the test bench waveform file
is automatically added to your project.
xi.
Close the test bench waveform.
4.3 Program Code

Program for Cipher:
module aes_cipher_top(clk, rst, ld, done, key, text_in, text_out );
input clk, rst;
input ld;
output done;
input [127:0] key;
input [127:0] text_in;
output [127:0] text_out;
56
// Local Wires
//wire [31:0] w0, w1, w2, w3;
reg
[127:0] text_in_r;
reg
[127:0] text_out;
reg
[7:0]
sa00, sa01, sa02, sa03;
reg
[7:0]
sa10, sa11, sa12, sa13;
reg
[7:0]
sa20, sa21, sa22, sa23;
reg
[7:0]
sa30, sa31, sa32, sa33;
wire
[7:0]
sa00_next, sa01_next, sa02_next, sa03_next;
wire
[7:0]
wire
[7:0]
wire
[7:0]
wire
[7:0]
sa00_sub, sa01_sub, sa02_sub, sa03_sub;
wire
[7:0]
wire
[7:0]
wire
[7:0]
wire
[7:0]
sa00_sr, sa01_sr, sa02_sr, sa03_sr;
wire
[7:0]
wire
[7:0]
wire
[7:0]
57
wire
[7:0]
sa00_mc, sa01_mc, sa02_mc, sa03_mc;
wire
[7:0]
wire
[7:0]
wire
[7:0]
reg
done, ld_r;
reg
[3:0]
dcnt;
// Misc Logic
//always @(posedge clk)
if(!rst) dcnt <= #1 4'h0;
else
if(ld) dcnt <= #1 4'hb;
else
if(|dcnt) dcnt <= #1 dcnt - 4'h1;
always @(posedge clk) done <= #1 !(|dcnt[3:1]) & dcnt[0] & !ld;
always @(posedge clk) if(ld) text_in_r <= #1 text_in;
always @(posedge clk) ld_r <= #1 ld;
// Initial Permutation (AddRoundKey)
//always @(posedge clk)
sa33 <= #1 ld_r ? text_in_r[007:000] ^ w3[07:00] : sa33_next;
always @(posedge clk)
58
// Round Permutations
//assign sa00_sr = sa00_sub;
assign sa01_sr = sa01_sub;
59

assign {sa00_mc, sa10_mc, sa20_mc, sa30_mc} = mix_col(sa00_sr,sa10_sr,sa20_sr,sa30_sr);
assign sa00_next = sa00_mc ^ w0[31:24];
60

//Final text output
//always @(posedge clk) text_out[127:120] <= #1 sa00_sr ^ w0[31:24];
always @(posedge clk) text_out[095:088] <= #1 sa01_sr ^ w1[31:24];
61

//Generic Functions
//function [31:0] mix_col;
input [7:0]
s0,s1,s2,s3;
reg
s0_o,s1_o,s2_o,s3_o;
[7:0]
begin
mix_col[31:24]=xtime(s0)^xtime(s1)^s1^s2^s3;
mix_col[23:16]=s0^xtime(s1)^xtime(s2)^s2^s3;
mix_col[15:08]=s0^s1^xtime(s2)^xtime(s3)^s3;
mix_col[07:00]=xtime(s0)^s0^s1^s2^xtime(s3);
end
endfunction
function [7:0] xtime;
input [7:0] b; xtime={b[6:0],1'b0}^(8'h1b&{8{b[7]}});
62
endfunction
// Modules
//aes_key_expand_128 u0(
.clk(clk),
.kld(ld),
.key(key),
.wo_0( w0),
.wo_1( w1),
.wo_2( w2),
.wo_3( w3));
aes_sbox us00(.a(sa00), .d(sa00_sub ));
63
aes_sbox us22(.a(sa22), .d(
sa22_sub));
sa23_sub));
sa30_sub));
sa31_sub));
sa32_sub));
sa33_sub));
endmodule
Test bench
module aes_crypto_processor_tb;
reg clk;
reg rst;
reg kld;
reg [31:0]bus_key;
reg [31:0]bus_text;
wire [31:0]bus_textout;
initial
begin
clk = 1'b0;
rst = 1'b0;
kld = 1'b0;
64
bus_key = 32'h0000000a;
bus_text = 32'h0000000a;
#10 rst = 1'b1;
#10 bus_key = 32'h0000000a;
bus_text = 32'h0000000b;
Inverse cipher:
module aes_inv_cipher_top(clk, rst, kld, ld, done, key, text_in, text_out );
input clk, rst;
input kld, ld;
output done;
input [127:0] key;
input [127:0] text_in;
output [127:0] text_out;
wire
[31:0] wk0, wk1, wk2, wk3;
reg
[31:0] w0, w1, w2, w3;
reg
[127:0] text_in_r;
reg
[127:0] text_out;
reg
[7:0]
sa00, sa01, sa02, sa03;
reg
[7:0]
sa10, sa11, sa12, sa13;
reg
[7:0]
sa20, sa21, sa22, sa23;
65
reg
[7:0]
sa30, sa31, sa32, sa33;
wire
[7:0]
wire
[7:0]
wire
[7:0]
wire
[7:0]
wire
[7:0]
wire
[7:0]
wire
[7:0]
wire
[7:0]
wire
[7:0]
wire
[7:0]
wire
[7:0]
wire
[7:0]
wire
[7:0]
sa00_ark, sa01_ark, sa02_ark, sa03_ark;
wire
[7:0]
wire
[7:0]
wire
[7:0]
reg
ld_r, go, done;
reg
[3:0]dcnt;
// Misc Logic
66

if(!rst) dcnt <= #1 4'h0;
else
if(done)dcnt <= #1 4'h0;
else
if(ld) dcnt <= #1 4'h1;
else
if(go) dcnt <= #1 dcnt + 4'h1;
done <= #1 (dcnt==4'hb) & !ld;

if(!rst) go <= #1 1'b0;
else
if(ld)
go <= #1 1'b1;
else
if(done)
go <= #1 1'b0;
if(ld) text_in_r <= #1 text_in;
ld_r <= #1 ld;
// Initial Permutation
67
// Round Permutations
assign sa00_sr = sa00;
68

assign sa00_ark = sa00_sub ^ w0[31:24];
69

assign {sa00_next, sa10_next, sa20_next, sa30_next} = inv_mix_col(sa00_ark,sa10_ark,sa20_ark,sa30_ark);
// Final Text Output
always @(posedge clk) text_out[127:120] <= #1 sa00_ark;
70

// Generic Functions
//function [31:0] inv_mix_col;
input [7:0]
s0,s1,s2,s3;
begin
inv_mix_col[31:24]=pmul_e(s0)^pmul_b(s1)^pmul_d(s2)^pmul_9(s3);
inv_mix_col[23:16]=pmul_9(s0)^pmul_e(s1)^pmul_b(s2)^pmul_d(s3);
inv_mix_col[15:08]=pmul_d(s0)^pmul_9(s1)^pmul_e(s2)^pmul_b(s3);
inv_mix_col[07:00]=pmul_b(s0)^pmul_d(s1)^pmul_9(s2)^pmul_e(s3);
end
endfunction
function [7:0] pmul_e;
input [7:0] b;
71
reg [7:0] two,four,eight;

begin
two=xtime(b);four=xtime(two);eight=xtime(four);pmul_e=eight^four^two;
end
endfunction
function [7:0] pmul_9;
input [7:0] b;
begin
two=xtime(b);four=xtime(two);eight=xtime(four);pmul_9=eight^b;
end
endfunction
function [7:0] pmul_d;
input [7:0] b;
begin
two=xtime(b);four=xtime(two);eight=xtime(four);pmul_d=eight^four^b;
end
endfunction
function [7:0] pmul_b;
72
input [7:0] b;
begin
two=xtime(b);four=xtime(two);eight=xtime(four);pmul_b=eight^two^b;
end
endfunction
function [7:0] xtime;
input [7:0] b;xtime={b[6:0],1'b0}^(8'h1b&{8{b[7]}});
endfunction
// Key Buffer
//
reg
[127:0] kb[10:0];
reg
[3:0]
reg
kdone;
reg
kb_ld;
kcnt;

if(!rst) kcnt <= #1 4'ha;
else
if(kld) kcnt <= #1 4'ha;
else
73
if(kb_ld)kcnt <= #1 kcnt - 4'h1;

if(!rst) kb_ld <= #1 1'b0;
else
if(kld) kb_ld <= #1 1'b1;
else
if(kcnt==4'h0) kb_ld <= #1 1'b0;
always @(posedge clk)kdone <= #1 (kcnt==4'h0) & !kld;
always @(posedge clk)if(kb_ld) kb[kcnt] <= #1 {wk3, wk2, wk1, wk0};
always @(posedge clk){w3, w2, w1, w0} <= #1 kb[dcnt];
// Modules
//aes_key_expand_128 u0(
.clk(clk),
.kld(kld),
.key(key),
.wo_0( wk0),
.wo_1( wk1),
.wo_2( wk2),
.wo_3( wk3));
aes_inv_sbox us00(.a(sa00_sr),.d(
sa00_sub));
74
sa01_sub));
sa02_sub));
sa03_sub));
sa10_sub));
sa11_sub));
sa12_sub));
aes_inv_sbox us13(a(sa13_sr),.d(
sa13_sub));
sa20_sub));
aes_inv_sbox us21(.a(sa21_sr),d(
sa21_sub));
sa22_sub));
sa23_sub));
sa30_sub));
sa31_sub));
sa32_sub));
sa33_sub));
endmodule
75
CHAPTER 5
RESULT ANALYSIS
5.1 Specifications
Software
: Xilinx 9.2i
Family
: Spartan
Device
: XC3S200
Package
: FT256
Speed Grade
: -4
Top- Level Source Type
: HDL
Synthesis Tool
: XST (Verilog)
Simulator
: ISE Simulator (Verilog)
Preferred Language
: Verilog
5.1.2 Input/Output Specifications

Input:
Plain text
128 bits (h00112233445566778899aabbccddeeff);
Key
128 bits (h000102030405060708090a0b0c0d0e0f);
Output
Output Message: 128 bits (h00112233445566778899aabbccddeeff);
76
5.2 Encryption Results:

5.2.1 Simulation Waveforms of Encryption:
Fig 5.1 Simulation Result Waveform of Encryption

The simulation result show the encrypted result of the input from h00112233445566778899aabbccddeeff
to69c4e0d86a7b0430d8cdb78070b4c55a
77
5.2.2 Top Level Module of Encryption
Fig 5.2 Top Level Module of Encryption
5.2.3 Synthesis Report for Encryption
Fig 5.3 Synthesis Report for Encryption

78
5.2.4 RTL Schematic Of Encryption
Fig 5.4 RTL Schematic Of Encryption
The above diagram explains the RTL schematic diagram of encryption.
79
5.3 Decryption Results

5.3.1
Simulation Waveforms of Decryption
Fig 5.5 Simulation Waveforms of Decryption
This above simulation show the encrypted result converted back into original output
h00112233445566778899aabbccddeeff.
80
5.3.2
Top Level Module of Decryption
Fig 5.6 Top Level Module of Decryption

5.3.3
Synthesis Report of Decryption
Fig 5.7 Synthesis Report of Decryption
81
5.3.4
RTL Schematic of Decryption
Fig 5.8 RTL Schematic of Decryption
This above diagram shows the RTL Schematic of the decryption output.
82
ADVANTAGES AND DIS-ADVANTAGES

a) Advantages
1. It hides the message and your privacy is safe.
2. No one would be able to know what it says unless there's a key to the code.
3. You can write whatever you want and how ever you want (any theme any symbol for the code)
to keep your code a secret.
4. You are able to use Cryptography during lessons without the teacher knowing.
b) Disadvantages
1. Takes a long time to figure out the code.
2. It takes long to create the code.
3. If you were to send a code to another person in the past, it will take long to get to that person.
4. Overall cryptography it's a long process.
83
APPLICATIONS
1. Communications : GSM, Payphones
2. Entertainment : Pay-TV, Public event access control
3. Health care : Insurance data, Personal file
4. Government : Identification, Passport, Driving license
5. E-banking : Access to accounts, To do transactions , Sharing
6. E-commerce : Sale of tickets, Reservations
7. Education : Student database, personal data like results
8. Biometric : Finger print recognition
84
CONCLUSION
This design presents, for the first time, a universal cryptography processor for smart-card applications
that supports both private and public key cryptography algorithms. This is achieved this by expressing the
primitives of three important algorithms for smart cards (DES, AES, and ECC) in terms of simple logical
operations that maximize the number of common blocks among them. This approach resulted in a crypto
processor that meets the power consumption and performance specifications of smart cards and occupies 2.25
mm
in 0.18-m CMOS when SRAM memory blocks are used. This area represents just 9% of the
maximum available smart-card chip area of 25 mm .Using FeRAM instead of SRAM memory blocks
provides nonvolatile configuration at no extra area overhead
85
FUTURE SCOPE
DNA Cryptography:
DNA cryptography is a new born cryptographic field emerged with the research of DNA computing,
in which DNA is used as information carrier and the modern biological technology is used as
implementation tool.
The vast parallelism and extraordinary information density inherent in DNA molecules are explored
for cryptographic purposes such as encryption, authentication, signature, and so on.
QUANTUM Cryptography:
Quantum cryptography attempts to achieve the same security of information as other forms of
cryptography but through the use of photons, or packets of light. The process, though still in
experimental stages, makes use of the polarization nature of light and is proving to be a very
promising defence against eavesdropping.
86
REFERENCES
[1]. www.cryptography.com
[2]. www.wikipedia.com
[3]. www.io.com/~hcexres/power_tools/hyperweb/website1.PDF
[4]. www.abo.fi/~ipetre/crypto
[5]. www.google.com
[6]. www.howstuffworks.comhttp://rijndael.info/audio/rijndael_pronunciation.wav
[7]. Nicolas Courtois, Josef Pieprzyk, "Cryptanalysis of Block Ciphers with Overdefined Systems of
Equations". pp267287, ASIACRYPT 2002.
[8]. Joan Daemen, Vincent Rijmen, "The Design of Rijndael: AES The Advanced Encryption Standard."
Springer, 2002. ISBN 3-540-42580-2.
[9]. Christof Paar, Jan Pelzl, "The Advanced Encryption Standard", Chapter 4 of "Understanding
Cryptography, A Textbook for Students and Practitioners". (companion web site contains online
lectures on AES), Springer, 2009.
87

An Area Efficient Universal Cryptography Processor For Smart Cards

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

An Area Efficient Universal Cryptography Processor For Smart Cards

Uploaded by

Copyright:

Available Formats

CHAPTER 1

1.1.2 Previous System

1.3 Literary Survey

1.4 Organization of Project

JOGINPALLY M.N. RAO WOMENs ENGINEERING COLLEGE

Simply we say Integrated circuit is many transistors on one chip.

Design/manufacturing of extremely small, complex circuitry using modified semiconductor

Integrated circuit(IC) may contain millions of transistors, each a few mm in size

Applications wide ranging: most electronic logic devices.

2.2 History of Scale Integration:

2.3 Advantages of ICs over Discrete Components

JOGINPALLY M.N. RAO WOMENs ENGINEERING COLLEGE

Lower power consumption:

2.4 Applications of VLSI

2.5 The main digital VLSI circuit testing problems

JOGINPALLY M.N. RAO WOMENs ENGINEERING COLLEGE

Test Generation Problems

2.6 Introduction to Cryptography

2.7 Basics of Cryptography

JOGINPALLY M.N. RAO WOMENs ENGINEERING COLLEGE

Block diagram to converts plain text into cipher text.

Fig.2.1 converts plain text into cipher text

Fig 2.2 convert cipher text into plain text

2.8 Types of Cryptography

Secret key cryptography

Public key cryptography

Secret Key Cryptography

JOGINPALLY M.N. RAO WOMENs ENGINEERING COLLEGE

Fig2.3 secret key cryptography

JOGINPALLY M.N. RAO WOMENs ENGINEERING COLLEGE

Electronic Codebook (ECB) mode :

Cipher Block Chaining (CBC) mode :

Cipher Feedback (CFB) mode :

Output Feedback (OFB) mode

Symmetric Key Cryptographic Algorithms:

2.8.2 Public Key Cryptography

JOGINPALLY M.N. RAO WOMENs ENGINEERING COLLEGE

Fig 2.4 public-key cryptography

JOGINPALLY M.N. RAO WOMENs ENGINEERING COLLEGE

How PGP Works?

Ciphertext + encrypted session key

JOGINPALLY M.N. RAO WOMENs ENGINEERING COLLEGE

Block Diagram of decryption process.

Recipients Private Key Used

To Decrypt Session Key

Session Key Used

Fig 2.6 Decryption

JOGINPALLY M.N. RAO WOMENs ENGINEERING COLLEGE

Fig 2.7 Private Key and Public Key

JOGINPALLY M.N. RAO WOMENs ENGINEERING COLLEGE

The Advantages of Public-Key Cryptography Compared with Secret-Key Cryptography is as follow:i.

Public-key cryptography may be vulnerable to

impersonation, however, even if users' private keys

JOGINPALLY M.N. RAO WOMENs ENGINEERING COLLEGE

Why Three Encryption Techniques?

two Belgian cryptographers, Joan

JOGINPALLY M.N. RAO WOMENs ENGINEERING COLLEGE

Block diagram of AES.

Fig 3.1 Block diagram of AES

AES is a block cipher with a block length of 128 bits.

JOGINPALLY M.N. RAO WOMENs ENGINEERING COLLEGE

The 4 4 matrix of bytes is referred to as the state array.

JOGINPALLY M.N. RAO WOMENs ENGINEERING COLLEGE

The Encryption Key and Its Expansion