GSF Version of Report

Fast Prototyping of Constraint Satisfaction Problem Architecture Using FPGAs
Alan Cheng
4/1/2012
Table of Contents
I. Abstract 1. Introduction 2. Review of Literature 2.1 Logic Synthesis
2.1.1 Boolean Algebra and Switching Algebra 2.1.2 Truth Tables 2.1.3 Karnaugh Maps 2.1.4 Commonly Used Gates 2.1.4.1 AND Gate 2.1.4.2 OR Gate 2.1.4.3 NOT Gate 2.1.4.4 XOR Gate 2.1.4.5 Multiplexer 2.1.5 Arithmetic Gates 2.1.5.1 Adder 2.1.5.2 Multiplier 2.1.6 Clock 2.1.7 Flip-Flops and Latches 2.1.7.1 D-Latch 2.1.7.2 D-Flip Flop
2.2 Constraint Satisfaction Problems and Oracles

2.2.1 Constraint Satisfaction Problem 2.2.1.1 Satisfiability
2.2.1.2 Graph Coloring 2.2.1.3 Cryptarithmetic Problems 2.2.2 Oracle in Hardware
2.3 Cryptography
2.3.1 An Introduction to Cryptology 2.3.2 Cryptography in History 2.3.3 Modern Cryptography
2.4 Field Programmable Gate Array and Verilog-HDL

2.4.1 Field-Programmable Gate Array 2.4.2 Hardware Description Languages 2.4.3 Verilog-HDL
3. Goal, Question, and Hypothesis 3.1 Goal 3.3 Hypothesis 4. Design and Setup of Hardware 4.1Constraints of Hardware 4.2 Terasic DE2-115 Development Board 4.3 ModelSim 4.4 Quartus II 5. TWO+TWO=FOUR Oracle 5.1 Overview of the Oracle 5.2 TWO+TWO=FOUR Problem 5.3 The Selector 5.4 The Permuter 5.5 Arithmetic Checker
5.6 The RAM Module 5.7 The RAM Counter 5.8 The LCD Display 5.9 Verification for the TWO+TWO=FOUR Oracle 5.10 TWO+TWO=FOUR Problem Experiment 5.11 Conclusion for the TWO+TWO=FOUR Problem 6. Oracle for Arbitrary Cryptarithmetic Problems 6.1 Overview of the Oracle 6.2 The Input of the Oracle 6.3 The Counter 6.4 The Input Validity Checker 6.5 The Arithmetic Checker 6.6 Software Variation of the Oracle for Arbitrary Cryptarithmetic Problems 6.7 Verification of the Oracle for Arbitrary Cryptarithmetic Problems 6.8 Oracle for Arbitrary Cryptarithmetic Problems Experiment
6.8.1 TWO+TWO=FOUR 6.8.2 ONE+ONE=TWO 6.8.3 TOO+TOO=LONG 6.8.4 OOP+LLP=PRGM
6.9 Conclusion for the Oracle for Arbitrary Cryptarithmetic Problems 6.10 Future Extensions for the Oracle for Arbitrary Cryptarithmetic Problems 7. Oracle for Single Solutions of Arbitrary Cryptarithmetic Problems 7.1 Overview of the Problem 7.2 Efficiency of Current Oracle Design
7.4 Discussion of Different Designs 7.5 Random-Start Algorithm 7.6 Calculation Time Analysis of the Current Oracle Design and the Random-Start Algorithm
7.6.1 TWO+TWO=FOUR 7.6.2 ONE+ONE=TWO 7.6.3 TOO+TOO=LONG 7.6.4 OOP+LLP=PRGM
7.7 Conclusion of Oracle for Single Solutions of Arbitrary Cryptarithmetic Problems 7.8 Future Extensions 8. Conclusion 9. References Appendix
I. Abstract
Cryptography is a major area of research in the government. Recently, however, secret codes are getting harder to decode with standard software, and therefore a different method is needed. The purpose of this project is to use Field-Programmable Gate Arrays (FPGAs) for fast prototyping of oracles to solve constraint satisfaction problems, which include cryptography. Cryptarithmetic problems are a subset of constraint satisfaction problems, which are commonly used to test hardware. Cryptarithmetic problems consist of characters, in which any value 0-9 can be used to substitute into each character. The goal is to find a successful encoding which would be able to solve the problem. The hardware used is FPGAs. FPGAs are very flexble development platform which can be reconfigured (by programming in hardware-description languages) for new circuits. There were two problems in which the hardware was tested on, a specific problem of TWO+TWO=FOUR, and a problem with arbitrary puzzles. In result, the hardware performed over three million times faster than the software in the TWO+TWO=FOUR problem, and 50 times faster for the arbitrary cryptarithmetic problems. In conclusion, hardware technologies are a faster approach to solving cryptarithmetic problems (and constraint satisfaction problems). This occurred in both experiments by drastic amounts.
1. Introduction
Cryptography has been an important area of research to the government, military, and large corporations. In the past, during World War II, cryptography was vital in winning the war. The Allies were able to decode German secret codes (using methods of cryptography) which revealed the plans of their military. If this was not done, the war would have lasted much longer. Currently, there are chances for war. If such war occurs, the lives of many civilians could be potentially saved by finding information about pre-planned attacks. However, current software technologies are slow and will not be able to decode strongly secured messages in a practical amount of time. As a result, other methods of technologies are required in order to be able to decode the messages in a reasonable amount of time. Field-programmable Gate Arrays and other hardware technologies are one alternative to such limitations in software.
2. Review of Literature
2.1 Logic Synthesis
2.1.1 Boolean Algebra and Switching Algebra
Boolean algebra is a two-valued algebraic system that was invented by George Boole in the book An Investigation of the Laws of Thought. It is a variant of algebra, in which the only values operated on was 0 (for false) and 1 (for truth). It is used in mathematic logic, computer programming, and digital logic. In the 1930s, Claude Shannon observed that Boolean algebra can be used for logic circuits and gates. Thus, he introduced switching algebra, a variant of Boolean algebra which can be used for analyzing logic circuits. In logic synthesis, both Boolean algebra and switching algebra are commonly used.
2.1.2 Truth Tables

Truth tables are a way to graphical display the function of the current circuit. In standard truth tables, the inputs are on the left, and the outputs are on the right. In the columns for the input, a line for every single combination of values for each input is drawn. Normally, the values of the inputs are listed in normal binary coding. The output column shows the output for the corresponding input values on the left. Inputs A 0 0 0 0 1 1 1 1 Output(s)
Combinations of all possible input values
B C X 0 0 0 0 1 1 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 1 1 0
Corresponding output to the input combination
Figure 2.1.2: Sample truth table with description of each part.
2.1.3 Karnaugh Maps

Karnaugh maps (K-map) are another graphical way to display functions on. However, unlike truth tables, it is very efficient at showing how to graphically synthesize circuits. In order to find minterms to a function in a K-map, one would only need to find groups of 1s (or 0s in
POS) with size 2n.(more description about groups and Hamming Distance) An example of finding a POS function is shown in figure 2.1.3.2. Input values (for cd) In graycode encoding cd 00 ab 00 0 Input values (for ab) In graycode encoding 01 11 10 01 11 10
Inputs
1 1 1 1
0 1 0 0
0 0 1 1
X Output(s) Output value for corresponding inputs
0 1 0
Figure 2.1.3.1: Karnaugh Map example with descriptions.
ac bc
cd 00 ab 00 01 11 10
01
11
10
1 1 1 0
1 1 1 0
0 0 0 0
1 0 0 1
X
bcd
F(x) = ac + bc + bcd
Figure 2.1.3.2: Example Karnaugh Map with minterms labeled and function at bottom. Notice that in Figure 2.1.8.2, the group of six is not grouped as one minterm. This is n because in binary logic, there are 2 combinations for n bits, and 6 is not a result of 2n. It can also be noticed that there can exist groups which wrap around the edge of the Karnaugh Map. It is allowed because 00 and 10 are in Hamming Distance one.
2.1.4 Commonly Used Gates

The following subsections will describe logic gates which are commonly used in logic synthesis.
2.1.4.1 AND Gate The AND gate is one of the most basic gates. It is commonly used in deriving other larger and complex gates and circuit. The output of the AND gate is 1 when both inputs are 1, and 0 otherwise. In the equation representation of the AND operator, the plus sign + is used. A X B A B 0 0 0 1 1 0 1 1 X 0 0 0 1
X=A+B
Figure 2.1.4.1: (Left) The graphical symbol for the AND gate, (Center) the truth table for the AND gate, (Right) the equation representation of the AND gate.
2.1.4.2 OR Gate The OR gate is another basic gate. Like the AND gate, it is commonly used in deriving other larger and complex gates and circuit. The output of the OR gate is 1 when either inputs are 1, and 0 otherwise. In the equation representation of the OR operator, the multiplication sign is used. A X B A B 0 0 0 1 1 0 1 1 X 0 1 1 1
X=AB
Figure 2.1.4.2: (Left) The graphical symbol for the OR gate, (Center) the truth table for the OR gate, (Right) the equation representation of the OR gate.
2.1.4.3 NOT Gate The NOT gate (or inverter) is another basic gate. Unlike both the AND and OR gate, the NOT gate has only 1 input. The function of the NOT gate is to invert the input. If the input is 0, the output is 1, and vica-versa. There several different notations for the NOT operation, and two are described in Figure 2.3.3. A X A X 0 1 1 0
X=A X = ~A X=A
Figure 2.1.4.3.1: (Left) The graphical symbol for the NOT gate, (Center) the truth table for the NOT gate, (Right) the equation representations of the NOT gate. Sometimes, the when the not gate is appended to the input or output of a logic gate, it is denoted with an empty circle.
=
Figure 2.1.4.3.2: Different notations of a not gate appended to a AND gate. 2. 1.4.4 XOR Gate Although the XOR (also called the EXOR gate or the Exclusive OR) gate is a combination of AND, OR, and NOT gates (shown in figure 2.1.4.4.1), it is fundamental to several logic synthesis methods such as ESOP minimization. The XOR gate is also the most basic gate in quantum technologies. In the XOR gate, the output is 1 when only one of the inputs has value 1, and 0 otherwise. The notation of the XOR operator is .
A B X
Figure 2.1.4.4.1: Decomposition of XOR gate to primitive AND, OR, and NOT gates. A X B A B 0 0 0 1 1 0 1 1 X 0 1 1 0
X=AB
Figure 2.1.4.4.2: (Left) The graphical symbol for the XOR gate, (Center) the truth table for the XOR gate, (Right) the equation representation of the XOR gate.
2. 1.4.5 Multiplexer The multiplexer is another gate produced by combining the primary gates (AND, OR, NOT). The function of the multiplexer is equivalent to a selector, where there is two inputs, and a third input wire determines which of the first two inputs are chosen for the output. Figure 2.1.4.5.1 shows the decomposition of the multiplexer to primary gates.
A
Control
Figure 2.1.4.5.1: Decomposition of multiplexer to its primary gates. A 0 0 0 0 1 1 1 1 B Control X 0 0 0 0 1 0 1 0 0 1 1 1 0 0 1 0 1 0 1 0 1 1 1 1
A B
Control
Figure 2.1.5.5.2: (Left) The symbol for the multiplexer, (Right) the corresponding truth table to the multiplexer.
2.1.5 Arithmetic Gates

2.1.5.1 Adder The adder is a gate which performs the arithmetic addition operator. There are two types of adder designs, the half adder and the full adder. The half adder takes two 1 bit inputs, and outputs a 2 bit output. The circuit is shown in figure 2.1.5.1.1.
X Half Sum Y
Carry-Out
A B 0 0 0 1 1 0 1 1
HS CO 0 0 1 0 1 0 0 1
Figure 2.1.5.1.1: Half Adder with truth table A full adder enables there to be inputs of arbitrary bits to be added together. For each bit addition, a carry can also be added as well. When full adders are combined, they create a ripple adder, which can be used to add numbers of multiple bits.
X Y Carry-In Sum
Carry-Out
X Y CIN S COUT 0 0 0 0 0 0 0 1 1 0 0 1 0 1 0 0 1 1 0 1 1 0 0 1 0 1 0 1 0 1 1 1 0 0 1 1 1 1 1 1
Figure 2.1.5.1.2: Full Adder with truth table

X3 Y3 X2 Y2 X1 Y1 X0 Y0
X C4
Y C3
Y C2
Y C1
Y C0
COUT CIN S
COUT CIN S
COUT CIN S
COUT CIN S
S3
S2
S1
S0
Figure 2.1.5.1.3: Ripper Adder using full adders
2.1.5.2 Multiplier The function of the multiplier is to perform the multiplication operation. The circuit is constructed based on multiplication using addition. Thus, the circuit shown in figure 2.1.5.2.1 is constructed using half adders. Note that the multiplication is done using binary inputs, instead of integers.
X2
Y1
X2
X1
X2 X1 X Y2 Y1 Z4 Z3 Z2 Z1
Half Adders
Z4
Z3
Z2
Z1
Figure 2.1.5.2.1: Decomposition of multiplier to primitive gates.
2.1.6 Clock
A clock is a waveform generator. It is an input to the circuit which continuously switches between the values 0 and 1. Several logic gates used in state machines require the use of a clock pulse, where the gate is activated by a rising edge (from 0 to 1) or a falling edge (from 1 to 0). Figure 2.1.7.1 describes the clock waveform.
tper
tper = Period 1/tper = Frequency
Figure 2.1.6.1 Clock waveform with description.
2.1.7 Flip-Flops and Latches

Flip-flops and latches are the basis of sequential circuits. They are used in the majority of larger circuits (circuits used in industry). Flip-flops and latches can be used as gates which store memory. Following standard conventions, flip-flops are devices which change the output only at clock pulses, while latches are devices which change the outputs whenever the input is changed. As only synchronous circuits will be used, the discussion of latches will be limited.
2.1.7.1 D-Latch The D-latch is a latch which simply stores bits of information. The circuit is derived from the SR latch, and is shown in figure 2.1.7.1.1. It is used in asynchronous circuits and can also be used to build D Flip-Flops. D Q
Enable
Figure 2.1.7.1.1: Decomposition of D-latch to primitive gates.
D En
Q Q
D 0 0 1 1
En 0 1 0 1
Q Last Q Last Q 0 1
Q Last Q Last Q 1 0
Figure 2.1.7.1.2: D-latch with truth table.
2.1.7.2 D-Flip Flop The D Flip-Flop is simply a D-latch in synchronous logic. The inputs are stored on each clock pulse instead of when the input changes. This flip-flop is derived from D-latches, where only a reader for the clock slope is needed (positive edge and negative edge). The notch in figure 2.1.7.2.1 represents a clock input. D 0 0 1 1 Clk X X Rising Rising Q Last Q Last Q 0 1 Q Last Q Last Q 1 0
Q Q
Figure 2.1.7.2.1: D Flip-flop with truth table.
2.2 Constraint Satisfaction Problems and Oracles

2.2.1 Constraint Satisfaction Problem
Constraint satisfaction problems (CSP) are mathematical problems where there are a set of objects whose state must satisfy a number of constraints. The problems are commonly the subject of research in artificial intelligence. The most common CSPs will be mentioned below.
2.2.1.1 Satisfiability Satisfiability (SAT) is the problem of testing if given variables of a Boolean function will result in the output of true or 1. Satisfiability is a NP-complete problem, meaning that there is no algorithm which can solve it in polynomial time. Figure 2.2.1.1 shows an example of the problem.
Figure 2.2.1.1: Example satisfiability problem.
2.2.1.2 Graph Coloring Graph coloring is a problem consistent of the assignment of nodes. Given is a graph (as shown in figure 2.2.1.3) where arbitrary colors can be assignment to any node. However, no adjacent nodes can be assigned the same color. The goal is to have the fewest number of colors. Similar to satisfiability, graph coloring is NP-complete.
Figure 2.2.1.2: Example graph coloring problem
2.2.1.3 Cryptarithmetic Problems Cryptarithmetic problems are mathematical problems consisting of an equation with unknown numbers. These numbers are displayed in the form of letters. The goal of the problem is to find a successful encoding for each letter. Note that the same letter contains the same value, while different letters have to contain different values.
Figure 2.2.1.3: Example cryptarithmetic problem.
S E N D + M O R E M O N E D Y D D
2.2.2. Oracle in Hardware

The general term for an oracle is: a person (as a priestess of ancient Greece) through whom a deity is believed to speak. In some cases, the oracle would only answer a yes or no. The oracle in hardware matches that description. It gives either a 1 (yes) or 0 (no) to any set of inputs of a given problem. For example, an oracle for a cryptarithmetic problem would give a 1 if the inputs satisfy the problem and 0 if the inputs do not satisfy the problem. Figure 2.2.2.1 shows a general oracle for the example.
Inputs
Gates to Compare Inputs to Constraints (Adders, Equality Gates)
Global AND
Output
Figure 2.2.2.1: General oracle for a cryptarithmetic problem.
2.3 Cryptography
2.3.1 An Introduction to Cryptography
Cryptography is the study and practice of secure communication. It is involved with the area of mathematics, electrical engineering, and computer science. Typically, cryptography consists of a key, which acts as a translator between the original text and the encoded text. Currently, cryptography is used in many devices, such as banks, ATM cards, and computer passwords.
2.3.2 Cryptography in History
Cryptography has always been important in both the use of the government, military, and now the internet. Although handwritten cryptography was practiced by the Greeks and in the Medieval Ages, it was not until World War I when cryptography became important. The first methods of encryption (making the secret codes) or decryption (decoding the secret codes) were not introduced until the 19th century. Later, in World War I, The United Kingdom military was able to decrypt German naval codes. However, the most notable case of cryptography in this time was the decryption of the Zimmermann Telegraph, with prompted the United States entry into World War I. World War II was the most common example of use for cryptography. The Germans had started to use an electrical rotor machine known as an Enigma. The Allies, however, were able to decrypt Enigma, as well as advancing technology of decryption through many cryptographers including Alan Turing (the founder of modern computers). America was also able to decrypt Japanese naval codes, which lead to the famous victory of the Battle of Midway.
2.3.3 Modern Cryptography

Although not widely know, cryptography is used in many everyday devices. Cryptography is especially important in large financial organizations such as banks, because financial data must be secret from the public. With the introduction of the internet, information is encoded to be kept secret. However, the spread of data came with the spread of decrypting methods, in which organizations outside of the government received. Currently, all work on cryptography is done in secret, mostly by government organizations
2.4 Field Programmable Gate Array and Verilog-HDL

2.4.1 Field-Programmable Gate Array Field-programmable gate arrays, commonly known as FPGA, are integrated circuits which are re-programmable. FPGAs are also cost efficient, and can be programmable using hardwire description languages. The features of a FPGA can overcome the difficulty of using application-specific integrated circuits (ASIC). The interior of a FPGA commonly consist of logic elements. These logic elements consist of several logic gates (AND, OR) and also memory elements (flip-flops). For educational use, development boards for FPGA are made. These boards contain many IO (input and output) devices (switches, LEDs) and also on board memory (RAM).
2.4.2 Hardware description languages Hardware descriptions languages are used in order to design electrical circuits. They are often used in programming FPGAs. A program in these languages can describe circuit operation,
design, and organization. The most common hardware description languages are Verilog-HDL and VHDL.
2.4.3 Verilog-HDL Verilog-HDL is a hardware description language used to program electronic systems. It is based off the popular language of C. Verilog was created by Phil Moorby and Prabhu Goel during 1984. Since then, Verilog has become recognized as an IEEE standard, with the most recent extension in 2005.
3. Goal, Question, and Hypothesis

3.1 Goal
The goal of this project is to create an oracle which would be able to solve arbitrary cryptarithmetic problems. The oracle must be able to give all possible solutions or a single solution for a user given problem (the input). The calculation speed of the oracle must be adequately fast, and not take forever.
3.2 Hypothesis
I hypothesize that making an efficient oracle for arbitrary cryptarithmetic problems is possible. The speed of the oracle on a Field Programmable Gate Array would also be more efficient that the computer program because FPGAs is hardware, unlike software on the computer. Working directly on the circuit would make it more efficient than the software on the computer, which accesses hardware. Also, with a FPGA, calculation can be done in parallel. However, programming languages are sequential, and if multi-threading was implemented, there still will not be as much parallelism as a logic circuit.
4. Design and Setup of Hardware

4.1 Constraints of Hardware
In order for the project to be practical, constraints are imposed. The following bulleted list shows the constraints of the project.
Constraints:
Cost Hardware cannot be expensive (high-performance FPGAs over $1000) Computer HP Pavilion dm4 with core i5 processor (2.4 GHz) Features on FPGA Must be able to display the solutions of the given cryptarithmetic problem.
4.2 Terasic DE2-115 Development Board

The FPGA used in this project is on an Terasic DE2-115 development board (developed and manufactured by Terasic Inc). The main reasons why this board was used is because of its cost to performance ratio. It has a recent (2009) Altera Cyclone IV E FPGA for only $499. The development board also contains lots of inputs and a LCD output. The list below shows some specifications of the Terasic DE2-115.

114,480 logic elements (LEs) 3,888 Embedded memory (Kbits) 266 Embedded 18 x 18 multipliers 4 General-purpose PLLs 528 User I/Os 128MB (32Mx32bit) SDRAM 2MB (1Mx16) SRAM 8MB (4Mx16) Flash with 8-bit mode 32Kbit EEPROM 18 switches and 4 push-buttons 18 red and 9 green LEDs Eight 7-segment displays 16x2 LCD module
4.3 ModelSim
Mentor Graphics ModelSim is an integrated development environment (IDE) as well as a simulator for hardware description languages. It is useful for writing and debugging the modules. This is because the simulator provides waveforms that one could analysis to find early errors in their circuits. However, although simulation may prove to be working, the module may not work on the actual hardware.
Figure 4.3.1: Screenshot of ModelSims IDE with waveform
4.4 Quartus II
Alteras Quartus II is a compiler for any Altera FPGA. This means that the software would automatically download any Verilog-HDL program onto the FPGA. The Quartus II software is used in this project for running Verilog programs (from ModelSim) on the Terasic DE2-115 development board.
Figure 4.4.1: Screenshot of Quartus II IDE
5. TWO+TWO=FOUR Oracle
5.1 Overview of the Oracle
The purpose of the TWO+TWO=FOUR oracle is to be able to find all possible combination specifically for the TWO+TWO=FOUR problem (presented is section 5.2). The answers for the problem will be displayed on a LCD display. There are three major parts of the oracle: the selector, the permuter, and the arithmetic checker. In the oracle, there are also smaller circuits in order to store and display to combinations. These are the RAM (memory) module, the RAM counter, and the LCD display module
Figure 5.1.1: Flowchart of the TWO+TWO=FOUR Oracle
5.2 TWO+TWO=FOUR Problem

The TWO+TWO=FOUR Problem is a cryptarithmetic problem. Each letter of the problem can be substituted with values 0-9. As stated before, the goal of the problem is to find all possible solutions to the problem. Figure 5.2.1 shows the problem.
+
Figure 5.2.1: TWO+TWO=FOUR Problem
T W O T W O
F O U R
5.3 The Selector
The purpose of the selector is to select values for each different letter. In numerical terms, this means that five out of nine possible numbers (0-9 excluding 1, explained in section 5.5) are selected. To do this operation, the selector is split into two sections: the counter and the multiplexers.
Counter
Multiplexers
Figure 5.3.1: Diagram for the selector circuit. The counter portion of the selector simply counts up in the given modulo. This portion can be broken into five smaller counters. In the problem of TWO+TWO=FOUR, since there are five unique letters, the modulo input is 4 (selection states 000, 001, 010, 011, and 100). Figure 5.3.2 shows a sample count sequence for the counter.
Time (t)
T=0 T=1 T=2 T=3 T=4 T=5 T=6
Output Combination
00000 10000 11000 11100 11110 11111 20000
Figure 5.3.2: Sample count sequence for selector counter. In the actual circuit, the output signal is 3 bits, so that a value of 2 (in t = 6) is actually 10. The outputs of each counter are then directly connected to control the multiplexers. Each multiplexer has only five inputs, and the inputs to each multiplexer was adapted to the counter.
Figure 5.3.3 shows the multiplexer, and figure 5.3.4 shows the resulting output of the full selector.
Figure 5.3.3: Connection of multiplexers to input.
Time (t)
T=0 T=1 T=2 T=3 T=4 T=5 T = 6s
Output Combination
56789 46789 45789 45689 45679 45678 36789
Figure 5.3.4: Sample count sequence for full selector.
5.4 The Permuter

The purpose of the permuter is to permute all of the values given by the selector. This allows for all possible number to letter combinations to be tested (as the selector does not repeat any set of numbers). Much like the selector, there is a counter and a multiplexer part of the circuit. The counter is consisted of normal counters which selects which of the inputs to use. The multiplexer contains the inputs of each of the numbers selected (from selector). It then goes through a 4C5, 3C4, and 2C3 to gradually lessen the input for each further multiplexer. Figure 5.5.2 shows a sample sequence of the permuter. Inputs
Outputs
Figure 5.5.1: Diagram for the permuter circuit.
Time (t)
T=0 T=1 T=2 T=3 T=4 T=5 T=6
Output Combination
12345 12354 12435 12453 12534 12543 13245
Figure 5.5.2: Sample sequence for the permuter.
5.5 The Arithmetic Checker

The purpose of the arithmetic checker is to check the arithmetic validity of the problem using the inputs given. The arithmetic checker was designed by equations derived from the TWO+TWO=FOUR problem. Figure 5.5.1 shows the derived equations.
Figure 5.5.1: Equations derived from the TWO+TWO=FOUR problem. The equations shown in figure 5.5.1 are the decomposition of each column for the TWO+TWO=FOUR problem. In this problem, it is assumed that F has to equal 1, not 0. This is because if the value was 0, there is no need for a letter. It is also noticed that carry positions were added, similar to the handwritten method of addition. The arithmetic checker strictly follows the equations, by replacing the +, *, and = operators with their respective gates.
X = X +
X X +
+ =
X +
+ =
Figure 5.5.2: Complete arithmetic checker circuit.
5.6 The RAM Module

Although when the circuits described in sections 5.3 5.5 can make a complete oracle, there is no way for a user to know what the successful combinations are. In order to show what the combinations are, there first needs to be any type of memory to store the successful combinations. This is done on the FPGA by utilizing its RAM (Random Access Memory).
The circuit used for the RAM of the FPGA is based off the concept of vectors. In a vector, information can be inserted, as well as removed according to a reference number. The circuit produced first stores the successful combinations (combinations with an arithmetic checker output of 1) into the vector. In the post-processing stage, the combinations in the vector can be displayed to the user by a counter, which is explained in section 5.7 Pop-in Pop-out
Remove by address
Figure 5.6.1: Concept of a vector used for memory in the TWO+TWO=FOUR oracle.
5.7 The RAM Counter

In order to cycle through all of the solutions to the problem, a counter is required. The counter used to do the task is very similar to a normal counter, incrementing its value by 1, but it has an adjustable limit. This limit is imposed on the counter such that the counter wont go to empty cells in the vector, leaving the counter only counting the solutions
5.8 The LCD Display

The LCD Display is required in order to display the solutions. Without it, the user will not be able to see what the calculated answers are. The circuit for the LCD is fairly simple; the output from the RAM is displayed on the LCD. Only the initialization of special ports (to turn on the backlight and other functions) are required.
5.9 Verification of the TWO+TWO=FOUR Oracle

In order to verify that the TWO+TWO=FOUR oracle works properly, the solutions outputted from the oracle is compared to the solutions from a software program. This software
program is very accurate, and always finds all possible solutions using exhaustive search (refer to section 6.9). If the oracle works, then the solutions produced should be identical. Notice that all solutions considered here are only when F=1.
Solutions Found for TWO+TWO=FOUR

Oracle on FPGA Program on Computer 938+938=1876 765+765=1730 734+734=1468 836+836=1672 867+867=1734 846+846=1692 928+928=1856 867+867=1734 846+846=1692 928+928=1856 836+836=1672 938+938=1876 765+765=1730 734+734=1468 Figure 5.9.1: Table showing the solutions of both method for TWO+TWO=FOUR. Note that the solutions always have F=1. To see all solutions when F=0, refer to section 6.7 The data in figure 5.9.1 are identical for both methods. The only difference is the order in which they are calculated, due to a different number generator. The data tells us that the TWO+TWO=FOUR oracle works, and generates all solutions for the problem.
5.10 TWO+TWO=FOUR Problem Experiment

The experiment of the TWO+TWO=FOUR problem is a comparison between two methods: a hardware optimized implementation of the oracle, and a software implementation of an arbitrary oracle. The hardware implementation is run on a Terasic DE2-115 board, while the software implementation is run on a laptop with an Intel Core i5 CPU clocked at 2.40 GHz. These methods are compared for speed, or in this case, the time it takes to find all solutions for the same given problem. Counters are implemented in both the hardware and software oracles, yielding for more accurate results. The answers of the hardware program are verified by the software program. Refer to section 6.9 for further explanation about the software. The solutions are shown below. 938+938=1876 928+928=1856 867+867=1734 846+846=1692 836+836=1672 765+765=1530 734+734=1468
Since the selector, permuter, and arithmetic checker operate under one clock pulse, theoretically stating, without glitches in the circuit, it would take n clock pulses to go through n combinations. The circuit has to go through 9C5 combinations, and from that number of combinations, it has to go through 5P5 permutations. Solving it would figure out how many clock pulses it would take in order to complete.
Figure 5.10.1 Derivation of number of clock pulses required to solve the problem The number of clock pulses is then converted into time. 1 MHz is equivalent to 1000 KHz. Since the clock on the DE2-115 is 50 MHz, it is equivalent to 50(1000) = 50,000 KHz. There is a formula of 1/(KHz) which converts Kilohertz to milliseconds. Substituting 50,000 into the equation, 1/50,000 is the result, the number of milliseconds per cycle. The number of cycles, 15120, is multiplied to the factor, receiving 0.3024 as an answer, stating that the computational time is 0.3024 milliseconds.
Hardware vs Software Computational Time of TWO+TWO=FOUR

Hardware (Predicted) Hardware (Actual) Software In milliseconds In milliseconds In milliseconds 0.3024 0.2998 1105886 Trial 1 0.3024 0.2998 1136322 Trial 2 0.3024 0.2999 1136897 Trial 3 0.3024 0.2999 1126368 Average Table 5.10.1: Comparison of an oracle for hardware and software in terms of computational time for TWO+TWO=FOUR.
Comparison of Computational Time for TWO+TWO=FOUR

Computational Time (ms) 1200000 1000000 800000 600000 400000 200000 0 Trial 1 Trial 2 Trial 3 Hardware(Predicted) Hardware(Actual) Software
Graph 5.10.1: Bar graph comparing computational time of a hardware and software implementation of an oracle for the TWO+TWO=FOUR Problem.
Comparison of Average Computational Time for TWO+TWO=FOUR

Average Computation Time (ms) 1200000 1000000 800000 600000 400000 200000 0 Hardware(Predicted) Hardware(Actual) Software
Graph 5.10.2: Bar graph comparing the average computational time of the hardware and software implementations for the TWO+TWO=FOUR Problem. From the data shown in Table 5.10.1 and the graphs, it can be shown that the software computational time was significantly slower than that of both hardware times. Also, the hardware computational times were not equivalent, but had little differences between the times. In the average computational time comparison, the software took 1126368 milliseconds to calculate, while the hardware only took a small fraction of 0.2999 milliseconds. Thus, from the data, the hardware has around a 3724762 speed up time.
5.11 Conclusion for the TWO+TWO=FOUR Problem

In conclusion of the TWO+TWO=FOUR experiment, the hardware implementation of the oracle preformed significantly faster than the software implementation in terms of computational time. For all trials tested, the hardware preformed over three million times faster than the software equivalent. The results of the predicted hardware time and the actual hardware time were very close to being the same. Each trial always took around 0.3 milliseconds to calculate. This is because the some of the combinations are done in parallel with the selector permuter setup. However, the difference is so little, an approximate computational time of the FPGA can be proven by hand. There were no sources of error while taking data, as the timer were implemented are their respective systems. However, there can be lots of improvements to the experiment that could be made. First of all, the software tested is for an arbitrary amount of problems, meaning that it is very inefficient. Therefore, in order to have more accurate data, a specific oracle for the TWO+TWO=FOUR problem needs to be made. Also, the clocks of both systems are different (50 MHz on the FPGA versus 2.4 GHz on the computer). If one were to measure the ratio of FPGA speed-up, the clocks would have to be equivalent.
6. Oracle for Arbitrary Cryptarithmetic Problems

6.1 Overview of the Oracle
The purpose of the arbitrary cryptarithmetic solver is to be able to find all possible answers from a given problem inputted by the user. The answers are then displayed on a LCD display. There are three parts of the oracle: the counter, the Input Validity Checker, and the arithmetic checker. These parts are different than the specific TWO + TWO = FOUR Oracle because the number of variables used is not known, and therefore it is not know how many numbers are needed to select and permute. Although the usage of a counter is very inefficient, generating many garbage combinations, it is an arbitrary way to generate all possible combinations. The results from the oracle are then stored in memory and displayed by postprocessing circuits.
Figure 6.1.1: Flowchart for the oracle for arbitrary cryptarithmetic problems.
6.2 The Input of the Oracle

The user inputs the desired problem into the oracle. However, constraints are imposed on the input in order to prevent the user from inputting faulty problems. This is done by limiting the input to only 3 letters in all rows. The layout is shown below in figure 6.2.1.
+
X10
X1 X4 X9
X2 X5 X8
X3 X6 X7
Figure 6.2.1: Layout of the arbitrary oracle inputs. Notice that the last row of the problem, Xn increments from right to left (instead of left to right). This is because the value of X10 has only two values 0 and 1, and thus is logical to be put as the last Xn value (for the counter described in section 6.3). In the input, each Xn can be replaced with any letter A through Z or 0, where 0 represents an empty space. An empty space can be placed anywhere in the first two rows, but not in the last row (as for the solution, the sum of any two numbers is never nothing). In the oracle, this empty space is treated the same as a constant zero. The given input is then encoded to binary from the starting ASCII value. This allows for the values to be compared with each other, which produces the following control values: 00 meaning inequality, 01 meaning equality, and 10 meaning that the letter is a 0 (empty space). The control values are used in the Input Validity Checker circuit.
6.3 The Counter

The counter is used to generate all possible combinations any arbitrary problem in the given domain (shown above in figure 6.2.1). This counter consists of 10 smaller counters, each which counts from 0 to 9. This is because each letter Xn can have any value from 0 to 9, and that there are 10 letters (X1 X10) in total. The value of each letter is directly connected to each of the outputs (from each smaller counter) in the main counter. The pattern in which the counter counts is shown below in figure 6.3.1.
Time (t)
T=0 T=1 T=9 T = 10 T = 11
Output Combination
0000000000 0000000001 0000000009 0000000010 0000000011
Figure 6.3.2: Sample count sequence of the counter. It is noticed that when the first counter increments to over the max of 9, it increments the counter of the next higher value (which is on the left) by 1. This is done in the circuit by having the enable of the counter (the input which allows the counter to count) connected to the overflow of the previous counter (the counter to the right). Thus, when the counter overflows, it signals a 1 to the enable of the next counter, allowing it to increment. However, an AND gate is required for the enable 3rd to 10th counter from the right. This is because the overflow signal of the previous gate (the 2nd from the right for instance) would stay 1 until the next time it increments, which is
the next time the previous counter from that (the 1st counter from the right) increments. Since the counter increment at each clock pulse, it would increment for how long the enable is on. The sequence of the faulty counter (with direct overflow to enable connection) is shown below in figure 6.3.3.
Time (t)
T = 90 T = 91 T = 99 T = 100 T = 101 Figure 6.3.3: Incorrect counter sequence.
Output Combination
0000000090 0000000191 0000000999 0000001000 0000001001
6.4 The Input Validity Checker

The role of the Input Validity Checker circuit is to verify that the inputs are valid. This does not check for the satisfaction of the problem, but checks for repeating numbers or letters holding the same value. The Input Validity Checker consists of many equality and inequality checker, which are reconfigured for each new problem. Each equality and inequality outputs the value of 0 if the conditions are not satisfied, and a 1 if they are satisfied. All of the outputs are then fed into a global AND gate. Figure 6.4.2 shows an example checker for the case of
TWO+TWO=FOUR.
T X1 T W O T W O F O U R X1 X2 X3 X4 X5 X6 X10 X9 X8 X7 =
W X2
O X3
T X4
W X5
O X6
F X10
O X9
U X8
R X7
= =
Figure 6.4.1: Inequality and equality comparison cases for the TWO+TWO=FOUR problem
X1
X2
X1
X3
X1
X4
= = =
X8
X9
X8 X10
X9 X10
Global AND output Tells if inputs are valid Figure 6.4.2: Subset of circuit for the scenario described in figure 6.4.1. It can be seen that for every identical letter there is an equality checker, for every different letter, there is an inequality checker, and for all other cases the output is automatically 1 (not applicable for the TWO+TWO=FOUR problem). The cases where the output is automatically 1 are where a letter value is being compared with a null value.
6.5 The Arithmetic Checker

The purpose of the arithmetic checker is to verify that the inputs satisfy the problem arithmetically. This regardless of the output of the Input Validity Checker, so there may be combinations where the Input Validity Checker is not satisfied, while the arithmetic checker is. The arithmetic checker can be derived from the following set of equations shown in figure 6.5.1.
Figure 6.5.1: Equations for the Arithmetic Checker Circuit
The circuit can be derived by strictly following the excquations. There are adders for each + of the equation, there are multipliers for each 10*C3, and there are comparators for each =.
+ X +
+ X +
+ =
+ X +
+ =
Figure 6.5.2: The Arithmetic Checker Circuit
6.6 Software Variation of the Oracle for Arbitrary Cryptarithmetic Problems

In order to compare a software implementation of the oracle, such implementation must be created. Therefore, in order to have a more accurate comparison, the same circuit is implemented in software, where object oriented programming shortcuts (if-else statements) are used when applicable. The software variation will be able to found all possible solutions as an
exhaustive search on all combinations is performed, and each module is verified extensively (using different methods.) See Appendix D for the program code.
6.7 Verification of the Oracle for Arbitrary Cryptarithmetic Problems

In order to verify that the oracle for arbitrary cryptarithmetic problems works as intended, the solutions generated were analyzed. As stated in section 6.7, a program in software running an exhaustive search of the problem would not miss any solutions. Therefore, the oracle could by verified as working in the solutions calculated between the two methods are the same. Four cryptarithmetic problems were tested: TWO+TWO=FOUR, ONE+ONE=TWO, TOO+TOO=LONG, and OOP+LLP=PRGM.
Solutions Found for TWO+TWO=FOUR

Oracle on FPGA Program on Computer 765+765=1730 765+765=1730 836+836=1672 836+836=1672 346+346=0692 346+346=0692 846+846=1692 846+846=1692 357+357=0714 357+357=0714 867+867=1734 867+867=1734 132+132=0264 132+132=0264 418+418=0836 418+418=0836 173+173=0346 173+173=0346 428+428=0856 428+428=0856 928+928=1856 928+928=1856 438+438=0876 438+438=0876 938+938=1876 938+938=1876 193+193=0386 193+193=0386 459+459=0918 459+459=0918 469+469=0938 469+469=0938 479+479=0958 479+479=0958 234+234=0468 234+234=0468 734+734=1468 734+734=1468 Figure 6.7.1: Solutions found by both methods for TWO+TWO=FOUR.
Solutions Found for ONE+ONE=TWO

Oracle on FPGA Program on Computer 065+065=0130 065+065=0130 085+085=0170 085+085=0170 206+206=0412 206+206=0412 271+271=0542 271+271=0542 231+231=0462 231+231=0462 281+281=0562 281+281=0562 236+236=0472 236+236=0472 286+286=0572 286+286=0572 291+291=0582 291+291=0582 452+452=0904 452+452=0904 407+407=0814 407+407=0814 457+457=0914 457+457=0914 417+417=0834 417+417=0834 467+467=0934 467+467=0934 427+427=0854 427+427=0854 432+432=0864 432+432=0864 482+482=0964 482+482=0964 608+608=1216 608+608=1216 658+658=1316 658+658=1316 618+618=1236 618+618=1236 678+678=1356 678+678=1356 638+638=1276 638+638=1276 643+643=1286 643+643=1286 648+648=1296 648+648=1296 854+854=1708 854+854=1708 809+809=1618 809+809=1618 859+859=1718 859+859=1718 814+814=1628 814+814=1628 864+864=1728 864+864=1728 819+819=1638 819+819=1638 869+869=1738 869+869=1738 829+829+1658 829+829+1658 839+839=1678 839+839=1678 Figure 6.7.2: Solutions found by both methods for ONE+ONE=TWO.
Solutions Found for TOO+TOO=LONG

Oracle on FPGA Program on Computer 877+877=1754 877+877=1754 377+377=0754 377+377=0754 Figure 6.7.3: Solutions found by both methods for TOO+TOO=LONG.
Solutions Found for OOP+LLP=PRGM

Oracle on FPGA Program on Computer 881+771=1652 881+771=1652 771+881=1652 771+881=1652 881+551=1432 881+551=1432 551+881=1432 551+881=1432 771+661=1432 771+661=1432 661+771=1432 661+771=1432 881+661=1542 881+661=1542 661+881=1542 661+881=1542 Figure 6.7.4: Solutions found by both methods for OOP+LLP=PRGM. From the data shown in figures 6.7.1-4, it can be seen that the answers for both the implementation of the oracles are identical. This proves that the oracle in hardware (FPGA) is working properly, and that it can find all possible solutions to any given problem.
6.8 Oracle for Arbitrary Cryptarithmetic Problems Experiment

The experiment of the oracle for arbitrary cryptarithmetic problems is a comparison between two implementations of an oracle: a hardware implementation of the oracle, and a software implementation of the same oracle. The hardware implementation is run on a Terasic DE2-115 board, while the software implementation is run on a laptop with an Intel Core i5 CPU clocked at 2.40 GHz. These methods are compared for speed, in which the time it takes to find all solutions for the same cryptarithmetic problem is compared. A counter is implemented in both the hardware and software oracle for more accurate results (of time). The software variant of the oracle would also be used to verify if the oracle in hardware has generated all solutions (for reasons explained in section 6.9). Much like the TWO+TWO=FOUR oracle, the time of the hardware implementation of the oracle can be calculated. There are 2,000,000,000 combinations that the oracle will go through. The calculation time is derived using the steps shown in figure 6.8.1 (refer to section 5.11 for a more detailed explanation.)
Figure 6.8.1: Steps for calculating the calculation time of the oracle implemented in hardware. For this experiment, four different problems are tested: TWO+TWO=FOUR, ONE+ONE=TWO, TOO+TOO=LONG, and OOP+LLP=PRGM. 6.8.1 TWO+TWO=FOUR

Hardware (Predicted) Hardware (Actual) Software In milliseconds In milliseconds In milliseconds 40000 40254 225086 Trial 1 40000 40302 225371 Trial 2 40000 40187 225463 Trial 3 40000 40247.67 225306.7 Average Table 6.8.1.1: Comparison of an oracle for hardware and software in terms of computational time for TWO+TWO=FOUR.

250000 Computational Time (ms) 200000 150000 100000 50000 0 Trial 1 Trial 2 Trial 3 Hardware(Predicted) Hardware(Actual) Software
Graph 6.8.1.1: Bar graph comparing computational time of a hardware and software implementation of an oracle for the TWO+TWO=FOUR Problem.
Average Computation Time (ms)

250000 200000 150000 100000 50000 0 Hardware(Predicted) Hardware(Actual) Software
Graph 6.8.1.2: Bar graph comparing the average computational time of the hardware and software implementations for the TWO+TWO=FOUR Problem. From the data shown above, the hardware calculation times were much faster than the software counterpart. Also, the predicted and the actual time for the hardware oracle were slightly different, stating that the real hardware oracle has a slight delay in calculation. In the average comparison, the software took 225306.7 milliseconds (around 4 minutes) to calculate, while the FPGA only took 40247.67 milliseconds (around 1 minute).
6.8.2 ONE+ONE=TWO
Hardware vs Software Computational Time of ONE+ONE=TWO

Hardware (Predicted) Hardware (Actual) Software In milliseconds In milliseconds In milliseconds 40000 40260 232735 Trial 1 40000 40183 247241 Trial 2 40000 40255 255811 Trial 3 40000 40232.67 245262.3 Average Table 6.7.2.1: Comparison of an oracle for hardware and software in terms of computational time for ONE+ONE=TWO.
Comparison of Computational Time for ONE+ONE = TWO

300000 Computational Time (ms) 250000 200000 Hardware(Predicted) 150000 100000 50000 0 Trial 1 Trial 2 Trial 3 Hardware(Actual) Software
Graph 6.7.2.1: Bar graph comparing computational time of a hardware and software implementation of an oracle for the ONE+ONE=TWO Problem.
Comparison of Average Computation Time for ONE+ONE=TWO

300000 Average Computational Time 250000 200000 150000 100000 50000 0 Hardware(Predicted) Hardware(Actual) Software
Graph 6.7.2.2: Bar graph comparing the average computational time of the hardware and software implementations for the ONE+ONE=TWO Problem. From the data shown above, the hardware calculation times were drastically faster than the software counterpart. Again, the predicted and the actual time for the hardware oracle were slightly different, with the actual time being slightly higher. In the average comparison, the software took 245262.3 milliseconds to calculate, while the FPGA only took 40232.67 milliseconds.
6.7.3 TOO+TOO=LONG
Hardware vs Software Computational Time of TOO+TOO=LONG

Hardware (Predicted) Hardware (Actual) Software In milliseconds In milliseconds In milliseconds 40000 40230 235015 Trial 1 40000 40268 242760 Trial 2 40000 40227 286393 Trial 3 40000 40241.67 254722.7 Average Table 6.7.3.1: Comparison of an oracle for hardware and software in terms of computational time for TOO+TOO=LONG.
Comparison of Computational Time for TOO+TOO=LONG

350000 Computational Time (ms) 300000 250000 200000 150000 100000 50000 0 Trial 1 Trial 2 Trial 3 Hardware(Predicted) Hardware(Actual) Software
Graph 6.7.3.1: Bar graph comparing computational time of a hardware and software implementation of an oracle for the TOO+TOO=LONG Problem.
Comparison of Average Computation Time for TOO+TOO=LONG

300000 Average Computational Time 250000 200000 150000 100000 50000 0 Hardware(Predicted) Hardware(Actual) Software
Graph 6.7.3.2: Bar graph comparing the average computational time of the hardware and software implementations for the TOO+TOO=LONG Problem. From the data shown in the graphs and table, the hardware implementation again performed much faster than the software implementation. Again, the predicted and the actual
calculation time for the hardware implementation of the oracle were very close, but never equivalent. For a comparison of the averages of both methods, the hardware implementation only took 40241.67 milliseconds, while the software implementation took 254722.7 milliseconds. 6.7.4 OOP+LLP=PRGM *Note, OOP and LLP are types of programs, and PRGM is an abbreviation for program
Hardware vs Software Computational Time of OOP+LLP=PRGM

Hardware (Predicted) Hardware (Actual) Software In milliseconds In milliseconds In milliseconds 40000 40159 223127 Trial 1 40000 40205 228621 Trial 2 40000 40237 289617 Trial 3 40000 40200.33 247121.7 Average Table 6.7.4.1: Comparison of an oracle for hardware and software in terms of computational time for TOO+TOO=LONG.
Comparison of Computational Time for OOP+LLP+PRGM

350000 Computational Time (ms) 300000 250000 200000 150000 100000 50000 0 Trial 1 Trial 2 Trial 3 Hardware(Predicted) Hardware(Actual) Software
Graph 6.7.4.1: Bar graph comparing computational time of a hardware and software implementation of an oracle for the TOO+TOO=LONG Problem.
Average Computation Time (ms)
Comparison of Average Computational Time for OOP+LLP=PRGM

300000 250000 200000 150000 100000 50000 0 Hardware(Predicted) Hardware(Actual) Software
Graph 6.7.4.2: Bar graph comparing the average computational time of the hardware and software implementations for the TOO+TOO=LONG Problem. From the data shown above, the hardware calculation times again were faster than the software counterpart. As usual, the actual calculation time for the hardware oracle was slightly higher than the predicted time. In the comparison of the average time, the software took 40200.33 milliseconds to calculate the solutions, while the FPGA only took 247121.7 milliseconds.
6.9 Conclusion for the Oracle for Arbitrary Cryptarithmetic Problems

In conclusion of the oracle for arbitrary cryptarithmetic problem experiment, the hardware implementation of the oracle is drastically faster than the software counterpart. For all trials of the four problems tested, the hardware implementation of the oracle always preformed more than 6.25 times faster than the software variant. The hardware implementation of the oracle was very constant, each trial always taking between 40000 and 40100 milliseconds for any problem. This is most likely due to the fact that there could be small delays within the electronic circuit. The software implementation of the oracle, however, was unstable, and ranged from anywhere between 1000000 and 1100000 milliseconds. Similar to the TWO+TWO=FOUR problem, since the timers are implemented in both systems. However, improvements could be made to make the experiment more accurate. As discussed with the TWO+TWO=FOUR problem, one error is that the clock pulses of both
methods are different. If the FPGA has a faster clock equivalent to the computer, it would perform much faster.
6.10 Future Extensions for the Oracle for Arbitrary Cryptarithmetic Problems
Since the exhaustive search performed slowly, there is a need for a different method in order to solve larger problems. The current method already goes through all the combinations in the least amount of time (1 combination per clock pulse), which leaves artificial intelligence as the only option. By using artificial intelligence, the search space is drastically smaller. One artificial intelligence algorithm which could be applied to solve cryptarithmetic problems is the breadth-first search. In this search, the algorithm goes through all cases in the current layer before going deeper into a new layer. Each node of the algorithm contains a letter to number mapping, and each node is expanded to the all possible number mappings for the next letter. After a node is reached, local minimization is done in order to simplify the problem. If the current node happens to be invalid from local minimization, the rest of the nodes branches are cut, saving the time to search those combinations.
T=0 T=1 0WO+0WO=FOUR
TWO+TWO=FOUR
T=2
T=9
9WO+9WO=FOUR
1WO+1WO=FOUR 2WO+2WO=FOUR
Figure 6.10.2: The first layer of the breadth-first search for TWO+TWO=FOUR. The tree shows several nodes which are already determined incorrect by local minimization.
7. Oracle for Single Solutions of Arbitrary Cryptarithmetic Problems
7.1 Overview of the Problem

The design for the oracles mentioned in both section 5 and 6 manly focuses on solving the problems for all possible solutions. However, in many situations, one does not need all solutions. Sometimes even one solution is sufficient enough. By reducing the search space of the problem, the time needed for calculation is reduced, and creates new problems for the design of the oracle. In the original problem of finding all possible solutions, it can be shown (section 6.8.1 6.8.4) that the calculation time is relative constant. Since the counters speed depends heavily on clock speed, the only way to speed up the calculation time is to reduce the search space. However, one only one solution is required, several different methods to solve the problem arise. In this section, the method of flat-search, mutation, and a genetic algorithm will be discussed.
7.2 Efficiency of Current Oracle Design

Although the design of the oracle shown in section 6 is proven to have a faster calculation time to the software oracle, the result is not surprising. It is a well know result that hardware is faster than software. However, it is also shown that solving 5 letters + 5 letters = 6 letters using the current design is impractical to solve (635 years). Therefore, better methods are needed to bring to time into a polynomial space.
Oracle Output Start

0 000000000 199999999
Current Value of Counter
Figure 7.2.1: Example search space of the current oracle design.
7.3 Discussion of Different Designs

The results shown in section 7.3 are not very quick. The time that it takes to calculate is noticeably slow. This is because the calculation time of the oracle depends on the location of the first solution. If the X10 value of the solution is in the 0-5 range, the oracle will perform adequately fast. However, if the X10 value of the solution is within the 6-9 range, the oracle will perform noticeably slow (close to three minutes). One way to solve this problem is to start counting from a random point. This is because from the starting point of 0000000000, it takes 12345678 clock pulses in order to get to the first valid solution. By selected a starting point in the middle of all combinations, it could potentially be closer to the solution. Another way to solve this problem is to using artificial intelligence in order to solve the problem. By using genetic algorithms, the hardware can make intelligent moves which would get the current values closer to the solution. In this method, the location of the solution will not matter as much as other counter-based oracles.
7.4 Random-Start Algorithm

In the mutation stage of a genetic algorithm, a random position in the chromosome is chosen, and a random value is generated for that gene. The random-start algorithm simulates mutation, in which the chromosomes (value of the counter) are randomly generated every several cycles.
The random-start algorithm first generates a random initial value of the counter (generated by a linear feedback shift register). The counter than increments from the value a predefined number of clock pulses; and then new values are generated for the counter. This is somewhat similar to mutation, where after a period of time, a mutation occurs. Figure 7.5.1 shows an example search space of the Random-Start Algorithm. The blue peaks represent the solutions (0 for no, and 1 for yes), and the horizontal arrows show an example search sequence for the algorithm. The orange vertical peak is the selected solution, and the number over the arrows shows the order of the search. 1
Oracle Output 1
0 000000000
2
199999999
Figure 7.5.1: Example search space of the Random-Start Algorithm It can be noted that the Random-Start algorithm does not have a general calculation time. This is because the calculation time is heavily reliant on the pseudo-random number generator. If the random number generator generates the right sequence, then the problem is quickly solved. However, if the number generator is not close to the solution, then the algorithm takes a longer time to solve the problem.
7.6 Calculation Time Analysis of the Current Oracle Design and the RandomStart Algorithm
7.6.1 TWO+TWO=FOUR

Random-Start Algorithm 5505 1334 Trial 1 5546 207 Trial 2 5588 1286 Trial 3 5546.333 942.3333 Average Table 7.6.1.1: Computation time comparison of the original oracle and random-start algorithm methods for TWO+TWO=FOUR. Original Oracle

6000 Computational Time (ms) 5000 4000 3000 2000 1000 0 Trial 1 Trial 2 Trial 3 Original Oracle Random-Start
Graph 7.6.1.1: Bar graph comparing computation time of the original oracle and random-start algorithm methods for TWO+TWO=FOUR.

Average Computational Time (ms) 6000 5000 4000 3000 2000 1000 0 Original Oracle Random-Start
Graph 7.6.1.2: Bar graph comparing the average computation time of the original oracle and random-start algorithm methods for TWO+TWO=FOUR. From the data shown above, the random-start algorithm has a significantly faster calculation time. The average of the original oracle is 5546.3 milliseconds, while the average of the random-start algorithm is 942.3 milliseconds, yielding a 5.9 times speed up. However, the calculations times of the random-start algorithm are not stable, ranging from 207 to 1334 milliseconds. 7.6.2 ONE+ONE=TWO
Hardware vs Software Computational Time of ONE+ONE=TWO

Random-Start Algorithm 2818 1317 Trial 1 2761 359 Trial 2 2818 40 Trial 3 2799 572 Average Table 7.6.2.1: Computation time comparison of the original oracle and random-start algorithm methods for ONE+ONE=TWO. Original Oracle
Comparison of Computational Time for ONE+ONE=TWO

3000 Computational Time (ms) 2500 2000 1500 1000 500 0 Trial 1 Trial 2 Trial 3 Original Oracle Random-Start
Graph 7.6.1.1: Bar graph comparing computation time of the original oracle and random-start algorithm methods for ONE+ONE=TWO.
Comparison of Average Computational Time for ONE+ONE=TWO

Graph 7.6.2.2: Bar graph comparing the average computation time of the original oracle and random-start algorithm methods for ONE+ONE=TWO. From the data shown above, the random-start algorithm has a significantly faster calculation time. The average of the original oracle is 2799 milliseconds, while the average of the random-start algorithm is 572 milliseconds, which is a 4.9 times speed-up rate for the random-start algorithm. Again, the calculations times of the random-start algorithm are not stable, ranging from 1317 to just 40 milliseconds.
7.6.3 TOO+TOO=LONG
Hardware vs Software Computational Time of TOO+TOO=LONG

Random-Start Algorithm 15289 40130 Trial 1 15299 9630 Trial 2 15242 25242 Trial 3 15276.67 25000.67 Average Table 7.6.3.1: Computation time comparison of the original oracle and random-start algorithm methods for TOO+TOO=LONG. Original Oracle
Comparison of Computational Time for TOO+TOO=LONG

45000 40000 35000 30000 25000 20000 15000 10000 5000 0 Trial 1 Trial 2 Trial 3 Computational Time (ms)
Original Oracle Random-Start
Graph 7.6.3.1: Bar graph comparing computation time of the original oracle and random-start algorithm methods for TOO+TOO=LONG.
Comparison of Average Computational Time for TOO+TOO=LONG

Graph 7.6.3.2: Bar graph comparing the average computation time of the original oracle and random-start algorithm methods for TOO+TOO=LONG. From the data shown above, the random-start algorithm actually has a slower calculation time than the original oracle. In fact, the average of the original oracle is 15276.67 milliseconds, while the random-start algorithms calculation time is only 25000.67. This leads to a 0.6 times speed-up rate by using the random-start algorithm (which is a 1.6 times speed-up rate for using the original oracle. Also, the random-start algorithm has unpredictable results, where it only did better in trial 2. As an observation for this problem, there were numerous trials where the random-start algorithm could not find a solution. Therefore, these trials are omitted.
7.6.4 OOP+LLP=PRGM *Note, OOP and LLP are types of programs, and PRGM is an abbreviation for program
Hardware vs Software Computational Time of OOP+LLP=PRGM

Random-Start Algorithm 28106 13315 Trial 1 28889 5068 Trial 2 28907 1438 Trial 3 28634 6607 Average Table 7.6.4.1: Computation time comparison of the original oracle and random-start algorithm methods for OOP+LLP=PRGM. Original Oracle
Comparison of Computational Time for OOP+LLP=PRGM

35000 Computational Time (ms) 30000 25000 20000 15000 10000 5000 0 Trial 1 Trial 2 Trial 3 Original Oracle Random-Start
Graph 7.6.4.1: Bar graph comparing computation time of the original oracle and random-start algorithm methods for OOP+LLP=PRGM.
Comparison of Average Computational Time for OOP+LLP=PRGM

Average Computational Time (ms) 35000 30000 25000 20000 15000 10000 5000 0 Original Oracle Random-Start
Graph 7.6.4.2: Bar graph comparing the average computation time of the original oracle and random-start algorithm methods for OOP+LLP=PRGM. From the data shown above, the random-start algorithm again has a significantly faster calculation time. The average of the original oracle is 28634 milliseconds, while the average of the random-start algorithm is 6607 milliseconds, which is a 4.3 times speed-up rate for the random-start algorithm. Again, the calculations times of the random-start algorithm are not stable, ranging from 13314 to 1438 milliseconds.
7.7 Conclusion of Oracle for Single Solutions of Arbitrary Cryptarithmetic Problems

As a result, the random-start algorithm overall performed better than the existing oracle (explained in section 6). For the problems TWO+TWO=FOUR, ONE+ONE=TWO, and OOP+LLP=PRGM, it performed faster by over 4 times the existing oracle design. However, for the TOO+TOO=LONG problem, the existing oracle design performed better by 1.4 times. The calculation time for the random-start algorithm varied. Sometimes, the algorithm would take as little as 40 milliseconds, and sometimes it would take as long as 40000 milliseconds. This is because randomness is a factor in this algorithm. The calculation time heavily depends on the initial value of the pseudo-random number generator. For the TOO+TOO=LONG problem, the existing oracle design performed better, and there is a clear explanation to this. When solving for only one solution, the factor of the number of solutions in the current problem is introduced. This is because the more solutions there are in the problem, the high the probability that a random sequence is generated near a solution. The problems of TWO+TWO=FOUR, ONE+ONE= TWO, and OOP+LLP=PRGM have more than 8
solutions. However, TOO+TOO=LONG only has two solutions, therefore explaining the slower speed of the random-start algorithm.
7.8 Future Extensions

A genetic algorithm is current in progress of being implemented. In the genetic algorithm, several chromosomes are first initialized from a pseudo-random number generator. Then, the cost function of each function is determined. The cost function is a function related to the number of constraints satisfied. Next, crossover and mutations operators are applied to the gene pool. Finally, the new offspring chromosomes are tested for a new cost function. Figure 7.8.1 describes a flowchart for this process. Initialization of Population (10 Chromosomes) Evaluation of Fitness Values (Input Validity Checker and Arithmetic Checker)
Crossover and Mutation on Selected Chromosomes No Is Problem Satisfied? Yes End
Figure 7.8.1: Flowchart of genetic algorithm implementation for solving cryptarithmetic problems. A reason why a genetic algorithm might perform better than the previous methods mentioned is because there is a logical path to the solution. The solution will have the highest cost function, and each generation, the cost function gets closer to the solution. Compared to randomly choosing random points to count, the genetic algorithm is more reliable. However, as with many other artificial intelligence algorithms, there is a risk of a loss of data.
Start
Cost Function

Figure 7.8.2: Sample search space of a genetic algorithm.
8. Conclusion
From the data obtained from experiments, the goal was proven to be possible, and the hypothesis proven correct. The goal, which was to create an oracle that would be able to solve arbitrary cryptarithmetic problems, was possible. The oracle was also able to give all possible solutions to a given problem. Calculation time results from the project also supported the known fact that hardware is faster than software. For both oracles, the TWO+TWO=FOUR and the arbitrary cryptarithmetic problem oracles, all answers were reported. For the case of the TWO+TWO=FOUR oracle, the same solutions were outputted between the software and the hardware, but in different order. This is because both implementations have different methodology. However, for the oracle for arbitrary cryptarithmetic problems, the same answers were reported in the same order, because of similar methods used to construct both the software and hardware. In the scenario of a specific oracle for the TWO+TWO=FOUR problem, the hardware implementation of the oracle was much faster than the software implementation. The oracle in hardware was in fact over three million times faster than the oracle in software. Also, the calculation time of the hardware oracle was constant; it was always around 0.3024 milliseconds to calculate all answers. Theoretically, the circuit for the oracle is always the same and operates directly on clock pulses. As the clock pulses are always study, the calculation time never changes. However, with slight errors in the physical model, there is always a slight difference from the predicted time. The software oracle, in the other hand, varied in calculation time. Since the speed of the processor in the computer regular varies depending on the operation being calculated, the calculation time of the software oracle is not constant. In the final scenario of an oracle for solving arbitrary cryptarithmetic problems, the hardware implementation of the oracle was still much faster than the software implementation. Although the oracle in hardware was much slower due to processing more combinations, it still proved to be 6.5 times faster than the software counterpart. Again, the calculation time of the hardware oracle was constant at 40000 milliseconds, since the circuit again runs directly on clock pulses. The software oracle had varied times, all of them between 23000 and 28000 milliseconds. There can be major improvements done to make the experiment more accurate. For one, in both experiments, the speed-up ratio can be more accurately improved by using hardware and software of the same clock speed. Secondly, in the TWO+TWO=FOUR experiment, a comparison should be done between two optimized circuits, instead of an optimized circuit for hardware, and a non-optimized circuit in software. The problem for solving an arbitrary cryptarithmetic problem for one solution was presented. In this problem, new constraints and factors were presented. Two hardware
realizations of an oracle were compared in terms of speed-up. The first was an adaptation of the oracle for solving arbitrary cryptarithmetic problems (discussed above). The second the random-start algorithm, which was an algorithm which used a random starting point to reduce search time. In conclusion of this experiment, the random-start algorithm performed better on an overall basis. However, for problems with few solutions (TOO+TOO=LONG), the random-start algorithm performs poorly. Therefore, the idea of a genetic algorithm is brought into discussion as an extension of the problem. The next step of the project is to improve the circuit realized in hardware. Currently, in the oracle for arbitrary cryptarithmetic problem, there are 2000000000 combinations which are processed. However, of those combinations, in the largest problem of the accepted size, only 30240 combinations are possible solutions. Thus as a result, only about 0.00001512% of the total combinations are useful. By using a selector and permuter for arbitrary cryptarithmetic problems, the oracle would be able to calculate answers to much larger problems. Another step in order to increase the calculation time of the hardware implementation is to apply artificial intelligence to hardware. One way is to use a depth-first search algorithm in hardware, which would be able to drastically reduce the search size of the problems.
9. References
Chu, P. P. (2008). FPGA Prototyping by Verilog Examples. Hoboken, NJ: John Wiley & Sons. Haskell , R. E. (2010). Digital Design Using Digilent FPGA Boards (3rd ed.). Rochester Hills, MI: LBE Books LLC. Ishaque, B., Haider, B., Wasid, M., Alaul, S., Hassan, K., Ahsan, T., ... Alam, M. (2004). An Evolutionary Algorithm to Solve Cryptarithmetic Problem. TRANSACTIONS ON ENGINEERING, COMPUTING AND TECHNOLOGY, VI, 305-313. Prata, S. (2004). C++ Primer Plus (5th ed.). Indianapolis, IN: SAMs Publishing. Reza, A. (2009). Solving Cryptarithmetic Problems Using Parallel Genetic Algorithm. Second International Conference on Computer and Electrical Engineering, Vahid, F., & Lysecky, R. (2007). Verilog for Digital Design. Hoboken, NJ: Wiley & Sons. Wakerley, J. F. (2006). Digital Design (4th ed.). Upper Saddle River, NJ: Pearson Prentice Hall. *Title page image from: http://www.presseagentur.com/media/2577/INK_DE2115_Base_Board.JPG

GSF Version of Report

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

GSF Version of Report

Uploaded by

Copyright:

Available Formats

Fast Prototyping of Constraint Satisfaction Problem Architecture Using FPGAs

2.2 Constraint Satisfaction Problems and Oracles

2.2.1.2 Graph Coloring 2.2.1.3 Cryptarithmetic Problems 2.2.2 Oracle in Hardware

2.4 Field Programmable Gate Array and Verilog-HDL

2.1.2 Truth Tables

Combinations of all possible input values

Corresponding output to the input combination

Figure 2.1.2: Sample truth table with description of each part.

2.1.3 Karnaugh Maps

Figure 2.1.3.1: Karnaugh Map example with descriptions.

2.1.4 Commonly Used Gates

Figure 2.1.4.5.1: Decomposition of multiplexer to its primary gates. A 0 0 0 0 1 1 1 1 B Control X 0 0 0 0 1 0 1 0 0 1 1 1 0 0 1 0 1 0 1 0 1 1 1 1

2.1.5 Arithmetic Gates

Figure 2.1.5.1.2: Full Adder with truth table

Figure 2.1.5.1.3: Ripper Adder using full adders

Figure 2.1.5.2.1: Decomposition of multiplier to primitive gates.

tper = Period 1/tper = Frequency

Figure 2.1.6.1 Clock waveform with description.

2.1.7 Flip-Flops and Latches

Figure 2.1.7.1.1: Decomposition of D-latch to primitive gates.

Figure 2.1.7.1.2: D-latch with truth table.

Figure 2.1.7.2.1: D Flip-flop with truth table.

2.2 Constraint Satisfaction Problems and Oracles

Figure 2.2.1.1: Example satisfiability problem.

Figure 2.2.1.2: Example graph coloring problem

Figure 2.2.1.3: Example cryptarithmetic problem.

2.2.2. Oracle in Hardware

Gates to Compare Inputs to Constraints (Adders, Equality Gates)

2.3.2 Cryptography in History

2.3.3 Modern Cryptography

2.4 Field Programmable Gate Array and Verilog-HDL

3. Goal, Question, and Hypothesis

4. Design and Setup of Hardware

4.2 Terasic DE2-115 Development Board

Figure 4.3.1: Screenshot of ModelSims IDE with waveform

Figure 4.4.1: Screenshot of Quartus II IDE

Figure 5.1.1: Flowchart of the TWO+TWO=FOUR Oracle

5.2 TWO+TWO=FOUR Problem

5.3 The Selector

Figure 5.3.3: Connection of multiplexers to input.

Figure 5.3.4: Sample count sequence for full selector.

5.4 The Permuter

Figure 5.5.1: Diagram for the permuter circuit.

Figure 5.5.2: Sample sequence for the permuter.

5.5 The Arithmetic Checker

Figure 5.5.2: Complete arithmetic checker circuit.

5.6 The RAM Module

5.7 The RAM Counter

5.8 The LCD Display

5.9 Verification of the TWO+TWO=FOUR Oracle

Solutions Found for TWO+TWO=FOUR

5.10 TWO+TWO=FOUR Problem Experiment

Hardware vs Software Computational Time of TWO+TWO=FOUR

Comparison of Computational Time for TWO+TWO=FOUR

Comparison of Average Computational Time for TWO+TWO=FOUR

5.11 Conclusion for the TWO+TWO=FOUR Problem

6. Oracle for Arbitrary Cryptarithmetic Problems

6.2 The Input of the Oracle

6.3 The Counter

6.4 The Input Validity Checker

6.5 The Arithmetic Checker

Figure 6.5.1: Equations for the Arithmetic Checker Circuit

Figure 6.5.2: The Arithmetic Checker Circuit

6.6 Software Variation of the Oracle for Arbitrary Cryptarithmetic Problems