You are on page 1of 18

ARITHMETIC LOGIC UNIT (ALU) and BOOTHS MULTIPLIER FOR AMR7 MICROPROCESSOR

ii

ABSTRACT There are three major designs mentioned in this paper. First, the arithmetic logic unit (ALU) is designed to support for every conditional instruction in 32-bits ARM7 microprocessor. Two major tasks of ALU are to generate the result for arithmetic/logic operations and to generate the conditional signals. The functionalities of ALU consist of move, pass, add, add carry, subtract, subtract carry, bit-wise OR, bit-wise AND, bit-wise XOR, and bit-wise AND NOT. The adder in ALU is implemented by using a Carry Select Adder. Second, the multiplier, which supports for multi-cycle multiply instruction in ARM7, is designed by using Booths algorithm. Finally, the voltage regulator is designed to meet some specifications (output voltages tolerance, output currents delivering, and inputs polarity).

iii

TABLE OF CONTENTS PAGE 1. INTRODUCTION 1.0 1.1 1.2 1.3 2. Introduction Description of Booths multiplier Description of ALU Description of Voltage Regulator 1 1 2 2 3 4 4 4 6 7 7 8 9 10 10 10 11 12 12 12 13

DESIGN PROCEDURE 2.1 2.2 2.3 Booths multiplier ALU Voltage Regulator

3.

DESIGN DETAILS 3.1 3.2 3.3 Booths multiplier ALU Voltage Regulator

4.

DESIGN VERIFICATION 4.1 4.2 4.3 Booths multiplier ALU Voltage Regulator

5.

COSTS 5.1 5.2 Booths multiplier and ALU Voltage Regulator

6.

CONSCLUSION

iv

1. INTRODUCTION 1.0 Introduction The high-level block diagram of 32-bit ARM7 microprocessor is shown in Figure.1 below. This processor will support 11 instructions: Data Processing Transfer, Multiply, Single Data Swap, Single Data Transfer, Undefined, Block Data Transfer, Branch, Coproc Data Transfer, Coproc Data Operation, Coproc ResigterTransfer, and Software interrupt. Our task is to design and implement the ALU and the Multiplier module to chip level by using synthesis tool (Synopsys with 0.25 , 3.3V technology). Others will design the rest of the processor.

Address Register

Address Incrementer

Register Bank (31x32bit registers) (6 status registers)

Booths Multiplier

Barrel Shifter 32-bit ALU

Instruction Decoder & Control Logic

Write Data Register

Instruction Pipeline & Read Data Register

1.1

Fig.1. High-level block of ARM7 Description of Booths multiplier:

The multiplier module takes inputs A, B (32-bits), Enable and System clock (Sysclk). This module will give the least significant 32 bits of the product of two 32-bit operands, and Ready. Enable signal tells when multiplying begins, and Ready signal tells when multiplying ends. The result of a signed multiply and of an unsigned multiply of 32-bit operands differ only in the upper 32 bits, the low 32 bits of the signed or unsigned result are identical. So, they can use for both signed and unsigned multiplies. Block Diagram of Booths Multiplier: Input A Input B Enable Sysclk Booths Multiplier Ready Multiply_Result

The Booths multiplier is done in multi-cycles. As the performance specification, the maximum number of cycles required is 66 cycles, and the cycle length is as small as possible (typically less than 5 ns which is equivalent to 200MHz). There is a tradeoff between the number of cycles and the cycle length. If we have a large number of cycles, but the cycle length is small, then we might end up with a good total time required for multiply, but its downside might effect to other interfaced circuits since the cycle length is small. If we consider the other way around (less number of cycles, but the cycle length is large), then it also has some advantages and disadvantages. As the whole, the number of cycle and the cycle length are important variables for designing Booths Multiplier. 1.2 Description of ALU:

The ALU is a combinational logic unit. It takes two inputs of 32 bits wide and performs Logic and Arithmetics operations. The logic and arithmetic are operated depending on the 5-bit Alu_Control signal. The logic operations involve bit-wise and, bit-wise or, and bit-wise xor. The Arithmetics operations involve add, add Carry_in, sub, sub Carry_in, move from coprocessor register, and pass data. There are two outputs from ALU: 32_bit output and 4-bit Flag_signals. The 32-bit Output is the result depending on the operation of Input1 and Input2. The 4-bit Flag_signals support every conditional instruction in ARM7. They are broken down as [N,Z,C,V]. N indicates if Output is negative. Z indicates if Output is equal to zero. C indicates if the operation has carry out (this is only the case of Arithmetics operation). V indicates if we have the overflow situation (again, this is only the case of Arithmetics operation).

Block Diagram of ALU: Input1 Input2 Alu_Control

Carry_in Output

ALU

Flag_signals

The performance specification of ALU is to optimize the area, an important variable in our design, of ALU as small as possible, which leads to a cheaper cost. 1.3 Voltage Regulator

The functionality and specifications of the voltage regulator are to take an input voltage greater than 15V without consideration of polarity and always generate an output voltage of 13V with tolerance of 3% and an output current less than 2A. Block diagram of Voltage Regulator (Fig.1.2): + Vout = 13V Iout < 2A -

Vin

+ -

Diode Network

15V Vol Regulator

Resistive Network

Figure 1.2: Voltage Regulator

2. DESIGN PROCEDURE 2.1 Booths Multiplier

There are several algorithms to implement the multiplier. In ARM7, they choose Booths algorithm to implement the multiplier. This choice is desirable because Booths algorithm is simple to design. The alternative approach to this design is that we can implement the parallel multiplier or the Wallace Tree multiplier to gain better performance. However, these designs are much more complex. We divide the design process into several steps: First step: Draw the ASM (Algorithm State Machine) based on Booths algorithm, which is considered as the major design algorithm for the multiplier. The ASM of the multiplier is shown in Figure 1.3. Second step: Based on the ASM, we write the behavioral Verilog code for the multiplier. Then we run the simulation to see if the multiplier performs its function correctly. Third step: After verifying that the functionality of the multiplier works correctly, we begin to work on the architecture of the multiplier (actual hardware components used in the algorithm). Then, we write the mixed Verilog code for the multiplier (the combination of behavior and actual hardware for the multiplier). As always, we have to run the simulation to check if the mixed Verilog code agrees with the behavior code we wrote earlier. Final step: In this step, the actual components are used to build the whole design. There is no behavior code involved. All the codes we wrote are called Structural-Code, which describes the design structurally. Again, we have to verify if the structural code gives the same result as the behavior and mixed code do. Once this is done, we use Synopsys to synthesize the structural code to get the real circuit, which is ready to fabricate at the chip level. 2.2 Arithmetic Logic Unit (ALU)

As the same with Multiplier above, we have to go through all the steps except the mixed stage. This is because of ALU, which is a combinational logic circuit. The combinational logic circuit does not need to have the mixed stage. We can go directly from the behavior code to the structural code and synthesize from here. The alternative approach to the design is that we can use a conditional sum adder or a binary look ahead carry adder for better performance instead of the carry select adder. However, these designs will require a lot more complex. The major design equations for ALU are to implement the following functions depending on Alu_Control: Output = Input1 (Pass Input1) Output = Input2 (Pass Input2)
ASM FOR BOOTHS MULTIPLIER

Inputs (32 bits): A, B Outputs: Out (32 bits), Overflow (1 bit) R1 A R2 {0,B} Count = 0 Prev 0 READY 1 R[0] 0 0 1 1 Prev 0 1 0 1 Operation Shift Add A & Shift Subtract A & Shift Shift

Enable

Out R2[31:0]

0 Count < = 31 0

1 1 Prev = = 1

0 R2[0] = = 1

0 R2[0] = = 1

Prev 0

Prev 1 R2[63:32] R2[63:32] R1

Prev 0 R2[63:32] R2[63:32] + R1

Prev 1

R2 {R[63], R[63:1]} Count = Count +1 READY 0

Fig.1.3. ASM for Booths Multiplier Output = Input1 + Input2 (Add)

Output = Input1 + Input2 + Carry_in Output = Input1 - Input2 Output = Input1 Input2 Carry_in Output = Input2 Input1 Output = Input2 Input1 Carry_in Output = Input1 & Input2 Output = Input1 | Input2 Output = Input1 ^ Input2 Output = Input1 & (~Input2) Output = ~ Input1 2.3 Voltage Regulator

(Add Carry_in) (Substract) (Substract Carry_in) (Logical AND) (Logical OR) (Logical XOR) (Logical AND NOT) (Logical NOT)

We use the Diode Bridge to make sure the input voltage to the 15V voltage Regulator always positive regardless of the polarity of supply voltage. We also use the voltage divider to get 13V-output voltage. The voltage divider is Vout = _____R2_____Vin R1 + R2 The output current is Iout = Vout / R2 (2) The functionality of the voltage regulator is to take an input voltage greater than 15V without consideration of polarity and always generate an output voltage of 13V with tolerance of 3% and an output current less than 2A. Block diagram of Voltage Regulator (Fig.1.4): + Vout = 13V Iout < 2A (1)

Vin

+ -

Diode Network

15V Vol Regulator

Resistive Network

Figure 1.4: General form of Voltage Regulator

3. DESIGN DETAILS 3.1 Booths Multiplier To understand Booths algorithm, lets consider the multiplication of two operands: Operand A Operand B Result 0xFFFFFFF6 0x00000014 0xFFFFFF38 If the operands are considered as signed, operand A has the value -10, operand B has the value 20, and the result has the value 200 which is correctly represented by 0xFFFFFF38. If the operands are considered as unsigned, operand A has the value 4294967286, operand B has the value 20, and the result is 85899345720 which is equal to 0x13FFFFFF38, so the least significant bits are 0xFFFFFF38. Suppose that we multiply A*B where A is multiplicand and B is multiplier. The key to Booths insight is to divide the group bit of multiplier into three parts: the beginning, the middle, or the end of a run of 1s. More specific, the following table explains how the Booths algorithm works: Operation Do nothing but shift Add A, and shift Subtract A, and shift 1 1 Do nothing but shift Based on ASM (Fig 1.3), we write the behavior code attached below:
booth.txt

B[i] 0 0 1

B[i-1] 0 1 0

From the behavior code, we create a mixed code:


booth_mix.txt

Finally, the structural code for the multiplier:


booth_structural.txt

The entire circuit for the multiplier is generated from Synopsys:


mult.ps mult_arch.ps mult_arch_compare.ps mult_arch_con.ps mult_arch_counter.ps

mult_arch_mux3_1.ps

mult_arch_reg.ps

mult_control.ps

3.2

Arithmetic Logic Unit (ALU)

The behavior code for ALU is


alu.txt

The structural code for ALU is


alu_structural.txt

The following is the high level circuit of ALU from Synopsys:


alu_arm7.ps

These following components are the main modules to build the high level ALU:
comparator.ps csa.ps csa_adder4.ps csa_adder4_cla4.ps csa_adder4_pg4.ps

csa_adder4_sum4.ps

csa_adder5.ps

csa_adder5_cla5.ps

csa_adder5_pg5.ps

csa_adder5_sum5.ps

csa_adder6.ps

csa_adder6_cla6.ps

csa_adder6_pg6.ps

csa_adder6_sum6.ps

csa_adder7.ps

csa_adder7_cla7.ps

csa_adder7_pg7.ps

csa_adder7_sum7.ps

bitwise_and.ps

bitwise_or.ps

bitwise_xor.ps

busmux6_1.ps

mux12.ps

To understand the idea of Carry Select Adder, we consider the following diagram (Figure 1.4), which explains how the 8-bit Carry Select Adder works. If we need a 32-bit Carry Select Adder, we just need to cascade them in the same manner.

Fig.1.4. THE CARRY SELECT ADDER A4 A7 So S3 4 Co 4 BITS CARRY ADDER CELL 0 4 BITS CARRY ADDER CELL C8_0 S4 S7_0 0 4 A0 A3 4 S4 S7_1 B0 B3 1 4 BITS CARRY ADDER CELL C8_1 1 SUM4_7 B4 B7

A4 A7

B4 B7

C8_bar

3.3

Voltage regulator

We have the equation Vout = _____R2_____Vin R1 + R2 Since Vout = 13V (specification), Vin = 15V from the output of 15V voltage regulator. We choose R2 = 1K , and from the above equation we have R1 = R2 * (Vin/Vout 1) = 1K * (15/13 1) = 154 We also have the equation Iout = Vout / R2 = 13 / 10000 = 13 mA < 2 A (as specification) The Diode network is arranged as a bridge to make sure the input voltage to the 15Vvoltage regulator always positive regardless the polarity of supply input. Below is the whole circuit for voltage regulator

Figure 1.5 Voltage Regulator

10

4. DESIGN VERIFICATION 4.1 Booths Multiplier:

To test our multiplier, we applied the test vectors: input A = 55, input B = 11, expected output = 605 (In hexadecimal, input A = 0x00000037, input B = 0x0000000b, and expected output = 0x0000025d) to our behavior, mixed and structural Verilog codes. These codes are supposed to generate the same answer. Below are the simulated waveforms for the 3 different types of codes. Simulated wave form for behavior code:
booth_behave.ps

Simulated wave form for mixed code:


booth_mix.ps

Simulated wave form for structural code:


booth_structural.ps

From the result, we use Synopsys to optimize our multiplier to run at 2.2ns cycle length, which is approximately equal to 455MHz. The total time required to get the output is 64 cycles x 2.2ns = 140.8ns. Our results show that we meet all the criteria as specified above. Below is the timing report, which indicates that our multiplier runs at 2.2ns cycle time.
report.txt

4.2

Arithmetic Logic Unit (ALU)

We apply the input vectors: Input1 = 0x000009c4, Input2 = 0x000003e8, Carry_in = 1, and Alu_Control varies from 0 to 12 in order to test all the functionality (such as ADD, SUB, OR, AND, etc) of ALU. The results for behavior and structural code are supposed to be the same. The simulation for behavior code is
waves_behave.ps

The simulation for structural code is


waves_structural.ps

11

As stated above, we try to optimize the area of ALU as small as possible. Below is the area report for ALU
alu_area_report.txt

4.3

Voltage regulator

After the circuit is built, we use the voltage probe to measure the voltage across resistor R2 (This voltage is Vout). The result is Vout = 12.97V which almost equals to the specification voltage (13V). The tolerance should be (13 12.97) / 13 = 0.231% which is less than 3% as specification indicated. The output current should be Iout = 12.97 / 1000 = 12.97mA which is less than 2A as specification indicated. When we change the polarity of supply input, Vout and therefore Iout still stay the same. This indicates that we can change the polarity of supply input and we still get the same outputs. In other words, we have met all specifications.

12

5. COSTS 5.1 Labor costs for the ALU & the multiplier

Ideal salary (hourly rate) x actual hours spent x 2.5 Tam Nguyen: ($20/hr) x (8hrs/week) x 12 weeks x 2.5 = $4,800 Long Pham: ($20/hr) x (8hrs/week) x 12 weeks x 2.5 = $4,800 All the software we used to do the compilation and simulation for the VERILOG codes and Synopsys is provided, and therefore we do not include in our cost calculation. 5.2 Cost for the voltage regulator

The cost for components of the voltage regulator is approximately $5.00 (diodes, resistors, voltage - regulator) Labor cost: Tam Nguyen: ($8/hr) * 1.5hrs * 2.5 = $30 Long Pham : ($8/hr) * 1.5hrs * 2.5 = $30

13

6. CONCLUSION We feel confident that our design works properly. Furthermore, we have met the specifications as stated in the introduction. The design of Booths multiplier runs at 455MHz with 0.25 , 3.3V technology. However, we need to consider a faster speed for future design by shifting 2 bits per cycle instead of shifting 1 bit per cycle. The area of ALU is optimized to smallest area generated by Synopsys. There are different values of resistor we can choose from the equation (1) in order to design voltage regulator. However, we have chosen the values such that we have low power dissipation, which increases the reliability of the product.

You might also like