You are on page 1of 25

EE275 Final Project Report

Fall 2015

Joey Kuo
009431409
Executive Summary
This project consists of an implementation of MIPS architecture in which the dot product and factorial can be
calculated. The final design has successfully performed both the dot product and factorial operations, in addition to
having a 5-stage pipeline. The entire project was completed using the Xilinx Vivado Design Suite.
Assembly Code Reference
Dot Product

Factorial
Results
Figure 1: Dot Product Assembly Code

Empty instructions are added to prevent data hazards. There is at least a 4 instruction gap for dependent instructions,
as seen by the LDW R6, 0(R2) instruction and the MUL R8, R6, R7 instruction

Forward chaining was implemented only for the multiplication instruction; the product is sent straight to the ALU using a mux and a comparator that
compares the current_reg_dest with the next_reg_source
Figure 2: Dot Product Loop Simulation Overview

The AVECTOR and BVECTOR used is shown to the left


The above figure shows the simulation reaching the BNE R4, R0 instruction 5
times
After N = 0, the condition_met signal is not high anymore, and the STW
instruction is reached, as shown by the only memory_write signal at the end
of the simulation
Figure 3: Factorial Assembly Code

The above assembly code is almost exactly the same as the reference assembly code
Stall instructions are added due to lack of forward chaining
Instructions are reordered to minimize stalling
Figure 4: Factorial Loop Simulation Overview

From the start of the simulation, n = 5. Since n != 1 for the first 4 loops, the BEQ instruction doesnt branch and the BNE is reached the first four times.
This can be seen from the two close branch signals at the bottom.
The jal instruction being executed can be seen from the jump signal after each double-branch signal
When n = 1, the BEQ R4, R1 condition is met, and the return1 subroutine is called
The instructions restart at the link address and the second half of the continue subroutine is called
Figure 5: Factorial Loop Multiplication

The above image shows the MULT R2, R4, R2 instruction that occurs during the unstacking portion of the factorial instructions
rs(reg_source) = 4 and rt(reg_secondary) = 2, circled in blue
the values in rs (data1out) and rt (data2out) are circled in white
The product and jr instruction jump signal is circled in red
Instruction Signal Paths
Figure 6: Annotated ADD instruction signal path

o One additional note is that the CTRL module outputs its CTRL signals at every change in opcode
o The Branch/Jump? Module at the top is just a few gates that output high when a jump/branch condition is met
o Instruction Addresses are 16 bits values so that they can be represented by the immediate values in I-instructions
Figure 7: ADD instruction signal path
Figure 8: ADDI instruction signal path
Figure 9: LDW instruction signal path

o While not shown, the Register File and Memory both write the input data into the input address every clock cycle if their write signal is high. By
the next clock edge after the memory access cycle, it can be seen that the writedata, write signal, and write address of the register file are ready.
Figure 10: STW instruction signal path

o Here, the contents of the rs register are output from data2out of the register file and towards memory_writedata. The signal switch was
implemented to keep all instructions consistent in which bits represented rs
o rd is sent to the ALU with the immediate offset value for the destination memory address
Figure 11: BEQ instruction signal path

o When branching or jumping, we want the next instruction as soon as possible, so modules were added to skip all stage-transition flip flops
o As mentioned before, a dead instruction is needed after every branch/jump instruction because it takes 1 clock cycle to set the branch/jump
address.
Ex. Current instruction address is 12. Since the branch opcode hasnt gone past the IR flip flop, the next instruction address is set at 16
instead of branch address 32. After the instruction is clocked through, the address going into the PC is set to 32. However, instruction 16
has entered the pipelined and will be executed.
Figure 12: JAL instruction signal path

o The JAL_inst register at the very bottom captures the return address every time the jal signal goes high from the CTRL module. The signal gets
captured during the decode stage because that is when the CTRL module gets the opcode and sends out the jal signal.

Ex. Current instruction address is 12. The address into the JAL_inst register is at 16 but since the branch opcode hasnt gone past the IR
flip flop, the register doesnt save that address. After the jal instruction is clocked through the IR register, the address going into the
JAL_inst register is 20. Therefore, the return instruction address will be written to be 20, and instruction 16 needs to be a dead
instruction.
Figure 13: JR instruction signal path
Waveforms of Pipelined
Instructions
Figure 14: ADD instruction waveform
Figure 15: LDW instruction waveform
Figure 16: ADDI instruction waveform
Figure 17: BEQ instruction waveform
Figure 18: JAL instruction waveform
Figure 19: JR instruction waveform

You might also like