You are on page 1of 8

The datapath we discussed for the single-cycle implementation can be broken into several steps, each takes one

clock cycle. Multicycle Implementation The multicycle implementation allows a functional unit to be used more than once per instruction in different clock cycles, that is, one instruction requires multiple cycles. This sharing helps reduce the amount of hardware. For multicycle implementation, different instructions takes different numbers of clock cycles; function units can be shared within the execution of a single instruction.

Figure 5.13 shows the abstract model of the datapath for the multicycle Implementation: A single memory unit is used for both instructions and data. A single ALU is used, instead of an ALU and two adders. One or more registers are added after every major functional unit to hold the output between two consecutive clock cycles. Therefore, the datapath is partitioned into a memory access, a register file access and the ALU operation. o For memory access: Instruction Register (IR): to keep the output from the memory for the instruction fetch. Memory Data Register (MDR): to keep the output from the memory for the data read. Both values are needed during the same clock cycle. o For register file access: A and B Register: to keep the two operands from the RF. o For ALU: to keep the result of the ALU. ALUOut Register: The IR is needed until the end of the instruction. Therefore the IR needs to hold data for multiple cycles. A write control signal is needed. For other registers, their values are updated at each clock cycle. The write control signal is not necessary. To share the functional units within a single instruction, An additional MUX is added for the first ALU input to select between the A or the PC. The MUX for the second ALU input is replaced with a 4-to-1 MUX: o the B (R-type) o the constant 4 (PC+4) o the sign-extended immediate field (memory address) o the sign-extended immediate field shifted by 2 bits (branch address)

See the control signals for the multicycle datapath.

See the multicycle datapath and its control unit (in the abstract view).

Figure 5.34 lists the description of the control signals. Signal RegDst RegWrite ALUSrcA MemRead MemWrite MemtoReg IRWrite PCWrite None The 1st ALU operand is the PC None None Deasserted The write address is rt (I[20-16]) Asserted The write address is rd (I[15-11]) Enable the write operation The 1st ALU operand is the A register Enable the read of the data from the data memory Enable the write of the data to the data memory

The data written to the RF comes from the The data written to the RF comes from the ALUOut MDR None None The output of the memory is written into the IR The PC is written; the source is controlled by PCSource The PC is written if the Zero status output is active Effect Add Subtract The function field determines the ALU operation The 2nd ALU operand is the B The 2nd ALU input is the constant 4 The 2nd ALU operand is the sign-extended immediate field of the IR The 2nd ALU operand is the sign-extended immediate field shifted by 2 bits

PCWriteCond None Signal ALUOp Value 00 01 10 00 ALUSrcB 01 10 11

00 PCSource 01 10

The next PC will be (PC+4) The next PC is the ALUOut (for branch) The next PC is the ALUOut {(PC+4)[31:28], Instruction[25:0], 00} (for jump)

Stage 1: Instruction Fetch 1. IR = Memory[PC] o MemRead/IRWrite o IorD = 0 2. PC = PC + 4 o ALUSrcA = 0 o ALUSrcB = 01 o ALUOp = 00 o PCSource = 00 o PCWrite Stage 2: Instruction Decode and Register Fetch 1. A = Reg[IR[25-21]] 2. B = Reg[IR[20-16]] (Pre-fetch register content, no additional control is needed) 3. ALUOut = PC + ( Sign-Extend(IR[15-0]) << 2) (Calculate branch target in advance) o ALUSrcA = 0 o ALUSrcB = 11 o ALUOp = 00 Stage 3: Execution, Memory Address and Branch Address Computation a. Jump: 1. PC = PC[31-28] || (IR[25-0] << 2) PCWrite PCSource = 10 b. Branch: 1. if (A==B) PC = ALUOut (PC may be written twice) ALUSrcA = 1 ALUSrcB = 00 ALUOp = 01 PCWriteCond PCSource = 01 c. R-type: 1. ALUOut = A op B ALUSrcA = 1 ALUSrcB = 00 ALUOp = 10 d. Memory Access (Load or Store): 1. ALUOut = A + Sign-Extend(IR[15-0]) ALUSrcA = 1 ALUSrcB = 10 ALUOp = 00 Stage 4: Memory Access and R-type Instruction Write Back a. R-type Execution 1. Reg[IR[15-11]] = ALUOut RegDst = 1 RegWrite MemtoReg = 0 b. Memory Access i. Load: 1. MDR = Memory[ALUOut] MemRead IorD = 1 ii. Store: 0. Memory[ALUOut] = B (B is read twice in this instruction)

MemWrite IorD = 1 Stage 5: Memory Read Completion (for load instruction) 1. Reg[IR[20-16]] = MBR o MemtoReg = 1 o RegWrite o RegDst = 0 The summary of the multicycle datapath: Stage R-type Memory Access Branch Jump 1 2 3 4 5 1.1 and 1.2 2.1 and 2.2 and 2.3 3.c.1 4.a.1 3.d.1 4.i.1 or 4.ii.1 5.1 3.b.1 3.a.1

Defining the Controller Two possible implementations of the controller: 1. Finite State Machine 2. Microprogramming Finite State Machine

Finite State Machine All read/write signal that are not explicitly asserted are deasserted. The MUX controls have to be specified. Each state takes 1 clock cycle. The number of clock cycles for each instruction class, and its ratio in a specified benchmark program are shown as follows: following: Instruction Cycles Ratio Load Store ALU (R-type) Branch Jump 5 4 4 3 3 23% 13% 13% 43% 2%

The average (effective) CPI is CPI = 0.23 x 5 + 0.13 x 4 + 0.43 x 4 + 0.19 x 3 + 0.02 x 3 = 4.02 It is better than the worst-case CPI when all the instructions took the same number of clock cycles, i.e., 5 clock cycles.

Microprogramming The FSM provides a Visual (graphical) representation of control design. It is not feasible if the number of instructions (number of clocks) increases. The instructions are divided into microinstructions. Each microinstruction defines 1. datapath control signals and 2. instruction sequence (state transition of the FSM). The microinstruction format Simply a symbolic representation of the control. From pages 383, 384, the MIPS instructions can be divided into 7 fields and an optional label: Field Label ALU Control Operation Any string Add Subt Func code SRC1 PC A B SRC2 4 Extend Extshft Read Register Control Write ALU Write MDR Read PC Memory Read ALU Write ALU ALU PCWrite Control ALUOut-Cond Jump address Seq Sequencing Fetch Dispatch i (1 or 2) Dispatch is a table look-up operation. For Dispatch 1: 1. Mem1: for memory-reference instructions 2. Rformat1: for R-type instructions 3. BEQ1: for the branch equal instruction 4. JUMP1: for the jump instruction And for Dispatch 2: 1. LW2: for load word instruction 2. SW2: for store word instruction Label Fetch Mem1 LW2 Write MDR SW2 Write ALU ALU Control Add Add Add SRC1 SRC2 PC PC A Read ALU 4 Extshft Read Register Control Memory Read PC PCWrite Control ALU Sequencing Seq Dispatch 1 Dispatch 2 Seq Fetch Fetch

Rformat1 BEQ1 JUMP1

Func Code Subt

A A

B Write ALU B ALUOut-Cond Jump Address

Seq Fetch Fetch Fetch

Exceptions Undefined instruction Overflow IO request hardware malfunction When exception happens, 1. EPC (a 32-bit register) is used to hold the address of the affected instruction. o EPC = PC = (PC + 4) - 4 EPCWrite ALUSrcA = 0 ALUSrcB = 01 ALUOp = 01 2. Cause (a 32-bit register) is used to record the cause of the exception, e.g., o Undefined instruction: 0 CauseWrite IntCause = 0 o Arithmetic overflow: 1 CauseWrite IntCause = 1 3. Exception address (e.g., C0000000) is used to replace PC for the execution of the exception handling. o PCWrite o PcSource = 11

You might also like