Professional Documents
Culture Documents
Instruction-Set Architecture
Jan. 2011
Slide 1
Revised
July 2004 Jan. 2009
Revised
June 2005 Jan. 2011
Revised
Mar. 2006
Revised
Jan. 2007
Jan. 2011
Slide 2
CPU execution time = Instructions CPI / (Clock rate) Performance = Clock rate / ( Instructions
Try to achieve CPI = 1 with clock that is as high as that for CPI > 1 designs; is CPI < 1 feasible? (Chap 15-16) Design memory & I/O structures to support ultrahigh-speed CPUs
Jan. 2011
CPI )
Define an instruction set; make it simple enough to require a small number of cycles and allow high clock rate, but not so simple that we need many instructions, even for very simple tasks (Chap 5-8)
Computer Architecture, Instruction-Set Architecture
Design hardware for CPI = 1; seek improvements with CPI > 1 (Chap 13-14)
CPU execution time = Instructions CPI / (Clock rate) Performance = Clock rate / ( Instructions
Assembly line analogy
CPI )
Single-cycle (CPI = 1)
Faster
Items that take longest to inspect dictate the speed of the assembly line
Jan. 2011
Faster
up to 2 30 words ...
Memory
m 2 32
Loc Loc m8 m4
EIU
(Main p roc.)
$0 $1 $2
FPU
(Cop 1 roc. )
$0 $1 $2
Floatingpoint unit
$31
ALU
Integer mul/div
Hi Lo
FP arith TMU
$31
Chapter 10
Chapter 11
Chapter 12
BadVaddr Trap & (Cop 0 Status memory roc. ) Cause unit EPC
Figure 5.1
Jan. 2011
Data Types
B yte Byte = 8 bits Hal fword Halfword = 2 bytes W bytes Word = 4ord
Doubleword = 8 bytes Dou ble word Used only for floating-point data, so safe to ignore in this course
MiniMIPS registers hold 32-bit (4-byte) words. Other common data sizes include byte, halfword, and doubleword.
Jan. 2011 Computer Architecture, Instruction-Set Architecture Slide 8
$0 $1 $2 $3 $4 $5 $6 $7 $8 $9 $10 $11 $12 $13 $14 $15 $16 $17 $18 $19 $20 $21 $22 $23 $24 $25 $26 $27 $28 $29 $30 $31
$zero $at Reserved for assembler use $v0 Procedure results $v1 $a0 Procedure $a1 Saved arguments $a2 $a3 $t0 $t1 $t2 Temporary $t3 values $t4 $t5 $t6 $t7 $s0 $s1 Saved $s2 across $s3 Operands procedure $s4 calls $s5 $s6 $s7 More $t8 temporaries $t9 $k0 Reserved for OS (kernel) $k1 $gp Global pointer $sp Stack pointer Saved $fp Frame pointer $ra Return address
A 4-b yte word sits in consecutive memory addresses according to the big-endian order (most significant byte has the lowest address) Byte numbering:
3 2 1
3 2 1 0
Register Conventions
When loading a byte into a register, it goes in the low end Byte
Word Doublew ord
A doubleword sits in consecutive registers or memory locations according to the big-endian order (most significant word comes first)
Jan. 2011
8 operand registers
Change Wallet
Keys
Figure 5.2
Jan. 2011
(partial)
Register file
Register file
$17 $18
ALU
$24
Register readout
Operation
Data read/store
Register writeback
Figure 5.3
Jan. 2011
Decimal and hex constants Decimal Hexadecimal 25, 123456, 2873 0x59, 0x12b4c6, 0xffff0000
Machine instruction typically contains an opcode one or more source operands possibly a destination operand
Jan. 2011 Computer Architecture, Instruction-Set Architecture Slide 12
op
6 bits Opcode
25
rs
5 bits Source register 1
20
rt
5 bits Source register 2
15
rd
5 bits
10
sh
5 bits Shift amount
fn
6 bits Opcode extension
Destination register
15
31
op
6 bits Opcode
25
rs
5 bits Source or base
20
rt
5 bits
operand / offset
16 bits Immediate operand or address offset
Destination or data
31
op
6 bits Opcode
25
1 0 0 0 0 0 0 0 0 0 0 0 26 bits 0 0 0 0 0 0 1 1 1 1 0 1 0 0
Figure 5.4 MiniMIPS instructions come in only three formats: register (R), immediate (I), and jump (J).
Jan. 2011 Computer Architecture, Instruction-Set Architecture Slide 13
# # # # # #
rt
to to to to to to
10
31
op
ALU instruction
rs
20
fn
add = 32 sub = 34
Figure 5.5 The arithmetic instructions add and sub have a format that is common to all two-operand ALU instructions. For these, the fn field specifies the arithmetic/logic operation to be performed.
Jan. 2011 Computer Architecture, Instruction-Set Architecture Slide 14
31
op
25
rs
20
rt
15
operand / offset
Figure 5.6 Instructions such as addi allow us to perform an arithmetic or logic operation for which one operand is a small constant.
Jan. 2011 Computer Architecture, Instruction-Set Architecture Slide 15
op
lw = 35 sw = 43
25
rs
Base register
20
rt
Data register
15
operand / offset
Offset relative to base Note on base and offset: The memory address is the sum of (rs) and an immediate value. Calling one of these the base and the other the offset is quite arbitrary. It would make perfect sense to interpret the address A($s3) as having the base A and the offset ($s3). However, a 16-bit base confines us to a small portion of memory space.
1 0 x 0 1 1 1 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0
Memory
A[0] A[1] A[2]
. . .
lw lw
$t0,40($s3) $t0,A($s3)
Address in base register
A[i]
Figure 5.7 MiniMIPS lw and sw instructions and their memory addressing convention that allows for simple access to array elements via a base address and an offset (offset = 4i leads us to the i th word).
Jan. 2011 Computer Architecture, Instruction-Set Architecture Slide 16
# load mem[40+($s3)] in $t0 # store ($t0) in mem[A+($s3)] # ($s3) means content of # The immediate value 61 is # loaded in upper half of $s0 # with lower 16b set to 0s
15
31
25
rs
Unused
20
rt
operand / offset
Figure 5.8 The lui instruction allows us to load an arbitrary 16-bit value into the upper half of a register while setting its lower half to 0s.
Jan. 2011 Computer Architecture, Instruction-Set Architecture Slide 17
Initializing a Register
Example 5.2 Show how each of these bit patterns can be loaded into $s0: 0010 0001 0001 0000 0000 0000 0011 1101 1111 1111 1111 1111 1111 1111 1111 1111 Solution The first bit pattern has the hex representation: 0x2110003d lui ori $s0,0x2110 $s0,0x003d # put the upper half in $s0 # put the lower half in $s0
Same can be done, with immediate values changed to 0xffff for the second bit pattern. But, the following is simpler and faster: nor
Jan. 2011
$s0,$zero,$zero # because (0 0) = 1
Computer Architecture, Instruction-Set Architecture Slide 18
verify $ra
31
# go to mem loc named verify # go to address that is in $ra; # $ra may hold a return address
op
25
0 0 0 0 1 0 j=2
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 1
x x x x 0 0 0 0 0 0 1 1 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
(incremented) op
From PC
31
rs
Source register
20
rt
Unused
15
rd
Unused
10
sh
Unused
fn
jr = 8
0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 ALU instruction
Figure 5.9 The jump instruction j of MiniMIPS is a J-type instruction which is shown along with how its effective target address is obtained. The jump register (jr) instruction is R-type, with its specified register often being $ra.
Jan. 2011 Computer Architecture, Instruction-Set Architecture Slide 19
31
op
bltz = 1
25
rs
Source
rt
Zero
15
operand / offset
Relative branch distance in words
0 0 0 0 0 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 1
31
op
beq = 4 bne = 5
25
rs
Source 1
20
rt
Source 2
15
operand / offset
Relative branch distance in words
0 0 0 1 0 x 1 0 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 1
# # # # #
20
if ($s2)<($s3), set $s1 to 1 else set $s1 to 0; often followed by beq/bne if ($s2)<61, set $s1 to 1 else set $s1 to 0
15
31
25
rs
Source 1 register
rt
Source 2 register
rd
10
sh
Unused
fn
slt = 42
0 0 0 0 0 0 1 0 0 1 0 1 0 0 1 1 1 0 0 0 1 0 0 0 0 0 1 0 1 0 1 0 Destination
op
slti = 10
25
rs
Source
20
rt
15
operand / offset
Immediate operand
0 0 1 0 1 0 1 0 0 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 1 Destination
Forming if-then constructs; e.g., if (i == j) x = x + y bne $s1,$s2,endif add $t1,$t1,$t2 endif: ... # branch on i j # execute the then part
If the condition were (i < j), we would change the first line to: slt beq
Jan. 2011
$t0,$s1,$s2 $t0,$0,endif
# # # # # # #
j<i? (inverse condition) if j<i goto else part begin then part: x = x+1 z = 1 skip the else part begin else part: y = y1 z = z+z
Slide 23
Operand
Immediate
Register
Base
Reg base
PC-relative
Constant offset
Incremented PC
Pseudodirect
PC
$t0,0($s1) $t1,$zero,0 $t1,$t1,1 $t1,$s2,done $t2,$t1,$t1 $t2,$t2,$t2 $t2,$t2,$s1 $t3,0($t2) $t4,$t0,$t3 $t4,$zero,loop $t0,$t3,0 loop
# # # # # # # # # #
initialize maximum to A[0] initialize index i to 0 increment index i by 1 if all elements examined, quit compute 2i in $t2 compute 4i in $t2 form address of A[i] in $t2 load value of A[i] into $t3 maximum < A[i]? if not, repeat with no change # if so, A[i] is the new
Instruction Copy
Load upper immediate Add Subtract Set less than Add immediate Set less than immediate AND OR XOR NOR AND immediate OR immediate XOR immediate Load word Store word Jump Jump register Branch less than 0 Branch equal Branch not equal
Usage
lui add sub slt addi slti and or xor nor andi ori xori lw sw j jr bltz beq bne rt,imm rd,rs,rt rd,rs,rt rd,rs,rt rt,rs,imm rd,rs,imm rd,rs,rt rd,rs,rt rd,rs,rt rd,rs,rt rt,rs,imm rt,rs,imm rt,rs,imm rt,imm(rs) rt,imm(rs) L rs rs,L rs,rt,L rs,rt,L
op fn
15 0 0 0 8 10 0 0 0 0 12 13 14 35 43 2 0 1 4 5
Slide 26
Arithmetic
op
6 bits O pc od e
25
32 34 42 36 37 38 39
rs
5 bits
20
rt
5 bits
15
rd
5 bits
10
sh
5 bits
fn
6 bits
O pc od e ex tens ion
0
31
op
6 bits O pc od e
rs
5 bits S ourc e or bas e
20
rt
5 bits
15
o p era n d / o ffs et
16 bits Im m ediate o pe ran d or ad dres s o ffs et
D es tination or data
31
op
6 bits O pc od e
25
ju m ptarg e t ad d res s
Table 5.1
Jan. 2011
MiniMIPS instructions for procedure call and return from procedure: jal proc # jump to loc proc and link; # link means save the return # address (PC)+4 in $ra ($31) # go to loc addressed by rs
Computer Architecture, Instruction-Set Architecture Slide 28
jr
Jan. 2011
rs
PC
jal
proc
Restore
jr $ra
Figure 6.1
Jan. 2011
$0 $1 $2 $3 $4 $5 $6 $7 $8 $9 $10 $11 $12 $13 $14 $15 $16 $17 $18 $19 $20 $21 $22 $23 $24 $25 $26 $27 $28 $29 $30 $31
$zero $at Reserved for assembler use $v0 Procedure results $v1 $a0 Procedure $a1 Saved arguments $a2 $a3 $t0 $t1 $t2 Temporary $t3 values $t4 $t5 $t6 $t7 $s0 $s1 Saved $s2 across $s3 Operands procedure $s4 calls $s5 $s6 $s7 More $t8 temporaries $t9 $k0 Reserved for OS (kernel) $k1 $gp Global pointer $sp Stack pointer Saved $fp Frame pointer $ra Return address
A 4-b yte word sits in consecutive memory addresses according to the big-endian order (most significant byte has the lowest address) Byte numbering:
3 2 1
3 2 1 0
When loading a byte into a register, it goes in the low end Byte
Word Doublew ord
A doubleword sits in consecutive registers or memory locations according to the big-endian order (most significant word comes first)
Jan. 2011
|($a0)|
In practice, we seldom use such short procedures because of the overhead that they entail. In this example, we have 3-4 instructions of overhead for 3 instructions of useful computation.
Jan. 2011 Computer Architecture, Instruction-Set Architecture Slide 31
PC
jal
abc
abc
Procedure xyz
Restore
jr $ra jr $ra
Figure 6.2
Jan. 2011
b a Pop x
sp
c b a
sp sp = sp 4 mem[sp] = c
b a
x = mem[sp] sp = sp + 4
Figure 6.4
push: addi sw
Jan. 2011
Hex address
00000000 00400000
Reserved
Program
10000000 10008000 1000ffff
Stack
7ffffffc
80000000
Stack segment
Figure 6.3
Jan. 2011
z y . . .
c b a . . .
Figure 6.5
Jan. 2011
# # # # #
put top stack element in $s0 put 2nd stack element in $ra restore $sp to original state restore $fp to original state return from procedure
Jan. 2011
Slide 36
doubleword doubleword
ASCII Characters
Table 6.1
0 1 2 3 4 5 6 7 8 9 a b c d e f 0 NUL f SOH STX ETX EOT ENQ ACK BEL BS HT LF VT FF CR SO SI
Jan. 2011
aMore symbols
8-bit ASCII code (col #, row #)hex e.g., code for + is (2b) hex or (0010 1011)two
# # # # #
20
load rt with mem[8+($s3)] sign-extend to fill reg load rt with mem[8+($s3)] zero-extend to fill reg LSB of rt to mem[A+($s3)]
15
31
rs
Base register
rt
Data register
immediate / offset
Address offset
1 0 x x 0 0 1 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
Figure 6.6
Jan. 2011
Figure 6.7 A 32-bit word has no inherent meaning and can be interpreted in a number of equally valid ways in the absence of other cues (e.g., context) for the intended meaning.
Jan. 2011 Computer Architecture, Instruction-Set Architecture Slide 40
A[i] A[i + 1]
A[i] A[i + 1]
Figure 6.8 Stepping through the elements of an array using the indexing method and the pointer updating method.
Jan. 2011 Computer Architecture, Instruction-Set Architecture Slide 41
Selection Sort
Example 6.4
To sort a list of numbers, repeatedly perform the following: Find the max element, swap it with the last item, move up the last pointer
A
first first max x
A
first
Start of iteration
Maximum identified
End of iteration
Figure 6.9
Jan. 2011
A
first
A
y Outputs from proc max last
In $a0 In $a1
last
In $v0
last
In $v1
y
Start of iteration
Maximum identified
End of iteration
# # # # # #
single-element list is sorted call the max procedure load last element into $t0 copy the last element to max loc copy max value to last element decrement pointer to last
Slide 43
# # # # #
rs
20
op
ALU instruction
25
rd
Unused
10
sh
Unused
fn
mult = 24 div = 26
Figure 6.10
R
op
ALU instruction
25
rs
Unused
20
rt
Unused
15
rd
10
sh
Unused
fn
mfhi = 16 mflo = 18
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 x 0 Destination register
Figure 6.11 MiniMIPS instructions for copying the contents of Hi and Lo registers into general registers .
Jan. 2011 Computer Architecture, Instruction-Set Architecture Slide 44
Logical Shifts
MiniMIPS instructions for left and right shifting: sll srl sllv srlv $t0,$s1,2 $t0,$s1,2 $t0,$s1,$s0 $t0,$s1,$s0
31
# # # #
20
op
ALU instruction
25
rs
Unused
rd
sh
Shift amount
fn
sll = 0 srl = 2
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 0 0 0 x 0 Destination register
31
op
ALU instruction
25
rs
Amount register
20
rt
Source register
15
rd
10
sh
Unused
fn
sllv = 4 srlv = 6
0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 1 x 0 Destination register
Figure 6.12
Jan. 2011
addiu $t0,$s0,61
To make MiniMIPS more powerful and complete, we introduce later: sra $t0,$s1,2 srav $t0,$s1,$s0 syscall
Jan. 2011
# sh. right arith (Sec. 10.5) # shift right arith variable # system call (Sec. 7.6)
Slide 46
Instruction
Move from Hi Move from Lo Add unsigned Subtract unsigned Multiply Multiply unsigned Divide Divide unsigned Add immediate unsigned Shift left logical Shift right logical Shift right arithmetic Shift left logical variable Shift right logical variable Shift right arith variable Load byte Load byte unsigned Store byte Jump and link System call
Usage
mfhi rd mflo rd addu rd,rs,rt subu rd,rs,rt mult rs,rt multu rs,rt div rs,rt divu rs,rt addiu rs,rt,imm sll rd,rt,sh srl rd,rt,sh sra rd,rt,sh sllv rd,rt,rs srlv rt,rd,rs srav rd,rt,rd lb rt,imm(rs) lbu rt,imm(rs) sb rt,imm(rs) jal L syscall
op fn
0 0 0 0 0 0 0 0 9 0 0 0 0 0 0 32 36 40 3 0
Slide 47
16 18 33 35 24 25 26 27 0 2 3 4 6 7
op
6 bits O pc od e
25
rs
5 bits
20
rt
5 bits
15
rd
5 bits
10
sh
5 bits
fn
6 bits
O pc od e ex tens ion
0
31
op
6 bits O pc od e
rs
5 bits S ourc e or bas e
20
rt
5 bits
15
o p era n d / o ffs et
16 bits Im m ediate o pe ran d or ad dres s o ffs et
Shift
D es tination or data
31
op
6 bits O pc od e
25
ju m ptarg e t ad d res s
12
Usage
lui add sub slt addi slti and or xor nor andi ori xori lw sw j jr bltz bne rt,imm rd,rs,rt rd,rs,rt rd,rs,rt rt,rs,imm rd,rs,imm rd,rs,rt rd,rs,rt rd,rs,rt rd,rs,rt rt,rs,imm rt,rs,imm rt,rs,imm rt,imm(rs) rt,imm(rs) L rs rs,L rs,rt,L
Instruction
Move from Hi Move from Lo Add unsigned Subtract unsigned Multiply Multiply unsigned Divide Divide unsigned Add immediate unsigned Shift left logical Shift right logical Shift right arithmetic Shift left logical variable Shift right logical variable Shift right arith variable Load byte Load byte unsigned Store byte Jump and link System call
Usage
mfhi rd mflo rd addu rd,rs,rt subu rd,rs,rt mult rs,rt multu rs,rt div rs,rt divu rs,rt addiu rs,rt,imm sll rd,rt,sh srl rd,rt,sh sra rd,rt,sh sllv rd,rt,rs srlv rd,rt,rs srav rd,rt,rs lb rt,imm(rs) lbu rt,imm(rs) sb rt,imm(rs) jal L syscall
Slide 48
Assembler
Figure 7.1 Steps in transforming an assembly language program to an executable program residing in memory.
Jan. 2011 Computer Architecture, Instruction-Set Architecture Slide 50
Loader
Linker
Memory content
Symbol Table
Assembly language program
a ddi$ s0,$z ero, 9 s ub $ t0, 0,$s0 $s a dd $ t1, ero, zero $z $ te st: b ne $ t0, 0,don e $s a ddi $ t0, 0,1 $t a dd $ t1, 0,$ze ro $s j t est do ne: s w $ t1, sult( $gp) re
Location
0 4 8 12 16 20 24 28
00 10000 00001 00000 00000 00000 01001 00 00001 00001 00000 10000 00001 00010 00 00000 10010 00000 00000 00001 00000 00 01010 10001 00000 00000 00000 01100 00 10000 10000 10000 00000 00000 00001 00 00001 00000 00000 10010 00001 00000 00 00100 00000 00000 00000 00000 00011 10 10111 11000 10010 00000 00111 11000 op rs rt rd sh fn
Field b oundaries shown to facilitate understanding
Symbol table
28 248 12
Figure 7.2 An assembly-language program, its machine-language version, and the symbol table created during the assembly process.
Jan. 2011 Computer Architecture, Instruction-Set Architecture Slide 51
Assembler directives provide the assembler with info on how to translate the program but do not lead to the generation of machine instructions
.macro .end_macro .text ... .data .byte 156,0x7a .word 35000 .float 2E-3 .double 2E-3 .align 2 .space 600 .ascii a*b .asciiz xyz .global main
Jan. 2011
# # # # # # # # # # #
start macro (see Section 7.4) end macro (see Section 7.4) start programs text segment program text goes here start programs data segment name & initialize data byte(s) name & initialize data word(s) # name short float (see Chapter 12) # name long float (see Chapter 12) align next item on word boundary reserve 600 bytes = 150 words name & initialize ASCII string null-terminated ASCII string # consider main a global name
Slide 52
7.3 Pseudoinstructions
Example of one-to-one pseudoinstruction: The following
not $s0 # complement ($s0)
# # # #
copy x into $t0 is x negative? if not, skip next instr the result is 0 x
Slide 54
MiniMIPS Pseudoinstructions
Pseudoinstruction Copy
Move Load address Load immediate Absolute value Negate Multiply (into register) Divide (into register) Remainder Set greater than Set less or equal Set greater or equal Rotate left Rotate right NOT Load doubleword Store doubleword Branch less than Branch greater than Branch less or equal Branch greater or equal
Usage
move la li abs neg mul div rem sgt sle sge rol ror not ld sd blt bgt ble bge regd,regs regd,address regd,anyimm regd,regs regd,regs regd,reg1,reg2 regd,reg1,reg2 regd,reg1,reg2 regd,reg1,reg2 regd,reg1,reg2 regd,reg1,reg2 regd,reg1,reg2 regd,reg1,reg2 reg regd,address regd,address reg1,reg2,L reg1,reg2,L reg1,reg2,L reg1,reg2,L
Slide 55
Arithmetic
Table 7.1
Shift Logic Memory access Control transfer
Jan. 2011
7.4 Macroinstructions
A macro is a mechanism to give a name to an often-used sequence of instructions (shorthand notation)
.macro name(args) ... .end_macro # macro and arguments named # instrs defining the macro # macro terminator
If the macro is used as mx3r($t0,$s0,$s4,$s3), the assembler replaces the arguments m, a1, a2, a3 with $t0, $s0, $s4, $s3, respectively.
Jan. 2011 Computer Architecture, Instruction-Set Architecture Slide 57
MicrosoftResearch
Formerly:Professor,CSDept.,Univ.Wisconsin-Madison
Jan. 2011
Slide 59
($v0)Function
1Print integer
Output
4Print string
Pointer in $a0
Input
5Read integer 6Read point 7Read float floatingdoublePointer in $a0, length in $a1
Cntl
Jan. 2011
8Read string
PCSpim
File Sim lat W dow He u or in lp
?
EPC = 00000000 Cause = 00000000 HI = 00000000 LO = 00000000 General Registers R8 (t0) = 0 R16 (s0) = 0 R24 R9 (t1) = 0 R17 (s1) = 0 R25
Sim lat u or
ClearRegist ers Reinit ze iali Reload Go Brea k Cont inue Single St ep Mult St ... iple ep Brea kpoint ... s Set Value... D Sym Table isp bol Setings ... t
TextSeg ent m
[0x00400000] [0x00400004] [0x00400008] [0x0040000c] [0x00400010] 0x0c100008 0x00000021 0x2402000a 0x0000000c 0x00000021 jal 0x00400020 [main] addu $0, $0, $0 addiu $2, $0, 10 syscall addu $0, $0, $0 ; ; ; ; ; 43 44 45 46 47
W dow in
Tile 1Me ssage s 2Te Seg ent xt m 3 D a Seg ent at m 4Regist ers 5Con sole ClearConso le Toolba r St us bar at
Messag s e
See the file README for a full copyright notice. Memory and registers have been cleared, and the simulator rei D:\temp\dos\TESTS\Alubare.s has been successfully loaded
Figure 7.3
Jan. 2011
Status bar
Slide 61
op
6 bits O pc od e
25
rs
5 bits
20
rt
5 bits
15
rd
5 bits
10
sh
5 bits
fn
6 bits
All of the same length Fields used consistently (simple decoding) Can initiate reading of registers even before decoding the instruction
O pc od e ex tens ion
0
31
op
6 bits O pc od e
rs
5 bits S ourc e or bas e
rt
5 bits
o p eran d / o ffset
16 bits Im m ediate o pe ran d or ad dres s o ffs et
31
op
6 bits O pc od e
25
Jan. 2011
Machine
Pentium
Instruction
MOVS
Effect
Move one element in a string of bytes, words, or doublewords using addresses specified in two pointer registers; after the operation, increment or decrement the registers to point to the next element of the string Count the number of consecutive 0s in a specified source register beginning with bit position 0 and place the count in a destination register Compare and swap: Compare the content of a register to that of a memory location; if unequal, load the memory word into the register, else store the content of a different register into the same memory location Polynomial evaluation with double flp arithmetic: Evaluate a polynomial in x, with very high precision in intermediate results, using a coefficient table whose location in memory is given within the instruction
PowerPC
cntlzd CS
IBM 360-370
Digital VAX
POLYD
Jan. 2011
Slide 64
cntlzd
6 leading 0s
Destination string
POLYD
Coefficients
(Move string)
Jan. 2011 Computer Architecture, Instruction-Set Architecture Slide 65
MOVS
More complex format (slower decoding) Less flexible (one algorithm for polynomial evaluation or sorting may not be the best in all cases) If interrupts are processed at the end of instruction cycle, machine may become less responsive to time-critical events (interrupt handling)
Slide 66
Operand
Immediate
Register
Base
Reg base
PC-relative
Constant offset
PC
Pseudodirect
PC
Table 6.2
Instruction
Load upper immediate Add Subtract Set less than Add immediate
Instruction
Move from Hi Move from Lo Add unsigned Subtract unsigned Multiply Multiply unsigned Divide Divide unsigned Add immediate unsigned Shift left logical Shift right logical Shift right arithmetic Shift left logical variable Shift right logical variable Shift right arith variable Load byte Load byte unsigned Store byte Jump and link System call
Usage
mfhi rd mflo rd addu rd,rs,rt subu rd,rs,rt mult rs,rt multu rs,rt div rs,rt divu rs,rt addiu rs,rt,imm sll rd,rt,sh srl rd,rt,sh sra rd,rt,sh sllv rd,rt,rs srlv rd,rt,rs srav rd,rt,rs lb rt,imm(rs) lbu rt,imm(rs) sb rt,imm(rs) jal L syscall
Slide 68
Set less than immediate AND OR XOR NOR AND immediate OR immediate XOR immediate Load word Store word Jump Jump register Branch less than 0 BranchJan. 2011 equal Branch not equal
Instruction
Operand
Indirect
PC Memory Mem addr This part maybe replaced with any Mem addr, other form of address specif ication 2nd access
Figure 8.1 Schematic representation of more elaborate addressing modes not supported in MiniMIPS.
Jan. 2011 Computer Architecture, Instruction-Set Architecture Slide 69
# # # #
# entry
Slide 70
Form at
Opcode
12 syscall
Description of operand(s)
One im plied operand in register$v0 Jum p target addressed (in pseudodirect form ) Two s ource registers address ed, destination im plied Destination and two source registers addressed
j
24 mult 32 add
0 rs rt 0 rs rt rd
Figure 8.2 Examples of MiniMIPS instructions with 0 to 3 addresses; shaded fields are unused.
Jan. 2011 Computer Architecture, Instruction-Set Architecture Slide 71
Polish string: a b + d c
If a variable is used again, you may have to push it multiple times Special instructions such as Duplicate and Swap are helpful
Jan. 2011 Computer Architecture, Instruction-Set Architecture Slide 72
May have to store accumulator contents in memory (example above) No store needed for a + b + c + d + . . . (accumulator)
Jan. 2011 Computer Architecture, Instruction-Set Architecture Slide 73
Two-Address Architectures
Two addresses may be used in different ways: Operand1/result and operand 2 Condition to be checked and branch target address Example: Evaluating the expression (a + b) (c d) load add load subtract multiply $1,a $1,b $2,c $2,d $1,$2
A variation is to use one of the addresses as in a one-address machine and the second one to specify a branch in every instruction
Jan. 2011
Slide 74
Opcode (1-2 B)
ModR/M
SIB
Opcode
PUSH JE MOV XOR ADD TEST
Description of operand(s)
3-bit register specification 4-bit condition, 8-bit jum p offset 8-bit register/mode, 8-bit offset 8-bit register/mode, 8-bit base/index, 8-bit offset 3-bit register spec, 32-bit imm ediate 8-bit register/mode, 32-bit imm ediate
Figure 8.3 Example 80x86 instructions ranging in width from 1 to 6 bytes; much wider instructions (up to 15 bytes) also exist
Jan. 2011 Computer Architecture, Instruction-Set Architecture Slide 76
Performance objectives
Implementation
?
Tuning & bug fixes Feedback
Figure 8.4
Jan. 2011
Jan. 2011
Reading 100 pages per hour, as opposed to 10 pages per hour, may not allow you to finish the same reading assignment in 1/10 the time
Jan. 2011 Computer Architecture, Instruction-Set Architecture Slide 81
# # # # #
dest,src1,src2 label
# # # # # # #
at1 = 0 at1 = -(src1) at1 = -(src1)(src2) dest = 0 dest = -(at1) at1 = 0 at1 = -1 to force jump
Slide 84
URISC Hardware
Word 1 Word 2
Source 2 / Dest
Word 3
Jump target
Source 1
PC in
MDR in
MAR in Read
Write
Adder N in Z in
N Z
P C
M D R
M A R
Memory unit
R in
1 Mux 0
PCout
Figure 8.5
Jan. 2011