Professional Documents
Culture Documents
2 Memory Organization
Laundry Example
from David patersson
30 40 20 30 40 20 30 40 20 30 40 20
T
a A
s
k
B
O
r
d C
e
r D
Sequential laundry takes 6 hours for 4 loads
If they learned pipelining, how long would laundry take?
Pipelined Laundry
Start work ASAP
6 PM 7 8 9 10 11 Midnight
Time
30 40 40 40 40 20
T
a A
s
k
B
O
r
d C
e
r
D
Instruction Fetch
Instruction Decoding
Operand Fetch
Execute
Store Result
5 Steps of MIPS Datapath
Instruction Instr. Decode Execute Memory Write
Fetch Reg. Fetch Addr. Calc Access Back
Next PC
MUX
Adder
Next SEQ PC
4 RS1
Zero?
Reg File
Address
Memory
MUX MUX
RS2
Inst
ALU
Memory
Data
RD L
M
MUX
D
Sign
Imm Extend
WB Data
5 Steps of MIPS Datapath
Figure 3.4, Page 134 , CA:AQA 2e
Instruction Instr. Decode Execute Memory Write
Fetch Reg. Fetch Addr. Calc Access Back
Next PC
IF/ID
ID/EX
EX/MEM
MEM/WB
MUX
Next SEQ PC Next SEQ PC
Adder
4 RS1
Zero?
Reg File
Address
Memory
MUX MUX
RS2
ALU
Memory
Data
MUX
Sign
WB Data
Extend
Imm
RD RD RD
ALU
Ifetch Reg DMem Reg
n
s
t
ALU
r. Ifetch Reg DMem Reg
O
r
ALU
Ifetch Reg DMem Reg
d
e
r
ALU
Ifetch Reg DMem Reg
Pipeline Time Analysis
With Pipeline
k segment pipeline
n tasks
tp clock cycle time
ktp time to complete task T1
(n-1)tp time to complete remaining n-1 tasks
k+(n-1) clock cycles to complete n tasks
(k+n-1)tp time to complete n tasks
Without pipeline
tn time to complete each task
ntn time to complete n tasks
Pipeline Time Analysis
Speedup of pipelining
nt n nt n tn
S= S = lim =
(k + n 1)t p n (k + n 1)t
p tp
ALU
I Load Ifetch Reg DMem Reg
n
s
ALU
t Instr 1 Ifetch Reg DMem Reg
r.
ALU
Ifetch Reg DMem Reg
Instr 2
O
r
ALU
Reg
d Instr 3 Ifetch Reg DMem
ALU
r Instr 4 Ifetch Reg DMem Reg
Data Hazard on R1
Time (clock cycles)
IF ID/RF EX MEM WB
ALU
add r1,r2,r3 Ifetch Reg DMem Reg
n
s
ALU
t sub r4,r1,r3 Ifetch Reg DMem Reg
r.
ALU
Ifetch Reg DMem Reg
O and r6,r1,r7
r
d
ALU
Ifetch Reg DMem Reg
e or r8,r6,r9
r
ALU
Reg
xor r10,r1,r11 Ifetch Reg DMem
Control Hazard due to Branches
ALU
10: beq r1,r3,36 Ifetch Reg DMem Reg
ALU
Ifetch Reg DMem Reg
14: and r2,r3,r5
ALU
Reg Reg
18: or r6,r1,r7 Ifetch DMem
ALU
Ifetch Reg DMem Reg
22: add r8,r1,r9
ALU
36: xor r10,r1,r11 Ifetch Reg DMem Reg
Solutions
Instruction Reordering
Branch Prediction
Parallel Processing
Concurrent data processing
Possibilities
Fetch next instruction while current instruction is executed in
ALU
System may have more than one ALU
System may have more than one CPU
Overall goal is to increase throughput
Multiple Functional Units
Parallel Processing Classifications
Classification of parallel processing can be considered based
on
Internal organization of processors
Interconnection structure between processors
Flow of information through system
Flynns classification
SISD: Single Instruction, Single Data
SIMD: Single Instruction, Multiple Data
MISD: Multiple Instruction, Single Data
MIMD: Multiple Instruction, Multiple Data
SISD
Single computer with
Control Unit
CPU, and
Memory
Instructions are executed sequentially
Parallel processing achieved by
Multiple functional units
Pipeline processing
SIMD
Multiple processing units supervised by a common control
unit
All processors:
Receive same instruction received from the control unit
Operate on different data
Shared memory unit must have multiple modules so multiple
processors can each access their own memory module
simultaneously
MIMD
Computer system that simultaneously executes many
programs
Category for most multiprocessor and multicomputer
systems