You are on page 1of 21

Hardware Support for Exposing More

Parallelism at Compiler Time


BY
N.R.Rejin Paul
Lecturer/CSE Dept.
1

Conditional or Predicated Instructions


BNEZ R1, L
Most common form is move
MOV R2, R3
Other variants

CMOVZ R2,R3, R1

L:
Conditional loads and stores
ALPHA, MIPS, SPARC, PowerPC, and P6 all have simple
conditional moves
IA_64 supports full predication for all instructions

Effect is to eliminating simple branches

Moves dependence resolution point from early in the pipe


(branch resolution) to late in the pipe (register write)
forwardingis more possible
Also changes a control dependence into a data
dependence

Net win since in global scheduling the control


dependence fence is the key limiting complexity

Conditional Instruction in SuperScalar


First slot(Mem)
LW R1,40(R2)

Second slot (ALU)


ADD R3, R4, R5
ADD R6, R3, R7

BEQZ R10, L
LW R8, 20(R10)
LW R9, 0(R8)

First slot(Mem)
LW R1,40(R2)

Second slot (ALU)


ADD R3, R4, R5
LWC R8,20(R10),R10 ADD R6, R3, R7
BEQZ R10, L
LW R9, 0(R8)

Waste a memory operation slot in


the 2nd cycle
data dependence stall if not taken

Condition Instruction Limitations


Precise Exceptions
If an exception happens prior to conditional evaluation, it must
be carried through the pipe
Simple for register accesses but consider a memory protection
violation or a page fault

Long conditional sequences If-then with a big then body


If the task to be done is complex, better to evaluate the
condition once and then do the big block

Conditional instructions are most useful when the


condition can be evaluated early
If data dependence in determining the condition help less

Condition Instruction Limitations


(Cont.)
Wasted resource
Conditional instructions consume real resources
Tends to work well in the superscalar case
Our simple 2-way model Even if no conditional instruction,
other resource is wasted anyway

Cycle-time or CPI Issues


Conditional instructions are more complex
Danger is that they may consume more cycles or a longer
cycle time
Note that the utility is mainly to correct short control flaws
Hence use may not be for the common case
Things better not slow down for the real common case to support
the uncommon case
5

Compiler Speculation with HW support


Ideal view
Do conditional things in advance of the branch
(and before the condition evaluation)
Nullify them if the branch goes the wrong way
Also implies the need to nullify exception behavior
as well

Limits
Speculated values cant clobber any real results
Exceptions can not cause any destructive activity
6

To Speculate Ambitiously
Ability of the compiler to find instructions that
can be speculatively moved and not affect the
program data flow
Ability of HW to ignore exceptions in
speculated instructions, until we know that
such exceptions should really occur
Ability of HW to speculatively interchange
loads and stores, or stores and stores, which
may have address conflicts
7

HW Support for Preserving Exception


Behavior
How to make sure that a mis-predicted speculated
instruction (SI) can not cause an exception
Four methods
HW and OS cooperatively ignore exceptions for SI
SI that never raise exceptions are used, and checks are
introduced to determine when an exception should occur
Poison bits are attached to the result registers written by SI
when SI cause exceptions. The poison bits cause a fault
when a normal instruction attempts to use the register
A mechanism to indicate that an instruction is speculative,
and HW buffers the instruction result until it is certain that
the instruction is no longer speculative

Exception Types
Indicate a program error and normally cause
termination
Memory protection violation
Should not be handled for SI when misprediction
Exceptions cannot be taken until we know the
instruction is no longer speculative

Handled and normally resumed


Page fault
Can be handled for SI just if they are normal
instructions
Only have negative performance effect when misprediction
9

HW-SW Cooperation for Speculation


Return an undefined value for any terminating exception
The program is allowed to continue, but almost generate
incorrect results
If the excepting instruction is not speculative program in
error
If the excepting instruction is speculativeprogram correct but
speculative result will simply be unused (No harm)
Never cause a correct program to fail, no matter how much
speculation
An incorrect program, which formerly might have received a
terminating exception, will get an incorrect result
Acceptable if the compiler can also generate a normal version of the
program (no speculate, and receive a terminating exception)

10

HW-SW Cooperation for


Speculation (Cont.)
if (A==0) A=B; else A=A+4
A is at 0(R3), B is at 0(R2);
LD
BNEZ
LD
J
L2

L1: DADDI
L2: SD

R1, 0(R3) ; load A


R1, L1
; test A
R1, 0(R2) ; if
;skip else

R1, R1, #4 ;else


R1, 0(R3) ; store A

Assume then is almost always executed

Compiler-based speculation
LD
R1, 0(R3) ; load A
LD R14, 0(R2) ;spec-lw B
BEQZ
R1, L3
; other
bran.

DADDI
L3: SD

R14, R1, #4 ;else


R14, 0(R3) ; store A

R14 is used to avoid destroying


R1 when B is loaded
No need to know which
instruction is speculative

11

Non-Terminating Speculative
Instructions + Exception Checking
LD
sLD
BNEZ
SPECCK

J
L1: DADDI
L2: SD

R1, 0(R3) ; load A


R14, 0(R2) ;spec-lw B
R1, L1
;test A
0(R2)
;spec check

L2
R14, R1, #4 ;else
R14,0(R3) ; store A

sLD: speculative load without


termination
SPECCK: speculation checking
Note:
Require to maintain a basic block
for the THEN case
Checking for a possible exception
requires extra code

12

Poison Bits
Track exceptions as they occur but postpones
any terminating exception until a value is
actually used.
Incorrect programs that caused termination
without speculation will still cause exceptions
when instructions are speculated.
Poison bit for every register. A bit to indicate SI
The poison bit of a destination register is set
when SI results in a terminating exception.
All other exceptions are handled immediately

A SI uses a poisoned register dest-reg is

13

Poison-Bit Example
LD
R1, 0(R3) ; load A
sLD
R14, 0(R2) ; spec-lw B. If exception
R14 poisoned
BEQZ
R1, L3
; other bran.
DADDI
L3: SD

R14, R1, #4
; else
R14, 0(R3) ; store A. R14 poisoned SD fault

If sLD generates a terminating exception, the poison bit of R14 will be


turned on. When SD occurs, it will raise an exception if the poison bit for
R14 is on.

14

Boosting
Boosting
How to deal with exception? Similar to Poison Bits?
Reduce # of registers used
Provide separate shadow resources for boosted
instruction results
If condition resolves selecting the boosted path
Then these results are committed to the real registers

LD
R1, 0(R3) ; load A
LD+
R1, 0(R2) ; Boosted load B. Result is
never written to
; R1 if
branch is not taken

15

HW (and OS) ignores exception until


instruction commits
Rely on a hardware mechanism that operates
like ROB
Instructions are marked by the compiler as
speculative and include an indicator of how
many branches the instruction was
speculatively moved across and what branch
action (taken/not taken) the compiler
assumed
The original location of SI is marked by a
sentinel, which tells HW that earlier SI is no
longer speculative and values may be

16

HW (and OS) ignores exception until


instruction commits (Cont.)
ROB tracks when instructions are ready to
commit and delays the write-back portion of
any SI
SI are not allowed to commit until the
branches that have been speculatively moved
over are also ready to commit, or, alternatively,
until the corresponding sentinel is reached
We know whether SI should have been executed
or not

If a ready-to-commit SI should have been

17

HW Support for Memory Reference


Speculation
Try to Move loads across stores any
address conflict?
HW Use a special instruction to check for
address conflicts
The special instruction is left at the original
location of the load instruction (act like a
guardian), and the load is moved up across
one or more stores
When a speculated load is executed, HW saves
the address of the accessed memory location

18

HW Support for Memory Reference


Speculation (Cont.)
Speculation failure handling
If only the load instruction was speculated, redo
the load at the point of the check instruction
If additional instruction that depended on the
load were also speculated, then a fix-up sequence
that re-executes all the SI starting with the load is
needed
Penalties!!

19

4.6 HW Versus SW Speculation


Mechanisms
To speculate extensively, we must be able to
disambiguate memory reference easy for
HW (Tomasulo)
HW speculation works better when control
flow is unpredictable, and when HW branch
prediction is superior to SW branch prediction
done at compiler time
Misprediction rate 16%/10% for 4 major integer
SPEC92 SW/HW

HW speculation maintains a completely

20

HW Versus SW Speculation
Mechanisms (Cont.)
HW speculation with dynamic scheduling does
not require different code sequences to
achieve good performance for different
implementation of an architecture
HW speculation require complex and
additional HW resources
Some designers have tried to combine the
dynamic and compiler-based approaches to
achieve the best of each
21

You might also like