You are on page 1of 42

VLSI Architecture :: MEL G642

Dr. A. Amalin Prince BITS Pilani K.K. Birla Goa Campus Department of Electrical , Electronics and Instrumentation Engineering

MEL G642

Contents
MAC fundamentals MAC implementations A MAC case study MAC integration

MEL G642

Datapath in a DSP processor


The data path (DP)
RF ALU MAC

Control path (CP)

Processor memory and register busses

PM

DM1

DM2

AGU1

AGU2

Addressing path (AGU)


MEL G642

MAC general

MEL G642

MAC instructions
Multiplication arithmetic's MAC & Iterative instructions Double-precision arithmetic instructions Move data from and to MAC Data format conversions Other instructions

MEL G642

Why MAC
MAC: Multiplication and accumulation unit
Performs convolution based algorithms
o FIR, IIR, Auto correlation, Cross correlation

Support most transformation algorithms


o FFT and DCT need MAC hardware
x(n)
c(0)

Z-1

x(n-1)

Z-1

x(n-2)

Z-1

x(n-3)

Z-1

x(n-4)

c(1)

c(2)

c(3)

c(4)

y(n)

MEL G642

Why MAC
y (n) = x( n i )c(i )
i =0 m 1

Data x(n) is shifted through a FIFO buffer consisting of 4 registers So that x(n) become x(n-1) and x(n-1) become x(n-2) the next clock cycle All arithmetic executions are mapped to hardware in parallel There are four multipliers and four full adders A sample of y(n) is computed per clock cycle y(n) = x(n)*c(0) + x(n-1)*c(1) + x(n-2)*c(2 )+ x(n3)*c(3 )+ x(n-4)*c(4)

MEL G642

MAC basics
MAC: Multiplication and accumulation unit
Adder = accumulator; Accumulator register
MOA MOA MOB MOB Multiplier Pipeline AOB Accumulator

Multiplier AOA ACR ACR = Accumulating register Accumulator AOB ACR ACR = Accumulating register AOA

Flag circuit

Flag circuit

(a) MAC without pipeline


MEL G642

(b) MAC with pipeline

MUL circuit

MEL G642

Multiplications
How to manage double precision? How to manage signed?

Hardware multiplication Fractional multiplication Integer multiplication

Signed multiplication

Unsigned multiplication

Result with double precision

Result with single precisoin


MEL G642

Multiplications
How to manage double precision? The 16-bit signed and 16bit unsigned multiplication How to manage signed? can be implemented based
Hardware multiplication Fractional multiplication Integer multiplication

on a 17b17b signed multiplier. In general, a (N+1)(N+1) bits signed multiplier can give N bits signed and unsigned multiplication

Signed multiplication

Unsigned multiplication

Result with double precision

Result with single precisoin


MEL G642

Basic multiplication instructions


No. M1 M2 M3 M4 M5 M6 M7 Specifications on the result Signed integer multiplication, double precision result ACR [31:0] <= {A[15], A[15:0]} * {B[15], B[15:0]} Signed-Unsigned integer multiplication, double precision result ACR [31:0] <= {A[15], A[15:0]} * { 0 , B[15:0]} Unsigned-signed integer multiplication, double precision result ACR [31:0] <= { 0, A[15:0]} * {B[15] , B[15:0]} Unsigned-Unsigned integer multiplication, double precision result ACR [31:0] <= { 0, A[15:0]} * { 0 , B[15:0]} Signed integer multiplication, single precision result no round ACR [31:16] <= SAT(216*({A[15], A[15:0]} * {B[15], B[15:0]})) Signed fractional multiplication, double precision result ACR [31:0] <= SAT (2*({A[15], A[15:0]} * {B[15], B[15:0]})) Signed fractional multiplication, single precision rounded result ACR [31:16] <= SAT(Round(2*({A[15], A[15:0]} * {B[15], B[15:0]})))
MEL G642

Multiplication of long data


ACR <=X[31:0] Y[15:0]

ACR [47:0] <= {X[31], X[31:16]} * {Y[15], Y[15:0]} + 2-16*( 0, X[15:0]} * {Y[15], Y[15:0]});

ACR<= X[31:0] Y[31:0]

ACR [64:0] <= {X[31], X[31:16]} * {Y[31], Y[31:16]} + 2-16*({ 0, X[15:0]} * {Y[31], Y[31:16]}) + 2-16*({X[31], X[31:16]} * {0, Y[15:0]}) + 2-32*({0, X[15:0]} * {0, Y[15:0]});

MEL G642

An example of MUL

MEL G642

MAC circuit

MEL G642

MAC instructions

MEL G642

Guard Operations In MAC

MEL G642

MAC circuit

MEL G642

MEL G642

Control Signals

Chk carefully, mistakes possible???


MEL G642

MAC instructions
Single step (signed) MAC
Integer Fractional

(Signed) Convolution
Integer Fractional

Diff between MAC and convolution


In control path, not shown here

MEL G642

Double-Precision Arithmetic

MEL G642

Double-Precision Arithmetic in MAC


No. D1 D2 D3 D4 D5 D6 D7 Specifications on the result Double-precision data add/sub double-precision data Saturate (ACRx[39:0] ACRy[39:0]) Double-precision data add/sub single-precision data align to LSB Saturate (ACRx[39:0] {24b OPB [15],OPB[15:0]}) Double-precision data add/sub single-precision data align to MSB Saturate (ACRx[39:0] {8b OPB [15],OPB[15:0], 16b0}) Double-precision data plus/sub single precision immediate Saturate (ACRx[39:0] 24b immediate[15], immediate[15:0]) Absolute operation on a double-precision data if ACRx[39] Saturate (INV(ACRx[39:0]) + 1) else ACRx Compare two double-precision data and set flags set flag: Saturate (ACRx [39:0] - ACRy [39:0]) Simple scale by MUX instead of by shift logic
MEL G642

Scaling in DSP :: MAC

MEL G642

With Double Precision Arithmetic

MEL G642

MEL G642

Control Signals

MEL G642

MEL G642

Move / change data types

MEL G642

Move data from MAC and to MAC


Basic load
Loads in half ACRn and keeps another half.

o The higher part and fill in guards


Loads in half ACRn and cleans another half.

o The higher part and fill in guards


Loads in both lower higher part of ACRn.

o To fill in guards using the higher part sign

MEL G642

Move data to MAC


Specifications on the result L1 ACRn <= {8bA[15],A[15:0], ACRn[15:0]} //Keep lower part L2 ACRn <= {8bA[15],A[15:0], 16H0000} //clean lower part L3 ACRn <= {ACRn [39:16], A[15:0]} //keep higher part L4 ACRn <= {24bA[15], A[15:0]} //sign extension higher part L5 ACRn <= {8bA[15],A[15:0], B[15:0]}// Load A and B from RF L6 ACRn <= {A[7:0], ACRn[31:0]} // restore guards

MEL G642

Move data to MAC: Logic

MEL G642

MAC Modified

MEL G642

MEL G642

Move data from MAC

Specifications on the result 1 2 3 4 5 6 Rn <= ACRn[31:16] //Rn is a register in RF Rn <= ACRn[15:0] //Rn is a register in RF Rn <= ACRn[31:16]; Rn+1 <= ACRn [15:0] //Rn and Rn+1 in RF M1 <= ACRn[31:16]; M2 <= ACRn[15:0]; //M1 M2: memories Rn <= ACRn[31:16]; Rn+1<=ACRn[15:0]; Rn+2<=ACRn[39:32] Rn <= {8h00, ACRn[39:32]}; // guard to register file RF

MEL G642

MAC integration

MEL G642

Flags in MAC
Usually control code is implemented using ALU instructions
Flags in MAC is not used much

Mainly for exception


MAC has
o Saturation Flag (FMO) o Sign Flag (FMS) o Zero Flag (FMZ)

MEL G642

Data operation Sepuence


Very important
Add guard bits Operation (iteration) and scaling Round after iteration Saturation and removing guard bits Truncation and output

MEL G642

Physical critical path


What is physical critical path?

MEL G642

Physical critical path

D-mem 1

D-mem 2

D-mem3

D-mem4

R F OPB
32 to1

R F OPA
32 to1

C onstant

Long wires

Long wires

As MAC input

Very heavy fan out here!

MEL G642

Physical critical path

A C R1 Register A C R2 select A C Rm logic AC Rn Heavyfanout for M ACinternal logic Longw ire

Froma port FromR F

D ata select logic D ata m em ory

MEL G642

Pipeline

* *
ACR ACR ACR

Accumulator

Accumulator

Accumulator

Flag circuit

Flag circuit

Flag circuit

(a) MAC in one clock cycle (b) MAC using two clocks

(a) MAC using three clocks

MEL G642

Example :: MAC Design


Design a MAC unit capable of the following operations:
o o o o o o o o o o OP0: OP1: OP2: OP3: OP4: OP5: OP6: OP7: OP8: OP9: No operation ACR = 0 ACR = A * B (Fractional multiplication (signed)) ACR = A * B + ACR (Fractional multiplication (signed)) ACR = 1.25 * ACR (Scaling) Load ACR with a fractional value from a register ACR = SATURATE(ROUND(ACR)) RF = ACR[7:0] RF = ACR[15:8] RF = SIGNEXTEND(ACR[19:16])

Constraints: A and B are 8 bits, registers are 8 bits ACR is 20 bits (including 4 guard bits). Only one multiplier may be used. You should select as small a multiplier as necessary. You also need to annotate whether it is signed or unsigned.

MEL G642

Example :: MAC Design

MEL G642

Example :: MAC Design

MEL G642

The End :: Thank you for your attention

Questions?

MEL G642

You might also like