You are on page 1of 60

VLSI Programming: Lecture 2

Course 2IN35
Course:

Lab:

Kees van Berkel

c.h.v.berkel@tue.nl

Rudolf Mak

r.h.mak@tue.nl

Kees van Berkel


Rudolf Mak
Alok Lele, Hrishikesh Salunkhe

www:

http://www.win.tue.nl/~cberkel/2IN35/

Lecture 2

pipelining, retiming, J-slow, parallel

4/23/2013

VLSI Programming: time table 2013


date

hour 5
hour 6
out
introduction, DSP representations,
bounds
April 23
unfolding (cntd), look-ahead,
May 7 T1 + T2 strength reduction
T3 + T4

hour 7
hour 8
out
pipelining, retiming, transposition,
J-slow, unfolding
T1 + T2
(have FPGA tools installed)
FPGA + Verilog intros
L1
FPGA lab/L1: audio filter
simulation
May 14 T3 + T4 systolic computation
T5
May 21 T5
folding
FPGA lab/L2: audio filter on XUP board
L2
May 28
DSP processors FPGA lab/L3: sequential FIR, strength-reduced FIR
L3
May 30
FPGA lab/L3: sequential FIR, strength-reduced FIR (cntd)
FPGA lab/L4: audio sample rate convertor
June 4 L3
deadline report L3
L4
June 6
FPGA lab/L4: audio sample rate convertor (cntd)
FPGA lab/L5: audio sample rate convertor "1024x"
June 11 L4
deadline report L4
L5
June 13
FPGA lab/L5: audio sample rate convertor "1024x" (cntd)
in

June 18 L5

deadline report L5

4/23/2013

FPGA IC on a Xilinx XUP Board (Atlys)

FPGA
Xilinx
Spartan
6

4/23/2013

Atlys board, based on Xilinx Spartan 6

FPGA
Xilinx
Spartan
6
4
4/23/2013

Preparation for Lab work

Prepare your notebook for lab work

See preparation link on 2IN35 web-site

Install the required tools and test them


2 weeks from now (May 7):
Hrishikesh and Alok will be around for Q&A

First Lab exercises: Tue May 7

Find a partner (team size is max 2)

4/23/2013

Note on course literature


Lectures VLSI programming are loosely based on:

Keshab K. Parhi. VLSI Digital Signal Processing Systems, Design and


Implementation. Wiley Inter-Science 1999.
This book is recommended, but not mandatory

Accompanying slides can be found on:

http://www.ece.umn.edu/users/parhi/slides.html
http://www.win.tue.nl/~cberkel/2IN35/

Mandatory reading:
reading

Edward A. Lee and David G. Messerschmitt. Synchronous Data


Flow. Proc. of the IEEE, Vol. 75, No. 9, Sept 1987, pp 1235-1245.

Keshab K. Parhi. High-Level Algorithm and Architecture


Transformations for DSP Synthesis. Journal of VLSI Signal
Processing, 9, 121-143 (1995), Kluwer Academic Publishers.
4/23/2013

Outline Lecture 2
Transformations of DFGs and SFGs:

(commuting of an SFG)

lecture 1

pipelining of a DFG

Parhi3.pdf

transposition of an SFG

Parhi3.pdf

retiming of a DFG

Parhi4.pdf

K-slow transformation of a DFG

Parhi4.pdf

unfolding of a DFG

Parhi5.pdf

assignments

Parhi3.pdf

4/23/2013

4/23/2013

car assembly line; Henry Ford [1908]


1914: Ts = 3min; latency = 93 min

4/23/2013

10

4/23/2013

11

4/23/2013

12

4/23/2013

13

4/23/2013

every

by 0

14

4/23/2013

15

4/23/2013

16

4/23/2013

17

4/23/2013

18

4/23/2013

19

4/23/2013

20

4/23/2013

Retiming and pipelining

21

Review slides Parhi3.pdf

Parhi follows a graph-theoretic approach to compute


optimal pipelining/retiming

For our purposes moving delays around is sufficient:


Node retiming (Parhi4.pdf, slide 2)
Introduction of a delay at all inputs (or all outputs)

4/23/2013

Parhi 95, Fig 3a


1

2
Critical path is 10 time units long
(transposed version: 8 time units)
22

4/23/2013

Parhi 95, Fig 3a / retiming step 1

Critical path is 10 time units long


23

4/23/2013

Parhi 95, Fig 3a / retiming step 2

Critical path is 10 time units long


24

4/23/2013

Parhi 95, Fig 3a / retiming step 3

Critical path is 7 time units long


25

4/23/2013

Parhi 95, Fig 3a / retiming step 4

Critical path is 7 time units long


26

4/23/2013

Parhi 95, Fig 3a / retiming step 5

Critical path is 4 time units long


27

4/23/2013

Parhi 95, Fig 3a / retiming step 6

Critical path is 4 time units long


28

4/23/2013

Parhi 95, Fig 3a / retiming step 7

Critical path is 3 time units long


29

4/23/2013

Parhi 95, Fig 3a / retiming step 8

Critical path is 3 time units long


30

4/23/2013

Parhi 95, Fig 3a / retiming step 9

Critical path is 2 time units long


31

4/23/2013

Parhi 95, Fig 3a after retiming = Fig 3b

Critical path is 2 time units long


32

4/23/2013

33

4/23/2013

34

4/23/2013

35

4/23/2013

36

4/23/2013

37

4/23/2013

38

4/23/2013

39

4/23/2013

x(2(k-1))

x(10(k-1))

40

4/23/2013

Unfolding, L=2

Parhis paper, Fig 1/2, paper p123/124


y(n) = ax(n) + bx(n-1) + cx(n-2)
y(2k) = ax(2k) + bx(2k-1) + cx(2k-2)
y(2k+1) = ax(2k+1) + bx(2k) + cx(2k-1)
Rewrite all indices in equations to the form
(L(k - i) + j), with 0 j < L
y(2k) = ax(2k) + bx(2(k-1)+1) + cx(2(k-1))
y(2k+1) = ax(2k+1) + bx(2k) + cx(2(k-1)+1)
41

4/23/2013

= Fig 2

Unfolding, L=3

Same FIR
y(3k) = ax(3k )
y(3k+1) = ax(3k+1)
y(3k+2) = ax(3k+2)

+ bx(3k-1)

+ cx(3k-2)

+ bx(3k )

+ cx(3k-1)

+ bx(3k+1)

+ cx(3k )

Rewrite all indices in equations to the form


(L(k - i) + j), with 0 j < L
y(3k) = ax(3k )
y(3k+1) = ax(3k+1)
y(3k+2) = ax(3k+2)
42

+ bx(3(k-1)+2)+ cx(3(k-1)+1)

4/23/2013

+ bx(3k )

+ cx(3(k-1)+2)

+ bx(3k+1)

+ cx(3k

43

4/23/2013

44

4/23/2013

45

4/23/2013

46

4/23/2013

47

4/23/2013

48

4/23/2013

Parhi 5, slide 2

Original program:

y(n)

= a x(n) + b y(n-2)

2-unfolded version y(2k)

= a x(2k) + b y(2k-2)
y(2k+1) = a x(2k+1) + b y(2k-1)

Rewrite all indices in equations to the form


(L(k - i) + j), with 0 j < L
2-unfolded version y(2k)

= a x(2k) + b y(2(k-1))
y(2k+1) = a x(2k+1) + b y(2(k-1)+1)

49

4/23/2013

Parhi 5, slide 3 (Fig 5.3, pp 123)

Original program:

v(n)

4-unfolded version v(4k)

v(4k+1)

v(4k+2)

v(4k+3)
4-unfolded,

50

= u(n-37)
= u(4k-37)
= u(4k-36)
= u(4k-35)
= u(4k-34)

v(4k)

= u(4(k-10) +3)

v(4k+1)

= u(4(k-9))

v(4k+2)

= u(4(k-9)+1)

v(4k+3)

= u(4(k-9)+2)

4/23/2013

51

4/23/2013

Parhi5, slide 4 (Fig 5.4, pp 123)

v(n)
v(3k)
v(3k+1)
v(3k+2)
v(3k)
v(3k+1)
v(3k+2)

= u(n-1)

+ t(n-6)

+ v(n-12)

= u(3k-1)

+ t(3k-6)

+ v(3k-12)

= u(3k)

+ t(3k-5)

+ v(3k-11)

= u(3k+1)

+ t(3k-4)

+ v(3k-10)

= u(3(k-1)+2) + t(3(k-2))
= u(3k)

+ t(3(k-2)+1) + v(3(k-4)+1)

= u(3k+1)

+ t(3(k-2)+2) + v(3(k-4)+2)

= Fig 5.4b
52

+ v(3(k-4))

4/23/2013

53

4/23/2013

Parhi5, slide 6 (Fig 5.6, pp 129)

u(n)

= p(n)

+ (s*u(n-3) + t*u(n-2))

u(2k) = p(2k) + (s*u(2k-3) + t*u(2k-2))


u(2k+1) = p(2k+1) + (s*u(2k-2) + t*u(2k-1))
u(2k) = p(2k) + (s*u(2(k-2)+1) + t*u(2(k-1))
u(2k+1) = p(2k+1) + (s*u(2(k-1)) + t*u(2(k-1)+1)

54

4/23/2013

55

4/23/2013

56

4/23/2013

FIR assignment

Consider FIR: y(n) = a*x(n) + b*x(n-1) + c*x(n-3)


Assume add and multiply times: 2 and 5 nsec resp.

1. Draw DFG of FIR, calculate throughput.


2. Pipeline and retime FIR for maximal throughput.
3. Unfold FIR J=2; draw the unfolded DFG. Throughput?
4. pipeline and retime unfolded FIR; draw DFG. Throughput?
5. Same for J=3 (draw DFG), and J=16 (no need to draw DFGs).
Throughput?

57

Return deadline: Tuesday May 7, 13:45

4/23/2013

IIR assignment

Consider IIR: y(n) = x(n) + a*y(n-2)


Assume add and multiply time: 2 and 5 nsec resp.

1. Draw DFG of IIR, calculate throughput.


2. Pipeline and retime IIR for maximal throughput.
3. Unfold IIR J=2; draw the unfolded DFG. Throughput?
4. pipeline and retime unfolded IIR; draw DFG. Throughput?
5. Same for J=3 (draw DFG), and J=16 (no need to draw DFGs).
Throughput?

58

Return deadline: Tuesday May 7, 13:45

4/23/2013

VLSI Programming: Feb 28

59

Parhi,
More unfolding, parallelism
Strength reduction

4/23/2013

THANK YOU

You might also like