You are on page 1of 38

FPGA Co-Processing

for DSP
Agenda
„ Applications of FPGA-Based Co-Processors for
DSP
„ Development Tools & Methodologies Available
from Altera to Build Co-Processors
− SOPC Builder
− DSP Builder
„ FPGA Co-Processor Development Examples
− QAM Modulator
− FIR Filter
Growing Demand for MIPS &
Memory Bandwidth
Application
„ Growth Drivers Requirements
− Algorithm Complexity
− Security
Processing (DSP) MIPS &

− Multiple Users Digital Signal


Memory Bandwidth

Processors
Digital Signal

Time
DSP System Architecture Options

DSP DSP DSP DSP

DSP DSP DSP DSP


Performance (MACs/Sec)

DSP
DSP DSP DSP DSP

DSP DSP DSP DSP


DSP

Stand-Alone Processor Array Processor + Dedicated Hardware


Processor Co-Processor Architecture
Exploring the DSP Design Space
Custom Altera Th
FPGA e
ASIC
Algorithm Fl
ex
in Hardware ib
i li
ty
Altera Zo
Digital Signal
FPGA ne
Processor
Co-Processor
Performance

Altera
Conventional
FPGA
Digital Signal Processor
Co-Processor
Processor
with
Altera
Co-Processor FPGA with
Nios &
Co-Processor

Digital Signal
Processor
Farm Digital Signal
Processor Conventional
Processor

Flexibility
Co-Processing on FPGAs

Processor
Processor on
on FPGA
FPGA Processor
Processor External
External to
to FPGA
FPGA

FPGA
FPGA FPGA

FIR FIR
Processor
Processor IQ IQ
Map Map

NCO NCO
Memory
Memory Memory
Memory

Memory
Memory
When Do FPGA Co-Processors
Reduce System Cost?
„ Off-Loading Algorithms to Co-Processor Reduces
Number or Cost of Digital Signal Processors
DSP DSP DSP DSP
Expensive Array
DSP DSP DSP DSP
to DSP + FPGA
DSP DSP DSP DSP FIR
DSP DSP DSP DSP DSP

$$$ DSP IQ
Map
Expensive DSP to
Inexpensive DSP + FPGA NCO
Memory
Memory Memory
Memory

„ Applications
− Algorithms with Large Amount of Digital Signal Processing &
Small Amount of Control Processing
FPGA Co-Processor Applications
„ Wireless „ Wireline Communications
− 2.5-G EDGE Equalization − Encryption
− 3-G Baseband Processing − Framer
z HSDPA − Traffic Management
z 1xEVDV − TCP/IP
− 3-G RF Linearization
„ Computer & Storage
„ Consumer − Data Analysis & Routing
− Broadcast - Studio & Cable Engine
Plant − Digital Imaging
− Digital Entertainment –
„ Military & Aerospace
MPEG2 & MPEG4
„ Security
„ Medical
− Imaging
DSP Development
Tools & Design
Methodology
FPGA Co-Processor Design Tools
SOPC Builder DSP Builder

Stand-Alone Processor + Co- Dedicated Hardware


Processor Processor Architecture
DSP Builder Overview
DSP Builder
Creates Creates Creates Download Verify
HDL Code Simulation Test Bench Processor Design to in
Plug-In Development Hardware
Board

HDL
Synthesis
DSP Builder Library Components
„ Arithmetic
„ Bus Manipulation
„ Complex Signals
„ Logical Components
„ SOPC Ports
„ Storage
„ MegaCore® IP
„ Rate Change
„ State Machine
„ Altera Library
„ DSP Board
MegaCore IP in DSP Builder

DSP
DSP Builder
Builder
MegaCore®® Functions
MegaCore Functions
•• FIR
FIR
•• FFT
FFT
•• Viterbi
Viterbi
•• Turbo
Turbo
•• Reed
Reed Solomon
Solomon
•• NCO
NCO
DSP Builder Support for Multiple
Clock Domains
„ Inherit Sampling Frequency (FS)
„ Adhere to Clock Design Rules
„ All Sample Times Match One of
Phase-Locked Loops (PLLs)
Output Clock Periods
„ Simplify Analysis &
Implementation of
Multi-Rate System
− Up Sampling
− Down Sampling
State Machine Builder
Sub-System Builder
„ Import Existing VHDL Design into Simulink
„ Simulink Simulation Options
− Convert into DSP Builder Blocks or MATLAB Functions
− Treat VHDL Design as Black Box
Signal Compiler
„ Generates VHDL Design Files
„ Generates Tool
Command Language
(Tcl) Scripts
„ Generates Testbench
„ Enables Parameterization
of IP Blocks
„ Launch Hardware
Compilation from
Simulink Cockpit
Stratix DSP Development Board
80-Pin I/O Connector

Altera® Nios®
Expansion
14-Bit, Prototype
165-MHz D/A Connector

Prototyping
Area
12-Bit,
125-MHz A/D Push-Button
Switches
SMA
Connector
9-Pin RS232
5-V Power
Connector
Supply

40-Pin Connector for Analog LEDs Texas Instruments Connectors Can


Devices Evaluation Boards Be Found on Underside of Board
Altera DSP Development Kits
Contains Everything You Need
to Develop High-Performance DSP
Designs on FPGAs

30-Day Evaluation
Version

System Reference
Designs
QAM Modulator Co-
Processor Design
Example
Modem Reference Design
„ Installed with Code Composer Studio As a Tutorial
− C:\ti\tutorial\dsk6711\modem
„ Used to Demonstrate
CCS Functionality
„ 16 QAM TX Modem
Modem Design Code Profile
Modem Design Hardware/ Software
Petition
Main
100%

Initialize Add Noise Signal


3% 0.5%

Hardware
Accelerator
Modem
Transmitter Shaping Filter 82%
96.5%
Modulation 8%

Sine Lookup 2.5%

Cosine Lookup 2.5%


Modulator Co-Processor
DSP Builder Used to Build Hardware DSP Data Path
Avalon™ Interface in SOPC Builder
DSP Builder— SOPC Builder Import
Import DSP Builder Generated Co-Processor
into SOPC Builder
SOPC Builder Integration
Modem Co-Processor (fir_comp)
Integrated with TI EMIF I/F (TIMaster)
FIR Filter Co-Processor
Design Example
Driving Down System Costs

Multi-Processing Digital Signal Processor +


DSP FPGA Co-Processor

FPGA
FPGA
DSP DSP FIR
DSP DSP FIR
DSP DSP
DSP DSP DSP
Memory
Memory
FIR Co-Processor Design
Example
„ FIR Parameters
− 128-Tap
− 16-Bit Data, 14-Bit Coefficients
„ Four FIR Implementations for Comparison
− TI C6711-Optimized TI DSPLib Function
− TI C6416-Optimized TI DSPLib Function
− Altera Eight-Cycle FIR Co-Processor
− Altera One-Cycle FIR Co-Processor
TI Filtering Library (DSP Lib)
„ C-CallableOptimized Assembly Routines
„ TI C67x DSPLib: Fir Filter (Radix 8)
− Formula: Nh * Nr /2 + 13
z Nh = Number of Coefficients
z Nr = Number of Samples

− ~1 Sample/ 64 Cycles (128 Tap Filter)


„ TI C64x DSPLib: FIR Filter (Radix 8)
− Formula: Nh * Nr/4 + 17
− ~ 1 Sample/ 32 Cycles (128 Tap Filter)
Filter Co-Processor Design
Example
„ Current Implementation
− 100 MHz, 32-Bit, Asynchronous EMIF on DSK
− TI Writes 300 Samples to Co-Processor (Input Data)
− Filter & Send Output to TI

C6000

Input FIFO

Compiler
Digital
Memory I/F

TI I/F Core
External

FIR
Signal
QDMA

Processor

FPGA Co-Processor
FIR Filter Example* – 16X
Cost/Performance Improvement
Device Solution FIR Performance Device Cost per
(MHz) Cost**** FIR MHz

TI C6713-200 64-Cycles** 3.125 $24.59 $7.87


at 200 MHz
TI C6416-600 32-Cycles** 18.75 $160 $8.53
at 600 MHz
Altera 7-Cycles*** 28 $14 $0.50
EP1C3-8 at 200 MHz
Altera 1-Cycle*** 170 $84 $0.49
EP1C12-8 at 170 MHz
** FIRFIR 128
128 Tap,
Tap, 16-Bit
16-Bit Data,
Data, 14-Bit
14-Bit Coefficients
Coefficients
**
** DSPLib Optimized Assembly Libraries from
DSPLib Optimized Assembly Libraries from Texas
Texas Instruments
Instruments
***
*** Optimized
Optimized MegaCore
MegaCore FIR
FIR Compiler
Compiler from
from Altera
Altera
****
**** Pricing
Pricing in
in Quantity
Quantity of
of 100
100 at
at Arrow
Arrow 6/25/03
6/25/03
14X Reduction in System Costs
FPGA
FPGA
DSP DSP
DSP DSP FIR
FIR
DSP DSP
DSP DSP DSP
Memory
Memory

Architecture FIR Total FIR Device Total


Performance Performance Costs Cost
(MHz) (MHz)
9 * TI C6416-600 9 * 18.75 167 9* $1,440
$160
Altera EP1C12-8 + 170 + 3 173 $84 + $110
1 TI c6713-200 $25
Summary
„ FPGA Co-Processors for DSP Offer Many
Advantages
− 10X Performance Boost
z More Channels
z More Complex Algorithms
z Increased System Throughput
− 10X Cost Reduction
z Fewer Components
− Complementary to DSP-Based Systems
z Offloads Existing DSP
z Integrates Into Existing DSP IDE
z Evolution Not Revolution
Altera Code: DSP Solutions
„ FPGAs
− Stratix, Stratix GX, Cyclone
„ Development Tools
− DSP Builder, SOPC Builder
„ Intellectual Property
− FIR, FFT, Viterbi, Turbo, Reed Solomon, NCO
− AMPP Third-Party Partners
„ Development Kits
− Altera
− Third-Parties
„ Design Services
− ACAP Third-Party Partners
„ Training