You are on page 1of 6

12/29/2017 The Most Under-rated FPGA Design Tool Ever | EE Times

REGISTER | LOGIN

Home News Opinion Messages Authors Video Slideshows Teardown Education EELife About Us | Subscribe

Big Data Analog Automotive Embedded Industrial Control & Automation IoT MCU Medical Memory

BREAKING NEWS NEWS & ANALYSIS: EU Consortium Targets Exascale Computing SoC

Blog Most Recent Comments

The Most Under-rated FPGA Design


Tool Ever
Michael Parker & Mark Jervis, Altera traneus ost wrote: "You probably
3 saves
9/10/2015 01:50 PM EDT LOGIN TO RATE don't care about latency." For virtual
10 comments post a comment reality (VR) and augmented reality
(AR), latency is critical.
12/28/2017
There is a design tool that is being quietly adopted by FPGA 8:17:38 PM
engineers because, in many cases, it produces results that are
better than hand-coded counterparts.

FPGAs keep getting larger, the designs more complex, and the
need for high level design (HLD) flows never seems to go away. C- Navigate to Related Links
based design for FPGAs has been promoted for over two decades
and several such tools are currently on the market. Model-based Programmable Logic Holds the Key to Addressing Device
design has also been around for a long time from multiple vendors. Obsolescence
OpenCL for FPGAs has been getting lots of press in the last couple Constraints, a continuing challenge for designers
of years. Yet, despite all of this, 90+% of FPGA designs continue to
be built using traditional Verilog or VHDL. The Future of Microcontrollers
Canyon Bridge Head Charged With Insider Trading in Lattice
No one can deny the need for HLD. New FPGAs contain over 1 Deal
million logic elements, with thousands of hardened DSP and
Mid-range FPGAs offer optimised cost, lower power, better
memory blocks. Some vendor's devices can even support floating- security
point as efficiently as fixed-point arithmetic. Data convertor and
interface protocols routinely run at multiple GSPS (giga samples
per second), requiring highly parallel or vectorized processing.
Timing closure, simulation, and verification become ever-more time-
consuming as design sizes grow. But HLD adoption still lags, and
FPGAs are primarily programmed by hardware-centric engineers
using traditional hardware description languages (HDLs).
Download Datasheets
The primary reason for this is quality of results (QoR). All high-level
design tools have two key challenges to overcome. One is to PART NUMBER
translate the designer's intent into implementation when the design
is described in a high-level format. This is especially difficult when e.g. LM317 Search
software programming languages are used (C++, MATLAB, or
others), which are inherently serial in nature. It is then up to the POWERED BY
compiler to decide by how much and where to parallelize the
hardware implementation. This can be aided by adding special
intrinsics into the design language, but this defeats the purpose.
OpenCL addresses this by having the programmer describe serial Cartoon Contest
dependencies in the datapath, which is why OpenCL is often used
for programming GPUs. It is then up to the OpenCL compiler to
decide how to balance parallelism against throughput in the December 2017 Cartoon Caption
implementation. However, OpenCL programming is not exactly a Contest
common skillset in the industry.

The second key challenge is optimization. Most FPGA hardware


designers take great pride in their ability to optimize their code to
achieve the maximum performance in a given FPGA, in terms of
design Fmax, or the achievable frequency of the system clock data
rate. This requires closing timing across the entire design, which
means setup and hold times have to be met for every circuit in the
programmable logic and every routing path in the design. The
FPGA vendors provide automated synthesis, fitting, and routing
tools, but the achievable results are heavily dependent upon the
quality of the Verilog and/or VHDL source code. This requires both
experience and design iteration. The timing closure process is
tedious and sometime compared to "Whack-a-Mole," meaning that "Your caption here!"
h ti i bl i fi d i l ti f th d i
https://www.eetimes.com/author.asp?section_id=36&doc_id=1327664 1/6
12/29/2017 The Most Under-rated FPGA Design Tool Ever | EE Times
when a timing problem is fixed in one location of the design, a 36 comments
different problem often surfaces at another location.
ALL CARTOONS
An oft-quoted metric for a high-level design tool is to achieve
results that are no more than 10% degraded from a high-quality
hand-coded design, both in terms of Fmax and the utilization of
FPGA resources, typically measured in "LEs" (logic elements) or Most Commented Most Popular
"LCs" (logic cells). In practice, very few tools can reliably deliver
such results, and there is considerable skepticism among the 24 Time to Look For Low-Cost DRAM Alternatives
FPGA design community when such a tool is promoted by EDA or 18 Report: LEDs Cut 570 Million Tons of CO2 ...
FPGA vendors. 16 Women in Tech: 25 Profiles in Persistence
15 Robocar’s Honking Problem
Having said this, there is a design tool that is being quietly adopted
14 Samsung’s Capex Seen Crushing Memory Startups
by FPGA engineers precisely because it not only addresses this
13 “Long Tail” & Legacy: Good, or Maybe Not
QoR gap, but -- in most cases -- extends it in the other direction,
11 Broad/Qual/NXP Battle Begins
meaning that the tool produces results that are usually better than
10 DARPA Takes Chip Route to ‘Unhackable’ ...
their hand-coded counterparts.
8 Programmable Logic Holds the Key to ...
8 Beating IoT Big Data With Brain Emulation

Like Us on Facebook

EE Times on Twitter follow us

Figure 1. Simulink high level design to optimized hardware.

This tool is called DSP Builder Advanced Blockset (the marketing


folks were obviously not at their best when naming this tool). This is
a model-based design tool, meaning that design entry is
accomplished using models in the Mathworks' Simulink
environment. The tool was first introduced to the market in 2007.

There are other model-based tools on the market, such as HDL


Coder, Synplify, and System Generator; however, only DSP Builder
Advanced Blockset offers the combination of the following ten
features:

Decoupling of system data rates from FPGA clock rates;


native multi-channel capabilities.
Automated timing closure at high Fmax, including auto-
pipelining.
Deterministic latency and data throughput.
Optimal usage of FPGA hard block features.
Design portability across FPGA families.
Fixed- or floating-point numerical implementation.
Support for vector manipulation.
Math.h library.
System simulation in the Mathworks' environment.
Hardware simulation from the Mathworks' environment.

This combination is what allows the tool to deliver superior QoR


along with the productivity advantages of a high level simulation,
design, and verification tool flow. Let's look at each of these
features in a little more detail...

Decoupling of system data rates from FPGA clock rates


Using DSP Builder, the user specifies the desired design clock rate.
The data rate can be higher or lower than the clock rate, sometimes
dramatically so. The tool will automatically parallelize the data and
t th d t b t i h th d t t i
https://www.eetimes.com/author.asp?section_id=36&doc_id=1327664 2/6
12/29/2017 The Most Under-rated FPGA Design Tool Ever | EE Times
represent the data buses as vectors in cases where the data rate is
higher than the clock rate. Integer ratios work most efficiently (4, 8,
12, 16, 32...) but any ratio will work and the control path will insert
empty data into some of the vectors to accommodate this.

This capability provides the ability to support very high data rates of
many GSPS using realistic FPGA clock rates of several hundred
MHz, depending upon the FPGA family.

Figure 2. From FFT, to parameter file, to parameterizable design


(Click here to see a larger image.)

Within DSP Builder, the designer builds the datapath, often


containing various rate FIR filters, memory blocks, NCOs, mixers,
saturate and round blocks, and so forth. However, the designer
need only lay down a single channel datapath assuming the design
clocks at the required rate, regardless of the actual data rate. DSP
Builder will build the data path with the specified number of
channels, and vectorize (or parallelize) the design to achieve the
needed data throughput. This is specified in a parameter file, which
means it is easily changed, with the only effort being a recompile.
The tool generates all needed control logic to handle multi-channel
and higher data rates, even for complex datapaths. Further all
configuration or coefficient registers can be read or written, with the
addressing and accessing logic auto-generated. This will operate at
a lower clock rate than the datapath.

Continue reading the next page on EE Times' sister site,


Embedded.com - Model-based FPGA design tool quietly gains
adherents

EMAIL THIS PRINT COMMENT


PAGE 1 / 3 NEXT >

Comments
VIEW COMMENTS: NEWEST FIRST | OLDEST FIRST | THREADED VIEW

Re: Love HLS NO RATINGS


LOGIN TO RATE
traneus 12/28/2017 8:17:38 PM
ost wrote: "You probably don't care about latency." For virtual
reality (VR) and augmented reality (AR), latency is critical.
USER RANK
AUTHOR Reply Post Message Messages List Start a Board

Re: integer multiply NO RATINGS


traneus 12/28/2017 8:14:45 PM LOGIN TO RATE

I looked up in "The Designer's Guide to VHDL, Third Edition" by


Peter J. Ashenden, page 36, and found that the integer multiply
USER RANK operator * returns a value of the same type as the operands.
AUTHOR Thus, to multiply two 32-bit integers together and get the 64-bit
product, I have two choices: 1. Build the multiplier manually
using the resources in the FPGA (which might be not too bad
nowadays). 2. Copy the 32-bit operands to 64-bit operands,
multiply using *, and get the desired 64-bit result (which might
synthesize a 64x64 multiplier, which would not be too bad
nowadays). The last time I had to do this (in 1998), I was
programming a 64-bit MIPS microprocessor in assembly
language. Sign-extending the operands to 64 bits, doing a
64x64 multiply, and keeping the result, was faster than doing a
32x32 multiply and combining the two halves of the product into
one 64-bit register. The 32x32 multiply was faster than the
64x64 multiply, but combining the halves of the product was
slow.

Reply Post Message Messages List Start a Board

Love HLS NO RATINGS


ost 12/28/2017 5:20:58 PM LOGIN TO RATE

I've been using HLS a bit and I love it. I do not ever want to go
back to vhdl, wich I find more like a [horrible] connecting
USER RANK language than functional. As an example, I can simulate video
AUTHOR output rendering on screen in a bitmap window instead of
looking at waves and numbers, comparing with an ideal file.
And I can keep the numbers and put breakpoints in there too if
I want to look at the math for one specific pixel. When I am
happy with the result, the code needs to be "massaged" so it
synthesizes like you want it, and if I worry something got
broken, I can just rerun a visual testbench. For those working
with video, imagine being able to change bus width from 1, 2 or
4 pixels pr clock with a #define.. You probably don't care about

https://www.eetimes.com/author.asp?section_id=36&doc_id=1327664 3/6
12/29/2017 The Most Under-rated FPGA Design Tool Ever | EE Times
latency, and the HLS synthesizer adds what it needs to get the
job done at the chosen speed. And the output code is free of
redundant LUT's, FF's and blockram. Yes, you do need some
c++ knowledge. And you will scratch your head wondering why
did the tool do this stuff to latency and initialization interval
during synth, and where does that happen in your code, but
imo its worth it. As time goes by, you can write better code the
first time.

Reply Post Message Messages List Start a Board

integer multiply NO RATINGS


LOGIN TO RATE
traneus 12/31/2015 10:15:33 PM
When two finite-length integers (signed or unsigned) are
multiplied together, the product's length is the sum of the
USER RANK lengths of the two factors. Do VHDL or Verilog handle multiply
AUTHOR this way?

In my limited experience with C, the product is truncated to the


length of one factor. Thus, to get the full-length product, one
must first cast the two factors to double-original-length integers
and then do the multiply.

Reply Post Message Messages List Start a Board

Re: Existing HDLs Provide Similar NO RATINGS


LOGIN TO RATE
Solutions
Kevin Neilson 9/14/2015 1:14:31 PM
I agree. I would prefer that all of this tool development effort
USER RANK would instead go into adding more SystemVerilog construct
AUTHOR support in the synthesizer.

Reply Post Message Messages List Start a Board

Re: Quaternion/matrix math NO RATINGS


LOGIN TO RATE
traneus 9/14/2015 11:25:22 AM
xeinth, you state the tool does vector/complex math. Does the
tool do quaternion or matrix math? These are useful for
USER RANK rotations in three dimensions.
AUTHOR
Reply Post Message Messages List Start a Board

Re: Existing HDLs Provide Similar NO RATINGS


LOGIN TO RATE
Solutions
xeinth 9/13/2015 4:41:48 AM
Cognoscan,
USER RANK
ROOKIE I understand where you are coming from, but this flow does
allow you to automaticlaly pipeline a design and do
vector/complex math dyanmically. While you could probably
add vector math support in some way to HDL languages, you
really cannot automtically pipeline an algorithm with HDL.

Consider Arria 10 and Stratix 10, they have top speeds of


roughly 500Mhz and 1Ghz respectively. If you had a simple
algorithm which runs at say 250Mhz and needs to be folded to
timeshare resources, then optimally pipeline the registers in a
way that works with Hyperflex, and then update the
statemachines to account for the latency delays of various
memories, multipliers in an automated fashion, would you really
write your HDL in a way to account for all of that?

I suppose you could claim its possible, but no one really ever
does it because its too much work, and the nuances of every
FPGA family end up being unique requiring some hand tuning.

PS/Disclaimer: I work for Altera, but not part of this team, just
have used the tool for IP development.

Reply Post Message Messages List Start a Board

But does it work for Xilinx or NO RATINGS


LOGIN TO RATE
Microsemi?
elizabethsimon 9/11/2015 11:19:31
AM

USER RANK The company I work for uses FPGAs from all major vendors so
AUTHOR we have a requirement that common blocks be written to be
vendor agnostic.

How well does this tool play with other pre-written blocks?

Reply Post Message Messages List Start a Board

NO RATINGS
https://www.eetimes.com/author.asp?section_id=36&doc_id=1327664 4/6
12/29/2017 The Most Under-rated FPGA Design Tool Ever | EE Times
Re: Existing HDLs Provide Similar NO RATINGS
LOGIN TO RATE
Solutions
Gregory.Nash_#1 9/11/2015 10:10:31
AM
USER RANK While it's certainly true that HDL can be parameterized for
ROOKIE different architectures, that would still require that the engineer
writing that HDL be aware of all current and future
architectures.

And it's certainly true that engineers can write their own GUIs.

And they can easily link them to simulations...

But this tool already does all that, without requiring knowledge
of many FPGA families and clairvoyance into the future and
doesn't require engineers to waste time reinventing tools.

The ability to abstact algorithm from implementation and have a


transparent link to simulation is not provided by HDL.

Reply Post Message Messages List Start a Board

Existing HDLs Provide Similar NO RATINGS


Solutions LOGIN TO RATE

Cognoscan 9/10/2015 4:25:13 PM


VHDL-2008 and SystemVerilog introduced a number of useful
USER RANK expansions to their respective languages that make so-called
ROOKIE "HLS" tools unnecessary. Modules can be parameterized to
provide the same level of configuration, and can be easily
chained together when using interfaces in SystemVerilog or
records in VHDL. The architecture-specific optimization can be
turned on through synthesis tools, or made explicit through
parameters / defines. In short, there is practically nothing here
that can't be replicated through good use of generic modules.

There are, I suppose, three actual benefits: simulation can be


faster, you can use a graphical tool, and the modules are fully
optimized. These are harder to overcome, but it's a matter of
time, rather than something fundamental to the tools.
Simulations can be optimized by ditching wire-accurate models,
graphical tools can be written (if any FPGA Designer really
wants one), and module optimization is something that, by the
article's own admission, engineers are very good at.

So as a time/money tradeoff, it makes sense to use these tools


upfront, but they don't have much benefit in the long term. And
if someone ever releases a good set of generic
modules/entities that replicate the MATLAB blocks, then this
tool carries no real benefit at all.

Reply Post Message Messages List Start a Board

ASPENCORE NETWORK

EBN | EDN | EE Times | EEWeb | Electronic Products | Electronics-Tutorials | Embedded | Planet Analog | ElectroSchematics | Power Electronics News |
TechOnline | Datasheets.com | Embedded Control Europe | Embedded Know How | Embedded News | IOT Design Zone | Motor Control Design |
Electronics Know How |

GLOBAL NETWORK

EE Times Asia | EE Times China | EE Times India | EE Times Japan | EE Times Taiwan | EDN Asia | EDN China | EDN Taiwan | EDN Japan | ESM China |

Working With Us: About | Contact Us | Media Kits


Terms of Service | Privacy Statement | Copyright © 2017 AspenCore All Rights Reserved

https://www.eetimes.com/author.asp?section_id=36&doc_id=1327664 5/6
12/29/2017 The Most Under-rated FPGA Design Tool Ever | EE Times

https://www.eetimes.com/author.asp?section_id=36&doc_id=1327664 6/6

You might also like