You are on page 1of 35

45nm NEW ASICs

Source: IBS

(Gate Count)

Standard Cell ASIC


Up -front Costs Customization Layers Very High All layers

Structured ASIC
High Multiple layers

The NEW ASIC


Low One Via layer

Design Flow Design Risk Manufacturing Cycle Wafer Sharing

Very complex verification Very High 12-16 weeks No

Complex verification High 10-16 weeks No

Simple FPGA-like flow Low 8-10 weeks Yes

Time to Market
In addition to the substantial system cost saving, we were pleasantly surprised at how quick we received working devices and how much lower the power consumption was compared to the low density FPGA we were using. Kevin Chiang, R&D Manager of AVTECH.

Power, Performance, Price


"Fujitsu Advanced Technologies Ltd is pleased to see that eASIC is delivering a 45nm product. This New ASIC is able to deliver the right combination of performance, power and price combined with low up-front cost. We find the power reduction especially important as we look to add more functionality to our world-class ICT infrastructure products." Akira Itoh, Director of the Fujitsu Circuit Technology Center of Fujitsu Advanced Technologies Ltd

Power, Cost, Tools, Support


We successfully ported an existing production FPGA design to eASICs Nextreme, validating the power savings, tools suite effectiveness, cost, scheduling and support. All were validated with flying colors, making us eager to proceed to the next phase. Pierre Boulanger, VP Engineering FLIR Systems

Power, Cost, Market Enabler


By working with eASICs Nextreme NEW ASIC, we were able to develop a reference design for RemoteFX hardware solutions that is low in power consumption as well as low in up-front development cost. We see eASICs Nextreme as a catalyst in enabling our RemoteFX partner ecosystem to ramp up their solutions to market readiness. Dai Vu, director of Virtualization Solutions Marketing, Microsoft Corp

Time, Cost & Performance


We were able to develop our eASIC device in a reduced time and cost of a standard cell ASIC and with development costs that were even lower than an FPGA approach and the eASIC device worked right first time on our PBX-A9 validation and development board, John Goodacre, Director, ARM

Common Platform Alliance (GDS Compatible)

Design

Packaging

Multi-Site Transferable Platform (ATP, ATK, ATT, ATC)

Wafer production starts in during design stage

Single Via configuration reduces time and risk

Tested devices delivered

500 MHz Performance Fabric Up to 7.4M ASIC Gates 740K eCells

45nm

500MHz Logic 500MHz bRAM

Up to 11.5 Mb Embedded Memory True Dual-Port, 36Kb blocks Register files ViaROM

32 Fully Balanced Clocks Vertical & Horizontal Spines

Up to 792 I/Os Differential Single-ended Voltage-referenced Dynamic Phase Alignment (DPA)

Up to 52 DLLs

Support for: - DDR3 up to 1067Mbps - DDR2, DDR, Mobile DDR - RLDRAM II, QDR II+

Up to 16 PLLs Top, bottom, left & right Power Optimized Architecture

Patented Power Management Clock Gating Triple-oxide Transistors Very low leakage

Device

eGroup eUnit bRAM eUnit eUnit RF eUnit


32K x1 16K x 2 8K x 4 4K x 8 4K x 9 2K x 16 2K x 18 1K x 32 1K x 36

Block RAM (36Kbits) True Dual Port

RF

512 x 1 256 x 2 128 x 4 64 x 8 32 x 16 16 x 32

Register File (512 bits) Simple Dual Port

eMotif eCell

eUnit

576 eCells 384 eDFFs eDFF 3 eCells 2 eDFFs

D en

N2X260 eCells Equivalent Gates (Million) bRAM (# of blocks) bRAM (Kbits) Register File (# of blocks) Register File (Kbits) ViaROM (# of 256Kbit blocks) PLL DLL MGIO 6.5Gbps Transceivers Packages and MGIO / User I/Os BG480 (23x23) FC480 (23x23) BG484 (23x23) BG672 (27x27) FC672 (27x27) FC780 (29x29) BG896 (31x31) FC896 (31x31) FC1152 (35x35) FC1152 (35x35) FC1152 (35x35) 480* 258,048 2.6 112 4,032 224 112 4 16 28 -

N2X380 387,072 3.9 168 6,048 336 168 4 16 36 -

N2X550 552,960 5.5 240 8,640 480 240 4 16 52 -

N2X580 580,608

N2X740 737,280 7.4 320 11,520 640 320 4 16 52 -

5.8
468 16,848 4 16 52 -

334 338 305*

314 338 305* 452 482 480* 600 620 620 792 792* 792* 620 458 468 468

* Note: FPGA Drop-in-Replacement 10

DSP functions implemented using eCells or soft DSP cores


Using eCells
1 TeraMAC DSP processing using eCells* Granular eCells enable bit-specific implementation Free Tensilica Diamond DSP cores

Using Soft DSP cores

eCells
GMACs

Stratix-IV**
GMACs

10x10 MAC 16x16 MAC 18x18 MAC 32x32 MAC

196 384 590 710

1,880 960 624 360

1,673 941 743 353

16 16 16 32

* Raw processing capability ** Combination of DSP Blocks and Logic-based MACs

eCell-based Multiply Accumulate (MAC)

11

Combinatorial Multiplier Full Precision

18 36 18

Multiplier Size
Quantized Output

Area Optimized eCELLs


112 241 429

Performance Optimized eCELLs


157 316 551

Performance
(MHz) 378 339 258

Performance
(MHz) 452 392 342

8x8 12x12 16x16

All DSP arithmetic is constructed out of eCells

18x18
24x24 32x32

495
914 1,539

231
184 150

629
1,083 1,735

337
303 265

Multiplier implementation has no pipelining

Device Core Voltage: 1.1V

12

Power Management Technique

Static Power Reduction

Dynamic Power Reduction

Low Power Silicon Process


Lowest Power silicon without performance compromise

No SRAM cells in LUTs or routing


Via optimized look-up tables and routing

Lower core voltage options


1.1V, 1.2V

GreenPowerVia
Power down unused eCells and Memory (zero leakage)

Sleep mode via clock gating


Low power mode

Dynamic power control


Column based clock gating to control dynamic power
GreenPowerVia: eASICs single Via configuration scheme enables customers to design eco-friendly, green solutions. Enabling up to 80% lower power consumption than FPGAs GreenPowerVia empowers designers to reduce system power or power per channel for their solutions and customers.

13

Unused Logic, Memories, PLLs, I/O are turned OFF to save Power

DUC CPRI DUC

CFR CFR DDC DDC

DPD DPD AGC AGC

Static Power Consumption vs Device Utilization


0.200 0.180 0.160 0.140 0.120 0.100 0.080 0.060 N2X380

0.040
0.020 0.000 50% 60% 70% 80% 90% 100%

Most designs at no more than 80% utilized some as low as 50-60%

50% Utilization 50% Lower Power

14

Sources: eASIC: Nextreme-2 Power Estimator eASIC: NX1500 NX5000 Measured Altera: PowerPlay Early Power Estimators Xilinx: XPower Early Power Estimators

Virtex-6 Virtex-5 Stratix-4

Stratix-3

Cyclone-III

Spartan-6 LX / LXT

Nextreme

Note: Static Power even lower if <100% resources used


Nextreme-2

Tj 70 degrees

Notes:

a. 1 Altera Logic Element = 1 Xilinx Logic Cell = 1 eASIC eCell b. Typical Conditions

15

Cyclone-III

Medium Density Design: 100K LC, 67K FF, Tj=70C, TR = 25%


Stratix-3
Sources: eASIC: Nextreme / Nextreme-2 Power Estimator Altera: PowerPlay Early Power Estimators Xilinx: XPower Early Power Estimators

Virtex-5 Spartan-6

Stratix-IV Virtex-6

Nextreme

Nextreme-2

Frequency (MHz)

16

Market
Wireless Ultrasound Wireless Ultrasound Wired Wireless

FPGA
Altera EP3SE260 Altera EP4S530 Altera EP3SL150 Altera EP3SL70 Xilinx 5VLX150 Altera EP4GX230

Total Power (Watts)


12.2 20.3 8.5 5.26 5.8 10.68

eASIC
N2X380 N2X740 N2X380 N2X260 N2X550 N2X550

Total Power (Watts)


2.61 5.32 3.51 1.44 1.45 3.95

Power Savings 79% 75% 59% 73% 75% 63%

~50% to 80% Power Savings over FPGA!

17

Nextreme-2 IO Slice is common to every I/O


eIO via configurable single ended, voltage referenced and differential physical layer eOLogic - Output DDR logic eILogic Input DDR logic eSERDES ability to configure multiple channels to support 1.2Gbps source synchronous interfaces LVDS, SERDES, DPA etc
Eg SPI4.2, Hypertransport, CSIX, UTOPIA, SONET / SDH, SoftCDR

eOLogic eIO eSERDES eILogic

Nextreme-2 I/O Slice

18

eASIC offers custom package development service for FPGA drop-in replacement packages Both for Xilinx and Altera Ask eASIC for Technical Feasibility Drop-in devices already implemented include:
Device
N2X260, N2X380 N2X550, N2X740

Package
BG484 FC780 FC1152

FPGA
Altera
Stratix-III Stratix-IV

Altera
Stratix-III Stratix-IV

Benefits
No PCB redesign needed Shorter board re-qual time Faster time to production No need to pay high ASIC NRE

19

Extended IO eFUSE 64 bit eFUSE 40 bit for user One bit disables scan Up to 16.8Mb Embedded Memory True Dual-Port, 36Kb blocks ViaROM

45nm

128 Extended IO (Do not support SERDES or DDR)

32 Fully Balanced Clocks Vertical & Horizontal Spines

Up to 630 I/Os Differential Single-ended Voltage-referenced Dynamic Phase Alignment

Up to 50 DLLs

Support for: - DDR3 up to 1067Mbps - DDR2, DDR, Mobile DDR - RLDRAM II, QDR II+

Up to 20 PLLs Top, bottom, left & right Power Optimized Architecture 1.0V Core Voltage Support Similar performance to N2X 1.2v

32, 6.5Gbps MGIO Transceivers 208 Gbps bandwidth

20

Device

eGroup
32K x1 16K x 2 8K x 4 4K x 8 4K x 9 2K x 16 2K x 18 1K x 32 1K x 36

Block RAM (36Kbits) True Dual Port

eUnit bRAM eUnit

eUnit bRAM eUnit

eMotif eCell

eUnit

576 eCells 384 eDFFs 3 eCells 2 eDFFs

eDFF

D en

21

N2XT330 eCells Equivalent Gates (Million) bRAM (# of blocks) bRAM (Kbits) ViaROM (# of 256Kbit blocks) PLL DLL MGIO 6.5Gbps Transceivers 331,776 3.3 288 10,368 4 20 50

N2XT580 580,608 5.8 468 16,848 4 20 50

24

32

Packages and MGIO / User I/Os


FC780 (29x29) BG896 (31x31) FC896 (31x31) FC1152 (35x35) FC1152 (35x35) FC1152 (35x35) 8 / 630 24 / 556 8 / 630 24 / 556 32 / 480 16 / 400 16 / 400

* Note: FPGA Drop-in-Replacement 22

Multi-Protocol Support with up to 6.5Gbps


Tx
Channel 0

3 Power States: Normal - 42mW (1.25Gbps), 70mW (5.0Gbps) Partial - : 31mW (1.25Gbps), 53mW (5.0Gbps) Slumber - 16mW (1.25Gbps), 20mW (5.0Gbps) Tx
Channel 2

Rx

Tx

Rx
Channel 1

Rx

Tx

Rx
Channel 3

S2P

P2S

S2P

P2S

P2S

S2P

P2S

PCS

PIPE (PCIe)

PIPE (PCIe)

CMU

PIPE (PCIe)

PIPE (PCIe)

SAPIS (SATA)
General

SAPIS (SATA)
General

SAPIS (SATA)
General

SAPIS (SATA)
General

Lanes operate at 1x 2x or 4x multiples

PCS allows user to select between: PCI Express PIPE interface SATA SAPIS interface more general interface for use with other standards

MGIO arranged as Quads. Each Quad shares a common PLL.


Clock division supported.

S2P

RXPLL

RXPLL

RXPLL

RXPLL

TXPLL

TXPLL

TXPLL

TXPLL

PMA

REF_CLK BUF.

23

RX-PMA

RX-PCS

Termination

RX-EQ/ DFE

CDR

S2P

Polarity / Bit Order (reg 0)

Symbol Alignment

Clock Compensation FIFO

Lane Deske w FIFO

20b/16b Decoder

Polarity / Bit Order (reg1)

RX Input Buffer

RX CLR (local) Runlength Detect TX CLK OOB Detect

RX Data / Status 8/10 bits OR 16/20 bits

8b/16b/10b/20b BIST Checker

Serial Links
TX-PMA
TX-PCS
8b/16b/10b/20b BIST Generator (One per Quad)

eCells
TX CLK(from CMU) TX CLK(Quad) TX CLK(local)

BIST Data Delay Termination

OOB

P2S

Polarity / Bit Order (treg0)

16b/20b Encoder

Polarity / Bit Order (treg1)

TX Data / Control 8/10 bits OR 16/20 bits

TX Driver

24

Telecom

CPRI -1228 CPRI-2457 CPRI-3072 OBSAI 1536 OBSAI 3072 SRIO v1.3 OC-48

CPRI-6144 OBSAI 6144 SRIO v2.1 SFI-5 Interlaken SPAUI 6.25 Double/R XAUI CEI-6G

Networking

GEPON 1GbE SGMII

SPAUI 3.125 XAUI HiGig/HiGig+ SAS 3.0 2G-FC SATA 3 4G-FC

Storage

SAS 6.0
SATA 6

1G-FC

SATA 1.5

Video

DisplayPort 1.62 DisplayPort 2.7 HD-SDI 3G HD-SDI PCIe Gen 1 PCIe Gen 2 Infiniband USB3.0 2 3 4 Li ne Rate - Gbps 5 6 7 8

Computing
iSCSI

25

Device
eASIC Nextreme-2T Altera Stratix-IV GX Xilinx Virtex-6 Altera Arria-2 GX

Power per channel @ 6.144 Gbps**


141mW 232mW* 253mW* 230mW*

Power Savings

39% 44% 39%

~40% Power Savings over FPGA!

*data sourced from FPGA power estimator spreadsheets. ** Power is per lane (in active quad configuration) including PMA, PCS and CMU (PLL)

26

Spectral Plots from RRH Antenna -1.4% EVM.

Low Power Nextreme-2T 6.5 Gbps device driving CPRI interface to RRH. CPRI Mask Compliant

Radiocomp CPRI / OBSAI RRH

10M optical fiber

27

MGIO characterization report is available now Includes:


MGIO test board setup PLL characterization

PMA Tx characterization
PMA Rx characterization Power consumption

28

Familiar IDE Environment: - Easy FPGA Conversion - eZ-IP Wizards - Graphical Layout - Pin Placement - Macro Placement - Floorplanning - Push Button Flow

29

New features in 8.2:


Improved QoR for ePlacer/Resynthesis
~20 / 30% improvement in Push-Button Results* Run time improvement as high as 50% over eTools 8.1 Automatic pre-buffering and post-buffering netlist resynthesis (worst path optimization)

Ability to use propagated clocks and OCV in ePlacer/Resynthesis


Automatic netlist constant propagation after floorplanning More Control of Implementation Tools Synthesis 3rd Party Support Support for DC Ultra, derating, multiple CPUs and wire loads for DC Synthesis

Available Now
Evaluate Today!

* QoR improvements are design dependent

30

Design Conversion

Hand-off - Option 1
(Synthesized Netlist)
(Design Services required)

Back End Implementation

Hand-off - Option 2
(Placed Netlist)

31

eZ-IP Cores
Digital Signal Processing
FFT, FIR Compiler, NCO, CIC Filter Turbo CODECs, Digital Pre-Distortion

Embedded Processing
P cores: Coldfire, Tensilica, Gaisler Research, OpenCores DSP cores: Tensilica

Interfaces
PCIe Gen1/2, PCI, PCI-X 10/100/1G/10/40G Ethernet CPRI, sRIO, OBSAI Interlaken DDR3, DDR2, DDR, M-DDR USB, SPI, I2C
Silicon proven with Development Board

Level 1 Level 2
Flow Verified

Encryption/Decryption
AES, DES, MD5, SHA

Video/Image Processing
H.264, MPEG-4, VGA

Level 3

Synthesizable RTL

32

MGIO operate at all speeds (1.25-6.5 Gbps) All physical layer characterization completed MGIO characterization report available now

Next Step: Protocol and IP qualification:


CPRI & OBSAI PCI Express Gen 1 & Gen 2 XAUI, Double XAUI Gigabit Ethernet

SRIO
Interlaken

34 Inch FR4 @ 6.5 Gbps

2.5 Gbps PRBS 16

33

eASIC 45nm devices, proven, in volume production


Low up-front costs

Fast turnaround
Simple design flow Up to 80% lower power than FPGAs

N2XT330 and N2XT580 available now Up to 32 MGIO MGIO characterization report available now

Evaluate eTools 8.2 Today and get a Power Reduction Estimate from eASIC

34

35

You might also like