Professional Documents
Culture Documents
12,DECEMBER2014
2585
ASystematicDesignMethodologyfor
LowPowerNoCs
GursharanReehal,Member,IEEE,andMohammedIsmail,Fellow,IEEE
AbstractNetworkonchip(NoC)communicationarchitectures
areemergingasthemostscalableandefficientsolutiontohandle
onchipcommunicationchallengesinthemulticoreera.InNoCs,
powerestimationsintheearlystagesofthedesignhelpthe
designerstooptimizethedesignforenergyconsumptionand
efficientlymapapplicationstoachievelowpowersolutions.How
ever,in90nmdesignsorbelow,theimpactofparasiticsnotonly
influencetimingclosure,butalsoleadstovariabilityinpower
andareabudgetsamongdifferentNoCarchitectures.Thereisa
growingneedforadvanceddesignmethodologiestoovercome
theseissuesinNoCdesigns.Thispaperpresentsasystemlevel
designmethodologybasedonlayoutandpowermodelstoachieve
lowpowerandhighperformanceNoCdesigns.Theimpactof
globalinterconnectswithandwithoutrepeaterinsertiononthe
bandwidthandpowerisconsidered.Widthandspacingofglobal
interconnectsanditseffectonperformanceandpower
dissipationareanalyzed.Forarchitecturallevelpoweranalysis,
differentrouterdesignsforChipLevelIntegrationof
CommunicatingHeterogeneousElements(CLICHE),Butterfly
FatTree(BFT),Scalable,Programmable,IntegratedNetwork
(SPIN),andOctagonNoCarchitecturesareimplementedusing
ARMs65nmstandardcelllibraryin65nmTaiwan
SemiconductorManufacturingCorporation(TSMC)process.
TherouterdesignsaresynthesizedinRVTprocessusingaVdd
of1.0Vandatemperatureof25C.SynopsysPrimeTimePX
designtoolisusedforcalculatingaveragepowerdissipationof
therouterdesigns.
IndexTermsBandwidth,ButterflyFatTree(BFT),ChipLevel
IntegrationofCommunicatingHeterogeneousElements
(CLICHE),delay,interconnects,IPbased,networkonchip
(NoC),Octagon,performance,powermodels,Scalable
ProgrammableIntegratedNetwork(SPIN).
I.INTRODUCTION
ASSEMICONDUCTORindustryismovingtowardcomplexsystem
onchip(SoC)designscontaininghundredsorthousandsof
heterogeneousIPblocks,networkonchip
(NoC)designsareemergingasoneofthemosteffectiveand
reliablechoiceofcommunicationfabric.NoCsarepacket
switchedinterconnectednetworks,integratedontoasingle
chip,andtheiroperationisbasedontheoperatingprincipleof
macronetworks.NoCattemptstosimplifytheglobalcom
municationproblem,byprovidingvariouscomponentlevel
ManuscriptreceivedSeptember17,2012;revisedApril21,2013andAugust
25,2013;acceptedNovember26,2013.DateofpublicationMarch3,2014;
dateofcurrentversionNovember20,2014.Thisworkwassupportedinpart
byATIC,AbuDhabi,andinpartbytheKSRC,Kustar,UAE.
G.ReehaliswiththeDepartmentofElectricalandComputerEngineering,
TheOhioStateUniversity,Columbus,OH43210USA(email:
reehal.4@osu.edu).
M.IsmailwaswiththeDepartmentofElectricalandComputerEngineering,
TheOhioStateUniversity,Columbus,OH43210USA.Heisnowwith
KUSTAR,UAE(email:ismail@kustar.ac.ae).
Colorversionsofoneormoreofthefiguresinthispaperareavailableonline
athttp://ieeexplore.ieee.org.
DigitalObjectIdentifier10.1109/TVLSI.2013.2296742
elements(CLICH),ButterflyFatTree(BFT),scalable,
programmable,integratednetwork(SPIN),andoctagon[2]
[4].NoCisscalableduetoitsinherentstructureanddesign.
NoCarchitectureprimarilycomposedofthreemain
components:1)switches(orrouters);2)interswitchlinks;and
3)repeaters.NoCarchitecture(ortopology)specifiesthe
physicalarrangementofthecommunicationnetwork.It
defineshownodes(IPs),switches(routers),andlinksare
connectedtoeachother.ThesuccessofanNoCdesign
heavilydependsonitspowerbudget.Ahighlevelnetwork
modelofNoCarchitectureisshowninFig.1.
InanNoCdesign,wireslinkingtwoswitchesarecalled
interconnects.Theseinterconnectplayacrucialroleinthe
overallsystemperformanceandcanhavealargeimpacton
totalpowerconsumption.Inoldertechnologies,whenwires
weremuchwiderandthick,itwaspossibletotreatonchip
interconnectsaspurelycapacitiveloadsoflogicgates,i.e.,
thesewireshadnointrinsicdelaysoftheirownandwere
modeledasshortcircuits.However,withmorereductionin
wirewidths,interconnectcapacitancebecamecomparable
withgatecapacitanceandwasrequiredinwiremodeling.
Now,withthetechnologyscalingintothedeepnanometer
regime,wireshavebecomemuchnarrower,drivinguptheir
resistanceandcapacitancetothepointthat,inmanypaths,the
wireRCdelayexceedsgatedelayandcanseverelyimpact
achievablesystembandwidthandthusNoCperformance.
Fig.1.NoCarchitectureanditsmaincomponents.
architectureswithspecificinterconnectionnetwork
topologies.SomeofthemaintopologiesforNoCarchitecture
arechiplevelintegrationofcommunicatingheterogeneous
OneofthephysicalconstraintsintheimplementationofNoC
networksistheavailablewiringarea,asmostofcomplexSoC
designsaregenerallywirelimited.Thesiliconarearequiredby
thesesystemsisprimarilydeterminedbytheinterconnectarea,
andthechoiceofnetworkdimensionthereforeinfluencedbyhow
welltheresultingtopologymakes
106382102014IEEE.Personaluseispermitted,butrepublication/redistributionrequiresIEEEpermission.See
http://www.ieee.org/publications_standards/publications/rights/index.htmlformoreinformation.
2586
IEEETRANSACTIONSONVERYLARGESCALEINTEGRATION(VLSI)SYSTEMS,VOL.22,NO.12,DECEMBER2014
useoftheavailablewiringarea.Onesuchmethodthatcan
relatenetworktopologytothewiringconstraintisthe
performancemeasurebisectionbandwidth,inheritedfromthe
macrocomputernetworks.Bisectionbandwidthisdefinedas
theminimumnumberofwiresthatmustbecutwhenthe
networkisdividedintotwoequalsetsofnodes.Sincethe
primarygoalhereistojudgethebandwidthandresulting
wiringdemand;onlydatalinesareconsideredintheesti
mationprocess.Bisectionbandwidthisastaticmeasureand
providesroughestimatesandcanonlybeusedintheearly
stagesofthedesignprocess,whenalittleornoinformation
aboutthephysicallayoutisavailable.Innanometertechnolo
gies,however,therelationbetweenbisectionbandwidthand
achievablebandwidthisnotonetoone.Bisectionbandwidth
ignoresthefactsthatastechnologyscalesandsystemsemploy
largernetworks,wiredelaysbegintodominateandthedelay
throughthelongwirescouldbesubstantial.Forthesereasons,
interconnectshavebecomecenterofattentionwithrespectto
area,delay/performance,andpowerconsumptioninnanome
terdesigns.Interconnectshavenotscaledexponentiallylike
transistorswithtechnologyscaling.Toovercomethesepower
andperformanceissuesofinterconnects,someadvancedmiti
gationschemesmaybenecessaryindesigns.Inaddition,
theseenhancementsolutionsoftenresultindifferentareaand
powerbudgetpreestimatedintheearlystagesofdesigns.
Hence,manyresearchershavepointedoutthateconomical
designofpresentandfuturenanometersdesignsislimitedby
theirwiringdemands.
Inregardstopower,thesituationissimilar;theportionofpower
associatedwithinterconnectskeepsincreasingwitheach
technologyscaling.Thisisanimportantissuebecauseinthe
conventionaldesigntheanalysisandsynthesisofverylargescale
integrationcircuitsarebasedontheassumptionthatgatesarethe
dominatingsourcesofonchippowerconsumption.Withpower
beingthemostcriticaldesignconstraintinlargeSoCdesigns,
architecturallevelpowerestimationthereforehasbecome
extremelyimportanttoverifythatpowerbudgetscanbemetnot
onlybythecommunicationnetwork,butbytheentiresystem.
SomeearlydesignoptimizationstechniquesforNoCsarehighly
dependentonearlypowerbudgetestimations.
Inthispaper,adesignmethodologybasedonNoC
architecturallevellayoutsandhighlevelpowermodelsis
presentedtoachievelowpowerandhighperformanceNoC
designs.Themethodologyisefficientinselectinganappropri
ateNoCarchitectureforlowpowerresultsandthusfor
shortertimingclosure.Therestofthispaperisorganizedas
follows.InSectionII,westudyinterconnectperformanceand
theeffectofscalingontheinterconnectresistanceand
capacitance.InSectionIII,NoClinkoptimizationtechnique
basedonRCdelaymodelisdiscussed.SectionIVprovides
insightintothebufferinsertiontechniqueforperformance
enhancementinlongerinterconnects.SectionVprovidesNoC
powerdissipationmodelsforvariousNoCarchitectures.In
SectionVI,IPbaseddesignmethodologyforlowpower
resultsarepresented.InSectionVII,somesimulationresults
basedonthedesignmodelsarepresented.Finally,the
conclusionispresentedinSectionVIII.
Fig.2.Gatedelayversusinterconnectdelay.
wire
Fig.3.NoCinterconnects.
II.NoCPERFORMANCEANALYSIS
OneofthemostcriticalchallengesforanNoCdesignisto
providethedesiredbandwidthbytheSoCdesigntomanage
certainperformancethresholds.Astechnologyisscalingin
nanometerdomain,however,achievinghigherbandwidth
couldbetrickyandmayrequiremitigationschemes.Aswires
arecontinuingtoshrink,wiringdelayisdominatinggate
delays,asshowninFig.2.
(1)
delay
W
Thewiredelaydoubleswitheachtechnologynodeand
increasesquadraticallyasafunctionofwirelength.Evena
smallnumberofglobalinterconnects,wherethesignaldelay
isveryhigh,canhaveasignificantimpactonsystem
performanceandmayalsoinfluencetimingclosure.Thekey
tosolvethisproblemistoknowmoreaboutthephysical
design,i.e.,placementofIPsandestimatedinterconnects,
earlyinthedesigncycleforaccuratebudgetingandshorter
designcycles.ToachievehigherbandwidthinNoC,itis
possibletodesignpipelinedrouterssuchthattheyprocessone
flitpercycle,butthedurationoftheclockcycleusually
determineshowfasteachflitcanbeprocessedinthenetwork.
InnanometerNoCs,thiscycletimeisnotlimitedbythelogic
inbetweentwoclockedelements,butbythelinksbetween
tworouters.NoClinkstypicallyconsistofanumberof
parallelsignalwiresoffixedwidthandspacing,asshownin
Fig.3.
Theselinkscanbeuseddirectlytoexpressanumberof
metrics,suchasdatarate,bandwidthdensity,orbisectional
bandwidth.However,datarateispreferredandmostappro
priatemetrictoestimatesystemperformanceandcanbe
expressedasfollows:
Bandwidth=
N
wires
+
S
delay
REEHALANDISMAIL:SYSTEMATICDESIGNMETHODOLOGYFORLOWPOWERNoCs
2587
Fig.5.Crosssectionalviewofglobalinterconnects.
resistivity(2.2_m)toreducewiringresistance.Wirecapac
itance,ontheotherhand,ismorecomplex,andmanyofits
componentsaregeometrydependent,asshowninthecross
sectionalviewofaglobalinterconnectsinFig.5.
Fig.4.15FO4delayindifferenttechnologynodes.
whereNwiresisthetotalnumberofsignalwiresinthelink
Cc
Thethreemajorcomponentsofwirecapacitancearerelatedto
thegeometrybythefollowingrelation:
anddelayisthedelayofasinglewire.Thus,toachieve
CT=Cf+Cb.W+
(4)
higherbandwidth,itisimportantthatthedelayiskepttoa
minimum.Interconnectdelayisafunctionofwireresistance
whereCTisthetotalcapacitance,Cfisthefringingcapaci
andcapacitance.ThedelayofadistributedRClinedrivenby
tance,Cbistheparallelplatecapacitanceduetothetopand
anidealdriver(zerooutputimpedance)atthenearend,and
bottomlayersofmetalandisproportionaltotheinterconnect
anopenterminationatfarendcanberepresentedas
width,andCcisthecouplingcapacitancebetweenneighboring
Twire(delay)=0.4R.C.L
(2)
interconnectsandisinverselyproportionaltotheinterconnect
spacingS.Theparallelplatecapacitanceis
whereTwireisthewiringdelay,Listhewirelength,Risthe
wireresistanceperunitlength,andCisthecapacitanceper
unitlength.Thisisbelievedtobeagoodapproximationandis
reportedtobeaccuratewithin5%forawiderangeofRandC
[30],[31],[35].InNoCdesign,theminimumconceivable
clockcycletimeinahighlypipelineddesigncanbeassumed
tobeequaltothevalueof15FO4,withFO4definedasthe
delayofaninverterdrivingfouridenticalones[6].
Thetwoimportantparametersforinterconnectdelayarewire
resistanceandwirecapacitance.Theresistanceperunitlength
ofawirecanbedefinedas
R=
.L
(3)
Indifferenttechnologynodes,FO4canbeestimatedas425
Lmin,whereLministheminimumgatelengthinanytechnology
T.W
node[5].Inlongwires,theintrinsicdelaycaneasilyexceedthis
limitof15FO4,andtherebylimitingtheclockcycletimeandasa
result,systembandwidthmaysuffer.15FO4delayfordifferent
technologynodesisshowninFig.4.
Dependingonthelengthofinterconnects,differenttech
niquesmaybenecessarytoreduceintrinsicRCdelay.Two
methodsforreducinginterconnectdelaysarediscussedin
SectionsIIIandIV.
whereistheresistivityofthewire,andismaterialdepen
dent.Inmoderntechnologies,copperisbeingusedforlow
W.L
III.NoCLINKOPTIMIZATIONUSING
b=oxH
INTRINSICRCMODEL
(5)
whereoxistheSiO2dielectricconstant,Wisthewidth,Lis
thelengthofthewire,andhisthedielectricheight.Fringing
andcouplingcapacitancearemoredifficulttocomputeby
hand,butanempiricalformula,whichiscomputationallyeffi
cientandrelativelyaccurate,isgivenbyfollowingequations:
0.25
LW+0.77+1.06W
Cc=ox
0.83
f = ox
0.5
+1.06T (6)
0.07
H
4/.
0222
L0.03
S
W
(7)
Wideningawireproportionallyreducesresistancebut
increasesthecapacitanceduetothetopandbottomlayers.
Thisleadstolessthanproportionalincreaseincapacitance,
stilltheoverallRCdelayimproves,butontheotherhand,
increasingspacingbetweenthewiresreducescapacitanceto
theadjacentwiresandleavestheresistanceunchanged.This
alsoreducesRCdelaybysignificantlyreducingthecoupling
capacitance.WhileTandHparametersarefixedforeach
metallayerinagivenprocesstechnology,parametersWandS
canbechosenbythelinkdesignertoachieveanacceptable
delay.Ifadesignislimitedbythewiringspaceavailablethen
Using(2)(7),delayofthelongestinterconnects(10mm)in65nmtechnologywascalculatedtobe4254ps,whereasthe
15FO4timeinthesametechnologynodeis414.375ps.Theachievablefrequencybythesystemis2.41GHz,whereasdueto
thelengthandassociateddelayofthelongestinterconnect,theachievablefrequencyislimitedto0.24GHzonly.Afterbuffer
insertion,thewiredelaywasimprovedto412.3psandisshowninFig.7.
Withoptimalrepeaterinsertion,thegrowthoftheinterconnectdelaybecomeslinearwiththewirelength.However,forlarge
highperformancedesigns,thenumberofsuchrepeaterscanbeprohibitivelylargeandcantakeupsignificantportionofsilicon
androutingareaandadditionallycanconsumesignificantamountofpower.NoCpowerdissipationisdiscussedinthe
followingsection.
whereRsandCsaretheresistanceandcapacitanceofaminimumsizeinverter,Ristheresistanceofwireperunitlength,andC
isthecapacitanceofwireperunitlength.Similarly,optimalwidthWoptandoptimalspacingaregivenasfollows:
2588
IEEETRANSACTIONSONVERYLARGESCALEINTEGRATION(VLSI)SYSTEMS,VOL.22,NO.12,DECEMBER2014
varyingWandSforoptimaldelaywillhaveanimpactonthe
numberofwiresinthelinkbythefollowingrelation:
wirecanbemadelinearwithdistance,bysplittingtheline
intomultiplesegmentsandinsertingarepeaterbetweeneach
segmenttoactivelydrivethewire.
Usingthemethodologyin[8],optimalrepeatersizekoptand
Awire=NwiresW+(Nwires1)S.
optimalinterbuffersegmentlinelengthhoptcanbe
(8)
calculatedusing
Increasinginterconnectswidthandspaceinalimitedareawill
reducethenumberoflinksandthustheoverallsystem
bandwidth.Asaresult,thesegeometricadjustmentstoachieve
lowerdelaycancreateanupperboundontheconceivable
bandwidth.
IV.PERFORMANCEOPTIMIZATIONUSING
opt
Rs.C
BUFFERINSERTION
Bothresistanceandcapacitanceofawireincreaseswithwire
2
length,sotheRCdelayofawireincreaseswithL .Therefore,
forlongerNoCinterconnects,wiresizingandspacingaloneis
notsufficienttolimitthequadraticgrowthofdelaywith
respecttothewirelength.Bufferinsertion(alsocalledrepeater
insertion)maybenecessary,asshowninFig.6.Thedelayofa
(9)
R.Cs
R.C
opt
2.Rs(Cs+Cp)
(10)
Fig.6.NoCinterconnectswithoptimalbufferinsertion.
Fig.7.Delayof10mmlongunbufferedversusbufferedwirein65nm.
V.NoCPOWERANALYSIS
W
=
C S
opt
a opt+ c
opt
C W
c
Cb
opt
C +C W
a b
opt
leadtoreliabilityconcernsbecauseofelectromigrationandotherheatrelated
failuremechanisms.Carefulanalysisofpowerconsumptionatallstagesofdesign
isessentialfor
Powerconsumptionisamajor
concernforanylargechip
designincludingthedesignof
anNoC.InNoCs,poweris
dissipatedbecauseofthelarge
amountofactivitygenerated
bysomanytransistors.High
powerconsumptioncanresult
inexcessivelyhigh
temperaturesduringoperation,
whichcan
keepingpowerconsumptionwithinacceptablelimits.InNoCdesign,power
dissipationismainlybecauseofthreemaincomponentsofthenetwork,namely:
1)routers;2)interconnects(orwires);and3)repeaters.Theclosedformtotal
powerequationforNoCcanthusbedefinedas
(13)
wherePswitchesisthetotalpowerconsumedbytheswitchesinthenetwork,Plineisthetotalpowerdissipationofinterswitch
links,andPrepisthetotalpowerdissipationoftherepeaters,whicharerequiredforlonginterconnects.Thenumberofrepeaters
requireddependsonthelengthoftheinterswitchlink
Pline=CLvdd f
(14)
Prep=NrephoptCovdd
(15)
leakrep
dd
shortrep dd
IfthenumbersofIPsisequalinthexandydirections,thenthenumberofhorizontallinksisequaltotheverticallinks,and
canbecalculatedusingN(N1).Dependingonthetechnologynode,theoptimallengthforrepeaterinsertioncouldbe
obtainedusing(9).ThetotalinterconnectslengthandtherequirednumberofrepeatersforaCLICHtopologycanthusbe
calculatedusing
REEHALANDISMAIL:SYSTEMATICDESIGNMETHODOLOGYFORLOWPOWERNoCs
2589
Fig.8.Layoutfora16IPCLICHnetwork.
Fig.9.Layoutfora16IPBFTnetwork.
Coistheinputcapacitanceofaminimumsizerepeater.Vddisthe
supplyvoltage.
whereistheactivityfactoroftheinterswitchlink,Cisthe
interconnectcapacitance,andfistheclockfrequency.Nrepisthe
totalnumberofrepeaters,hoptistheoptimalrepeatersize,and
calculatedusingthefollowingexpression:
Area(
totalcliche= sw switch+
N 1
) NwiresCVDD
Area
wires
1)
kopt N
hopt co VDD
f+Prepleak+Prepshort
(19)
A.CLICHArchitecture
whereNswisthetotalnumberofswitchesinthenetworkandPswitchis
thepowerconsumedbyasingleswitch.
CLICHtopologyisproposedin[1].Itisa2Dmesh
consistingofmnmeshofswitches,interconnectingcompu
tationalresources(IPs).Everyswitchexceptthoseatthe
edgesisconnectedtofourneighboringswitchesandoneIP
block.InCLICH,thenumberofswitchesisequaltothe
numberoffunctionalIPs.ThelayoutofaCLICHtopology
consistingof16IPsisshowninFig.8.
TheIPsandswitchesareconnectedthroughcommunication
channels.Achannelconsistsoftwouniformbidirectional
linksconsistingofdataandcontrolsignals.Thistopologyis
widelyusedintheNoCdesignsbecauseofitsregular
structureandshorterinterswitchinterconnects.Inthis
architecture,allinterswitchwiresegmentsareofsamelength,
andcanbedeterminedusingthefollowingexpression:
B.BFTArchitecture
BFTtopologyasanNoCarchitectureisproposedin[5].In
thisarchitecture,theIPsareplacedattheleavesandswitches
atthevertices.Atthelowestlevel(level0),thereareNIPs
andIPsareconnectedtoN/4switchesatthefirstlevel.The
numberoflevelsinBFTarchitecturedependsonthetotal
numberofIPs,andcanbecalculatedusing(21).Thelayout
schemeofaBFTarchitectureconsistingof16IPsisshownin
Fig.9.
WiththelayoutschemeshowninFig.8,thetotalnumberof
switchesneededandinterswitchwirelengthsforaBFT
architecturecouldbecalculatedusingthefollowingequations:
sw
Area
(16)
N
_
L = .
levels
(20)
where
=
2
levels=log2(N)3
(21)
switchesatjthlevel=
N
(22)
2j+1
.
CLICH
1) Nwires
Area
=
2
wires.
1)
Area
rep
cliche
AsshowninFig.8,therearetwodifferenttypesofinterconnect
lengthsinbetweenswitchesfora16IPBFTarchitecture.
Theinterswitchwirelengthscanbecalculatedusingthefollowingexpression
[10]:
(18)
kopt N
l
a+1,a=
Area
(23)
levelsa
Usingtotalnumberofswitchesneeded,totalwirelengthfor
interconnectsandtotalnumberofrequiredrepeaters,thetotal
powerconsumptionfortheCLICHarchitecturecanbe
isthelengthoftheinterconnectspanning
thedistance
betweenlevelaanda+1switches,where
wherela+1,a
acantake
integervaluesbetweenzeroandlevels1.
2590
IEEETRANSACTIONSONVERYLARGESCALEINTEGRATION(VLSI)SYSTEMS,VOL.22,NO.12,DECEMBER2014
Thus,thetotallengthofinterconnectsandtotalnumberofrepeaterscouldbecalculatedusingthefollowingexpressions:
total=
Area
[levelsN
wires
(24)
2levels
repeaters=
NNwires
l
1,0
+
1
2,1
+
1
3,2
opt
opt
opt
levels
levels
1
(25)
2
log
4(N)
opt
wherekoptistheoptimallengthoftheglobalinterconnect
lengthinbetweentworepeaters.Usingthetotalnumberof
Fig.10.Layoutfora16IPSPINarchitecture.
switches,totalwirelength,andthetotalnumberofrequiredrepeaters,thetotalpowerdissipationofBFTarchitecturecouldbe
calculatedusingthefollowingexpression:
totalBFT
levels
Pswitch
= 2
Area
[levelsNNwires]cVdd
f
2levels
1,0
2,1
3,2
+ N Nwires
opt
opt
opt
Fig.11.LayoutforanOctagonarchitecture.
levels
levels
log
4(N)
opt
equation
:
3N
(26)
totalspin=
Pswitch+0.875
area NN
wires
cV
dd
+ N Nwires
area
area
area
C.SPINArchitecture
8k
opt
4k
opt
2k
opt
SPINisproposedin[3].Thisnetworkmakesuseoffattree
(29)
topology;everynodehasfoursonsandthefatherisreplicated
fourtimesatanylevelofthetree.Thistopologycarriessome
D.OctagonArchitecture
redundantpaths,andthereforeoffershigherthroughputat
thecostofaddedarea.Thistopologyisscalableanduses
Octagonnetworktopologyisproposed
in[4],as
an
smallnumberofroutersforagivennumberofIPs.Inalarge
onchipcommunicationarchitecturefornetworkprocessors.
SPIN(>16IPs),thetotalnumberofswitchesis3N/4[3].
Abasicoctagonunitconsistsofeightnodesand12bidirec
AnefficientfloorplanfortheSPINarchitectureisshown
tionallinks.EachnodeisassociatedwithoneIPandtwo
inFig.10.
neighboringswitches.Communicationbetweenanypairof
Withthisfloorplan,theinterswitchwirelengthcanbe
nodestakesinatmosttwohops.Thenumberofswitches
determinedusing(23).Thetotalwirelength
andnum
requiredinanoctagonunitisequaltothenumberofIPs.For
berofrepeaterscanbecalculatedusingthefollowing
asystemcontainingmorethaneightnodes,theoctagonunit
expressions:
isexpandedtomultidimensionalspaceusingmultiplebasic
octagonunits.Anefficientlayoutschemeforabasicoctagon
unitisshowninFig.11.
spintot
AreaNNwires
(27)
= 0.875
SwitchesarerepresentedbysmallerrectanglesandIPswith
= N Nwires
area
area
area
repeaters
bigrectangles.Dependingonthelayoutstylepresented,as
8k
opt
4k
opt
2k
opt
theoneshowninFig.10,therearefourdifferentinterswitch
wirelengthsneededinoctagonarchitecture[8].Firstsetisconnectingnodes15and48,secondsetisconnectingnodes
Thetotalpowerconsumptionofaspinarchitectureusing
26and37,thirdconnectingnodes18and45,andfourth
thetotallengthoftheinterconnectsandthetotalnumberof
isconnectingnodes12,23,34,56,67,and78.The
requiredrepeatersthuscanbecalculatedusingthefollowing
interswitchwirelengthscanbecalculatedusingthefollowing
REEHALANDISMAIL:SYSTEMATICDESIGNMETHODOLOGYFORLOWPOWERNoCs
2591
expressions:
3L
l1=
(30)
l2
=
13 w N
l wires+
(31)
13 w N
(32)
wires
l4=
L
(33)
where
L
isthe
lengthof
four
nodes
and
is
equalto
(4
is
the
summation
of
global
intercon
Area
/
nectwidthandspace.Consideringdifferentinterswitchwire
lengths,thetotallengthofinterconnectandtotalnumberof
requiredrepeaterscouldbecalculatedusingthefollowing
expressions:
ltotal=
L+52wlNwiresNwiresNoct
(34)
= 2
3L/4
13w N
l wires
L4
repeaters
+2
+ /
opt
opt
13w
N
L/4
+2
lwires
+6
NwiresNoct(35)
opt
opt
where
Noctisthenumberofbasicoctagonunits.Thetotal
Fig.12.
ScalingofIPcoresastechnologyscales.
powerdissipationfortheoctagonnetwork,canthusbecalcu
latedusingthefollowingexpression:
total
thecapacitytointegratesimilartypeofIPsdoublesoritsarea
= Pswitches +14
cVdd f
halves.AnaturalprogressionforthenumberofIPsthatcan
3L/4
13w N
l wires
befitonthesamedieduetotechnologyscalingisshown
L/4
inFig.12.
+ Nwires Noct 2
+2
ItisimportanttoknowtheimpactofanNoCdimension
opt
opt
13w N
l wires
L/4
tointerconnects,andhencetothetotalpowerconsumption.
+2
+6
Recently,afairamountofresearchhasbeendedicatedtoeffi
opt
opt
(36)
cientlymappingIPsinNoCdesigns.Inthispaper,functional
IPblocksarenotdiscussed,sincetheyaredependentonthe
VI.IPBASEDDESIGNSANDEFFECTOFNoCSIZES
specificapplications.However,forthepurposesofthispaper,
theymaybeconsideredasasetofembeddedprocessors.
IPbaseddesignsarenowthedominantwaytodesignany
Inthisexperiment,weshowthatNoCpowerisafunctionof
largesystemcontainingbillionsoftransistorsinareasonable
thenumberofIPsbeingintegratedandthediesize.Depend
amountoftime.IPbaseddesigndiffersfromcustomdesigns
ingonthenumberofIPsandanestimateddiearearequired
inthat,IPsaredesignedwellbeforetheyareused.Therefore,
bythem,atopologyforlowpowerdesigncanbeselected
inthesedesigns,mostofthesystemrequirements,suchas
usingpowermodelspresentedintheprevioussections.For
bandwidth,area,andpowerconsumption,areknownapriori.
differentNoCarchitectures,powerdissipationvaryduetothe
ThelifecycleoffinelydesignedIPsmaystretchwellover
differenceininterconnectwirelengths,numberofswitches,
theyearsfromthetimetheyarefirstcreatedthroughseveral
andtotalnumberofrepeatersrequiredbythetopology.The
generationsoftechnologyuntiltheirfinalretirement.Dueto
totalnumberofIPsbeingintegratedonagivenareacanmake
this,IPbasedNoCdesignsareadominantdesignmethodol
adifferenceinwhetherrepeaterinsertionisrequiredforthe
ogy.Therearemanydifferenttypesofconfigurationpossible
interconnectsornot,andmayresultindifferentareaandpower
withIPdesigns.DependingonthenumberofIPs,different
results.AsthenumberofIPsisincreasedforagivendiearea,
sizesofNoCsmayberequired.Asanexample,anNoC
sometopologiesscalewellwithshorterinterconnectlengths,
withlargenumberofIPsisafinegrainednetwork,whereas
whereasothersdonot.Thelengthsofthelongestinterconnect
withfewerIPsisacoarsegrainednetwork.Thegranularity
fordifferentNoCarchitectures,asthenumbersofIPsare
ofanNoCdirectlyimpactsitspowerconsumption.There
increasedona20mm20mmdiesize,areshowninFig.13.
isadirectrelationshipbetweenthesizeofanNoCandits
InterconnectsforCLICHandoctagonarchitecturesscales
impactonthelengthofinterconnects,ifthediesizeiskept
well,i.e.,thelengthofthelongestinterconnectisreducedwith
constant.Inaddition,fromonetoanothertechnologynode,
increasednumberofIPs.Inothertopologies,suchasBFTand
thedesignmayexperiencesimilareffects.Since,itisanatural
SPINarchitectures,thelengthofthelongestinterconnectsdoes
progressionthat,witheverygenerationoftechnologyscaling,
notscaleduetothephysicalarrangementoftheswitchesin
2592
IEEETRANSACTIONSONVERYLARGESCALEINTEGRATION(VLSI)SYSTEMS,VOL.22,NO.12,DECEMBER2014
Fig.13.ScalingofthelongestinterconnectversusthenumberofIPs.
Fig.14.LayoutofBFTarchitecturefor64and256IPs.
thelayout.ThelayoutofaBFTarchitecturefor64and256
IPsisshowninFig.14.
Inthelayout,thelongestinterconnectsaremarkedredincolor
andtheyremainunchangedwithincreasednumberofIPs.
Foradesiredbandwidthrequirement,longerinterconnects
requireoptimizationtechniquesintermsofwidthandspacing
alongwithrepeaterinsertion.Asthelinklengthstartsto
increase,thelinkpowerconsumptionlargelyaugments.This
showsthatthewirepowerconsumptionmustbeconsidered
duringtheinitialdesignphases.Withthepowermodels
presentedintheprevioussection,asystematicdesign
approachtoselectanoptimaltopologyforalowpowerdesign
solutionisshowninFig.15.
Theproposedsynthesisapproachcanbeusedasadesign
spaceexplorationtooltoevaluatetheefficiencyofdifferent
NoCtopologies.TheflowisapplicabletoanyNoCtopology,
andisconsistentwiththeflowpresentedinthispaper.Power
analysisofdifferentNoCtopologies(basedonthemodels
developedearlier)ispresentedinSectionVII.
VII.SIMULATIONRESULT
Toobservetheimportanceofconsideringpowerconsump
tionduringthesynthesisprocess,weevaluatedthepower
dissipationperformancefordifferentNoCtopologiesthrough
variousexperimentalsetups.Manydifferentrouterdesignsto
supportdifferentNoCtopologiesareimplementedusing
ARMsstandardcelllibraryin65nmTaiwanSemiconductor
Fig.15.IPbaseddesignmethodologyforlowpowerNoC.
ManufacturingCorporationdesignprocess.SynopsyssPrime
TimePXtoolisusedtocalculateaveragepowerdissipation
oftherouterdesigns.Powerdissipationofarouterdesignis
directlyproportionaltothenumberofportsinthedesign.For
example,asixportrouterdesignconsumes9.62MWof
poweratafrequencyof200MHz.Usingthemethod
presentedin[8],for65nmtechnologynode,thecritical
interconnectlengthis1.44mmandanoptimalrepeatersizeof
105isused.Thelinksareassumedtobebidirectional,with
eightdatalinesandtwocontrolsignallinesperlink.For
powercalculations,anoptimalinterconnectwidthof799(nm)
andanoptimalinterconnectspacingof329(nm)[8]areused.
Usingthepowermodelsanddesignflowpresentedearlier,
powervarianceamongdifferentNoCtopologiesisshownin
Figs.1619.Arangeof161024IPsandadiesizeof25400
2
mm areused.SPINtopologyconsumesthehighestpower,
whereasBFTismorepowerefficient.SPINtopology
REEHALANDISMAIL:SYSTEMATICDESIGNMETHODOLOGYFORLOWPOWERNoCs
2593
Fig.16.PowerCLICHarchitecture.
Fig.19.PowerOctagonarchitecture.
TABLEI
TOTALNUMBEROFREPEATERSANDMETALRESOURCESREQUIREDTO
IMPLEMENTCLICHARCHITECTURE
TABLEII
TOTALNUMBEROFREPEATERSANDMETALRESOURCESREQUIREDTO
Fig.17.PowerBFTarchitecture.
IMPLEMENTBFTARCHITECTURE
hasthehighestwiringdemand,andaslinklengthsstartto
increase,thelinkpowerconsumptiondominates.Thisshows
thatpowerconsumptionbyinterconnectsmustbeincludedin
theinitialNoCsynthesisphase,asitisdone
intheapproachpresentedhere.Consideringadiesizeof20
mm20mm,powerconsumedbydifferentNoCarchitec
turessizes(for16,64,and256IPs)ispresentedinTablesI
IV.Powerconsumptionbywiresandrepeatersisalso
presented.Asystemoverheadintermsof100W[17]of
powerisevaluated.
Fig.18.PowerSPINarchitecture.
Adetailedanalysisofpowerconsumptionshelpsdesignersto
savemorepowerthroughdifferentapproachesthatmaybe
applicable.InSPINtopologywith256IPs,repeatersalone
canconsumeasmuchas1209.6MWofpower;thisisquite
significant,consideringthesizeandhighendpowerbudgetof
thechip.ThetotalpowerconsumedbytheBFTarchitectureis
lessinallthethreecases;however,itisimportanttoobserve
howdifferentcomponentsarecontributingtothetotalpower
2594
IEEETRANSACTIONSONVERYLARGESCALEINTEGRATION(VLSI)SYSTEMS,VOL.22,NO.12,DECEMBER2014
TABLEIII
IMPLEMENTOCTAGONARCHITECTURE
TOTALNUMBEROFREPEATERSANDMETALRESOURCESREQUIREDTO
VIII.CONCLUSION
IMPLEMENTSPINARCHITECTURE
Inthispaper,anefficientdesignmethodologytoestimateNoC
poweratthearchitecturelevelispresented.Theanalysisis
basedonlayoutandpowermodels.Toachievealowpower
NoCarchitecturedesign,anaccurateestimationofpowerand
areabudgetsisimportantintheearlyphasesofdesign.
Differenceintheareaandpowerbudgetsofalogicdesign
versusphysicaldesigncaneitheroffsetadesigncompletelyor
mayresultinendlessiterations.Inaconventionaldigital
designflow,severaliterationsoflogicsynthesisandphysical
designarerequiredbeforeconvergencetodesignspecification
isachieved.Inthispaper,asystematicapproachisshownto
tackletheissuethroughpowermodelingandperformance
analysiswithareaandlayoutawareness.AsystemlevelSoC
designercanapplythesemodelstoacceleratetheNoCdesign
processforlowpowersolutionandfastertimingclosure.The
impactofinterconnectwidthandspacingontheareaand
powerdissipationisanalyzed.Thetradeoffbetweendelayand
bandwidthisusedasafigureofmeritforinterconnects
performance.3Dgraphsofpowerasafunctionofdiearea
andnumberofIPsarepresented.Innanometerdesigns,
interconnectspowerconsumptionissignificant,andthus
needstobeincludedattheearlystagesofdesigncycles.
TABLEIV
TOTALNUMBEROFREPEATERSANDMETALRESOURCESREQUIREDTO
REFERENCES
K.SundaresanandN.Mahapatra,Anaccurateenergyandthermalmodelfor
globalsignalbuses,inProc.18thInt.Conf.VLSIDesign,Jan.2005,pp.
685690.
K.SundaresanandN.Mahapatra,Accurateenergydissipationandthermal
modelingfornanometerscalebuses,inProc.11thInt.Symp.HPCA,Feb.
2005,pp.5160.
X.C.Li,J.F.Mao,H.F.Huang,andY.Liu,Globalintercon
nectwidthandspacingoptimizationforlatency,
Fig.20.ContributiontototalpowerbydifferentNoC(router,wires,and
repeaters)components.
bandwidth
and
consumptions.Amoredetailedparameterizedcontributionby
differentNoCtopologiesinthecaseof64IPsisshowninFig.
20.Itisinterestingtonotethat,inCLICHarchitecture,the
biggestsourceofpowerconsumptionisswitches,althoughit
issecondtoBFTintotalpowerconsumption.CLICH,
consumeslesspowerininterconnectsandrepeaters,incom
parisonwiththeotherarchitectures.Thus,theexplorationof
powerconsumptionbyindividualcomponentsishelpfuland
efficientinprovidingmeaningfulinsight.
S.PasrichaandN.Dutt,OnChipCommunicationArchitecures:Sytemon
ChipInterconnect.SanMateo,CA,USA:MorganKaufmann,2008.
S.Kumar,etal.,Anetworkonchiparchitectureanddesignmethodology,in
Proc.IEEEComput.Soc.Annu.Symp.VLSI,2002,pp.117124.
P.GuerrierandA.Greiner,Agenericarchitectureforonchippacket
switchedinterconnections,inProc.DesignAutom.TestEur.Conf.Exhibit.,
Mar.2000,pp.250256.
F.Karim,A.Nguyen,andS.Dey,Aninterconnectarchitecturefor
networkingsystemsonchips,IEEEMicro,vol.22,no.5,pp.3645,
Sep./Oct.2002.
P.Pande,C.Grecu,A.Ivanov,andR.Saleh,Designofaswitchfornetwork
onchipapplications,inProc.IEEEInt.Symp.CircuitsSyst.,vol.5.May
2003,pp.217220.
powerdissipation,IEEETrans.ElectronDevices,
vol.52,no.
10,
22722279,Oct.2005.
G.ReehalandM.Ismail,Layoutawarehighperformanceinterconnectsfor
networkonchipdesignindeepnanometertechnologies,inProc.IEEE6thInt.
DesignTestWorkshop,Dec.2011,pp.5861.
C.Grecu,P.Pande,A.Ivanov,andR.Saleh,Timinganalysisofnetworkon
chiparchitecturesforMPSoCplatforms,Microelectron.J.,vol.36,no.9,
pp.833845,Sep.2005.
P.P.Pande,C.Grecu,M.Jones,A.Lvanov,andR.Saleh,Performance
evaluationanddesigntradeoffsfornetworkonchipinterconnect
architectures,IEEETrans.Comput.,vol.54,no.8,pp.10251040,Aug.
2005.
L.BeniniandG.deMicheli,Networksonchips:AnewSoCparadigm,
IEEEComput.,vol.35,no.1,pp.7078,Jan.2002.
A.BalakrishnanandA.Naeemi,Optimalglobalinterconnectsfornetwork
onchipinmanycorearchitectures,IEEEElectronDeviceLett.,vol.31,no.
4,pp.290292,Apr.2010.
Y.Hoskote,S.Vangal,A.Singh,N.Borkar,andS.Borkar,A5GHzmesh
interconnectforateraflopprocessor,IEEEMicro,vol.27,no.5,
5161,Sep./Oct.2007.
REEHALANDISMAIL:SYSTEMATICDESIGNMETHODOLOGYFORLOWPOWERNoCs
2595
L.Ost,G.Guindani,L.Indrusiak,andS.Maatta,ExploringNoCbased
MPSoCdesignspacewithpowerestimationmodels,IEEEJ.Des.Test
Comput.,vol.28,no.2,pp.1629,Mar./Apr.2011.
G.Reehal,Designinglowpowerandhighperformancenetworkonchip
communicationarchitecturesfornanometerSoCs,Ph.Dthesis,
L.Xue,W.Ji,Q.Zuo,andY.Zhang,Floorplanningexplorationand
performanceevaluationofanewnetworkonchip,inProc.IEEEDesign,
Autom.TestConf.Eur.(DATE),Mar.2011,pp.16.
Dept.Electr.Comput.Eng.,OhioStateUniv.,Columbus,OH,USA,2012.
[16]M.B.Taylor,W.Lee,J.Miller,D.Wentzlaff,I.Bratt,
Greenwald,etal.,Evaluationofrawmicroprocessor:Anexposedwiredelay
architectureforILPandstreams,IEEEISCA,vol.32,no.2,pp.213,Mar.
2004.
K.Latif,A.Rahmani,T.Seceleanu,andH.Tenhunen,Powerand
performanceawareIPmappingforNoCbasedMPSoCplatform,inProc.
17thIEEEICECS,Dec.2010,pp.758761.
G.Reehal,M.A.AbdElghany,andM.Ismail,Octagonarchitectureforlow
powerandhighperformanceNoCdesign,inProc.IEEENAECON,Jul.
2012,pp.6367.
L.P.Carloni,A.B.Kahng,S.Muddu,A.Pinto,K.Samadi,and
Sharma,Interconnectmodelingforimprovedsystemleveldesign
optimization,inProc.ASPDAC,pp.258264,Mar.2008.
H.Elmiligi,A.Morgan,M.ElKharashi,andF.Gebali,Poweroptimization
forapplicationspecificnetworkonchips:Atopologybasedapproach,J.
Microprocess.Microsyst.,vol.33,nos.56,pp.343355,Aug.2009.
J.Postman,T.Krishna,C.Edmonds,L.Peh,andP.Chiang,SWIFT:Alow
powernetworkonchipimplementingthetokenflowcontrolrouterarchitecture
withswingreducedinterconnects,IEEETrans.VeryLargeScaleIntegr.(VLSI)
Syst.,vol.21,no.8,pp.14321446,Aug.2013.
[27]
S.Murali,D.Atienza,P.Meloni,S.Carta,L.Benini,G.Micheli,
etal.,Synthesisofpredictablenetworksonchipbasedinterconnect
architecturesforchipmultiprocessors,IEEETrans.VeryLargeScaleIntegr.
(VLSI)Syst.,vol.15,no.8,pp.869880,Aug.2007.
S.Vangal,A.Singh,J.Howard,S.Dighe,N.Borkar,andA.Alvandpour,A5.1
2
GHz0.34mm routerfornetworkonchipapplications,inIEEESymp.VLSI
CircuitsDig.Tech.,Jun.2007,pp.4243.
G.KhanandA.Tino,SynthesisofNoCinterconnectsformulticore
architecturess,inProc.IEEE6thInt.Conf.CISIS,Jul.2012,
432437.
K.BhardwajandR.Jena,EnergyandbandwidthawaremappingofIPsonto
regularNoCarchitecturesusingmultiobjectivegeneticalgorithm,inProc.
IEEEInt.Symp.Syst.Chip,Oct.2009,pp.2731.
X.Wang,M.Yang,Y.Jiang,andP.Liu,Powerawaremappingfor
networkonchiparchitecturesunderbandwidthandlatencyconstraints,in
Proc.IEEE4thInt.Conf.EmbeddedMultimediaComput.,Dec.2009,pp.1
6.
InternationalTechnologyRoadmapforSemiconductors.Denver,CO,USA.
(2007).InternationalTechnologyRoadmapforSemiconductorsSystem
Drivers[Online].Available:http://www.itrs.net/
J.Liu,L.R.Zheng,D.Pamunuwa,andH.Tenhunen,Aglobalwire
planningschemefornetworkonchip,inProc.ISCAS,vol.4.May2003,pp.
892895.
W.DallyandJ.Poulton,DigitalSystemsEngineering.Cambridge,U.K.:
CambridgeUniv.Press,2008.
M.Kim,D.Kim,andG.Sobelman,Networkonchiplinkanalysisunder
powerandperformanceconstraints,inProc.Int.Symp.CircuitsSyst.,2003,
pp.41634166.
D.Pandini,C.Forzan,andL.Baldi,Designmethodologiesandarchitecture
solutionsforhighperformanceinterconnects,inProc.IEEEICCD,Oct.
2004,pp.152159.
Y.ShinandH.Kim,AnalysisofPowerconsumptioninVLSIglobal
interconnects,inProc.IEEEInt.Symp.CircuitsSyst.,May2005,
47134716.
GursharanReehal(M09)receivedtheB.S.(Hons.)degreeinelectrical
engineeringandtheM.S.andPh.D.degreesinelectricalandcomputerengi
neeringfromTheOhioStateUniversity,Columbus,OH,USA,in1996,1998,and
2012,respectively.
After,graduationfromMS,shejoinedLucentTechnologies,Columbus,OH,
USA,asaHardwareTestEngineer.SheiscurrentlyaSeniorLecturerwith
theDepartmentofElectricalandComputerEngineering,TheOhioState
University.HercurrentresearchinterestsincludelowpowerdigitalVLSI
design,networkonchip(NoC)communicationarchitectures,highperfor
manceI/Ointerfaces,embeddedsystemsformedicalapplications,andrecon
figurablecomputing.
Dr.ReehalisamemberoftheEngineeringHonorSociety,TauBetaPi,andthe
ElectricalEngineeringHonorSociety,EtaKappaNu.ShereceivedtheBestPaper
AwardfromtheIEEESystemonChipConferencein2010forherworkinthearea
oflowpowerandhighperformanceNoCsandtheprestigiousShiningStaraward
fromtheWirelessNetworkGroup,LucentTechnologies.
MohammedIsmail(F09)isaprolificauthorandentrepreneurinthefieldof
chipdesignandtestinacademiaandindustryintheU.S.andEurope.Heis
theFounderofTheOhioStateUniversity(OSU)AnalogVLSILaboratory,
oneoftheforemostresearchentitiesinthefieldofanalog,mixedsignal,and
RFintegratedcircuits.HeservedontheFacultyoftheElectroScience
Laboratory,OSU.HeheldaResearchChairpositionattheSwedishRoyal
InstituteofTechnology,StockholmSweden,wherehefoundedtheRadioand
MixedSignalIntegrated
Systems(RaMSIS)ResearchGroup.HewaswithAaltoUniversity,EspooFinland,
NTH;andUniversityofOslo,Norway,TwenteUniversity,Enschede,The
Netherlands;andTokyoInstituteofTechnology,Tokyo,Japan.HejoinedKhalifa
UniversityofScience,TechnologyandResearch(KUSTAR),AbuDhabi,UAE,in
2011,whereheholdstheATICProfessorChairandistheHeadoftheElectrical
andComputerEngineeringDepartmentonbothKUSTARscampusesinSharjah
andAbuDhabi.HeisservingasCoDirectoroftheATICSRCCenterof
ExcellenceonEnergyEfficientElectronicSystemstargetingselfpoweredchip
setsforwirelesssensingandmonitoring,biochips,andpowermanagement
solutions.Hehasadvisedtheworkofover50Ph.D.studentsandofover100M.S.
students.Hehasauthoredorcoauthoredover12booksandover150journal
publicationsandhassevenU.S.patents.Hiscurrentresearchinterestsincludeself
healingdesigntechniquesforCMOSRFandmmwaveICsindeepnanometer
nodes.HeservedasaCorporateConsultanttoover30companiesandisaCo
FounderofMicrysInc.,Columbus,OH,USA,SpireaAB,Stockholm,Firstpass
Technologies,Inc.,Dublin,OH,USA,andANACADEgypt(nowpartofMentor
Graphics).
Dr.IsmailistheFoundingEditoroftheSpringerJournalofAnalog
IntegratedCircuitsandSignalProcessingandservesastheJournalsEditor
inChief.HeservedtheIEEEinmanyeditorialandadministrativecapacities.
HeistheFounderoftheIEEEInternationalConferenceonElectronics,
CircuitsandSystems,theFlagshipRegion8ConferenceoftheIEEECircuits
andSystemsSociety.HereceivedtheU.S.PresidentialYoungInvestigator
C.Grecu,P.Pande,A.Ivanov,andR.Saleh,AscalablecommunicationAward,theOhioStateLumleyResearchAwardfourtimes,in1992,1997,centric
SoCinterconnectarchitecture,inProc.IEEEISQED,2004,2002,and2007,andtheU.S.SemiconductorResearchCorporationsInventor
pp.343348.
RecognitionAwardtwice.