Professional Documents
Culture Documents
This document summarizes the usual practices and strategies that can be used in RC optimization. Also
included in this some new attributes that are introduced in RC 11.2 version
Contents
1 PURPOSE ............................................................................................................................................... 3
2 RC SYNTHESIS FLOW ............................................................................................................................. 3
2.1 Whats Needed to Synthesize a Design with RC? ......................................................................... 3
2.2 What Tool Does During Synthesis? ............................................................................................... 4
3 Area-based Optimizations..................................................................................................................... 5
3.1 General Recommendations .......................................................................................................... 5
3.2 New Attributes .............................................................................................................................. 6
4 Timing-based Optimizations ................................................................................................................. 6
4.1 General Recommendations .......................................................................................................... 6
5 Power Optimization .............................................................................................................................. 8
5.1 Reducing Dynamic Power ............................................................................................................. 8
5.2 Reducing Leakage Power .............................................................................................................. 8
5.3 Reducing both dynamic power and leakage power...................................................................... 9
5.4 Reporting power ......................................................................................................................... 10
6 Clock Gating Enhancements ............................................................................................................... 10
7 Basic DFT Flow & Useful Commands.................................................................................................. 11
8 Special Care abouts & Recommendations .......................................................................................... 12
9 LEC Flow & Care Abouts ...................................................................................................................... 15
10 Tcl Procedures ................................................................................................................................. 20
11 Some useful Utilities and Applets.......................................................................... 29
1 PURPOSE
This document summarizes the usual practices and strategies that can be used in RC optimization. This
AppNote also includes some new attributes that are introduced in RC 11.1 and 11.2 versions.
2 RC SYNTHESIS FLOW
2.1 Whats needed to synthesize a Design with RC?
Note : For PLE synthesis, DEF, LEF and Captable files are needed.
(i) Elaboration.
Before Elaboration the design should be read in along with other inputs. Make sure that general
compiler settings are done.
Elaboration is required on the top level design which automatically elaborates all its references.
The elaborate command does following things:
-Builds Data structures and infers registers in the design.
-Performs High level HDL optimization, such as dead code removal.
-Identifies Clock Gating and operand isolation candidates.
- Structuring
- Mapping
- Target setting
- Restructuring/Mapping as per target settings
(i) Analyze the timing reports prior to synthesis, i.e., at pre-synthesis diagnosis stage. If the I2C,
I2O paths are already meeting timing, then area can be recovered from these paths. For
this, under constrain these path groups before the global mapping stage:
Ex: The following command will relax the timing on I2O paths by 200 ps. So, now the
effective clock-period for these paths are [Clock_period + 200].
path_adjust name UNDER_CON delay 200 from [all::all_inps] to [all::all_outs]
After global mapping phase, remove the path-adjust, so that timing reports would be
accurate and with respect to the actual clock period.
As per LEC requirement,it is preferred to use medium effort for synthesize to_generic. In
cases where timing is critical ,the effort level can be set to high if and only if LEC is not an
issue.Here ,RC uses more aggressive csa optimization algorithms for better timing. This
might result in larger area as well.
(iii) Set the attribute drc_first to false, in case if you have set it to true (default is false). This has
to be set to false, so that RC does not fix DRCs aggressively & bump up the area.
(iv) Review and remove any preserve attributes (dont_touch) that are not needed. This will
help RC to optimize the results better.
(v) Check if there are any datapath elements in the critical paths with preserve attributes.
(vi) Set the attribute dp_postmap_downsize to true on data path elements present in the
design before incremental synthesis. This will perform architecture downsizing after
mapping. This attribute is effective only in incremental optimization.
(vii) If timing is not critical in the design, you do not need to turn on the attribute tns_opto.
(viii) Area multiplier: The area of a cell can be modified by using the attribute area_multiplier.
The default value for this attribute is 1.0. When set to a value less than default value can
favor a cell to be picked up by the tool. For example, when applied on complex cells, it can
bias the tool for mapping them on non-critical paths. To do so, one could set the
area_multiplier value to less than one 1.0 for these kind of cells. However, for accurate area
reporting, you should change the multiplier to 1.0 before issue any cell or area reporting
command.
(ix) Check if proper complex libcells are selected when needed. Otherwise one can bias the tool
to pick them up using area multiplier trick discussed above.
(x) If you have a max_delay constraint set on your design, RTL Compiler interprets the
constraint more restrictively and may produce larger area and instance count. To improve
area, check for these and convert them in to set_input_delay set_output_delay constraints
that refererence to the appropriate clock.
4 Timing-based Optimizations
4.1 General Recommendations
Following are the important steps in addition to those mentioned in the flow, which will improve the
performance:
(i) Do a detail analysis of timing reports in RC just prior to synthesis, i.e., just after loading the
RTL into RC (Pre-synthesis Diagnosis). If there is some MACRO, which is in the critical path of
reg-to-reg path groups and which has huge delay, then make a separate path group for all
the paths passing through that MACRO. This will greatly improve the optimizations of all the
other reg-to-reg paths not passing through that MACRO.This MACRO path can also be
optimized in a better way by over constraining it.
(ii) At the beginning of global mapping, based on the libraries, logic structure and constraints RC
will estimate a target slack for all cost groups. RC tries to meet these numbers. In the logfile
look for the word target slack. If there is a cost group with large negative slack normally is
a problem area. If the constraints are clean, one could set the initial target to 0 or some
positive value using the initial_target attribute. This forces the mapper to do aggressive
structuring and optimization on them to meet the target set.
set_attribute initial_target 0 [find / -cost_group name_cg]
(iii) Over constrain the paths, which are not meeting timing.
Ex: The following command will over constrain the reg-to-reg paths by 200 ps. So, the
effective clock-period for these paths is now [Clock_period 200 ps]. Hence, the tool will try
to meet its target in this effective clock-period.
path_adjust name OVER_CON delay -200 from [all::all_seqs] to [all::all_seqs]
This command is to be used before global mapping stage. After global mapping, remove the
over constraint and do report timing, so that the timing reports are with respect to the
actual clock-period.
(iv) If timing degradation is due to high fanout nets in the design, idealizing those nets improves
timing at the synthesis stage. In the later stages a place and route tool can build a better
buffer tree for these nets. The script mentioned in the Tcl procedures section can be used to
find out the high fanout nets and idealizing them.
(v) Usage of Low Vth cells significantly improves timing. But usage of these cells significantly
increases the power numbers as these have high leakage power.
(vi) Set the attribute tns_opto to true. When set, it forces the tool to consider all the
endpoints for the optimization.
Note: This may increase the area.
(vii) Make sure that the attribute drc_first is not set to true. By default it is set to false. If
set to true the tool will give higher priority to design rule constraints than the timing
constraints.
(viii) Set the attribute iopt_ultra_optimization to true .When set, it enables ultra optimization
in incremental optimization to achieve best QOR with higher runtime.
5 Power Optimization
5.1 Reducing Dynamic Power
RC can read either the Toggle Count Format (TCF) or the Switching Activity Interchange Format
(SAIF) for the switching power information.
Please set rtl_name_scope_framework to true before reading the rtl for relating rtl name to
internal object association information present in TCF or SAIF.
For dynamic power optimization, after reading in the libraries, loading the design and elaborating
set the following:
If no TCF/SAIF file is provided, RC engine will use default values for the toggle rate. From 11.1
version onwards, its ON with a value of 0.20 (20%) .
Enabling Clock Gating Insertion: To enable clock gating insertion during synthesis, set the attribute
lp_insert_clock_gating to true.
To perform leakage power optimization, RC would require multi Vth libraries. After reading in the
multi Vth libraries, loading the design and elaborating, do the following:
specifies the desired value for the leakage power target in nW.
When set to high, RTL Compiler focuses on leakage power optimization and uses mainly high
Vth cells (low leakage power, slow timing performance). The usage of low Vth cells (faster cells, higher
leakage power) is restricted.
Note : There will be an impact on the runtime when these attributes are set.
Controls the weight factors to be used when optimizing leakage power and dynamic power
simultaneously during global mapping, mapping, and incremental optimization. Specify a value
between zero and one. Assuming the attribute is set to w, the RC-LP engine optimizes for total
power. Total power is computed as follows:
The weight factor will be taken into account only when both the max_dynamic_power and
max_leakage_power attribute are set. If you do not set this attribute, tool issues warning POPT-501
and will ignore dynamic power optimization.
Multibit coverage: Make sure that the root level attribute use_multibit_cells is not set to false. By
default its set to true. Following are the advantages of multibit cells:
a. Smaller area and delay, due to shared transistors (as in select or set/reset logic) and
optimized transistor-level layout. In the use of single-bit components, the select or
set/reset logic is repeated in each single-bit component.
b. Reduced clock skew in sequential gates, because the clock paths are balanced
internally in the hard macro implementing the multibit component.
c. Lower power consumption by the clock in sequential banked components, due to
reduced capacitance driven by the clock net.
d. Better performance, due to the optimized layout within the multibit component.
RTL Compiler recognizes the following style of components for multibit merging:
a. Flops (non-scan and scan) with one or more of the following shared input pins: flop
clock, async_set, async_reset, sync_set, sync_reset, sync_enabe
b. Latches with one or more shared control pins: latch gate/ enable, async_set,
async_reset
c. Three-state cells that share the enable pins.
d. Combinatorial cells (muxes, inverters, nand, nor, xor, xnor) that share all pins, that
are not bundled in the Liberty description.
e. State retention (SRPG) cells that share the same retention control pins(s)
Below is rtl of a 4 bit latch which can be inferred as a multibit cell by the tool:
module ff(ld,rst,cp,d,q);
input ld,rst,cp;
input [3:0] d;
output [3:0] q;
reg[3:0] q;
always @(cp) begin
if (rst==0) q = 0 ;
else if (ld==1) q = d;
end
endmodule
As the module contains common control pins ld and rst the tool maps the latch to a multibit
cell.
lp_clock_gating_register_aware
Root level boolean attribute. Default is False
When this is true ,clock gating will be register bank aware
For example, if we have flops like a[0], a[1], a[2], b[0], b[1], b[2] in same hierarchy, one clock
gating cell will be added for a[0], a[1], a[2] flops and one clock gating cell for b[0], b[1], b[2]
flops. By default, when this attribute is off, one clock gating cell may be inserted for all the flops.
Note : clock gating declone command is not register bank aware.
lp_clock_gating_rc_inserted
This is an instance level boolean attribute.
If queried on an instance, it returns true if the instance is a RC inserted clock gating instance.
Otherwise it will return false.
lp_clock_gating_gated_clock_gates
This is an instance level attribute.
When queried on a clock gating instance, this returns a list of clock gates immediately gated by
this clock gating instance
lp_clock_gating_stage
This is an instance level attribute and return an integer.
If queried on a clock gating instance, this return what is the stage of the clock gating instance.
For example, if CG1 is gating CG2 and CG2 is gating flop. then stage of CG2 is 2 and stage of CG1
is 1.
lp_clock_gating_gated_flops
This is an instance level attribute. When queried on a clock gating instance, this will return the
flops immediately gated by this clock gating cell.
Note : Using the above attributes, there is a script available in the TCL procedures section to get RC
inserted CG and Non-RC inserted CG and their corresponding flops in a design.
set absSyncInst [filter dft_status "Fails DFT rules" [filter user_defined_segment true [find
/designs/* -scan_segment *<baseNameOfAbsSegmentsforSyncFlops>*]]]
foreach inst $absSyncInst {
set_attr dft_dont_scan true $inst
}
(ii) If the core .libs doesnt have non-scan flops then to allow the mapper to be able to tie-off
the pins of the scan-flop (i.e. degenerate the scan flop to create a more simple D-flop
implementation) for the non-first registers in the shift-register, then you must set the
following attribute to:
By default, RTL Compiler performs boundary optimization during synthesis for all subdesigns in
the design. It controls boundary optimization on the subdesign and hierarchical pin inversion.
To preserve the input and output pins of a subdesign, you can turn off the boundary
optimization. In this case, no hierarchical pin inversion will be done either for this subdesign.
Note: To exclude individual pins from boundary optimization, use the preserve attribute
The boundary optimization can be turned off as below :
2. Retiming
Improves the performance of the design by either optimizing the area or the clock
period (timing) of the design. Retiming moves the registers across the combinational logic to
improve the performance without changing the input/output behavior of the circuit .Generally it
is not recommended retiming on the whole design due to multiple reasons :
a. You will lose traceability of the flops since all the retimed flops will become
retime*reg. Verification could be a major issue for both formal and simulation. The
retiming space explodes to the complete design and flops can get spread out e from one
module to one or more others
b. There will be runtime increase since the Retiming flow would require an initial mapping
during prepare phase, retiming step and a final mapping on the whole design
d. On a Post-mapped design, you could try incremental retiming which makes use of
the unbalanced paths. But on typical designs, if the negative slack cannot be improved
by incremental retiming, then success would be limited.
3. Ungrouping
RC by default ungroups the user created hierarchies during synthesis to improve area and timing
during synthesis.. This can be controlled by setting the attribute auto_ungroup to both/none and
must be specified before synthesis.
none Ungrouping will not be performed.
Both Ungrouping will be performed with an emphasis on both optimizing timing and area.
You can also set the ungroup_ok attribute to false to control any subdesign/hierarchy not to
ungrouped. This way, while achieving better QoR, you can also keep the hierarchy of interest intact.
PAS works on better structuring of MUXes in the design. The tool has choice between using binary
and one-hot muxes. Binary muxes are generally good for congestion and worse for timing. PAS tries
to make this choice based on timing and congestion tradeoff. When PAS set, one can see timing
degrades at synthesis stage but will help in meet timing during Iopt and placement stages.
Note : PAS preserves the binary mux structures throughout the flow.Please use following variable to
remove the hard preserve constraint from these inferred binary muxes :
set physical_aware_structuring_structure_preserve 0
This feature helps in reducing the Iopt run times. Setting this attribute, enables the super threading
feature in Iopt stage . This is built on the existing super threading infrastructure. Below is the usage
model :
(i) For unresolved instances during elaboration. The command for checking unresolved
instances is check_design unresolved
(ii) check if any sdc errors exist and verify the SDC summary report in the log file.
(iii) Warnings pertaining to sequential logic deletion (GLO-32,GLO-12)
(iv) Clock gating fanout statistics.
(v) At the beginning of the global mapping step, RC will estimate a target slack for each cost
group. This estimated target is based on the libraries, the logic structure, and the
constraints. RC will work toward this target number during the optimization process. In
the logfile, search for the keyword target slack, as it will be printed before and after
the global mapping step for each cost group in the design. A cost group with large
negative target slack would normally indicate a problem area. Also check whether these
targets are met after mapping or not.
(vi) On some designs RC might spend time in incremental optimization. To debug such cases
one should have set the debug variable iopt_stats to 1 and the attribute
information_level to 9. When set, RC dumps out tables in the logfile as shown
below:
In the table, if you see huge numbers in the 'Time' column, that could be a problem
trick. Another thing to watch out for is the number of accepts and attempts.
For example if you see tricks which gets called many times and accepts are only a few
(5% to 10% of the attempts), those tricks can be switched off as shown below:
set trick_name 0
This should be done before iopt stage in the consecutive runs, to speed up the runtime.
(vii) Before attempting to run synthesis, the user should check the input data, pay attention
to the warning messages and correct any obvious issues.
(i) set_attr heartbeat 200 / : Prints a short runtime/memory message every 200 secs.
These timestamps interspersed in the logfile can show how long the run has been stuck
in some area
(ii) set_attr dump_stack_trace <100> / : Prints the timestamp along with the stack trace
during the specified intervals.
(iii) set_attr profile_output_filename "file.pf" : Set this attribute at the place which
consumes long runtime to identify which particular routine takes maximum time.
During synthesis, RC creates a fv directory and dumps the dofile and intermediate netlist. The
intermediate netlist is referred as G1 netlist and final synthesized netlist is referred as G2 netlist.
During RTL to NETLIST Verification, it is always better to do two step verification as below :
1. If RTL has instantiation of ChipWare/DesignWare component, then ensure LEC uses the
simulation model from the same RC release that synthesize the design. Before generating
the write_do_lec dofile, you must set the following attribute:
With the above attribute, the following lines are inserted at the beginning of
the write_do_lec generated dofile:
tclmode
setenv(CDN_SYNTH_ROOT)/path_to_rc_release/rc/rccurrent/tools.lnx86
vpxmode
3. In case of parameterized root module design, the parameters used during elaboration in RC
should be in consistent with that used during read design in LEC. In some cases, RC and LEC
result in a different number of modules, especially when the modules perform the same
function, but have different parameters.
4. If possible, avoid using combinational loops. Try the following options in LEC if automatic loop
cutting is unbalanced/mismatched:
5. Once the mapping is done do cross verify the unreachable points dumped by LEC.
6. In case ,the design has DW components ,make sure to uncomment following command in do
file:
If non-eqs are present due to DW component then following proc might help in solving the
issue:
hidden_proc ::dp_utils::dp_limit_boundary_opto_on_comps {} {
foreach subdesign [find / -subdesign *] {
set impl [get_attribute implementation $subdesign]
if {![string equal $impl ""]} {
set comp [dirname [dirname $impl]]
if {[get_attribute carrysave_outputs $comp]} {
set_attr -quiet boundary_opto false $subdesign
}
}
}
}
This proc essentially switches off boundary optimization on the DW components and the netlist
generated can be LEC cleaned with the help of this boundary optimization.
7. Instance naming in VHDL generate block is not covered by LRM. Currently LEC and RC have
different naming convention. However,write_do_lec automatically includes set naming
rule command to make the instance naming consistent.
8. Ungrouping prevents LEC from hierarchically comparing certain sub-modules, which can
cause aborts. Before generating G1 netlist, do not ungroup any high complexity sub-modules
(especially with datapath operators) using:
After generating the G1 netlist, do not change any datapath architecture nor any retiming or
pipelining.
9. In case of mixed rtl usage (verilog+vhdl), its better to follow strict LRM rules while writing RTL
Below is the snippet of top (verilog) rtl where ctm_top (vhdl module) is instantiated. LEC will not
elaborate the following rtl scenario:
ctm_top #
(
.CTM_NUMINPT (31),
.CTM_NUMCNTR (8),
.CTM_NUMTIMR (2),
.CTM_TIMEVT_PORTWIDTH(2), //same as CTM_NUMTIMR
.CTM_TIMINTPOLARITY (1),
.CTM_TIMINTWIDTH (2),
.CTM_NUMSTM (0),
.CTM_CCMAVAIL (0),
.CTM_NUMDBGSGL (0),
.CTM_DBGSGL_PORTWIDTH(2),//default
.CTM_ASYNCIDLEREQ (0) ,
.CTM_CNTR_CHAINSHADOW("00000000000000000000000000010100") <----- bitwidth
mismatch seen here
)
u0_sctm
(
// Top Clock
.gl_clk_r (sctm_clk),
// Global Reset
.gl_reset_nr (sctm_rst_n),
// IDLE Realted
.gl_fclken_tr (1'b1), // to eve_FClken?
.gl_sidlereq_tr (sctm_SIdleReq),
.ctm_sidleack_tr (sctm_SIdleAck),
);
ENTITY ctm_top IS
GENERIC (
CTM_NUMINPT : INTEGER := N_CTM_NUMINPT ;
CTM_NUMCNTR : INTEGER := N_CTM_NUMCNTR ;
CTM_NUMTIMR : INTEGER := N_CTM_NUMTIMR;
CTM_TIMEVT_PORTWIDTH : INTEGER := N_CTM_NUMTIMR ;
CTM_TIMINTPOLARITY : INTEGER := 1 ;
CTM_TIMINTWIDTH : INTEGER := 1 ;
CTM_NUMSTM : INTEGER := N_CTM_NUMSTM ;
CTM_CCMAVAIL : INTEGER := 1 ;
CTM_NUMDBGSGL : INTEGER := 2 ;
CTM_DBGSGL_PORTWIDTH : INTEGER := 2 ;
CTM_ASYNCIDLEREQ : INTEGER := 0 ;
CTM_CNTR_CHAINSHADOW : STD_LOGIC_VECTOR(N_CTM_NUMCNTR-1 DOWNTO 0) :=
"00000000000000000000000000000000" <-------default parameter
);
where N_CTM_NUMCNTR = 32.
To fix the error, instead of using double quote ("") for passing parameter to
CTM_CNTR_CHAINSHADOW as below:
.CTM_CNTR_CHAINSHADOW("00000000000000000000000000010100")
. CTM_CNTR_CHAINSHADOW (32'b00000000000000000000000000010100)
11. write_do_lec does not deposit constant constraints information into dofiles. The corresponding
pin constraints need to be added manually.
12. write_do_lec can handle sequential merges by default with the following command:
In LEC, Z is the default. Using any other value has verification risks.
Set the following for each retimed module for easier verification:
set_attribute boundary_opto false [find / -subdesign rt_mod]
set_attribute retime_hard_region true [find / -subdesign rt_mod]
(By default its set)
The following global setting is required for running lec on a retimed design. It enables the
retiming verification with LEC.
It controls whether to add the Conformal LEC command that excludes the specified retimed
modules from the hierarchical dofile script generation. Specify this attribute before writing out
the intermediate netlist.
One must use hierarchical comparison to verify the retimed netlist. The command
write_do_lec can handle retiming by default with the following commands:
10 Tcl Procedures
1. set_hfn : This procedure looks for high fanout nets over a specified threshold and then idealizes
them. The user also has an option to add a path_adjust # through these nets to account for the
missing delay.
puts "INFO: Finding all nets with a fanout greater than $threshold"
if {$noideal} {
puts "INFO: The nets will be not be idealized, just reported"
if {$preserve_net} {
puts "ERROR: Cannot use -noideal and -preserve switch together."
puts " It does not make sense to preserve the net without idealizing it."
return -code error
}
}
if {$preserve_net} {puts "INFO: The nets will be preserved."}
set _count 0
# if desired, create a cost_group
if {![string match $costgroup_name ""]} {
puts "INFO: Defining a cost_group named $costgroup_name"
define_cost_group -name $costgroup_name
}
# look for high fanout nets in the design
foreach net [find / -net *] {
set _driver [get_attr driver $net]
if {![string match [llength $_driver] "1"]} {
#puts "NOTE: found [llength $_driver] drivers on net: [vname $net]"
continue
}
# get the type of net.
set _type [basename [dirname [dirname [dirname $_driver]]]]
# if driver is a constant, then continue
if {[string match [what_is $_driver] "constant"]} {continue}
# if driver is a subport, then continue
if {[string match [what_is $_driver] "subport"]} {continue}
set is_port [string match [what_is $_driver] "port"]
set is_seq_pin [expr {[string match [what_is $_driver] "pin"] && [string match $_type
"instances_seq"]}]
set is_comb_pin [expr {[string match [what_is $_driver] "pin"] && [string match $_type
"instances_comb"]}]
set is_not_clock [expr {[string match [get_attr propagated_clocks $_driver] ""]}]
if {[string match [what_is $_driver] "pin"]} {
set is_libpin [expr {![string match [get_attr libpin $_driver] ""]}]
} else {set is_libpin 0}
######puts "$_driver : $_type : $is_port : $is_seq_pin : $is_comb_pin : $is_libpin :
$is_not_clock"
# if driver is a port, or sequential or combinational pin, or a libpin then check the fanout.
# but not if it is a clock
if {[expr {($is_port || $is_seq_pin || $is_comb_pin || $is_libpin) && $is_not_clock}]} {
set _fanout [llength [fanout -max_pin_depth 1 $_driver]]
#puts "$_fanout ... $threshold"
if {[expr {$_fanout > $threshold}]} {
incr _count
# create path_group?
if {![string match $costgroup_name ""]} {
puts " Creating a path group through [vname $_driver], and putting it in
cost_group $costgroup_name"
path_group -through $_driver -group $costgroup_name
}
if {$noideal} {
puts " [vname $net] : $_fanout"
} else {
puts " Setting [vname $net] as ideal (fanout $_fanout)"
dc::set_ideal_network -no_propagate $net
# set a path_adjust?
if {![string match $path_adj_ps ""]} {
puts " Applying path_adjust through driver: [vname $_driver]"
path_adjust -through $_driver -delay $path_adj_ps
}
# preserve?
if {$preserve_net} {
set_attr -quiet preserve true $net
# preserve all hierarchical nets in the fanout
foreach fo [fanout -max_pin_depth 1 $_driver] {
set sub_net [get_attr "net" $fo]
dc::set_ideal_network -no_propagate $sub_net
set_attr -quiet preserve true $sub_net
}
}
}
}
} else {
#puts "DEBUG: Driver not accepted: $_driver : $_type : [get_attr libpin $_driver]"
}
}
# display a count of how many HFNs were found
puts "INFO: Found $_count nets with a fanout more than $threshold"
add_command_help set_hfn "Finds high fanout and declares them as ideal" "Constraint"
2. report_cg_tree : this procedure gives a detail report for RC inserted CG and Non RC inserted CG.
proc report_cg_tree {} {
set debug 0
if {[llength [find / -design *]] < 1} {
puts "Error: There is no design loaded. Load the design and map it before running this util"
return
}
if {[llength [find . -design *]] > 1} {
puts "Error: There are multiple designs loaded."
puts " : [find / -design *]"
puts " : 'cd' to target design"
return
}
if {[llength [find . -design *]] == 1} {
puts "$rtobp"
set count 1
}
set prev $libcell
}
set rtobp [string repeat " " 70]
set rtobp [string replace $rtobp 1 [expr 1 + [string length $Tech_libname]] $Tech_libname]
set rtobp [string replace $rtobp 40 [expr 40 + [string length $prev]] $prev]
set rtobp [string replace $rtobp 62 67 $count]
puts "$rtobp\n"
}
puts "[string repeat "-" 70]"
puts " Total Count [string repeat " " 48] $totalCount\n\n"
}
proc check_b2b_inv_buf { instList } {
set foundBuf 0
set foundInv 0
set isB2BInv 0
set isB2BBuf 0
set inv_pair 0
#puts "Inst list is $instList"
foreach Inst $instList {
if {[get_attr inverter $Inst] && ($foundInv == 1) } then {
set isB2BInv 1
set inv_pair [expr $inv_pair + 1]
set foundInv 0
} elseif { [get_attr inverter $Inst] } {
set foundInv 1
} else {
set foundInv 0
}
global ::pathLibCells
set cnt 0
set total 0
set max 0
set min 100000
# Process the data
foreach p [array names ::pathLibCells] {
incr cnt
set length [llength $::pathLibCells($p)]
if {$length > $max} {
set max $length
set max_chain $p
}
if {$length < $min} {
set min $length
set min_chain $p
}
set total [expr $total + $length]
}
puts "Max length = $max on report number $max_chain"
puts "Min length = $min on path number $min_chain"
puts "Average length = [expr $total / $cnt] over $cnt total paths"
}
1. timestat
Usage : timestat <stage_name>
Cmd : timestat GENERIC
2. generate_reports
Usage : generate_reports -outdir <reports_dir> -tag <tag_name>
Cmd : generate_reports -outdir $_REPORTS_PATH -tag generic
generates the QoS statistics at any stage in the flow. Statistics include Timing, Area, Instance
count, Utilization, Congestion and Power details. This command is followed by summary_table
command to generate a summary table for these QoS statistics.
3. summary_table
Usage : summary_table outdir <reports_dir>
Cmd : summary_table -outdir $_REPORTS_PATH
Reports the summary of the design in the following format based on the generate_reports tag-
name as below:
=================================================================
Flow Settings:
=================================================================
Total Runtime (m:s): 88:47
Total Memory (MB): 1903.15
Executable Version: 12.10
Dumps out qor reports into an html format, which can be viewed using htmlview.
measure qor -name generic qoR.html
measure qor -name map qoR.html
measure qor -name incr qoR.html
The result is you will see the comparative qor reports for generic,map,incr in html format in the
file qoR.html
COPYRIGHT 2012, CADENCE DESIGN SYSTEMS, INC. ALL RIGHTS RESERVED
Page 30
RTL Compiler (RC) Cook Book
Without instance option the above applet dumps out the area corresponding to each instance
in the html file.
Lets say we have 3 independent runs, A B and C.We can save the gate comparison for the three
runs post mapping using compare_gates with names as map_A , map_B, map_C in a common
html file map.html for easy comparison.
Similarly post incremental synthesis incr1, as incr1_A, incr1_B, incr1_C in a common html file as
incr1.html
CostGrouptrategy(edi|clock|clock_location|clock_location_single_io|location|none)
[-add]:add to existing cost groups(default is overwrite)
Creating testcase-setup to be used for the debug purpose. It creates a tarball of the testcase.Use
the automatically generated collateral in the auto-generated directory.
gtar -zxvf <archive>.tgz
cd archive
You may need to modify the automatically generated Tcl script to suite the needs of individual
users. Some known modifications to be done to the automatically created Tcl file is as below.
Following this, based on the user requirements/debugging purpose, this script can be modified
to run the full synthesis.
Please go through the README file generated with create_tcase for more details.