Professional Documents
Culture Documents
AbstractWith the advent of the ever increasing complexity of VLSI designs, it is inevitable for verification to
stay in race. This paper explores and proposes a set of architectural choices to be made in providing a solution of
such requirements. Solution shown in this paper is considered in terms of measurable performance, metrics,
flexibility of the solution, partitioning of functionality, re-usability and ease of management. Paper focuses more on
arriving at a single solution reasonably suited for all types of verification methods using existing UVM infrastructure
and with little enhancement.
The UVM toolkit does not adequately provide for modeling independent protocol stacks. Paper proposes two
main enhancements to UVM in terms of TLM ports which can later be expanded to multiple domains. UVM agent
architecture is well suited for System-on-chip(SoC) and sequenced layering translation for Network-on-chip(NoC)
explained later in the paper. But both UVM architectures standalone cannot be used interchangeably for driving
stimulus to both SoC and NoC. This paper materializes concept of multi-stimulus ports (with enhanced UVM TLM
code) which addresses these concerns in stimulus path, elaborated using examples.
Keywordsuvm, tlm, multi stimulus port, translation in TLM
I.
INTRODUCTION
Todays verification task is becoming more and more complex to keep pace with evolving DUT design.
Fortunately, UVM delivers on its goal of enabling the creation of modular reusable verification components,
environments and tests. There are about 357 classes and 1037 unique methods (938 functions and 99 tasks) that
comprise UVM1.2. Out of this large repository, for a typical UVM usage, it is generally adequate to use only 3%
(2% of classes and 3% of methods) of UVM and still be productive enough as per one of the surveys done. These
statistics were gathered from the feedback received from fellow UVM users directly [1]. To guess a few reasons,
it could be due to the perceived complexities of UVM itself or deficiency of features seen in existing classes for
particular applications. Thus enriching some of these classes could widen its usage in the community, we feel.
Sequencer
Driver
Usually stimulus path uses pull mechanism for transfer of transaction, to driver (requestor/consumer), from
sequencer (provider/producer), as it has its own advantages. In pull mechanism, driver initiates the request of
transaction to be driven on DUT interface, which ultimately gets propagated to sequencer resulting in transaction
passing through many or no components between sequencer and driver. Generally, SoC type designs have many
interfaces through which data should be driven, requiring same number of sequencer-driver components.
Comparatively, NoC type design employs multiple protocol translation before driving to DUT interface, requiring
translator component, for protocol translation, between sequencer-driver in stimulus path. This paper proposes
one such enhancement in commonly used UVM infrastructure i.e. TLM ports to incorporate multiple request
requirements in stimulus path through single implementation method reducing total number of components
required in verification setup. Both Soc and NoC type verification needs are described below:
A. SoC type designs
The UVM agent architecture is very well suited to SoC type devices, as agent is easily ported from a simple
Thanks to Mr. Pradeep Dharane, MD APM Pune for sponsoring our travel for DVCon presentation and
encouraging for the same.
unit level to chip level. Porting is straight forward because I/O connected to the virtual interface of the Agent is
visible in both scopes. Again, with multi-interface/multi-protocol SoCs, we need to define as many components
in the env, thus increasing the component count and compromising simulation performance. Paper proposes
multi-stimulus ports where same sequence-sequencer can be reused while driving multiple protocols, thus,
reducing the UVM component count. This helps in routing the requested transaction to the destined requestor
without using multiple sequencers.
seq
seq
A agent
seqr
seq
B agent
seqr
drv
mon
C agent
seqr
drv
mon
drv
mon
DUT
DUT
DUT
A agent
seqr
seq
B agent
seqr
drv
mon
seq
C agent
seqr
drv
mon
drv
mon
DUT
Example application:
Sometimes you just need a simple protocol translation with same transaction properties but driven through
distinct interfaces. For example there are many control protocols like SPI, I2C, MDIO, AHB, APB, AXI etc as
shown in figure 2A. Their structural requirements are same like data, address and control (Read/Write). These
protocols may add their own requirements but as mentioned, their basic property remains same. In these cases, do
we really need to add two separate components to the environment at SoC level as shown in figure 2B? Why
cant we leverage base property and diversify functionality only when it is required in the verification
architecture? This can definitively be addressed by layered sequence, but, for such multi-stimulus requirement,
we need separate driving infrastructure in environment. We have used same UVM TLM pull port by enhancing
its functionality to address the above mentioned verification challenges.
A
agent
B
agent
DUT
seq
seqr
seqr
drv
drv
B
mon
mon
Problem faced
B
agent
A
agent
seq
seqr
A_TO_B
Translator Seq
seqr
DUT
drv
mon
mon
Second type of architectural requirement is in verification of NoC type designs where protocol
conversion/packet translation is a must. This is generally achieved by transaction type conversion, by using either
sequence layering or translator sequence techniques as shown in figure 3. But in both of these techniques we may
need to add an extra component to the existing infrastructure. Paper proposes this translation of transaction
embedded directly into multi-stimulus ports to optimize the existing verification architecture.
To summarize, this paper will give an idea about the problems we faced in a live project due to UVM
limitation, workaround employed to overcome this limitation and also the proposed UVM TLM multi-port
implementation with examples.
Proposed multi-stimulus port offers following two features:
1.
2.
Lets walk through the complexities we faced in designing the verification architecture for our project owing
to the limitation(s) of UVM TLM port seen and the workaround(s) deployed.
A. Verification requirement
Functional requirement of the project was to convert one level protocol to a different level protocol, to cater to
the discrete VIP needs as shown in NoC architecture. Referring to Ethernet Phy example shown in figure 4
below, each rate lane (10G, 40G etc) is required to be processed with different encryption algorithms.
Fulfilling the requirement of breaking a bigger rate lane to smaller multi-lanes or vice-versa was a challenge,
as UVM TLM does not support multiple pull ports from single pull implementation. Now, high level protocol
VIP (like MAC/PCS) shown in figure 4 works with only higher level protocol transactions while encryption,
encoding and gear boxing functions shown work at different protocol level transactions. Hence, transaction
translation is required between these two protocol levels. Finally these translated transactions are required to be
driven out through pull port mechanism upon request.
seq
Xn = transaction
seqr
High level
protocol VIP
(mac/pcs)
Xn-high
Solution??
Xn-encr
Xn-encr
Low level
encrypt-A
VIP
Xn-encode
Low level
encoder
VIP
Low level
encrypt-B
VIP
Xn-Gbx
Gearbox
VIP
Figure 4 shows that request is being generated from pull port shown at bottom, requesting a transaction,
starting flow of transaction from seqr, high level protocol VIP and finally through one of low level VIPs or
gearbox VIP.
Single port with multiple implementation ports exist in UVM codebase having push mechanism, but the
requirement is specifically pull mechanism. With these existing multi-implementation ports, we are addressing a
MUX like architecture (Multi-sequencer to one driver) whereas; the need is for De-MUX kind of architecture as
shown in figure 5.
seq
seqr
High level
protocol VIP
(mac/pcs)
Xn-high
Low level
encrypt-A
VIP
Xn-encr
Xn-encode
Low level
encrypt-B
VIP
Xn-Gbx
Low level
encoder
VIP
Gearbox
VIP
We enhanced our verification architecture to cater to the need for port de-mux and transaction converter
shown above to support these multiple pull ports in the stimulus path, without modifying UVM library. This
component deals with transactions conversion and providing them to the requesting VIPs. To feel the gravity of
the problem at hand, for a project handling an aggregate bandwidth of 500G for example, with a granularity of
25G per lane, this would translate to 20 extra components of port de-mux and transaction converter type as
well as the corresponding number of implementation ports getting added. Problem will worsen if the granularity
of driving lane is in terms of 10G instead, leading to 50 extra components! Demand for higher and higher
bandwidth chips is on the rise but due to physical limitations the granularities of lane continue to remain at a
lower rate. This demands more number of low rate lanes handling, while keeping the upper layer VIPs intact.
Finally, it ends up in deteriorating the simulation performance to a sizable extent. Also, adding any new
component to an existing verification environment can be error prone and can have its own maintenance overhead
which is taxing enough in the long run. This problem could have been very well solved by multi-stimulus port
described in next section.
III.
MULTI-STIMULUS PORT
The problem at hand was critical as it hits us with performance of simulation. Increase in simulation time
affects regressions and closure of verification takes a hit. In this situation, if we have had employed multistimulus ports instead, we would have surely seen better performance due to simpler architecture. Example shown
in figure 4 for multi-stimulus port shows the problem faced by our team on routing (one high level VIP to many
low level VIPs) & translation (high level transaction to low level transactions). Moreover, this example is also
capable of showcasing the other variant of gear-boxing(Many to one transaction conversion and vice versa) that
was not used in current project. As our project code cannot be shared, its equivalent pictorial representation is
shown below in figure 6. Another example with UVM modifications including some area of applications are
shown with code in section IIIB for your perusal.
Xn = transaction
seq
seqr
High level
protocol VIP
(mac/pcs)
Xn-high
translate()
translate()
translate()
translate()
Xn-encr
Xn-encr
Xn-encode
Xn-Gbx
Low level
encrypt-A
VIP
Low level
encrypt-B
VIP
Low level
encoder
VIP
Gearbox
VIP
In absence of multi-stimulus port support in UVM, we partially enhanced some of the UVM TLM code and
used it in our architecture. Note that this was not used in a live project, never-the-less implemented and simulated
for the purpose of this paper proposal. Figure 6 depicts the usage of multi-stimulus port and elimination of
workaround shown in figure 5, thus reducing the number of components used. On re-usability front, as shown in
figure 6, we can increase number of layer 2 protocols (lower level VIP) as required without touching the upper
layer. In specific requirements of architecture, where the upper layer needs to know which lower level VIP has
requested for transaction, upper layer can identify a requester based on lower level VIPs multi-port id received
along with its request. Details of multi-stimulus port are discussed in next section with examples which use
SoC/NoC structures.
A. What is a multi-stimulus port?
Multi-stimulus port in figure 7 is similar to well-known uvm_blocking_get_port or seq_item_pull_port TLM
ports defined in UVM.
uvm_blocking_pull_imp OR
uvm_non_blocking_pull_imp OR
uvm_seq_item_pull_imp
Translate 1
Translate 2
Translate 3
Translate n
Pull multi-port 1
Pull multi-port 2
Pull multi-port 3
Pull multi-port n
Multi-stimulus port requires a pull-port implementation to support its connection. As discussed earlier we
have 2 problems at hand
Problem 1: Getting transactions of similar base functionality and catering this functionality to DUT
through different interfaces. For example control interface protocols as shown in figure 1. Their base
transactions remain same but are driven to DUT with proprietary protocols. If we can leverage same
sequence in the environment we could have reduced component counts and improved performance.
Problem 2: As mentioned in previous problem, getting same base transaction and using it in slave/driver
with different driving protocol is not a big issue. The problem starts when these slaves/drivers work on
derived transactions instead of base transaction. We address this issue inside the multi-stimulus port itself
where a user can translate from base transaction to required transaction on need basis.
We propose enhancement in the UVM TLM structure classes to address these issues through an illustration.
B. Illustration
The updated UVM source code is shown in the section 3D.
seq
Random
data gen
Verification env
seqr
seq
MDIO
agent
seqr
drv
D
U
T
seq
seq
SPI
agent
seqr
mon
MDIO
drv
I2C
agent
Prbs to gbx
translation
seqr
mon
drv
SPI
drv
mon
I2C
drv
mon
DUT lane
data
interface
As shown in figure 8, we have built an example which has multiple agents connected to DUT. Three agents of
MDIO, SPI and I2C protocols are connected to respective DUT interface, similar to SoC setup. Gearbox (gbx)
interface reflects NoC type of structure, where a packet of pseudo random bit sequence (prbs) type gets translated
to gbx transaction type. With conventional UVM setup one needs to connect all agents to the respective interfaces
of DUT. Considering DUT with multi-million gate counts and supporting countless protocols, one would need
humongous set of components in the verification setup to achieve this. Again performance will take a hit.
seq
Verification env
Random
data gen
base_txn
to
mdio
mdio
drv
D
U
T
base_txn
to
spi
mon
MDIO
spi
drv
ort
us p
l
u
i stim
Mult
Producer
seqr
base_txn
to
i2c
i2c
drv
mon
SPI
base_txn
to
gbx
mon
I2C
gbx
drv
mon
DUT lane
data
interface
Instead of going with the normal setup, we use multi-stimulus port to drive the DUT. Figure 9 shows the
example in which we have eliminated 4-5 components like mdio seqr, i2c seqr, spi seqr, gearbox seqr and
gearbox translator, etc. With this small example if we can eliminate these many components, think of the saving
we can achieve in a complex verification setup. Performance and simplicity gained in the process would be the
natural gifts offered by such architectures. Now we will go in details of multi-stimulus port at UVM source code.
C. Application Example code
Before we move to the example code, there is certain coding guideline/recommendation we would like to
provide:
1.
We used get() method in the consumer VIP like SPI, MDIO, etc. It does not require to give response
back to producer. If someone uses get_next_item() or try_next_item(), he/she needs to give response
back to producer implementation as shown in translate() task of consumner_gearbox_driver().
Unless a response is given back producer implementation port will not generate new transaction if it
is using uvm_sequencer.
2.
Method translate() is implemented using task as well as function as uvm_component has not defined
prototype of this method.
3.
Transaction received in multi-port get() method is in terms of base transaction which needs a type
cast.
4.
Requests generated from consumer driver (with context of example shown here) in parallel, will get
served randomly in sequencer based on simulator implementation. If producer want to know which
requestor (consumer driver) has generated the request, producer implementation can get requestor id
using get_id() method as shown in producer_0 class of example code given below.
// Program Test
env e;
initial begin
e = new("e");
run_test();
end
p0;
drv_0[];
drv_1;
drv_2;
drv_d;
m_seq;
10
11
12
13
14
15
16
Output 2: consumer transaction consumption speed at half simulation (about 54 sequence transaction)
# UVM_INFO MSP_example.sv(178) @ 240: e.top.producer0@@common_sequence [uvm_sequence_item] Common Sequence[54]
txn=BASE TRANSACTION [ addr = 6d, data = 074e, ctrl = 1 ]
# UVM_INFO MSP_example.sv(320) @ 240: e.top.consumer_mdio_driver [e.top.consumer_mdio_driver] Received
mdio_transaction[24] is = MDIO TRANSACTION [ addr = 6d, data = 074e, ctrl = 1, st = 1, ta = 0]
# UVM_INFO MSP_example.sv(178) @ 240: e.top.producer0@@common_sequence [uvm_sequence_item] Common Sequence[55]
txn=BASE TRANSACTION [ addr = 1f, data = d307, ctrl = 0 ]
# UVM_INFO MSP_example.sv(371) @ 240: e.top.consumer_spi_driver [e.top.consumer_spi_driver] Received spi_transaction[12]
is = SPI TRANSACTION [ addr = ff, data = d307, ctrl = 0, slave_sel = 1 ]
# UVM_INFO MSP_example.sv(178) @ 240: e.top.producer0@@common_sequence [uvm_sequence_item] Common Sequence[56]
txn=BASE TRANSACTION [ addr = 64, data = 3547, ctrl = 0 ]
# UVM_INFO MSP_example.sv(269) @ 240: e.top.consumer_i2c_driver_0 [e.top.consumer_i2c_driver_0] Received
i2c_transaction[6] is = I2C TRANSACTION [ addr = 64, data = 3547, ctrl = 0, addr_mode = I2C_10BIT_ADDRESS, is_start_byte
=1]
# UVM_INFO MSP_example.sv(178) @ 240: e.top.producer0@@common_sequence [uvm_sequence_item] Common Sequence[57]
txn=BASE TRANSACTION [ addr = 30, data = 450f, ctrl = 1 ]
# UVM_INFO MSP_example.sv(269) @ 240: e.top.consumer_i2c_driver_1 [e.top.consumer_i2c_driver_1] Received
i2c_transaction[6] is = I2C TRANSACTION [ addr = 30, data = 450f, ctrl = 1, addr_mode = I2C_10BIT_ADDRESS, is_start_byte
=1]
# UVM_INFO MSP_example.sv(178) @ 240: e.top.producer0@@common_sequence [uvm_sequence_item] Common Sequence[58]
txn=BASE TRANSACTION [ addr = ea, data = b948, ctrl = 1 ]
# UVM_INFO MSP_example.sv(178) @ 240: e.top.producer0@@common_sequence [uvm_sequence_item] Common Sequence[59]
txn=BASE TRANSACTION [ addr = c6, data = 1ed0, ctrl = 0 ]
# UVM_INFO MSP_example.sv(431) @ 240: e.top.consumer_gearbox_driver [e.top.consumer_gearbox_driver] Received
gearbox_transaction [3] is = GEARBOX TRANSACTION [ addr_double = c6ea, data_double= 1ed0b948 ]
Output 3: Summary of packets consumed by each consumer for 108 transactions generated
# UVM_INFO ex_multi_stimulus_seq_pull_port.sv(325) @ 470: e.top.consumer_mdio_driver [e.top.consumer_mdio_driver]
Received mdio_transaction[47] is = MDIO TRANSACTION [ addr = 78ad, data = 58bc, ctrl = 0, st = 1, ta = 2]
# UVM_INFO ex_multi_stimulus_seq_pull_port.sv(182) @ 470: e.top.producer0@@common_sequence [uvm_sequence_item]
Common Sequence Ends
# UVM_INFO ../../../../src/base/uvm_objection.svh(1271) @ 1470: reporter [TEST_DONE] 'run' phase is ready to proceed to the
'extract' phase
# UVM_INFO ex_multi_stimulus_seq_pull_port.sv(486) @ 1470: e.top [report_phase] Total Transaction Received: MDIO=48,
SPI=24, I2C0=12, I2C1=12, GEARBOX=6
# UVM_INFO ../../../../src/base/uvm_report_server.svh(847) @ 1470: reporter [UVM/REPORT/SERVER]
# --- UVM Report Summary --#
# ** Report counts by severity
# UVM_INFO : 240
# UVM_WARNING : 0
# UVM_ERROR : 0
# UVM_FATAL : 0
b) src/tlm1/uvm_port.svh
Corresponding uvm_blocking_get_multi_port() class implementation needs to be defined in pull domain.
class uvm_blocking_get_multi_port #(type T=int, type IMP=int) extends uvm_port_base #(uvm_tlm_if_base #(T, T));
`UVM_MULTI_PORT_COMMON(`UVM_TLM_BLOCKING_GET_MASK,"uvm_blocking_get_port", IMP)
`UVM_BLOCKING_GET_MULTI_IMP (get_full_name(), this.m_if, T, t)
endclass
17
c)
src/tlm1/uvm_tlm_imps.svh
The required implementation macros to build multi-stimulus port are added here.
`define UVM_BLOCKING_GET_MULTI_IMP(my_p, imp, TYPE, arg) \
task get (output TYPE arg); \
imp.set_id(my_p); \
imp.get(arg); \
m_imp.translate(arg); \
endtask
In following existing code, we added max_size argument to the constructor of the producers implementation,
so that it can know how many multi-ports are connected to its implementation.
`define UVM_IMP_COMMON(MASK,TYPE_NAME,IMP) \
local IMP m_imp; \
function new (string name, IMP imp, int max_size=1); \
super.new (name, imp, UVM_IMPLEMENTATION, 1, max_size); \
..
d) src/macros/uvm_tlm_defines.svh
In order to support all the methods of multi-stimulus port, we need to provide a template for method
implementation thatll be routed to producer through uvm_tlm_if_base#().
`define UVM_SEQ_ITEM_PULL_MULTI_IMP(my_p, imp, REQ, RSP, req_arg, rsp_arg) \
function void disable_auto_item_recording(); imp.disable_auto_item_recording(); endfunction \
function bit is_auto_item_recording_enabled(); return imp.is_auto_item_recording_enabled(); endfunction \
task get_next_item(output REQ req_arg); imp.set_id(my_p); imp.get_next_item(req_arg); m_imp.translate(req_arg);
endtask \
task try_next_item(output REQ req_arg); imp.try_next_item(req_arg); endtask \
function void item_done(input RSP rsp_arg = null); imp.item_done(rsp_arg); endfunction \
task wait_for_sequences(); imp.wait_for_sequences(); endtask \
function bit has_do_available(); return imp.has_do_available(); endfunction \
function void put_response(input RSP rsp_arg); imp.put_response(rsp_arg); endfunction \
task get(output REQ req_arg); imp.set_id(my_p); imp.get(req_arg); m_imp.translate(req_arg); endtask \
task peek(output REQ req_arg); imp.peek(req_arg); endtask \
task put(input RSP rsp_arg); REQ t1 = rsp_arg; imp.put(t1); endtask //Workaround as virtual task has RSP=REQ
18
e)
src/base/uvm_tlm_base.svh
Sometimes if a producer is required to route a transaction to a specified requesting pull port, producer can
extract this information by calling get_id() function. So we added get_id() and set_id() methods to utilize when
required. Id will be set automatically by calling set_id() internally. Thus it is never expected to be called by user.
function void set_id(string s);
uvm_port_list temp_list;
string x;
int cnt;
get_provided_to(temp_list);
if (temp_list.first(x) )
begin
do
begin
if(x == s) temp_id = cnt;
else ++cnt;
end
while ( temp_list.next(x) );
end
endfunction
function int get_id();
return temp_id;
endfunction
IV.
CONCLUSION
Two different architectural worlds SoC and NoC - having varied verification support requirements were
discussed. It was shown that existing approach to fulfill these requirements resulted in complex verification
architecture and poor simulation performance. To overcome these limitations, paper proposes few enhancements
to some of the UVM base classes discussed, to achieve a new TLM port type a multi-stimulus port. Usage of
this port type in place of existing approach helped in achieving simplicity of verification architecture and
improved simulation performance. We would like to humbly request the UVM committee to consider this UVM
library enhancement request so that the larger UVM user base can benefit from its usage for the applications
discussed.
ACKNOWLEDGMENT (STYLE: HEADING 5)
Special thanks to our Team member, Pushkar Naik, for the valuable review inputs. We would also like to
thank our APM team members for helping us understand any verification architectural specifics required to be
dealt with. We would also like to thank online UVM community for responses posted that helped us. And finally
an all-time thanks to Accellera for their continued effort for betterment of verification methodologies.
REFERENCES
[1]
Stuart Sutherland and Tom Fitpatrick UVM Rapid Adoption: A practical subset of UVM DVCon March 2015
http://events.dvcon.org/2015/proceedings/papers/12_1.pdf
[2]
[3]
David Cornfield The Universal Translator A fundamental UVM component for networking protocol https://dvconeurope.org/sites/dvcon-europe.org/files/archive/2014/proceedings/T2_3_paper.pdf
[4]
[5]
[6]
IEEE Std 1800-2009 "IEEE Standard for System Verilog-Unified Hardware Design, Specification, and Verification
Language"http://dx.doi.org/10.1109/IEEESTD.2009.5354441
19