You are on page 1of 6

Functional Verifications for SoC Software/Hardware Co-Design: From

Virtual Platform to Physical Platform

Yi-Li Lin and Alvin W.Y. Su


Computer Science and Information Engineering Dept.
National Cheng Kung University, Tainan, Taiwan
AlvinSu@mail.ncku.edu.tw

ABSTRACT software part, the program can be enhanced to


This paper applies heterogeneous simulation become the final application. The hardware part
to achieve system and functional level co- follows VLSI design flow as shown in the upper
verification throughout SoC design flow. It reduces part of Fig. 1. Conventionally, after deciding
high verification complexity resulted from covering software and hardware partitions, RTL design
software and hardware works and involving various starts immediately by using hardware description
tools. Stubs for data transport and a Verification
languages (HDL), such as Verilog and VHDL.
Router for heterogeneous simulation management
are proposed. A functional module is transformed However, hardware design space exploration is
from a highly abstract model to its target design recommended recently. SystemC [3] is applied to
progressively through a series of intermediate assist this procedure. Its a C++ library for
models. Those models can be validated as a electronic system level [4] (ESL) modeling and
portion of a complete SoC system model. The design. Tools like Synopsys Platform Architect [5],
proposed heterogeneous verification is included library such as ARM [6] mode and AMBA
demonstrated with a jpeg encoder. model, can help to build a system level model
quickly. Before RTL design, the hardware model is
I. INTRODUCTION built by using SystemC. The model may reuse
Fig. 1 shows a conventional System-on-a-Chip codes from the original system model. Again, DSE
(SoC) design flow. First of all, the specification of is applied to decide the hardware partition and
an application is built as a system model by using architecture. Afterwards, RTL design starts and is
C/C++. Because a SoC usually employs a verified by using stimuli dumped from the previous
processor, the system model is ported to the target hardware model. At this stage, HDL simulator, like
processor for design space exploration (DSE). If an ModelSim [2], is involved in functional simulation
off-the-shelf processor is used, an evaluation and verification. Usually, only pieces of stimuli are
board equipped the processor can be adopted. On used. At the end of hardware design flow, the
the contrary, if the target processor is unavailable, hardware module is emulated on FPGA as well.
one can use instruction set simulator (ISS) or However, for a SoC design, full-system verification
emulator. ISS is a software tool which executes should combine both software and hardware
programs in the machine code format of the target
processor. Some ISS such as QEMU[1] can even
simulate or emulate a full system, including
peripherals. However, QEMU may be more helpful
than an evaluation board for DSE. In addition, a
guest operating system (OS) can be installed.
Thus, the system model is compiled to the target
processors instruction set and analyzed within
QEMU to determine software and hardware
partitions. Then, the development procedure is
Figure 1: A Conventional SoC Design Flow.
divided into hardware and software parts. For

978-1-4577-1617-1/11/$26.00 2011 IEEE 201


portions. With off-the-shelf processors, co- II. Functional Verification with Heterogeneous
verification is usually achieved by using SoC Tools
platform equipped with the target processor and As can be seen in Fig. 1, the flow involves
FPGA. various tools and simulation environments. To
Though programming and simulation tools are validate functional equivalence of models in
changed according to design stages, functions of a different implementation levels, co-verifications
module should be the same. Functional between models or modules are recommended.
equivalence can be validated easily if co- We proposed a co-verification model depicted in
simulations among tools are practical. There are Fig. 2. The environment indicates the adopted
works regarding the co-simulation issue. Works [7- programming languages or simulation tools. The
10] provide QEMU-SystemC co-simulation. blocks, which stand for functions, in both
SystemC co-simulates with HDL can be found environments can be software or hardware models
in[11]. The [12] and [13] deal with co-simulation built by using C/C++/SystemC or hardware
between SystemC and an FPGA platform. modules designed by using HDL, respectively. In
SystemC and SystemVerilog co-simulation is this figure, block B in environment I is a transitional
introduced in [14]. Issues regarding model which represents the golden sample in
hardware/software co-simulation can be found in subsequent design process. Its corresponding
[15-17]. In spite of dealing with co-simulation in design under test (DUT) is in environment II. The
above works, they dont cover a whole SoC design two environments are connected via stubs. When
flow. System level simulation, including software co-simulating, the stimuli is generated from golden
and hardware parts, may be required in every sample and sent to DUT. The outcome from DUT
stage such that functions of a new intermediate is returned to the golden sample in environment I
model or module can be validated as a part of a and examined on-the-fly. The co-verification model
whole system. is applied to the design flow. Thus, the co-
This paper concentrates on functional and simulation pairs can be (system model, hardware
system level verification. It is achieved by making model), (hardware model, RTL design), (hardware
an ISS (QEMU is used here) to co-simulate with model, FPGA) and so on. Overall system level
tools involved in the SoC design flow. A co- verification can be achieved if system software is
verification model is used as a reference. included in co-simulation. The co-simulation
Functional equivalence in every design transition is construction is described as follows.
verified by applying this model and employing co-
simulation among tools which range from virtual to A. System Model and Hardware Model Co-
physical platform, such as QEMU and FPGA Simulation
devices. We insert stubs into models running in As mentioned in previous section, DSE is
different tools to establish data exchange tunnels applied to system model in QEMU for software and
across tools. In addition, a centralized co- hardware partition. After deciding the partitions, the
simulation manager, called Verification Router, is hardware model is built for DSE by using SystemC
proposed to cope with complicated heterogeneous and reusing codes from the system model. It helps
simulation scenarios, such as a simulation to determine hardware module partition as well as
combining QEMU, HDL simulator and FPGA. Thus, its architecture. However, SystemC model must
even if the target processor of a SoC design is
absent due to adopting a new design, system level
verification can still be achieved.
The rest of this paper is organized as follows.
Section II introduces a co-verification model for
SoC design flow at the beginning. Applying the
model to each design stage is described. A jpeg
encoder design example is presented in section III.
The last section gives a conclusion.
Figure 2: The Proposed Co-Verification Model.

202
execute outside QEMU. Hence, co-simulation boot up a tiny Linux kernel on QEMU while it
between QEMU and SystemC is required if requires around 16 minutes on a cycle-accurate
functional equivalence of the hardware model is QEMU[10]. Here, we focus on the functional
examined by involving the proposed co-verification verification so that timing isnt critical issue.
model. Works regarding this feature can be seen in
[7-10]. In this paper, stubs shown in Fig. 2 are B. Hardware Model and HDL Co-Simulation
created. After hardware model is tested fully, RTL
In QEMU, hardware device is usually treated design starts. RTL design can be verified with test
as an interrupt-supported memory-mapped I/O patterns obtained from its previous transitional
device. Here, we create a hardware device, called hardware model, though the volume of patterns is
QEMU_stub, to exchange data with devices usually small. The proposed co-verification model
outside QEMU. The QEMU_stub is a set of data can verify RTL design, generating stimuli and
access functions, such as Qstub_write, in QEMU. examining the outcome from RTL design on-the-fly.
QEMU invokes Qstub_write when software sends The environment I shown in Fig. 2 is the hardware
data to the stub. Stubs employ inter-process SystemC model while the opposite side is its RTL
communication (IPC) to build a tunnel. IPC is a design. Some HDL simulators support SystemC-
mechanism in host environments, such as Linux HDL co-simulation. For example, SystemC design
and MS-Windows, to support communication of is included and compiled together with RTL design
software processes. There are several approaches in ModelSim. However, all modules have to be pin-
to implement IPC such as shared memory, and cycle-accurate. If the original SystemC model
memory-mapped file, socket and pipe. Shared adopts interface like transaction level modeling
memory mechanism is adopted here. When QEMU (TLM) [18], the alteration of communication
starts up, a system memory space of host OS is interface is needed. Such modification from TLM to
registered with an ID by stubs. The ID is the key to pin- and cycle-accurate may propagate to a large
obtain the access to the memory space for any portion of a module. To achieve it in an effortless
processes. All data transports are achieved way, two wrappers, SystemC_Stub and
through this memory space. HDLSim_Stub, are introduced. The SystemC_stub
In SystemC, stub is like a wrapper. For encapsulates data for transmission in SystemC
example, HW_SystemC_stub is placed in model. The interface between a SystemC model
hardware model to exchange data with and the stub can be either TLM or any SystemC
QEMU_stub. Once software in QEMU writes data primitive signals. Once the stub receives data, it
to the hardware model, the information, including encapsulates data and transmits it to the
data and target address, is sent to QEMU_stub. HDLSim_stub via IPC. The HDLSim_stub is a pin-
Then, QEMU_stub transmits it to the shared and cycle-accurate SystemC model, compiled in
memory space and sends signals to the ModelSim for interface transformation. The
corresponding HW_SystemC_stub. Next, the interface between a HDLSim_stub and a RTL
information is moved from shared memory into module has to be customized according to the RTL
HW_SystemC_stub with the IPC mechanism. module. It usually implements the protocol of
Lastly, the data is transmitted to target module via hardware communication channel, such as AMBA
SystemC interface. It is noted that QEMU is not a or others. Hence, if a bus writing operation in TLM
cycle-accurate emulator. Therefore, the data involves a batch of data in the SystemC model,
transport timing, recorded in SystemC simulation, both data and address information are sent to
can not represent the real case. In other words, if HDLSim_stub. The stub places them on the bus
the HW_SystemC_stub records a transport in cycle by cycle.
th
1000 cycle of simulation time after booting up a
Linux in QEMU, it doesnt indicate that the C. Hardware Model and FPGA Co-Simulation
th
transport happens in the 1000 cycle if a physical After RTL design is completed, it migrates to
processor is employed. The advantage of not an FPGA, running at a high clock rate close to the
involving cycle-accurate processor model is the target system. The simulation speed is faster than
simulation speed. It costs less than 30 seconds to HDL simulator and huge volume of stimuli can be

203
applied. Besides, peripheral devices can engage in response start to the end. In this example, the
test pattern production so that the verification response latency, OP_Lat, is transmitted back to
coverage is enlarged. the initiator along with the function call
SC_FPGA_stub is established to achieve data nb_transport_bw().This mechanism is implemented
transport and support the co-verification model. It in data transport wrapper with two additional
involves FPGA access functionality and is signals: Req and Ack. Fig. 4 shows the timing
embedded in the hardware SystemC module. Here, diagram. When the wrapper raises Req signal, it
we adopt FPGA platform from SMIMS[19]. The starts a timer until the Ack signal goes high. This
platform connects to PC via USB and provides interval is marked as request latency. The other
application programming interfaces (APIs) such two periods are measured similarly. Such timing
that user software in host PC can exchange data information is returned to support TLM timing
with circuits in FPGA. The SC_FPGA_stub calls mechanism mentioned above.
the SMIMS APIs, such as FIFODataWrite or
FIFODataRead, for such data exchange. In FPGA, D. System Model and HDL/FPGA Co-
a data transport wrapper helps to exchange data Simulation
between SMIMS framework and users hardware Co-simulations described previously cover
module. The interface between the wrapper and hardware design flow only. Considering system
the users hardware module has to be designed in level verification, software part should be included.
terms of its specification. It is also advantageous to simulate as a complete
Another role of this wrapper is to collect timing system. The system model is constructed as a
information when SystemC models adopt TLM golden sample, representing the target SoC. It
interface. In SystemC TLM protocol, there are four usually generates huge volume of test patterns.
types of transport interfaces: blocking, non- With co-simulations, outcomes from hardware
blocking, debugging, and direct memory interface. portion can be examined by the system model on-
The blocking and non-blocking transport interfaces the-fly. The stimuli which results in incorrectness
support timing estimation. When calling blocking can be recorded to test the possible faulty modules.
transport function b_transport(), the period spent In section II-A, we employ QEMU as the target
on a transaction is returned along with the function processor running OS and software. Hardware
call. For non-blocking transport interface, timing is models can join co-simulation by embedding
further divided for more precise measurement. Fig. QEMU_stub. Here, we modified HDLSim_stub and
3 shows an example of an initiator transmitting SC_FPGA_stub, used in HDL simulator and FPGA
data to a target model. When a transaction starts, respectively. Originally, the HDLSim_stub
the initiator calls nb_transport_fw() with a request communicates with SystemC_stub via IPC. Thus, it
tlm::BEGIN_REQ. The function call returns can also connect with QEMU_stub. Furthermore, in
immediately. After certain period of time, the target order to make it closer to the target system, the
model accepts the request by calling interface between HDLSim_stub and the RTL
nb_transport_bw() with a flag tlm::END_REQ. The module is advised to follow the one in the target
time period is taken as Request Latency. The time system. For example, if the RTL module connects
period between request acceptance and response to an AMBA bus, it is better to adopt AMBA as the
start is defined as Response Latency, while the interface between HDLSim_stub and the module.
Operation Latency is the duration from the As far as FPGA is concerned, the solution is

Figure 4: Timing Diagram of FPGA wrapper supporting TLM


Figure 3: Timing Diagram of a TLM Non-Blocking Transport. Non-Blocking Transport.

204
similar to QEMU_stub. As mentioned above, we Verification Router, the co-simulation extends to
adopted SMIMS FPGA platform which provides designs among tools. For example, a simulation
APIs for data exchange. We wrapped SMIMS APIs session can combine application software
as a hardware device in QEMU. When software in executed in QEMU, hardware model by using
QEMU transmits data to the hardware device, the SystemC and an off-the-shelf design which is only
data is sent to the module in FPGA so that available on FPGA. Thus, system level simulation
software and hardware portions cover the and verification can be applied more widely.
complete functions of the SoC. This enables
system-level verification even the target processor III. Demonstration and Results
is absent. We used a JPEG encoder as a demonstration.
Its C code is first tested. Then, we ported it into
E. Verification Router, the Heterogeneous QEMU which emulated an ARM system with Linux
Simulation Manager kernel installed. The JPEG encoder executed in
In above work, we adopted stubs to achieve QEMU is regarded as one system model. Then,
one-to-one co-simulation across two different DCT is separated from the C code and a dedicated
designs or simulation environments. The hardware circuit will be designed. Therefore, the
QEMU_stub and HW_SystemC_stub help co- DCT function is refined as a SystemC model. The
simulation between software in QEMU and model adopts TLM interface to connect with
hardware model built by using SystemC, while HW_SystemC_stub. A QEMU_stub was inserted
SystemC_stub and HDLSim_stub link hardware into QEMU whose code snap can be found in [20].
model and HDL simulator together. In addition, The second system model is completed. Next,
SC_FPGA_stub connects hardware model to DCT is designed as a RTL module and verified in
FPGA. The derived versions support system level ModelSim. The interface between HDLSim_stub
simulation and verification. Since the and DCT module is memory-mapped bus, because
communication protocol among all stubs can be processor uses memory-mapped bus to connect
identical, we defined a general communication with DCT module in the target system. Part of
protocol and built a centralized heterogeneous HDLSim_stub code is also in [20]. To verify the
simulation manager, called Verification Router, to RTL model, the DCT SystemC model generates
cope with complicated co-simulation scenarios. stimuli, which will be transformed by the DCT
Verification Router is the rendezvous for all stubs, module, and examines the transformed outcome.
as shown in Fig. 5. It is responsible for simulation In the simulation, 300 blocks of 8x8 pixel data were
organization. When simulating, a session is involved and no failure was found. The co-
created, describing connections among stubs in simulation with FPGA is similar and its description
the Verification Router. Each connection is isnt included here due to page limitation.
represented by an ID. Thus, stubs connect to the Next, system level verifications are applied
router with the dedicated ID and data transport is with QEMU and FPGA. We enhanced the software
built between two stubs of using the same ID. With part as a JPEG encoding application on Android
and placed the DCT hardwired into FPGA. A data
transport wrapper connects to the DCT by using
memory-mapped bus. The execution snapshot is
shown in Fig. 6. The android emulator, in the right

Figure 5: The Verification Router as a Heterogeneous Figure 6: The Snapshot of QEMU and FPGA Co-Simulation
Simulation Manager

205
side, shows the original figure and the compressed platform, circuit in FPGA can also be verified with
result. Messages in command console displayed its golden sample running on the host PC on-the-
the exchanged data and the verified result. In this fly.
simulation, 100 figures were used and the fault
resulted from register overflow was found and REFERENCES
1. F. Bellard, QEMU, a Fast and Portable Dynamic Translator,
removed. Finally, entire design, including JPEG 2005 USENIX Annual Technical Conference, pp. 41-46,
application on Android and RTL design, is migrated April, 2005.
to physical platform, shown in Fig. 7. The right- 2. ModelSim [Online] available: http://model.com
3. SystemC [Online] available: http://www.systemc.org
hand-side is FPGA which hosts DCT hardwired 4. D. Densmore, R. Passerone, and A. Sangiovanni-Vincentelli,
and the left-hand-side is an ARM cortex-A8. These A platform-based taxonomy for ESL design, IEEE Des.
Test Comput., vol. 23, no. 5, pp. 359374, Sep. 2006.
two are connected via a memory-mapped bus. The 5. Synopsys Platform Architect [Online] available: http://www.S
JPEG application as well as Android is executed ynopsys.com/Systems/ArchitectureDesign/Pages/PlatformAr
on the ARM. Due to lack of image input device, we chitect.aspx
6. ARM [Online] available: http://www.ARM.com
used stored files as test data. The result is shown
7. M. Monton, A. Portero, M. Moreno, B. Martinez, and J.
on a LCD panel. Most off-the-shelf platforms are Carrabina, "Mixed SW/SystemC SoC Emulation
usually close systems which make the execution Framework," in IEEE Int. Symp. On Industrial Electronics, pp.
2338-2341, 2007
status of circuits in FPGA invisible. The adopted
8. QEMU-SystemC, GreenSocs, [Online] Available:
FPGA platform provides USB interface connecting http://www.greensocs.com/en/projects/QEMUSystemC.
with host PC. Therefore, an interface circuit is 9. S.T. Shen, S.Y. Lee and C.H. Chen, "Full system simulation
added aside the DCT module to send intermediate with QEMU: An approach to multi-view 3D GPU design," In
IEEE Proc. of Intl Symp. on Circuits and Systems, pp.3877-
results of DCT module back to the system model 3880, May 2010
executed in A8 in real time so that it can be 10. M.C. Chiang, T.C. Yeh and G.F. Tseng "A QEMU and
SystemC-Based Cycle-Accurate ISS for Performance
compared with the golden sample. The system
Estimation on SoC Development," IEEE Trans. Comput.-
works well through the series of co-verifications. Aided Design Integr. Circuits Syst. , vol.30, no.4, pp.593-606,
April 2011
IV. Conclusion 11. J. Park, B. Lee, K. Lim,J. Kim, S. Kim, K.H. Baek "Co-
simulation of SystemC TLM with RTL HDL for surveillance
In this paper, interfacing stubs and Verification camera system verification," IEEE Intl Conf. Electronics,
Router are proposed to support complicated co- Circuits and Systems, pp.474-477, August 2008
simulation scenarios of adopting a co-verification 12. W.M. Young, C.H. Huang, A.P. Su, C.P. Jou and F.L. Hsueh
"A practice of ESL verification methodology from SystemC
model, employing heterogeneous tools in SoC to FPGA - using EPC Class-1 Generation-2 RFID tag
design flow. The co-verification model involves a design as an example," Design Automation Conference
golden sample, like system model, to verify (ASP-DAC), pp.821-824, 18-21 Jan. 2010
intermediate designs generated when transforming 13. C.Y. Huang, Y.F. Yin, C.J. Hsu, T.B. Huang, T.M. Chang
a design from its abstract level model to RTL. With "SoC HW/SW verification and validation," Design
Automation Conference (ASP-DAC), pp.297-300, 25-28
the proposed mechanism, boundaries of software Jan. 2011
models and hardware modules operating in 14. M.K. You and G.Y. Song "Case study: Co-simulation and
different tools are removed. That is, models or co-emulation environments based on SystemC &
modules working in both virtual and physical SystemVerilog," IEEE TENCON 2009 Region 10
Conference , pp.1-4, 23-26 Jan. 2009
platform can simulate together just like they work in
15. Vojin Zivojnovic and Heinrich Meyr,"Compiled HW/SW co-
target SoC design. In this paper, the co-simulation simulation," Design Automation Conference, pp.690-695,
tools include QEMU, SystemC, HDL simulator, and 1996
FPGA. Finally, with USB interface of the FPGA 16. A. Hoffmann, T. Kogel and H. Meyr, "A framework for fast
hardware-software co-simulation," Cofn. Design,
Automation and Test in Europe,pp.760-764, 13-16 March
2001
17. L. Formaggio, F. Fummi and G. Pravadelli, "A timing-
accurate HW/SW co-simulation of an ISS with SystemC",
CODES+ISSS '04, ACM , pp. 152--157, 2004
18. M. Monton, J. Carrabina, and M. Burton, Mixed simulation
kernels for high performance virtual platforms, in Proc.
Forum Specification Des. Languages, pp. 16, Sep. 2009
19. SMIMS Corp. [Online] available: http://www.smims.com
20. Code snap. [Online] available: http://screamlab-ncku-2008.
Figure 7: The Physical Platform of Executing Entire SoC blogspot.com/2011/04/code-snaps.html
System

206

You might also like