You are on page 1of 7

Hardware prototyping of FBMC/OQAM baseband

for 5G mobile communication


Jeremy Nadal , Charbel Abdel Nour , Amer Baghdadi , Hao Lin
Institut Mines-Telecom; Telecom Bretagne; Lab-STICC, Technopole Brest-Iroise, 29238 Brest, France
Orange Labs, 4 rue du Clos Courtel, 35512 Cesson Sevigne, France

Email: {jeremy.nadal,charbel.abdelnour,amer.baghdadi}@telecom-bretagne.eu, hao.lin@orange.com

AbstractEmbedded systems in the field of digital commu- flagship 5G project) [4]. It illustrates the complete design
nications are becoming increasingly diversified and complex. and prototyping flow, including: (1) algorithm simplification
This trend is being confirmed with the emergence of many and optimisation, (2) architecture exploration, (3) hardware
new application scenarios for mobile communication systems
beyond 2020. In this context, rapid prototyping experiences implementation, and (4) on-board validation and demonstra-
are of high interest for performance validation and proof-of- tion. To the best of the authors knowledge, this constitutes
concept of the diverse proposed communication techniques. In the first published design and prototyping experience related
this paper, we present a new design and prototyping experience of to this new technical component. The proposed contribution
an advanced communication system based on filter-bank multi- serves as a proof-of-concept of the new waveform and allows
carrier (FBMC) modulation. This modulation is being studied
and considered nowadays by recent research projects as a key for rapid architecture exploration, performance evaluation and
enabler for the future flexible 5G air interface. The paper comparison with state-of-the-art OFDM-based systems.
illustrates the complete design and prototyping flow from al- The rest of the paper is organized as follows. Section II gives
gorithm specification to on-board validation and demonstration. a brief technical description of the OFDM and FBMC/OQAM
The proposed prototype enables to illustrate and evaluate the
performance of this new waveform compared to state-of-the-art
modulation principals. Section III presents in detail the differ-
OFDM-based systems. ent steps of the proposed design and prototyping flow. This
is done for both OFDM and FBMC/OQAM based modulators
I. I NTRODUCTION using similar algorithm/architecture choices for reference and
Next generation mobile communication systems are fore- comparison purpose. The implementation results are summa-
seen to provide ubiquitous connectivity and seamless service rized and discussed in Section IV. Finally the paper concludes
delivery in all circumstances. There are forecasts of a total of with Section V.
50 billion connected devices by 2020 [1]. This huge expected
number of devices and the coexistence of human-centric
and machine-type applications will lead to a large diversity II. T ECHNICAL DESCRIPTION OF TARGET COMPONENTS
of communication scenarios and characteristics [2]. In this
context, many advanced communication techniques are under MultiCarrier Modulation (MCM) schemes enable a better
investigation. Each of these techniques is typically suitable for resistance to multipath channels by dividing the available
a subset of the foreseen communication scenarios. bandwidth suffering from frequency selectivity into multiple
However, the proposed new communication techniques are sub-bands corresponding to the available sub-carriers. Enjoy-
often studied and analysed at the algorithmic level considering ing a flat channel response, each sub-carrier holds a QAM
mainly the quality of the communication link, i.e. quality symbol.
of service. Although this remains as one of the main Key Nowadays, OFDM represents the MCM with the widest
Performance Indicators (KPI), the related hardware and energy reach. In case of perfect synchronization, OFDM rectangular
efficiencies are becoming increasing crucial requirements for pulse shape leads to no Inter-Carrier Interference (ICI). In
future mobile terminals and networks. addition, it is practical to implement, since it mainly calls
Thus, the availability of new rapid design, validation flows for an Fast Fourier Transform (FFT) computation. Some
and related prototyping experiences are of high interest for variants were adopted in different standards. For instance,
performance validation and proof-of-concept of the diverse DMT (Discrete Multi Tone) is a baseband OFDM transmission
proposed communication techniques. In this context, this paper used in ADSL (Asymmetric Digital Subscriber Line). OFDMA
presents a new design and prototyping experience of an (Orthogonal Frequency-Division Multiple Access) is another
advanced communication system based on filter-bank multi- variant, currently used in LTE (Long Term Evolution) stan-
carrier (FBMC) modulation. This modulation is being studied dard, where a group of sub-carriers is assigned by an individual
and considered nowadays by recent research projects as a key user, making it suitable for multiple access.
enabler for the future flexible 5G air interface [3]. It exhibits Thanks to their higher robustness against synchronization
better spectrum shape compared to the traditional OFDM errors and their better resistance to the Doppler effect, MCM
(Orthogonal Frequency-Division Multiplexing) and enables schemes applying a filter bank were studied in recent years.
better spectrum usage and mobility support. One appealing variant from this new family of MCM is the
The paper considers the recent FBMC/OQAM (Offset FBMC/OQAM modulation [5]. Therefore, a comparison with
Quadrature Amplitude Modulation) technical component pro- OFDM currently adopted in 4G systems becomes of great
posed in the framework of the METIS project (the European interest.
c (m)
OFDM Insert CP
OFDM spectral shape
binary QAM iFFT s (k)
data Mapper Length M CP Block n

a2 FBMC/OQAM a2 (m) x2 (m)


Pre- iFFT u2 (k)
processing Length M
PPN FBMC with TFL1 filter spectrum shape
OQAM Mapper unit
s(k)
real
binary QAM

+
data Mapper
imag Delay 2
a2+1 x2+1 (m) u2+1 (k)
Pre- iFFT PPN
a2+1 (m) processing Length M

.
Fig. 1. OFDM and FBMC/OQAM data flow and system description

A. OFDM technical description mapper enables an efficient reduction of ICI and InterSymbol
The binary information bits to be transmitted are first Interference (ISI) when a properly designed filter is applied.
modulated to generate complex In-phase I and Quadrature Second, it applies a filtering operation through the introduc-
Q components cn (m). A total of M QAM symbols are tion of a PolyPhase Network (PPN) after iFFT. This enables a
modulated, corresponding to the number of active sub-carriers better time and/or frequency localization, depending on the
of OFDM. Then, an IFFT of length M is computed and a block shape and the length of the used prototype filter. A good
of M complex samples is generated in time domain. Unused time localization reduces ISI whereas frequency localization
sub-carriers are padded to zero at the input of the IFFT. The reduces ICI. For instance, the prototype filter based on IOTA
baseband OFDM modulation in discrete time domain can be (Isotropic Orthogonal Transform Algorithm) function [6] is
written as: used for an optimal time and frequency localization. Raised
Cosine Filter [7] is another possible prototype filter where
+ M 1 its shape and its time-frequency localization can be adjusted
X X j2mk
s(k) = (k nM ) cn (m)e M , (1) through a roll-off factor, but theoretically requires an infinite
n= m=0 filter length. Note that OFDM is a special case of FBMC: the
Where M represents the overall number of sub-carriers, rectangular waveform defined in expression (2) is a prototype
cn (m) the complex valued data from a QAM constellation filter of length M . In our study, we use TFL1 prototype filter
at sub-carrier index m and block index n, s(k) is the complex [8] because of its compromise between time and frequency
output of the OFDM modulation, and denotes the following localization. Furthermore, it a is filter of length M , which
rectangular function: greatly reduces the computational complexity at the transmitter
( and the receiver side (simpler equalization). Two IFFT and
1 if 0 k M 1, PPN units are required for the FBMC/OQAM modulator.
(t) = (2)
0 elsewhere. Afterwards, the two corresponding outputs are added together
to provide the signal to be transmitted through the channel.
Every sub-carrier has a fixed cardinal sine waveform in Third, thanks to the filter time and frequency localization
frequency domain. OFDM is a block-oriented processing mod- and the application of the OQAM modulation, pulse-shape
ulation due to the IFFT computation, and a more condensed orthogonality can be achieved without the need for a cyclic
expression can be written as: prefix. Therefore, the spectral efficiency can be slightly im-
M 1 proved.
X j2mk
sn (k) = cn (m)e M , (3) The discrete time baseband FBMC/OQAM modulator out-
m=0 put can be written as follows [9]:
where sn (k) is the complex output sample of the OFDM + M 1
X M X 2 L1
modulation, at block index n and sample index k. s(k) = p(k n
) an (m)j n+m ej M m(k 2 ) ,
A cyclic prefix (CP) is inserted at the beginning of a block n=
2 m=0
(OFDM symbol) as shown in Fig. 1 to avoid inter-symbol (4)
interference caused by the delay spread of a multipath channel. where M is the overall number of sub-carriers, p(n) the
2 L1
It is formed by the LCP last samples of the OFDM symbol, prototype filter of length L, the term ej M m 2 is a phase
copied at the beginning of the symbol. LCP should be chosen component introduced by the delay of the filter, and is com-
to be greater than the delay spread of the channel. However, puted in the pre-processing unit before the IFFTs.
the insertion of the cyclic prefix may have a noticeable cost The spectrum shapes of the two considered modulation
in bandwidth and therefore spectral efficiency. schemes are presented in Fig. 1, where the TFL1 prototype
filter was used for FBMC/OQAM. The cardinal sine of OFDM
B. FBMC/OQAM technical description sub-carriers has secondary lobes with higher amplitude than
FBMC/OQAM modulation uses the main components of FBMC/OQAM with the TFL1 filter. Since each sub-carrier
OFDM with three main differences, as shown in Fig. 1. First, contributes to the global spectrum shape, OFDM has a higher
it uses QAM mapping followed by the application of an offset out-of band leakage. In case of a non-perfect synchronization,
of M/2 samples in time domain between the I (a2n (m)) and the secondary lobe of adjacent sub-carrier will cause more ICI
the Q (a2n+1 (m)) component of cn (m). The resulting OQAM in case of OFDM.
III. P ROPOSED DESIGN AND PROTOTYPING FLOW Development framework entry
Technical component
This section details the complete design and prototyping (e.g. FBMC/OQAM modulator) Related aspects
flow that we propose to implement and validate the recently - Algorithm/technique description - Target/estimated
performances
introduced FBMC-based waveform for 5G mobile commu- - Reference software model - Usage scenarios/test cases
nication. The different development phases, from algorithm - System parameters

specification to on-board validation and demonstration, are


Phase 1
summarized in Fig. 2.
Algorithm Simplification and Optimisation Performance assessment
The development framework entry consists of the descrip- - Simplified algorithms suitable for HW implementation w.r.t. t the reference model
- Efficient numeric representation (Quantization issues) and the target/estimated
tion of the technical component to be demonstrated and the - Algorithm parallelisation techniques performances
related aspects of usage scenarios and estimated performances. - Computation, communication, and memory requirements

In the presented design and prototyping experience, these Phase 2


inputs have been made available by the company Orange Labs Architecture Exploration
in the context of the METIS project [4]. The description of - Architectural choices to fulfill the target performances - Performance assessment
- Bit-true SW model
- Combined algorithm/architecture optimization techniques
the technical component includes the following: (1) a detailed - Various architecture models are considered (ASIP, MPSoC, etc.)
- Emphasis on parallelism techniques and architecture efficiency
description of the proposed algorithm/technique, which has
been summarized in the previous section, (2) a reference Phase 3
software model, including a reference software testbench, and Hardware Implementation - Performance assessment
- Traditional and recent digital hardware design methodologies - Validation through
(3) supported system parameters, including those related to (System-level design, HLS, ASIP-based design, etc.) behavioral and post-
- FPGA and ASIC target technologies synthesis simulations
the target channel models. Furthermore, the related aspects - Latest design tools and technology libraries available
in terms of estimated performances and usage scenarios were
Phase 4
provided to allow adequate results validation and demonstra-
On-Board Validation and Demonstration
tion setup. - Prototyping of a complete communication system on the target
Final demonstrator and
real time performances
In fact, in order to implement and evaluate the performance FPGA-based platform
- Dedicated GUI (control and real time results monitoring) assessment
of the FBMC-related new waveform, several configurations - Scenarios/test cases

can be proposed. The direct and simplest one consists of im-


plementing the transmitter chain and comparing the achieved Fig. 2. Proposed design and prototyping flow
performance with respect to the state-of-the-art OFDM system.
Power spectrum density can be measured at the output of the lays in the IFFT unit as illustrated in the previous section.
transmitter with respect to the OFDM case for appropriate sys- Depending on the application, the length, precision, and speed
tem parameters, besides the hardware complexity comparisons. of the FFT may vary. The LTE standard specifies FFT length
In this paper, we consider this configuration and we propose to in the range of 128 to 2048. FFT operation has been proven
develop a hardware prototype implementing both state-of-the- to be both computational intensive, in terms of arithmetic
art OFDM and new FBMC/OQAM based transmitters. For operations, and communicational intensive, in terms of data
the purpose of fair comparison, same architectural choices swapping/exchanging in the storage. Algorithm simplifications
and optimisation techniques are devised for both transmitters. and optimisations related to FFT implementation have been
Furthermore, typical LTE system parameters as defined in 4G widely investigated in the literature. This is done with close
mobile systems are considered. consideration and evaluation of the hardware efficiency, as
Based on the above considerations, the rest of this section illustrated in the design loop between phase 1 and 2 of Fig. 2.
details the conducted four design and prototyping phases Many algorithm/architecture variants have been proposed
illustrated in Fig. 2. with different features in terms of parallelism, pipeline struc-
A. Algorithm simplification and optimisation ture, operations scheduling, resources usage, and memory
requirement. Analysing the existing optimisation techniques
The first phase in the development framework starts by in this context, we selected one of the most efficient algo-
analysing the technical component considering an implemen- rithm/architecture solution known as Radix-22 Single-path De-
tation/demonstration perspective. The objective in this phase lay Feedback (R22 SDF) [10]. This solution allows for a fully
is to achieve simplified algorithms which are suitable for pipelined architecture and minimum memory requirement with
hardware implementation. It targets the exploration of inher- efficient resources (multipliers, adders, and registers) utilisa-
ent properties of the algorithm which can be exploited for tion.
efficient architecture design (second phase of the development
framework). In this context, quantization issues are studied In the case of FBMC/OQAM modulator, the particular
in order to propose efficient numeric representation of the structure of the real-valued samples enables the introduction
processed data algorithm. Impact of the proposed optimisa- of additional optimisation techniques of the related IFFT
tions is assessed and compared. In case of a potential high operations. Such optimisations, which are out of the scope
throughput requirement, algorithm parallelisation techniques of this paper, further lead to significant complexity reductions
should be proposed and their efficiency characterised. Fur- in the final implementation.
thermore, besides the computation complexity, communication Regarding the other units of the FBMC/OQAM modulator,
and memory requirements are analysed in this phase and mainly the PPN filter could present different algorithm variants
optimisation techniques are proposed. that need analysis and optimisation. For practical implementa-
In the considered FBMC/OQAM modulator and the state- tion purposes, we constrain the length L of the prototype filter
of-the-art OFDM one, the main digital processing requirement to L = qM , q N . Thus, expression (4) defined in previous
section becomes: 1) OFDM modulator architecture: The first unit of the
+ M 1 proposed OFDM modulator architecture is the QAM mapper
X M X 2
s(k) = p(k n ) xn (m) ej M mk , which is implemented through simple constellation Look-Up
2
n= m=0 Table (LUT). To support up to 64-QAM, as specified in the
m
xn (m) = an (m)j n+m (1)qm ej M (5) LTE standard, a LUT with 64 locations, each of 12 2 bits
width, is required.
The other units imply standard low-complexity computations
The second unit is the IFFT, which is the core element of
where simple architecture optimisations can be done in phase
the OFDM modulator architecture. As stated in the previous
2 of the development flow.
sub-section, the architecture exploration and algorithm opti-
In order to validate the above described algorithm choices
misation phases lead to the selection of the efficient R22 SDF
and to propose efficient numeric representation, a first floating-
solution [10]. Low complexity fully pipelined architecture with
point MATLAB reference model has been developed. Quan-
minimum memory requirement is devised. The R22 SDF ex-
tization issues have been studied in order to propose efficient
ploits the fact that an M -point IFFT can be recursively decom-
fixed-point software model while preserving signal quality.
posed into four IFFTs of length M 4 , and can be implemented
The number of quantization bits, the fixed-point represen-
by log4 (M ) stages of elementary IFFT of length 4 (when M is
tation, the shift to apply for re-scaling after multiplication,
a power of 4), called radix-4 butterfly. Instead of computing all
and the type of approximation (floor or round) have been
butterflies iteratively stage by stage, all stages are computed at
specified for each unit. Round approximation is chosen due
the same time, in a pipeline way. To do so, a feedback buffer is
to its reduced impact on the spectrum shape as shown in
attached to each of the log4 (M ) radix-4. At each clock cycle,
Fig. 3. In order to obtain a power spectrum density with-
one sample is processed at each stage, then put into the buffer
out significant quantization error, the following fixed-point
or sent to the next stage. The R22 SDF also optimizes the radix-
representation has been devised enabling less than -70 dB
4 by decomposing it into two pipelined radix-2 butterflies
Signal-To-quantization-Noise Ratio (SQNR) for all specified
(hence the name R22 ), each having two adders and one buffer.
constellation sizes in LTE (4, 16 and 64-QAM):
With such a decomposition, the hardware complexity of this
All samples at input and output of each unit has 16-bit
architecture counts for log4 (M ) 1 multipliers, 4log4 (M )
quantization. This also applies for each stage of the IFFT.
adders, and memory proportional to M 1.
All coefficients are quantised and stored with a precision
The devised architecture for the IFFT uses the Decimation
of 12 bits. This includes the fractional part of twiddle
In Frequency (DIF) decomposition which results in output
factors (related to IFFT computations), PPN filter coef-
samples in bit reversal order. Thus, to avoid additional memory
ficients, fractional part of cosine coefficients in the pre-
usage and latency overhead, both the insertion of the cyclic
processing unit.
prefix and the reordering operation are done jointly. IFFT
10

output samples are alternatively read and written in a memory


Power Spectral Density (dB)

0 OFDM OFDM
FBMC/OQAM FBMC/OQAM
-10
unit of depth M , in bit reversed and normal order, with the
-20

-30
LCP last samples read first to generate the the cyclic prefix
-40 between subsequent blocks.
-50

-60
The resulted OFDM modulator architecture is fully
-2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2

Frequency (MHz) Frequency (MHz) pipelined enabling continuous stream processing of one com-
(a) Floor approximation (b) Round approximation plex sample in baseband discrete time domain per clock cycle.
Fig. 3. Power spectral density at the output of the modulator with the devised
2) FBMC/OQAM modulator architecture: The first unit
fixed-point representation for OFDM and FBMC/OQAM, with (a) floor in the proposed FBMC/OQAM modulator architecture is the
approximation and(b) round approximation. Considered parameters are: IFFT Offset QAM mapper. LUT-based architecture, identical to one
length M , 16-QAM constellation, TFL1 prototype filter for FBMC/OQAM,
bandwidth of 5 MHz, 200 non-active sub-carriers at the center.
of the OFDM described above, is used. In addition, a simple
First In First Out (FIFO) of size M 2 is inserted at the imaginary
B. Architecture exploration (Q) output component to introduce the corresponding offset.
The second phase in the development flow concerns the The pre-processing unit computes xn (m) samples from
digital hardware architecture exploration. The objective is to an (m) as expressed in (5). The proposed architecture is de-
exploit efficiently the proposed algorithm optimisation tech- picted in Fig. 4 for the x2n (m) sub-sequence. The exponential
niques by selecting the most suitable architectural choices term is computed before the phase term j m to deal with real
in order to fulfill target performances. Various architecture valued input and reduce the number of multipliers. A LUT is
models can be considered for this phase: dedicated architec- used to store the sine and cosine coefficients, and a counter
ture model, ASIP-based architecture model, multiprocessing, generates the appropriate addresses. The term (1)n+mq j m
network-on-chip, diverse memory organisations, etc. Com- simplified from (5) does not require an additional multiplier
bined algorithm/architecture optimisation techniques can be since it can be obtained by swapping the real and imaginary
explored and proposed in this context as illustrated in the parts using multiplexers, while the sign inversion only requires
design loop between phase 1 and 2 of Fig. 2. The outcome of twos complement logic unit (Fig. 4).
this phase consists of an original architecture fully specified Regarding the IFFT unit, the same architecture as for the
and ready for the hardware implementation phase. This phase OFDM modulator is devised. Two M -point IFFT units are
includes also the refinement of the reference software model used to process the two real-valued offset streams of the
(including the testbench) into a bit-true model for validation OQAM mapped symbols. As stated in the previous sub-
and to be used as a reference for the hardware implementation. section, further optimisations can be applied in this context,
2
pipeline LUT
Address cnt Counter Parameters Parameters
register
D mapping L Modulo 2M
ROM coefficients ROM coefficients
2 testbench.vhd
cnt(L-1)
cos(
) sin(
)
Packages.vhd
2
(1)+
XOR cnt(0) cnt(1) Binary data Binary data
(if q odd)
-1
Reference test vectors Reference test vectors

x 1 0
-1
0 (2 ()) Reference_vectors.txt
a2 (m) X 4
0 1 x 1 Output sample Output sample OFDM.vhd FBMC.vhd

OFDM & OFDM &


X FBMC/OQAM FBMC/OQAM

0 samples.txt
0 (2 ())
-1 Software simulation Hardware simulation
1 x 1 run plot compile
Pre-processing
1 5 3
Fig. 4. Proposed architecture for the pre-processing unit
Software_model.m spectrum.png script.tcl
PolyPhase Network 1 Cascaded adder
+ + Pipeline Fig. 6. Software and hardware co-simulation during implementation and
22 24
FIFO FIFO register testing step
Depth M Depth M
(x2) (x2) (x2)
2 X X X
(Very high speed integrated circuit Hardware Description
R
Language) using generic forms for flexibility and scalability
Counter Counter Counter E
purpose. Application-specific VHDL packages have been de-
Modulo M
LUT
Depth M
Modulo M
LUT
Depth M
Modulo M
LUT
Depth M
+ S
C
A
L
E

fined to include all design parameters (IFFT length, QAM size,
quantization, etc.) and LUT contents (i.e. various coefficients).
2+1 X X X
To have a direct link with the Matlab fixed-point software
(x2) (x2) (x2)
model, all defined VHDL packages and test vectors are directly
FIFO
Depth M 21
FIFO
Depth M 23
generated form Matlab. The validation environment depicted in
+ + Fig. 6 is proposed to link both Matlab reference model and the
Cascaded adder
PolyPhase Network 2 hardware-level simulations using ModelSim. The validation
flow in this phase can be described as follows:
Fig. 5. Proposed PPN architecture for a filter length L = 3M with shared 1) First, all modulator parameters have to be configured
coefficient memories
in the MATLAB software model to represent some
specific communication scenario (number of sub-carriers
however they are out of the scope of this paper. Furthermore,
M , constellation size (QPSK to 64-QAM), prototype
reorder units are used as in the OFDM modulator, except
filter, and quantization, etc.). Then the software model
that cyclic prefix insertion is removed in the FBMC/OQAM
is executed.
modulator.
2) When the simulation is finished, the input binary in-
TABLE I
PPN ARCHITECTURE COMPLEXITY FOR A FILTER OF LENGTH L = qM formation and the output samples (sn (k) and s(k)) for
each modulator are stored in files. They serve as refer-
Hardware resource Quantity ence vectors for the hardware simulation. In addition,
RAM of depth M 2
for ping-pong FIFO implementationa 4(q 1) the VHDL packages are generated, using classic print
LUT of depth M q
Real multipliers 4q function in MATLAB.
Real adders 4q 2 3) In the third step, the hardware simulation is launched
a Note that for a one tap PPN (L = M ), the FIFOs are not needed using a simple specific script. It compiles all the VHDL
sources (including the generated packages), automati-
The last unit of the FBMC/OQAM modulator is the PPN cally setup the testbench parameters, executes ModelSim
filter. Fig. 5 illustrates the devised architecture for a filter simulation, and displays the signals waveform.
length L = 3M . The proposed architecture is derived from 4) Then, the generated output from the hardware simulation
[11] with adaptation to the OQAM components structure. is written into a file and directly compared in the
Filter coefficients are stored in shared memories between the testbench by loading the reference output generated from
two PPN units. Table I summarizes the PPN architecture the software model.
complexity considering a filter of length L = qM . To optimize 5) In the last step, the output samples from the hardware
the number of pipeline registers, the cascaded adders should simulation are used to plot figures like power spectral
be mapped in a tree structure (applicable when q > 3). density for comparison and demonstration purposes.
C. Hardware implementation D. On-board validation and demonstration
The third phase of the development flow concerns the The last phase in the development flow concerns the
hardware implementation of the devised architecture from on-board implementation and the development of the final
phase 2. Since one of the final goals is to have a flexible demonstrator environment. For on-board validation purpose,
implementation that enables further explorations, the choice of the ZedBoard evaluation board integrating Xilinx Zynq-7000
FPGA-based (Field-Programmable Gate Array) platform was XC7Z020 System-On-Chip (SoC) was used in addition to
preferred as a target integration technology. Thus, the above Xilinx ChipScope Pro Analyzer as shown in Fig. 7. Using
devised architectures have been described directly in VHDL ChipScope, two IP cores have been instantiated in the design.
CHIPSCOPE number and type of signals to observe, besides the time of
Zynq-7000 SoC (on Zedboard) Icon_core observation.
(Xilinx IP core)

Ila_core For demonstration purpose, the ZedBoard evaluation board


OFDM s (k) (Xilinx IP core)
0 is extended, through an FMC (FPGA Mezzanine Card) con-
en_OFDM TX valid_OFDM
MEMORY
BLOCK
Binary
data q COMPARE error_flag
nector, with the front-end board AD-FMCOMMS1-EBZ from
en_FBMC
FBMC/OQAM s(k)
1
Analog Device as illustrated in Fig. 8. This board integrates a
addr 0 1
select TX valid_FBMC
1 0 double 16-bit DAC, with Low Voltage Differential Signaling
ref_FBMC ref_OFDM
COUNTER

en_data
NOT
1 0
MEMORY MEMORY
(LVDS) for high data rate transmission up to 1200 Mega
Flip- select BLOCK BLOCK

MAIN
switch
valid
Flop Samples Per Second (MSPS). A Double Data Rate (DDR)
addr
last_sample
FSM en_compare
COUNTER
mode allows to transmit real and imaginary parts of output
last_compare modulated samples on rising and falling clock edge. The
valid = 1 PROCESS &
COMPARE
last_sample = 1
en_data = 0 central carrier frequency can be set from 400 MHz up to
en_compare = 1

en_data = 1 TRANSMIT
en_data = 1
en_compare = 1
FLUSH
PIPELINE
switch = 0
6 GHz, with a flat gain response for the 450MHz-3.8GHz
en_compare = 0 switch = 0
switch = 0
SWITCH bandwidth, and a power amplifier of 20 dB. The front-end
MODULATOR last_compare = 1

MAIN FSM
en_data = 0
en_compare = 0
board is controlled via an Inter Integrated Circuit (I2 C) bus
switch = 1
protocol.
Fig. 7. On-board validation system setup using Xilinx ChipScope analyzer In the proposed demonstration setup (Fig. 8), binary data
is generated randomly with Linear Feedback Shift Register
The input reference binary samples are stored in internal (LFSR) to avoid storing pre-defined samples which requires
memory blocks, and the output samples are compared to high memory resources. A dedicated unit is designed to adapt
the bit true reference samples, alternatively for OFDM and the modulator output signals, including clocks, to the DAC
FBMC/OQAM. An error flag signal is generated as a result interface (LVDS and DDR). A dual-core ARM processor is
from this comparison to rapidly monitor sample error occur- used to configure the parameters of the AD-FMCOMMS1-
rences. The on-board validation system is controlled through a EBZ board through I2 C interface. It also configures the target
global 4-state Finite State Machine (FSM) as shown in Fig. 7 modulator through slave registers, which enable the LFSR,
and described below: selects the modulator type, the QAM constellation and the
In state TRANSMIT the first modulator is enabled, and number of active sub-carriers. A Graphical User Interface
reference binary data are sent (arbitrary set between (GUI) is developed in MATLAB to allow the user to select the
FBMC/OQAM or OFDM). Because of the processing modulator parameters and to program the FPGA (download
latency, the output samples are not ready to be compared bitstream and processor software). An UART interface is
in this state. The modulator must set a valid signal when present to allow communication between the GUI and the
output samples are available. processor. Thereby, the modulators and the front end can be
In state PROCESS&COMPARE the main part of the dynamically configured with MATLAB. A spectrum analyzer
processing and results comparison is done. The first is used at the output of the front-end board to visualize and
output samples are available, thus the counter which evaluate the power spectral density.
generates the memory addresses for reference samples is
enabled and comparison of results can start. Input binary IV. I MPLEMENTATION RESULTS
data continues to be sent, until the counter reaches the The previous section has already presented the conducted
last address (control signal last sample = 1). implementation flow, including on-chip validation and demon-
In state FLUSH PIPELINE the last samples which remain stration setup. Table II summarizes the synthesis results for
in the pipeline are processed and compared. both OFDM and FBMC/OQAM modulators when targeting
In state SWITCH MODULATOR the second modulator is the SoC XC7Z020-2 Xilinx Zynq-7000 device of the Zed-
selected. As shown in Fig. 7, a flip-flop is devised to avoid Board. These results correspond to an IFFT size of M = 512,
doubling the number of states to control FBMC/OQAM one tap (q = 1) TFL1 prototype filter, and 16-QAM constel-
and OFDM modulator. This state lasts for one clock lation. The results show that the required amount of memory,
cycle, then the system goes back to state TRANSMIT. registers, and LUTs used as logic are almost doubled for
The main constraint in this validation system concerns the FBMC/OQAM modulator compared to OFDM. This corre-
amount of on-chip memory required. If N is the number sponds to the devised algorithm/architecture choices, where
of active sub-carriers modulated with a K-QAM constella- most of the baseband processing units are doubled in the
tion (K = 4, 16, 64...), and Nblocks the number of OFDM FBMC/OQAM modulator (Fig. 1). Furthermore, the 40 real
blocks to send, then a memory size of Nblocks N log2 (K) multipliers in FBMC/OQAM come from the doubled units
(depth width) has to be used to store all the binary input for IFFT (16 2), PPN (2q 2) and pre-processing (2 2).
data. In addition, two memory blocks are needed to store It is also to note that in this table the number of LUTs used
FBMC/OQAM and OFDM reference output samples. If qout as logic includes the memories used to store twiddle factor
is the output samples quantization (16 bits in our case), then coefficients, PPN filter coefficients, and the sine/cosine tables
a memory of size (Nblocks M + M 2 ) 2qout has to be used
in the pre-processing unit.
for FBMC/OQAM, and of size Nblocks (M + LCP ) 2qout Regarding the latency, it is almost identical for both modu-
for OFDM. To fit all the reference samples into the FPGA, lators. It corresponds to the latency of the pipeline registers (40
internal block RAMs must be used. Furthermore, the internal clock cycles in OFDM or 52 clock cycles in FBMC/OQAM)
logic analyzer requires additional memories, depending on the and the latency introduced by the IFFT (M clock cycles) and
MATLAB GUI Zynq-7000 All Programmable SoC AD-FMCOMMS1 RF front-end SPECTRUM ANALYZER

OFDM TX FFT size: 512 Sample rate: 7.68 MHz

DAC LVDS
Interface
Active subcarriers: 300 Bandwidth: 5 MHz
to DAC
LFSR Constellation: 16-QAM Prototype filter: TFL1
Ctrl. FBMC/OQAM
AD9122
enable TX
Slv_Regs
interface
select QAM, subcarriers select
AXI Interconnect OFDM
AXI Interconnect
20 dB

Ethernet
FBMC

Int. Ctrl
DDRX
UART
Dual-core ARM Cortex-A9 processing system

IIC
I2C config.

Host computer

Fig. 8. Proposed demonstration setup with front-end interface

TABLE II tors are included in the proposed hardware prototype, using


S YNTHESIS RESULTS FOR OFDM AND FBMC/OQAM FOR M = 512 AND similar algorithm/architecture choices and LTE typical sys-
TFL1 PROTOTYPE FILTER ( ONE TAP )
tem parameters for reference and comparison purpose. Fully
Performance OFDM FBMC/OQAM a pipelined architecture is proposed enabling continuous stream
Flip-Flops 3006 5687 processing of one complex sample in baseband discrete time
Hardware LUTs (as logic) 3599 7385 domain per clock cycle. Current work is targeting reducing
complexity LUTs (as RAM) 912 1632
real multipliers 16 40 the FBMC/OQAM complexity both at transmitter and receiver
Latency (clock cycle) 1064 1076 sides. Initial results allow to improve latency and to obtain
Throughput (Sample per clock cycle) 1 1 equivalent hardware complexity compared to the OFDM-based
Maximum clock cycle (MHz) 210 210 modulator.
a New algorithm and architecture optimization techniques (out of the
scope of this paper) have been devised allowing to improve latency ACKNOWLEDGMENT
and to reach equivalent hardware complexity compared to OFDM
based modulator. This work has been performed in the framework of the FP7
project ICT-317669 METIS, which is partly funded by the
reordering units (M clock cycles). In fact, the M European Union.
2 offset in
FBMC/OQAM is not counted as latency as the sub-sequence R EFERENCES
from real sub-carriers are directly available for processing.
[1] Ericsson, More than 50 Billion Connected Devices, White Paper,
However, the related latency is considered at the receiver side http://www.ericsson.com/res/docs/whitepapers/wp-50-billions.pdf, Feb.
since the real and imaginary sub-sequences are required in 2011.
order to recover the original complex QAM symbols. Latency [2] A. Osseiran, F. Boccardi, V. Braun, K. Kusume, P. Marsch, M. Maternia,
O. Queseth, M. Schellmann, H. Schotten, H. Taoka, H. Tullberg,
results are shown in Table II. M. Uusitalo, B. Timus, and M. Fallgren, Scenarios for 5G mobile
The critical path is related to the real multiplier for both and wireless communications: the vision of the METIS project, IEEE
modulators, so there is no difference in terms of reachable Communications Magazine, vol. 52, no. 5, pp. 2635, May 2014.
[3] M. Schellmann, Z. Zhao, H. Lin, P. Siohan, N. Rajatheva, V. Luecken,
clock frequency which is 210 MHz on the target Zynq-7000 and A. Ishaque, FBMC-based air interface for 5G Mobile: Challenges
device. Regarding the throughput, the devised architecture is and proposed solutions, in Proc. of the International Conference
fully pipelined allowing continuous stream processing of one on Cognitive Radio Oriented Wireless Networks (Crowncom), Oulu,
Finland, June 2014.
complex sample in baseband discrete time domain per clock [4] METIS, Mobile and Wireless Communications Enablers for the Twenty-
cycle. It is worth to note that the baseband processing can be Twenty Information Society, EU 7th Framework Programme project,
done at a higher rate than the sampling frequency (maximum http://www.metis2020.com.
[5] P. Siohan, M. Gharba, and R. Legouable, An alternative multiple access
30.72 MHz for LTE) if a high latency constraint is imposed. scheme for the uplink 3GPP/LTE based on OFDM/OQAM, in Proc. of
In this case, a buffer stage should be added before the DAC the IEEE Int. Symp. Wireless Commun. Syst. (ISWCS), Barcelona, Sep.
to adapt the different clock domains. 2010, pp. 941945.
[6] B. Le Floch, M. Alard, and C. Berrou, Coded orthogonal frequency
division multiplex, Proceedings of the IEEE, vol. 83, pp. 982996,
V. C ONCLUSION Jun. 1995.
[7] A. Sahin, I. Guvenc, and H. Arslan, A Survey on Multicarrier Com-
In this paper, a new design and prototyping experience of munications: Prototype Filters, Lattice Structures, and Implementation
an advanced communication system based on FBMC/OQAM Aspects, IEEE Communications Surveys Tutorials, vol. PP, no. 99, pp.
modulation is presented. The hardware prototyped new wave- 127, Dec. 2013.
[8] C. Lele, P. Siohan, and R. Legouable, 2 dB better than CP-OFDM with
form is considered as a key enabler for the future flexible OFDM/OQAM for preamble-based channel estimation, in Proc. of the
5G air interface. To the best of the authors knowledge, this IEEE International Conference on Communications (ICC), Beijing, May
constitutes the first published design and prototyping expe- 2008, pp. 13021306.
[9] P. Siohan, Siclet, C., and N. Lacaille, Analysis and design of
rience related to this new technical component. The paper OFDM/OQAM systems based on filterbank theory, IEEE Trans. Signal
has illustrated the complete design and prototyping flow, Process., vol. 50, pp. 11701183, May 2002.
including: (1) algorithm simplification and optimisation, (2) [10] H. Shousheng and M. Torkelson, A new approach to pipeline FFT
processor, in Proc. of the International Parallel Processing Symposium
architecture exploration, (3) hardware implementation, and (IPPS), Apr. 1996, pp. 766770.
(4) on-board validation and demonstration. The proposed [11] D. Dasalukunte, V. Owall, and S. Mehmood, Complexity analysis of
contribution serves as a proof-of-concept of the new waveform IOTA filter architectures in faster-than-Nyquist multicarrier systems, in
Proc. of the NORCHIP Conference, Nov. 2011, pp. 1415.
and allows for rapid architecture exploration and performance
evaluation and comparison with state-of-the-art OFDM-based
systems. Both OFDM and FBMC/OQAM based modula-

You might also like