Professional Documents
Culture Documents
,s)
u
k
=+1
P(s
, s, y)
(s
,s)
u
k
=1
P(s
, s, y)
(4.17)
where
P(s
, s, y) = P(s
, y
j<k
) P(s, y
k
[s
) P(y
j>k
[s)
= P(s
, y
j<k
) P(s[s
) P(y
k
[s
, s) P(y
j>k
[s)
=
k1
(s
)
k
(s
, s)
k
(s)
(4.18)
53
Here y
j<k
denotes the sequence of received symbols y
j
from the beginning of the trellis up to
time k1 and y
j>k
is the corresponding sequence from time k + 1 up to the end of the trellis. The
forward recursion and backward recursion of the MAP algorithm yield
k
(s) =
(s
,s)
k
(s
, s)
k1
(s
) (4.19)
k1
(s) =
(s
,s)
k
(s
, s)
k
(s) (4.20)
L( u) = L(u[y) = ln
P(u = +1[y)
P(u = 1[y)
= ln
(s
,s)
u
k
=+1
k1
(s
)
k
(s
, s)
k
(s)
(s
,s)
u
k
=1
k1
(s
)
k
(s
, s)
k
(s)
(4.21)
Whenever, there is a transition from s
to s , P(s[s
) = P(u
k
), where u
k
is the information bit
corresponding to the transition from s
k
(s
, s) = P(s[s
) p(y
k
[s
, s)
= P(y
k
[u
k
) P(u
k
)
(4.22)
The index pair (s
v=2
P(y
k,v
[u
k
, s
, s))
= P(y
k,1
[u
k
) (
n
v=2
P(y
k,v
[u
k,v
))
(4.23)
is the independent joint probabilities of the received symbols and
P(u
k
) = A
k
e
u
k
L(u
k
)/2
(4.24)
From Equation (4.11), we have,
P(y
k
[u
k
) = P(y
k,1
[u
k
) (
n
v=2
P(y
k,v
[u
k,v
))
= B
k
exp(
1
2
L
c
y
k,1
u
k
) (
n
v=2
exp(
1
2
L
c
y
k,v
u
k,v
))
= B
k
exp(
1
2
L
c
y
k,1
u
k
+
n
v=2
1
2
L
c
y
k,v
u
k,v
)
(4.25)
Hence,
k
(s
, s) = P(y
k
[u
k
) P(u
k
)
= A
k
B
k
exp(
1
2
L
c
y
k,1
u
k
+
n
v=2
1
2
L
c
y
k,v
u
k,v
+
1
2
u
k
L(u
k
))
(4.26)
54
The terms A
k
and B
k
in Equation (4.26) are equal for all transitions from level k1 to level k and
hence will cancel out in the ratio of Equation (4.21). Thus we use
k
(s
, s) = exp(
1
2
L
c
y
k,1
u
k
+
n
v=2
1
2
L
c
y
k,v
u
k,v
+
1
2
u
k
L(u
k
)) (4.27)
The extrinsic information can be calculated as
L
e
( u
k
) = L( u
k
) [L
c
y
k
+L(u
k
)] (4.28)
4.4.2 Log-MAP Algorithm.
The Log-MAP algorithm is a transformation of MAP, which has equivalent performance without
its problems in practical implementation. It works in the logarithmic domain, where multiplica-
tion is converted to addition. The following are the calculations of branch transition probabilities
and the forward/backward recursion formulas:
LM
k
(s
, s) = ln
k
(s
, s)
=
1
2
L
c
y
k,1
u
k
+
1
2
n
v=2
L
c
y
k,v
x
k,v
+
1
2
u
k
L(u
k
)
(4.29)
LM
k
(s) = ln
k
(s)
= ln(
LM
k
(s
,s)
e
LM
k1
(s)
)
= ln(
LM
k
(s
,s)+
LM
k1
(s)
)
(4.30)
LM
k1
(s) = ln
k1
(s
)
= ln(
LM
k
(s
,s)
e
LM
k
(s
)
)
= ln(
LM
k
(s
,s)+
LM
k
(s)
)
(4.31)
Therefore, the log-likelihood ratio is given by
L( u
k
) = ln
(s
,s)
u
k
=+1
e
LM
k
(s
,s)
e
LM
k1
(s)
e
LM
k
(s
)
(s
,s)
u
k
=1
e
LM
k
(s
,s)
e
LM
k1
(s)
e
LM
k
(s
)
= ln(
(s
,s)
u
k
=+1
e
LM
k
(s
,s)
e
LM
k1
(s)
e
LM
k
(s
)
) ln(
(s
,s)
u
k
=1
e
LM
k
(s
,s)
e
LM
k1
(s)
e
LM
k
(s
)
)
(4.32)
55
Max Function Dene
E(x, y) = ln(e
x
+e
y
) (4.33)
ln(e
x
+e
y
) = ln e
x
+ ln(e
x
+e
y
) ln e
x
= x + ln
e
x
+e
y
e
x
= x + ln(1 +e
yx
)
(4.34)
Similar way
ln(e
x
+e
y
) = ln e
y
+ ln(e
x
+e
y
) ln e
y
= y + ln(1 +e
xy
)
(4.35)
Hence
E(x, y) = ln(e
x
+e
y
)
= max(x, y) + ln(1 +e
|xy|
)
(4.36)
and take
E(x, y) = ln(e
x
+e
y
) max(x, y) (4.37)
We can easily prove that in general
E(x
1
, x
2
, , x
k
) = ln
k
i=1
(e
x
i
) = max(x
i
) + ln
k
i=1
(e
x
i
max(x
i
)
)
= max(x
i
) +(x
1
, x
2
, , x
k
)
= max
(x
i
)
(4.38)
Where (x
1
, x
2
, , x
k
) is called the correction term and can be computed using a look-up table.
Using equation (4.38), the calculations of MAP algorithm are done without its complexity.
4.4.3 Max-Log-Map Algorithm
With max-function, the Log-MAP algorithm becomes Max-Log-MAP algorithm resulting in some
degradation in the performance, but, with a drastic reduction in computational complexity. The
correction term in equation (4.38) is negelected.
E(x
1
, x
2
, , x
k
) max(x
i
) (4.39)
A
k
=
MLM
k
= max(
LM
k
(s
, s) +
LM
k1
(s)); (4.40)
B
k
=
MLM
k1
= max(
LM
k
(s
, s) +
LM
k
(s)); (4.41)
L( u
k
) =
(s
,s)
max
u
k
=+1
[
LM
k
(s
, s) +
LM
k1
(s) +
LM
k
(s
)]
(s
,s)
max
u
k
=1
[
LM
k
(s
, s) +
LM
k1
(s) +
LM
k
(s
)] (4.42)
56
4.5 Improvements In Turbo Decoding
4.5.1 Extrinsic Information Scaling
Extrinsic information is calculated as shown in equation (4.15)
L
2
e
( u) = L
2
( u) [L
c
y +L
1
e
(u)] (4.43)
We add a scaling factor s as shown
L
2
e
( u) =
_
L
2
( u) [L
c
y +L
1
e
(u)]
_
s (4.44)
Figure 4.9 shows the performance of the best evaluated scaling factor compared to the standard
algorithm (s = 1) for block length 51 14 and AWGN. For a bit error rate of 10
6
the improvement
of the MLMAP is 0.3dB and the dierence between MLMAP and MAP is now only O.ldB. It is
assumed that the scaling factor reduces the correlation between extrinsic and systematic symbols
which came from the approximation of equation (4.37).
Figure 4.9: turbo code with dierent scaling factors and block length 5114 bit, 8 iterations,
AWGN
4.5.2 The Sliding Window Soft Input Soft Output Decoder
The SISO algorithm requires that the whole sequence has been received before starting the smooth-
ing process. The reason is due to the backward recursion that starts from the (supposed-known)
nal trellis state. As a consequence, its practical application is limited to the case when the dura-
tion of the transmission is short (n small).
A more fexible decoding strategy is oered by modifying the algorithm in such a way that the
SISO module operates on a xed memory span and outputs the smoothed probability distribu-
tions after a given delay, D.
We propose three versions of the Sliding Window SISO that dier in the way they overcome the
problem of initializing the backward recursion without waiting for the entire sequence.
57
Use
MLM
k
We compute the forward recursion using equation 4.40. At time k > D we initialize
MLM
k
as follows
MLM
k
=
MLM
k
(4.45)
Use Equipropable beta
MLM
k
We compute the forward recursion using equation 4.40. At time
k > D we initialize
MLM
k
as follows
MLM
k
=
1
N
(4.46)
Where N is the number of states
Use 2 Backward Recursion Units This solution is based on three recursion units (RUs),
two used for the backward recursion (RU
B1
and RU
B2
), and one forward unit (RU
A
). Each RU
contains operators working in parallel so that one recursion can be performed in one clock cycle.
The horizontal axis in gure (4.10 ) represents time, with units of a symbol period. The vertical
axis represents the received symbol. Thus, the curve (x = y) shows that, at time t = k, the sym-
bol y
k
becomes available. Let us describe how the L symbols y
kLk<2L
are decoded (segment I of
Fig. 4.10). From t = 3L to 4L 1 , RU
B1
performs recursions, starting from y
3L1
down to y
2L
(segment II of Fig. 4.10). This process is initialized with the all-zero state vector , but after it-
erations, the convergence is reached and is then B
2L
obtained. During those L same cycles, RU
A
generates the vectors A
kLk<2L
(segment III of Fig. 4.10). The A
kLk<2L
vectors are stored in
the state vector memory (SVM) until they are needed for the LLR computation (grey area of
Fig. 4.10). Then, between t = 4L and 5L1 , RU
B1
starts from state B
2L1
to B
L
compute down
to (segment IV of Fig. 4.10). At each cycle, the vector A
k
corresponding to the computed B
k
is
extracted from the memory in order to compute L( u
k
). Finally, between t = 5L and 6L 1, the
data are reordered (segment V of Fig. 4.10) using a memory for reversing the LLR (light grey
area of Fig. 4.10). The same process is then reiterated every cycles, as shown in Fig. 4.10.
Figure 4.10: Graphical representation of a real-time MAP architecture
58
4.5.3 Stopping Criteria for Turbo Decoding
Iterative decoding is a key feature of turbo codes. Each decoding iteration results in additional
computations and decoding delay. As the decoding approaches the performance limit of a given
turbo code, any further iteration results in very little improvement. Often, a xed number M is
chosen and each frame is decoded for M iterations. Usually M is set with the worst corrupted
frames in mind. Most frames need fewer iterations to converge. Therefore, it is important to de-
vise an ecient criterion to stop the iteration process and prevent unnecessary computations and
decoding delay.
HDA Although iterative decoding improves the LLR value for each information bit through
iterations, the hard decision of the information bit is ultimately made based on the sign of its
LLR value. The hard decisions of the information sequence at the end of each iteration provide
information on the convergence of the iterative decoding process.
At iteration (i 1), we store the hard decisions of the information bits based on L
(i1)
2
( u) and
check the hard decisions based on L
(i)
2
( u) at iteration If they agree with each other for the entire
block, we simply terminate the iterative process at iteration i This stopping criterion is called the
hard-decision-aided (HDA) criterion.
IHDA Although iterative decoding improves the LLR value (L( u
k
)) for each information bit
through iterations, the hard decision of the information bit is ultimately made based on the sign
of its LLR value. From repeated simulations, it was observed that, as the number of iterations
used increases, for a good (easy to decode) frame, the magnitudes of the LLRs gradually become
larger. Since the term L
c
y is xed for every iteration, the increase in the magnitudes of the
LLRs is due to increases in the magnitudes of the extrinsic information. Since the extrinsic infor-
mation keeps increasing as the number of iteration i increases, it is conceivable, as the decoding
iteration converges to the nal stage, the hard decision based on L
c
y + L
(i)
e1
( u) from the rst
component decoder should agree with the hard decision based on the LLR at the output of the
second component decoder1 according to the following equation
L
2
( u) = L
c
y +L
1
e
( u) +L
2
e
( u) (4.47)
At iteration i, compare the hard decisions of the information bit based on L
c
y +L
(i)
e1
( u) with the
hard decision based on L
(i)
2
( u). If they agree with each other for the entire block, terminate the
iterative process at iteration i.
4.5.4 Modulo Normalization
In a SISO decoder, both A
k
(s) and B
k
(s) grow in magnitude as the recursions proceed. Without
normalization, overow may occur when the data width is nite. To avoid overow, A
k
(s) may
be normalized by subtracting a constant from all the metrics at a given time , and the same is
true for B
k
(s) . This is made possible by the fact that the soft output only depends on the dier-
ence between path metrics but not their magnitudes. Usually, such subtractive normalization is
done according to
A
k
(s) = A
k
(s) max(A
k
(s
)), s (4.48)
59
Figure 4.11: Average number of iterations for various stopping schemes
B
k
(s) = B
k
(s) max(B
k
(s
)), s (4.49)
where
A
k
and
B
k
are path metrics normalized by subtraction. This technique requires extra com-
putations to nd the maxima and perform the subtractions and increases latencies.
Modulo normalization can be implemented inherently by employing twos complement arithmetic.
There are 2 conditions to use it 1) the dierence between path metrics is bounded. 2) path selec-
tion depends only on path metric dierences These 2 conditions are shown to be true in [10]
The idea behind the modulo normalisation is for a metric mi to be replaced by a normalised met-
ric m
i
:
m
i
= (m
i
+C/2) mod C C/2 (4.50)
This normalisation can be represented graphically as wrapping the metric m
i
around a circle
whose circumference equals C, starting from 0 angle point and moving in the counter-clockwise
direction. Also, it can be seen that the range of the normalised metric is now:C/2 m
i
< C/2
. Using this method, the comparison between two metrics is equivalent to comparing the angle
between them (moving in the CCW direction) to . An example of this is shown in Fig. 4.12,
where m
1
< m
2
if and only if < . In order for this method to work correctly, the dierence
between the two metrics being compared has to be smaller than C/2 i.e. ([m
1
m
2
[ < C/2).
It is possible to show that the comparison of two normalised metrics c( m
1
, m
2
) is equivalent to:
c( m
1
, m
2
) = m
w1
1
m
w1
1
c
u
( m
1
, m
2
) (4.51)
where c
u
( m
1
, m
1
) represents an unsigned comparison of the metrics m
1
and m
2
where
m
i
= m
i
mod C/2 (4.52)
( the magnitude of m
i
), as shown in gure 4.13
60
0
256
-256
-512
00 0000 0000 10 0000 0000
Figure 4.12: Graphical example of modulo normalisation.
Figure 4.13: Hardware realisation of modulo normalisation.
4.6 LTE Standard
4.6.1 Turbo Encoder
The coding rate of turbo encoder is 1/3. The structure of turboencoder is illustrated in gure
4.14. The transfer function of the 8-state constituent code is:
G(D) =
_
1,
g
1
(D)
g
0
(D)
_
(4.53)
61
k
c
k
c
k
x
k
x
k
z
k
z
Figure 4.14: Structure of rate 1/3 turbo encoder (dotted lines apply for trellis termination only)
where
g
0
(D) = 1 +D
2
+D
3
(4.54)
g
1
(D) = 1 +D +D
3
(4.55)
The output from the turbo encoder is
d
(
k
0) = x
k
(4.56)
d
(
k
1) = z
k
(4.57)
d
(
k
2) = z
k
(4.58)
4.6.2 Trellis termination for turbo encoder
Trellis termination is performed by taking the tail bits from the shift register feedback after all
information bits are encoded. Tail bits are padded after the encoding of information bits. The
62
rst three tail bits shall be used to terminate the rst constituent encoder (upper switch of gure
4.14 in lower position) while the second constituent encoder is disabled. The last three tail bits
shall be used to terminate the second constituent encoder (lower switch of gure 4.14 in lower
position) while the rst constituent encoder is disabled. The transmitted bits for trellis termina-
tion shall then be:
d
(0)
k
= x
k
, d
(0)
k+1
= z
k+1
, d
(0)
k+2
= x
k
, d
(0)
k+3
= z
k+1
(4.59)
d
(1)
k
= z
k
, d
(1)
k+1
= x
k+2
, d
(1)
k+2
= z
k
, d
(1)
k+3
= x
k+2
(4.60)
d
(2)
k
= x
k+1
, d
(2)
k+1
= z
k+2
, d
(2)
k+2
= x
k+1
, d
(2)
k+3
= z
k+2
(4.61)
4.6.3 Interleaver
The bits input to the turbo code internal interleaver are denoted by c
0
, c
1
, ..., c
k1
, where k is the
number of input bits. The bits output from the turbo code internal interleaver are denoted by
c
0
, c
1
, ..., c
k1
. The relationship between the input and output bits is as follows:
c
i
= c
(i)
, i = 0, 1, (k 1) (4.62)
where the relationship between the output index i and the input index (i) satises the following
quadratic form:
(i) = (f
1
i +f
2
i
2
) mod k (4.63)
The parameters f
1
and f
2
depend on the block size k and are summarized in [1]
4.7 Implementation of Turbo Encoder
4.7.1 Encoder
The function of the Encoder
Its used to get the encoded bits with rate 1/3.
Turbo Encoder block diagram
63
The input ports of the ENCODER
1. c:
Its the input 40 bits of data (codeblock length).
2. clk:
Its the clock of the system to synchronize the system.
3. reset:
Its used to reset the all system and the block.
The output ports of the ENCODER
1. d
0
k
:
It represents the systematic output from the Turbo Encoder.
2. d
1
k
:
It represents the parity one output from the Turbo Encoder.
3. d
2
k
:
It represents the parity two output from the Turbo Encoder.
4. enable:
Its used to indicate that output is ready at output ports.
4.7.2 The Turbo Encoder main blocks
Turbo Encoder blocks diagram
We note that the Turbo Encoder contains seven blocks with ve main blocks
64
1. PISO.(Parallel input serial output ).
2. The Interleaver.
3. The Convolutional code. (The core of turbo Encoder).
4. SIPO.(Serial input prallel output).
5. Trellis.
4.7.3 PISO
The function of the PISO
Its used to transfer the parallel bits to serial bits.
PISO block diagram
The input ports of the PISO
1. d:
Its the input 40 bits of data (codeblock length).
2. clk:
Its the clock of the system to synchronize the system.
3. reset:
Its used to reset the all system and the block.
4. f:
Its the feedback data comes from the convolutional block at switching period.
65
The output ports of the PISO
1. q:
The serial output bits from PISO block.
2. x
k
:
Its the 43 bits containg the systematic bits and 3 bits from the convolutional code feedback
3. load:
Its a signal to indicate that the output bits is available at the output port.
4. rc:
Its one output pulse for one clock cycle only.
4.7.4 Interleaver
The function of the Interleaver
Its used to randomize the input data with random sequence.
Interleaver block diagram
The input ports of the Interleaver
1. D:
Its the input 40 bits of data (codeblock length).
2. clk:
Its the clock of the system to synchronize the system.
3. reset:
Its used to reset the all system and the block.
4. f:
Its the feedback data comes from the convolutional code feedback at switching period .
66
The output ports of the Interleaver
1. Q:
The serial output bits from the Interleaver block.
2. x
dk
:
Its the 43 bits block containing the interleaved bits and 3 bits from the convolutional code.
3. load:
Its a signal to indicate that the output bits is available at the output port.
4. rc:
Its one output pulse for one clock cycle only.
4.7.5 Convolutional code
The function of the Interleaver
Its the core of the Turbo Encoder.
Convolutional block diagram
The input ports of the Convolutional
1. d:
Its the input port for data bits.
2. clk:
Its the clock of the system to synchronize the system.
3. reset:
Its used to reset the block and ll the three registers with zeros.
4. en:
Its used to enable the block.
67
The output ports of the Convolutional
1. q:
The output encoded bits.
2. sw:
Its feedback signal to the PISO and Interleaver blocks.
3. rd:
Its a signal to indicate that the output bits is available at the output port.
4.7.6 SIPO
The function of the SIPO
It accepts serial bits and give block of parallel bits
SIPO block diagram
The input ports of the SIPO
1. d:
Its the input serial bits which come from the Convolutional block.
2. clk:
Its the clock of the system to synchronize the system.
3. reset:
Its used to reset the block.
The output ports of the SIPO
1. q:
Its one output block, contains 43 bits.
2. Load:
Its a signal to indicate that the output bits is available at the output port.
68
4.7.7 TRELLIS
The function of the TRELLIS
the function of the trellis is to form the trellis termination.
TRELLIS block diagram
The input ports of the TRELLIS
1. x
k
:
Its the one stream of 43 bits comes from the PISO.
2. x
dk
:
Its the one stream of 43 bits comes from the Interleaver.
3. z
k
:
Its the one stream of 43 bits come from the SIPO represent the encoded systematic bits.
4. z
dk
:
Its the one stream of 43 bits come from the SIPO represent the encoded interleaved bits.
5. clk:
Its the clock of the system to synchronize the system.
6. reset:
Its used to reset the block.
The output ports of the TRELLIS
1. d
0
k
:
It represents the systematic output from the Turbo Encoder
2. d
1
k
:
It represents the parity one output from the Turbo Encoder
69
3. d
2
k
:
It represents the parity two output from the Turbo Encoder
4. load:
Its a signal to indicate that the output bits is available at the output port.
4.8 Simulations of Turbo Encoder
4.8.1 By using Modelsim and Matlab
We will make the simulation by using the Modelsim and check the results by using Matlab
Let the 40 input bits of the Turbo Encoder are c = 0011000111011000101010111110001010100010.
Output by Matlab
d
0
k
= 00110001110110001010101111100010101000100010.
d
1
k
= 00100011100110100010011000100011000001100010.
d
2
k
= 00001011101001100100011110100011110011000000.
Output by Modelsim
0011000111011000101010111110001010100010
UUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUU 00110001110110001010101111100010101000100010
UUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUU 00100011100110100010011000100011000001100010
UUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUU 00001011101001100100011110100011110011000000
0 ps 200000 ps 400000 ps 600000 ps
/encodertest/c 0011000111011000101010111110001010100010
/encodertest/clk
/encodertest/reset
/encodertest/dk0 UUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUU 00110001110110001010101111100010101000100010
/encodertest/dk1 UUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUU 00100011100110100010011000100011000001100010
/encodertest/dk2 UUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUU 00001011101001100100011110100011110011000000
/encodertest/enable
Output simulation of the Turbo Encoder by using Modelsim
We note that the output from the Modelsim and Matlab is identical.
4.9 Workow for Turbo Decoder
The work ow used consists of two main steps: Design and implementation .See g 4.15
4.9.1 Design
The LTE standard has very high technical requirements , when it comes to frequency and round
trip time. The turbo decoder by nature is a computationally intensive unit . A lot of research
has been published to optimize the turbo decoder , reducing complexity , power consumption and
latency . The aim of this phase is to design a turbo decoder that is simple and ecient . It has
to be suitable for implementation on FPGA .
The design process starts with exploring the research published to nd techniques to optimize
the decoder . These various techniques are simulated and compared using Matlab . The nal de-
cision is made based on the results obtained . See gure 4.16
70
Design
Implementaon
Figure 4.15: The work ow used
The oating point arithmetic is complex and not suitable for FPGA implementation . Integer
arithmetic will cause a huge performance degradation . Thus, xed point arithmetic is the most
suitable . The oating point design previously developed is quantized to obtain xed point design
. This design will later be used as reference for the VHDL implementation. See gure 4.17
4.9.2 Implementation
The bottom up design method was used for implementing the decoder . The smaller blocks were
rst developed , then grouped and wired to form the top level design . The xed point design
was used as reference . Each block was tested individually and the whole system was veried .
The workow is shown in gure 4.18
71
Research
Simulate
Decide
Figure 4.16: Steps of oating point design
Floang point Design
Fixed Point Desgn
Quanzaon
Figure 4.17: Fixed point design is obtained by quantizing the oating point design
72
Fixed Point
Design
RTL Design
RTL
Vericaon
Synthesis and
opmizaon
RTL vs Netlist
vercaon
FPGA
implementaon and
tesng
Figure 4.18: Steps of implementation
73
4.10 Design Phase
4.10.1 Algorithm
Two algorithms for turbo decoding were tested : Map and Max Log Map
Figure 4.19 shows the performance of Map algorithm for dierent number of iterations. Figure
4.20 shows a comparison between Map and Max Log Map algorithm . The map algorithm uses
logarithmic functions and multiplications. Thus, its not suitable for FPGA. On the other hand ,
the Max Log Map algorithm used addition and max function . So , we will use the Max Log Map
algorithm .
1 0 1 2 3 4 5 6
10
4
10
3
10
2
10
1
10
0
Es/No (dB)
B
i
t
e
r
r
o
r
r
a
t
e
uncoded bits
iter 1
iter 2
iter 3
iter 6
iter 18
Figure 4.19: BER rate curve for turbo codes using Map at dierent iterations
4.10.2 Extrinsic Information Scaling
The extrinsic information scaling was tested for a factor of 1 , 0.75 and 0.7 . The results are
shown in gure 4.21 . The 0.7 scale shows slightly better performance then 0.75, but the 0.75
is simpler to implement on FPGA. So we choose 0.75 .
4.10.3 Sliding window
Three methods for sliding window were investigated : reusing A , assuming equiprobable , using
2 B units . See gure 4.22 . It shows no performance degradation compared to normal normal
decoder as shown in gure 4.23 . So its our choice for sliding window .
4.10.4 Stopping Criteria
As seen in gure 4.24 the HDA exhibits the best performance . So , it is chosen despite it has a
minimum of 2 iterations .
74
1 0.5 0 0.5 1 1.5 2
10
4
10
3
10
2
10
1
10
0
Mine Max
Mine Map
Figure 4.20: comparison between max log map and map BER curves (interleaver size=1088 num-
ber of iterations = 3)
1 0.5 0 0.5 1 1.5 2
10
4
10
3
10
2
10
1
10
0
scaling vs no scaling iter=3
scale=1
scale=0.75
scale=0.7
Figure 4.21: comparison between dierent scaling factors (interleaver size=1088 number of itera-
tions = 3)
4.10.5 Internal word length
Figure 4.25 shows the eect of changing the word length for the internal calculations of the in-
terleaver on the BER As seen in gure BER starting from word length of 11 and going up stop
decreasing . So we choose word length of 11. Comparing to oat point in gure 4.26, there is ap-
75
1 0.5 0 0.5 1 1.5 2 2.5 3
10
5
10
4
10
3
10
2
10
1
10
0
Eb/No
B
E
R
a reuse
Equipropable
Dummy b
Figure 4.22: comparison between dierent sliding window techniques (interleaver size=1088 num-
ber of iterations = 3)
1 0.5 0 0.5 1
10
3
10
2
10
1
10
0
Eb/No
B
E
R
normal
SW dummy B
Figure 4.23: comparison between two B units and no sliding window (interleaver size=1088 num-
ber of iterations = 3)
proximately no increase in BER.
76
0 1 2 3 4 5 6 7 8 9
1
2
3
4
5
6
7
8
Eb/No (dB)
n
u
m
b
e
r
o
f
i
t
e
r
a
t
i
o
n
s
HDA
IHDA
GENIE
Figure 4.24: comparison between dierent early stopping criteria
9 10 11 12 13 14
0.0326
0.0328
0.033
0.0332
0.0334
0.0336
0.0338
0.034
0.0342
word length
B
E
R
Figure 4.25: relation between BER and internal size of turbo decoder at SNR -9.16 dB and 2
iterations
77
10 9 8 7 6 5 4 3
10
3
10
2
10
1
10
0
Es/No (dB)
B
i
t
e
r
r
o
r
r
a
t
e
fixed wld=8 wl=11 vs floating
Fixed
Floating
Figure 4.26: comparison between oating point and xed point turbo decoder with internal
width of 11 (interelaver size=1088 number of iterations = 2)
78
4.11 Implementation of Map Decoder
4.11.1 Architecture
Figure 4.27 shows the top level architecture of the map decoder.
ACS_elem
a_column
aRam aExt
ysRam
leRam
BMU_column
gamma
Ram
ACS_elem
b_column
bExt calcLe
yp
ys
LeIn
LeOut
decision
Figure 4.27: High-level VLSI architecture of the implemented max-log map decoder (thin boxes
indicate registers).
4.11.2 Timing
First gamma is calculated . After the rst value of gamma is calculated , the corresponding al-
pha gets calcualted . At the last value of gamma , Beta calculation starts , followed directly by
extrinsic value calculations . Timing diagram for map decoder is shown in gure 4.18 .
4.12 Implementation of Turbo Decoder
4.12.1 Architecture
Figure 4.28 shows the top level architecture of the turbo decoder.
79
y1pR
y2pR
Ram
mapDec
Inter
Deinter
Le
interYs
decisionDeint
ysRam ysRam
ysRam
Interface
LeR
din
op
Figure 4.28: High-level VLSI architecture of the implemented turbo decoder.
4.12.2 Timing
First , inputs are read and stored in ysRam , y1pRam and y2pRam . Trellis termination are read
into ttRam. In the following cycles values stored in ttRam are written in the proper ram after re-
ordering them. During this time Le input is equal to zero . ysRam is interfaced to enable ready
y1s and y2s . During initial write data are read into the map decoder unit , and clock is disabled
until trellis termination is nished and then map operation continues until its nished . Extrin-
sic values output from mapDec are written to LeRam and are read interleaver for the second
stage. Timing diagram for turbo decoder is shown in gure 4.18 .
80
I
D
T
a
s
k
N
a
m
e
1
r
e
a
d
a
n
d
w
r
i
t
e
i
n
p
u
t
s
(
y
s
,
y
p
,
L
e
)
2
b
r
a
n
c
h
m
e
t
r
i
c
s
c
a
l
c
u
a
l
o
n
(
g
a
m
m
a
)
3
f
o
r
w
a
r
d
m
e
t
r
i
c
s
c
a
l
c
u
l
a
o
n
(
a
l
p
h
a
)
4
r
e
a
d
b
r
a
n
c
h
m
e
t
r
i
c
s
5
r
e
a
d
f
o
r
w
a
r
d
m
e
t
r
i
c
s
6
c
a
l
c
u
l
a
t
e
B
a
c
k
w
a
r
d
m
e
t
r
i
c
s
7
c
a
l
c
u
l
a
t
e
E
x
t
r
i
n
s
i
c
V
a
l
u
e
s
-
6
1
7
1
3
1
9
2
5
3
1
3
7
4
3
4
9
5
5
6
1
6
7
7
3
7
9
8
5
F
i
g
u
r
e
4
.
2
9
:
T
h
e
t
i
m
i
n
g
d
i
a
g
r
a
m
o
f
t
h
e
i
m
p
l
e
m
e
n
t
e
d
m
a
p
d
e
c
o
d
e
r
I
D
T
a
s
k
N
a
m
e
1
r
e
a
d
a
n
d
w
r
i
t
e
i
n
p
u
t
s
(
y
s
,
y
1
p
,
L
e
)
2
s
t
a
r
t
m
a
p
d
e
c
o
d
e
r
s
t
a
g
e
1
3
w
r
i
t
e
d
a
t
a
i
n
t
o
t
r
e
l
l
i
s
t
e
r
m
i
n
a
o
n
r
a
m
4
w
r
i
t
e
d
a
t
a
i
n
t
o
p
r
o
p
e
r
r
a
m
a
n
d
l
o
c
a
o
n
5
n
i
s
h
m
a
p
s
t
a
g
e
1
6
w
r
i
t
e
L
e
7
r
e
a
d
L
e
a
n
d
y
s
i
n
t
e
r
l
e
a
v
e
d
a
n
d
y
2
p
8
s
t
a
r
t
m
a
p
d
e
c
o
d
e
r
s
t
a
g
e
2
-
6
7
1
9
3
1
4
3
5
5
6
7
7
9
9
1
1
0
3
1
1
5
1
2
7
1
3
9
1
5
1
1
6
3
1
7
5
1
8
7
F
i
g
u
r
e
4
.
3
0
:
T
h
e
t
i
m
i
n
g
d
i
a
g
r
a
m
o
f
t
h
e
i
m
p
l
e
m
e
n
t
e
d
m
a
p
d
e
c
o
d
e
r
81
4.12.3 Power
Detailed power estimation is shown in table 4.1 and the summary in 4.2 . As seen from table, the
leakage power constitute the majority of the estimated power consumption .
On-Chip Power (W)
Clocks 0.092
Logic 0
Signals 0.001
BRAMs 0.031
IOs 0
Leakage 1.191
Total 1.315
Table 4.1: Detailed power consumption
Type Power (W)
Quiescent 1.191
Dynamic 0.124
Total 1.315
Table 4.2: Summary of power consumption
4.12.4 Ressource utilization
Table 4.3 shows the Virtex 5 ressources consumed by the design . Notice that these ressource
dont are just a small fraction of the resources available. Figure 4.31 shows the design after place
and route .
4.12.5 Throughput
Table 4.4 shows throughput of the implemented decoder
4.12.6 BER
Figure 4.32 shows the BER perfermance of the decoder . Unfortunately , only one iteration has
been implemented .
82
Resource usage
LUT /FF Pairs 2,447
Slice LUTs 2,171
Slice Registers 1,178
Block RAMs (36k) 2
Block RAMs (18k) 8
Max Clock Freq 201.295 MHz
Table 4.3: Resources utilization
Number of Cycles 210
Throughput 38.09 MHz
Table 4.4: Throughput of the implemented design
83
Figure 4.31: The placed and routed design on FPGA
84
1 0 1 2 3 4 5 6
10
5
10
4
10
3
10
2
10
1
10
0
Eb/No (dB)
B
i
t
e
r
r
o
r
r
a
t
e
iter 1
iter 2
iter 3
iter 6
iter 18
Figure 4.32: BER curves for the implemented decoder
85
86
Bibliography
[1] 3GPP. Evolved Universal Terrestrial Radio Access (E-UTRA); Multiplexing and channel
coding. TS 36.212, 3rd Generation Partnership Project (3GPP), January 2010.
[2] IEEE Schekeb Fateh Student Member IEEE Christian Benkeser Member IEEE
Christoph Studer, Member and IEEE Qiuting Huang, Fellow. Implementation trade-os
of soft-input soft-output map decoders for convolutional codes. 2007.
[3] Jelena Dragas. Design trade-os in the vlsi implementation of high-speed viterbi decoders
and their application to mlse in isi cancellation jelena draga. Masters thesis, Institut fr Inte-
grierte Systeme Integrated Systems Laboratory, March 2011.
[4] Warren J. Grossand P. Glenn Gulak Emmanuel Boutillon. Vlsi architectures for the map
algorithm. IEEE Transactions on Communications, 51(2), 2003.
[5] U. Vilaipornsawai M.R.Soleymani, Yingzi Gao. Turbo Coding for Satellite and Wireless
Communications. The Kluwer International Series in Engineering and Computer Science.
Kluwer Academic Publishers, 2002.
[6] T.M.N. Ngatched and F. Takawira. Simple stopping criterion for turbo decoding. Electronics
Letters, 37(22), 2001.
[7] Shu Lin Rose Y. Shao and Marc P. C. Fossorier. Two simple stopping criteria for turbo de-
coding. IEEE Transactions on Communications, 47(8), 1999.
[8] G. Montorsi a S. Benedetto, D. Divsalar and F. Pollara. A soft-input soft-output maximum
a posteriori (map) module to decode parallel and serial concatenated codes. Technical re-
port, TDA Progress Report, 1996.
[9] J. Vogt and A. Finger. Improving the max-log-map turbo decoder. Electronics Letters,
36(23), November 2000.
[10] Brian D. Woerner Yufei Wu and T. Keith Blankenship. Data width requirements in siso de-
coding with modulo normalization. IEEE Transactions on Communications, 49(11), Novem-
ber 2001.
87
88
Chapter 5
RATE MATCHING
The Rate-Matching (RM) algorithm selects bits for transmission from the rate 1/3 turbo coder
output via puncturing and/or repetition. Since the number of bits for transmission is determined
based on the available physical resources, the RM should be capable of generating puncturing
patterns for arbitrary rates. Furthermore, the RM should send as many new bits as possible
in retransmissions to maximize the Incremental Redundancy (IR) HARQ gains The main con-
tenders for LTE RM were to use the same (or similar) algorithm as HSPA,or to use Circular
Buer (CB) RM as in CDMA2000 1xEV and WiMAX as shown in,5.1 .
89
Figure 5.1: Circular-buer rate matching for turbo
90
5.1 Subblock interleaving
The bits input to the block interleaver are denoted by:
where D = K + 4 is the number of bits for each of systematic, parity 1 and parity 2 streams.
Note that K is the number of bits within a codeblock with bits xk , k = 0, 1, 2, . . . ,K ? 1,
and trellis termination adds four bits to each of systematic, parity 1 and parity 2 streams The
sub block interleaving is achieved by writing row-wise in a rectangular matrix, applying matrix
columns permutations and nally reading from the matrix column-wise. The number of columns
in the matrix is xed to 32, that is
The number of rows of the matrix:
Then
When the number of bits D does not completely ll the
Rectangular matrix, dummy bits are padded to fully ll the matrix as below:
Note that the maximum number of dummy bits is limited to
91
and these bits are added to the beginning of the stream. Also, note that when
no dummy bits need to be added as the total D bits fully ll the matrix in this case. The input
bit sequence is then written into the
rectangular matrix row by row starting with bit y0 in column 0 of row 0 as below:
5.2 permutation
The turbo code tail bits are uniformly distributed into the three streams, with all streams the
same size. Each sub-block interleaver is based on the traditional row-column interleaver with 32
columns (for all block size), and a simple length-32 intra-column permutation.
A length-32 column permutation is applied and the bits are read out column-by-column to form
the output of the sub-block interleaver for systematic and parity1
[0,16, 8,24,4,20,12,28,2,18,10,26,6,22,14,30,1,179,25,5,21,13,29,3,19,11,27,7,23 15,31]
For parity 2 stream, the output of the sub block interleaver permutation Given by equation
This leads to the foremost advantage of the LTE CB approach, in that it enables ecient HARQ
operation, because the CB operation can be performed without requiring an intermediate step of
92
forming any actual physical buer. In other words, for any combination of the 188 stream sizes
and 4 RV values, the desired codeword bits can be equivalently obtained directly from the out-
put of the turbo encoder using simple addressing based on sub-block permutation. Therefore the
term Virtual Circular Buer (VCB) is more appropriate in LTE. The LTE VCB operation also
allows Systematic Bit Puncturing (SBP) by dening RV = 0 to skip the (2*Rsubblock)bits lead-
ing to approximately six percentage punctured systematic bits (with no wrap around).
5.3 Subblock interlacing
The circular buer length is K
w
= 3K
, where K
1)
w
K+2k
= v
(1)
k
where k = 0, 1, 2, ..., (K
1)
w
K+2k+1
= v
(2)
k
where k = 0, 1, 2, ..., (K
1)
subblock interleaver
It should be noted that the subblock interlacing is only performed between parity 1 and 2 bits as
shown in the Figure. The systematic bits are not interlaced. The reason is that systematic bits
are generally part of the rst hybrid ARQ transmission. In response to hybrid ARQ NACK, for
example, subblock interlacing guarantees that an equal amount of parity 1 and 2 bits are trans-
mitted.
5.4 Hybrid ARQ soft buer limitation
The soft buer size for the rth code blockN
cb
is given as:
N
cb
=
_
min
__
N
IR
C
_
, K
W
_
downlink
KW uplink
93
where C is the number of codeblocks within the transport block andK
W
is the circular buer size
for the rth codeblock. N
IR
is soft buer size per codeword per hybrid ARQ process available at
the UE and is given as:
N
IR
=
_
N
soft
K
mimo
.min(M
DLHARQ
, M
Limit
)
_
where N
soft
is the total soft buer size, which is set by higher layers. K
mimo
= 1, 2 for the case of
single codeword and dual-codeword MIMO spatial multiplexing respectively. M
DLHARQ
= 8 is
the maximum number of hybrid ARQ processes and M
Limit
= 9
We note that the soft buer limitation only applies for the downlink due to soft buering con-
cerns for the UE receiver. In the uplink, there is no soft buer limitation for the eNB and hence
incremental redundancy can always be used. The soft buer size is directly proportional to the
supported data rate and is inversely proportional to the turbo coding rate. The idea with soft
buer limitation is that if UE has a certain buer size dimensioned for a given data rate and a
given coding rate then it can support either higher data rates with increasing coding rate (weaker
code) or lower data rates with a stronger code.
5.5 RV starting points
The transmission of bits from two codeblocks from the same transport block within a single re-
source element is avoided by rst dening G
as:
G
=
G
(N
L
Q
M
)
where G is the total number of bits available for the transmission of one transport block and
Q
M
= 2, 4, 6 for QPSK, 16 QAM and 64 QAM respectively. N
L
= 1 for transport blocks
mapped onto one MIMO transmission layer and N
L
= 2 for transport blocks mapped onto two or
four MIMO transmission layers.
Let us now set:
= G
modC
The rate-matching output sequence of length Efor the rth coded block is then given as:
E =
_
N
L
.Q
m
. G
/C| , r C 1
N
L
.Q
m
. ,G
/C| , otherwise
We note that some codeblocks may need to use one fewer resource element and some others one
more resource element to avoid mixing of bits in the same resource element from two codeblocks
from the same transport block. It should also be noted that the rate-matching output sequence
length E is determined independently of the codeblock size. we also note that the codeblocks
with lower index r C 1 may use one fewer resource element than the codeblocks with
higher index r > C 1.
The rate-matching output bit sequence is:
e
k
= w
(ko+j)modN
cb
k = 0, 1, 2, 3, ....., (E 1), j = 0, 1, 2, 3, ...., (K
W
1)
94
Note that the bit positions with w
(ko+j)modN
cb
= NULL, which denote dummy bits in the circu-
lar buer, a total of 3N
D
= (K
W
E) , are ignored and not included in the transmission. The
Redundancy Version (RV) starting point k
o
is given as:
k
o
= R
subblock
.
_
2.
_
N
c
b
8.R
subblock
_
.rv
idx
+ 2
_
rv
idx
= 0, 1, 2, 3
Where rv
idx
= 0, 1, 2, 3. The operation(k
o
+ j)modN
cb
in previous equation makes sure that the
bit index is reset to the rst bit in the buer when the index reaches the maximum index of N
cb
,
which is the idea of a circular buer.
5.6 Implementation of Rate Matching Transmitter
5.6.1 The Rate Matching Transimatter main blocks
Implementation of rate matching transmitter
The main blocks of the transmitter
1. Three Sub block interleavers .
2. Bit collection.
3. Bit selection.
5.6.2 Sub block interleaver
We have two tybes of sub block interleaver
95
5.6.3 The function of the Sub block interleaver
Its used to randomize the bits.
Sub block interleaver block diagram
The input ports of the Sub block interleaver
1. d:
Its the input 43 bits of data (encoded bits).
2. clk:
Its the clock of the system to synchronize the system.
3. reset:
Its used to reset the all system and the block.
4. load:
Its used to enable the block to receive bits.
The output ports of the Sub block interleaver
1. Q
1
:
The rst output bits from the sub block interleaver block.
2. Q
2
:
The second output bits from the sub block interleaver block.
3. en:
Its a signal to indicate that the interleaved bits is available at the output ports.
96
5.6.4 Bit collection
The function of the Bit collection
Its to collect the interlaved bits from the Sub blocks and interlace them
Bit collection block diagram
The input ports of the Bit collection
1. w
10
, w
20
:
the input ports from the rst sub block interleaver.
2. w
11
, w
21
:
the input ports from the second sub block interleaver.
3. w
12
, w
22
:
the input ports from the third sub block interleaver.
4. clk:
Its the clock of the system to synchronize the system.
5. load
1
,load
2
,load
3
:
Its used to enable the block.
The output ports of the Bit collection
1. w
k1
,w
k2
:
The interlaced output bits from the Bit collection block.
2. load:
Its a signal to indicate that the output bits is available at the output ports.
97
5.7 Simulation of Transmitter
We note that we will make simulations by using Modelsim and check results by using Matlab.
5.7.1 the rst Sub block interleaver
We will use the results from the previous simulation of Turbo Encoder.
the input is:
d
0
k
= 00110001110110001010101111100010101000100010.
By using matlab
v
0
k
= 9190910091019110909191019111910190909000900091109090911090109010.
We note that we represent the dummy variable by 9.
By using Modelsim
00000000000... 00110001110110001010101111100010101000100010
UUUUUUUU... 0100010001010110000101010111010100000000000001100000011000100010
UUUUUUUU... 1110110011011110101111011111110110101000100011101010111010101010
0 ps 200000 ps 400000 ps
/subblock1test/d 00000000000... 00110001110110001010101111100010101000100010
/subblock1test/reset
/subblock1test/load
/subblock1test/clk
/subblock1test/Q1 UUUUUUUU... 0100010001010110000101010111010100000000000001100000011000100010
/subblock1test/Q2 UUUUUUUU... 1110110011011110101111011111110110101000100011101010111010101010
/subblock1test/en
The First Sub block interleaver simulation by Modelsim
We note that the dummy variables representation in Matlab are dierent from VHDL representa-
tion.
5.7.2 the Third Sub block interleaver
We will use the results from the previous simulation of Turbo Encoder.
the input is:
d
2
k
= 00001011101001100100011110100011110011000000.
By using matlab
v
2
k
= 9190910191019000909191109000900091919110900091109001911090119009.
We note that we represent the dummy variable by 9.
98
By using Modelsim
00000000000... 00001011101001100100011110100011110011000000
UUUUUUUU... 0100010101010000000101100000000001010110000001100001011000110000
UUUUUUUU... 1110110111011000101111101000100011111110100011101001111010111001
0 ps 200000 ps 400000 ps
/subblock3test/d 00000000000... 00001011101001100100011110100011110011000000
/subblock3test/reset
/subblock3test/load
/subblock3test/clk
/subblock3test/Q1 UUUUUUUU... 0100010101010000000101100000000001010110000001100001011000110000
/subblock3test/Q2 UUUUUUUU... 1110110111011000101111101000100011111110100011101001111010111001
/subblock3test/en
The Third Sub block interleaver simulation by Modelsim
5.7.3 The Bit collection Block
By using matlab
The input is:
v
0
k
= 9190910091019110909191019111910190909000900091109090911090109010.
v
1
k
= 9190900090009010919191119110910190909101900090009091901090009010.
v
2
k
= 9190910191019000909191109000900091919110900091109001911090119009.
The output is:
w
k
= 9190910091019110909191019111910190909000900091109090911090109
01099119900990100019910019900100099109911991111109910100099100
0109901990199110110990000009901010099009011990111009900010199001009
By using Modelsim
0100010001010110000101010111010100000000000001100000011000100010
0100000000000010010101110110010100000101000000000001001000000010
0100010101010000000101100000000001010110000001100001011000110000
1110110011011110101111011111110110101000100011101010111010101010
1110100010001010111111111110110110101101100010001011101010001010
1110110111011000101111101000100011111110100011101001111010111001
...010001000101011000010101011101010000000000000110000001100010001000110000000100010001000100001000...
...111011001101111010111101111111011010100010001110101011101010101011111100110100011101000111001000...
0 ps 200000 ps 400000 ps 600000 ps
/collectiontest/vk10 0100010001010110000101010111010100000000000001100000011000100010
/collectiontest/vk11 0100000000000010010101110110010100000101000000000001001000000010
/collectiontest/vk12 0100010101010000000101100000000001010110000001100001011000110000
/collectiontest/vk20 1110110011011110101111011111110110101000100011101010111010101010
/collectiontest/vk21 1110100010001010111111111110110110101101100010001011101010001010
/collectiontest/vk22 1110110111011000101111101000100011111110100011101001111010111001
/collectiontest/load1
/collectiontest/load2
/collectiontest/load3
/collectiontest/clk
/collectiontest/wk1 ...010001000101011000010101011101010000000000000110000001100010001000110000000100010001000100001000...
/collectiontest/wk2 ...111011001101111010111101111111011010100010001110101011101010101011111100110100011101000111001000...
/collectiontest/en
The interlacing Modelsim simulation.
99
5.7.4 The Bit selection Block
By using matlab
The input is:
w
k
= 9190910091019110909191019111910190909000900091109090911090109
01099119900990100019910019900100099109911991111109910100099100
0109901990199110110990000009901010099009011990111009900010199001009
The output is:
at rv=0
e
k
= 10010111001101111101000000001100011001001011
at rv=1
e
k
= 11001001011000100010100010010001011111110101
at rv=2
e
k
= 11111110101000100010010111011000000001010000
at rv=3
e
k
= 00000101000001101110000010100100101001011100
By using Modelsim
The Bit selection Modelsim simulation for rv = 0.
The Bit selection Modelsim simulation for rv = 1.
100
The Bit selection Modelsim simulation for rv = 2.
The Bit selection Modelsim simulation for rv = 3.
101
5.8 Simulation of receiver
5.8.1 Matlab
There are four cases :-
1.Rv=0 sending rst part of circular buer only , turbo decoder can detect and correct data.
2.Rv=1 sending rst part of circular buer ,second part and turbo decoder can detect and cor-
rect data.
3.Rv=2 sending rst part of circular buer ,second part ,third part and turbo decoder candetect
and correct data.
4.Rv=3 sending rst part of circular buer ,second part ,third part ,last part and turbo decoder
can detect correct correct data.
In each case turbo decoder chick data and decide if it need more copy about this data or not.
Ex:-
First case if Rv=0 and rst part of data[1:48]=1.
Output after de puncturing
wk=[0000111111111111111111111111111111111111111111111111000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000]
Second case Rv=1 and rst part of data equal second part of data[1:48]=1
Output after de puncturing
wk=[0000111111111111111111111111111111111111111111111111111111111111
1111111111111111111111111111111111110000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000]
third case Rv=2 and all previous parts are equal data[1:144]=1.
Output after de puncturing
WK=[0000111111111111111111111111111111111111111111111111111111111111
1111111111111111111111111111111111111111111111111111111111111111
1111111111111111111100000000000000000000000000000000000000000000]
102
Fourth case Rv=3 and rst part of data=1:48 and second part of
data=49:96 and third part=97:144 and fourth part=145:192.
Output after de puncturing
103
Output after De interlacing for the fourth case
104
105
we can note that parity0 take odd number and parity1 take the even [2]
106
Output after de permutation
107
5.8.2 VHDL
There are four cases:-
First case:-
Ex1:-if received data at circular buer is ek0[0:48]=111111111..... At RV=0
output will be wk 192 bit lling remainder bits by 0s and put ek0 start
from wk(5) due to ko as in previous section.
108
109
Second case:-
Ex2:-if received data at circular buer is ek1[0:48]=[ones(0:23) zeros(0:23)]
At RV=1 it store ek0 to use it and ek1 to conrm wk output will be wk
192 bit lling remainder bits by 0s.
Third case:-
Ex3:-if received data at circular buer is ek2[0:48]=[ones(0:23) zeros(0:23)] At RV=2 it
store ek0 and ek1 to use them and ek2 to conrm wk output will be wk 192 bit lling re-
mainder bits by 0s.
110
fourth case:-
Ex4:-if received data at circular buer is ek3[0:48]=11111111..... At RV=3
it store ek0 and ek1 and ek2 to use them and ek3 to conrm wk output will
be wk 192 bit.
We note that( wk ) have four ones more than f4 which mean that last in-
put rv=3 rotate to complete least signicant nibble.
last step de permutation
Ex5:-if input to bit selection wk =1010101010........To 192 bit and output is
systematic and parity0 and parity1.
111
112
Bibliography
[1] 3GPP. Evolved Universal Terrestrial Radio Access (E-UTRA); Multiplexing and channel cod-
ing. TS 36.212, 3rd Generation Partnership Project (3GPP), January 2010.
[2] Farooq Khan. LTE for 4G Mobile Broadband. Cambridge university press, 2009.
113
114
Chapter 6
Scrambling
6.1 PN-sequences
Noise-like wideband spread-spectrum signals are generated using PN sequence.
In DS/SS(direct-sequence spread-spectrum) , a PN spreading waveform is a time function
of a PN sequence.
In FH/SS(frequency-hopping spread-spectrum), frequencyhopping patterns can be gener-
ated from a PN code.
PN sequences are deterministically generated, however they almost like random sequences
to an observer.
The time waveform generated from the PN sequences also seem like random noise.
6.1.1 m-sequences
M-sequences have been studied extensively as the nearest approximation to random sequences.
M-sequences have found numerous applications in modern communication systems, including
spread spectrum Code Division Multiple Access (CDMA). These applications require large sets
of codes with2 highly peaked autocorrelation and minimum cross-correlation. M-sequence (binary
maximal length shift-register sequence)
Generated using linear feedback shift-register and exclusive OR-gate circuits.
Linear generator polynomial g(x) of degree m > 0
g(x) = g
m
x
m
+g
m1
x
m1
+.... +g
1
x +g
0
Recurrence Equation (g
m
= g
0
= 1)
x
m
= g
m1
x
m1
+g
m2
x
m2
+..... +g
1
x +g
0
If g
i
= 1 , the corresponding circuit switch is closed, otherwise g
i
,= 1 , it is open.
Output of the shift-register circuit is transformed to 1 if it is 0, and 1 if it is 1.
115
The maximum number of non-zero state is 2
m
1 , which is the maximum period of output
sequence c = (c
0
, c
1
, c
2
, ......)
State of the shift-register at clock pulse i is the nite length vector
s
i
= (s
i
(m1), s
i
(m2), ...., s
i
(0))
and the output at clock pulse is c
i
= s
i
(0)
Output sequence recurrence condition according to g(x)
c
i+m
= g
m1
c
i+m1
+g
m2
c
i+m2
+.... +g
i
c
i+1
+ci
Example of a shift-register sequence For any nonzero starting state (s
0
= (0, 0, 0, 0, 0)) ,
the state of shift-register varies according to the recurrence condition.
Other g(x) may yield a sequence of shorter period than 2
m
1
For dierent initial loading, output sequences become a shift of the sequence c, T
J
c (shift
c to the left(right) by j units)
A linear combinations of T
4
c, T
3
c, T
2
c, T
1
c, c , yields all the other shift of c. example :
Shift-register sequence with x
5
+x
4
+x
2
+x + 1
116
Primitive Polynomial the generator polynomial of m-sequence is primitive polynomial. -
g(x) is a primitive polynomial of degree m if the smallest integer n for which g(x) divides
x
n
+ 1 is n = 2
m
1
g(x) = x
5
+x
4
+x
2
+x +1 is primitive, on the other hand g(x) = x
5
+x
4
+x
3
+x
2
+x +1
is not primitive since x
6
+ 1 = (x + 1)(x
5
+x
4
+x
3
+x
2
+x + 1) so the smallest n is 6.
The number of primitive polynomial of degree m is equal to
1
m
(2
m
1) where
(n) = n
p|n
_
1
1
p
_
p[n denotes all distinct prime divisors of n
(n) is the number of positive integer less than n that are relatively prime to n.
Property of m-sequences
Property I The Shift Property A cyclic shift(left-cyclic or right-cyclic) of an m-sequence is
also an m-sequence
Property II The Recurrence Property Any m-sequence in S
m
satises the recurrence condi-
tion
c
i+m
= g
m1
c
i+m1
+g
m2
c
i+m2
+.... +g
i
c
i+1
+ci
where i = 0, 1, 2...
117
Property III The window Property If a window of width m is slid along an m-sequence in
S
m
, each of 2
m
1 nonzero binary m-tuples is seen exactly once
Property IV One more 1 than 0s Any m-sequence in S
m
contains 2
m1
1s and 2
m1
1
0s
Property V The addition Property The sum of two m-sequence in S
m
(mod2, term by term)
is another in S
m
Property VI The Shift and Add Property The sum of an m-sequence and a cyclic shift of
itself(mod2, term by term) is another m-sequence.
Property VII Thumb-Tack Autocorrelation The normalized periodic autocorrelation func-
tion of an msequence, dened as
=
1
N
N1
j=0
(1)
c
i
c
j
is equal to for i = 0(mod N) and 1/N for 0 i ,= (mod N)
Proved easily by shift and add property
Property VIII Runs A run is string of consecutive 1s or a string of consecutive 0s. In any
m-sequence, one-half of the runs have length 1, onequarter have length 2, one-eighth have
length 3, and so on. In particular, there is one run of length m of 1s , one run of length m-1
of 0s.
Property IX Decimation The decimation by n0 of a m-sequence c, denoted as c[ n], has a
period equal to N/gcd(N,n), if it is not the all-zero sequence, its generator polynomial g( x)
has roots that are nth powers of the roots of g(x)
6.1.2 Preferred Pair
1. Any pair of m-sequences having the same period N can be related by y= x[q], for some q.
118
2. Denition :
m0(mod 4) : that is, m=odd or m=2(mod 4)
y = x[q], where q is odd and either q = 2
k
+ 1 or q = 2
2k
2
k
+ 1
1 for m odd
gcd(m ,k )=
2 for m =2(mod 4 )
gcd : the greatest common divisor
3. It is known that preferred-pairs of m-sequences do not exist for m=4,8,12,16, and it was
conjectured that no solutions exist for all m=0 (mod 4).
6.1.3 Gold Codes
Gold sequences of length N can be constructed from a preferred-pair of m-sequences.
A preferred-pair of m-sequences , say x and y, has a threevalued correlation function :
x,y
(n) =
1, t(m), or t(m) 2 for all n where t(m)=1 + 2
m+2/2
The set of Gold sequences includes the preferred-pair of msequences x and y , and the mod
2 sums of x and cyclic shifts of y .
The maximum correlation magnitude for any two Gold sequences in the same set is equal
to the constant t(m).
Example of Gold sequences for m=3
Number of m-sequences :1/3(7) = 2
Length of m-sequences : N = 2
3
1 = 7
Primitive polynomials of degree m=3 (initial loading : 001)
x
3
+x + 1 : x = 1001011
x
3
+x
2
+ 1 : y = 1001110
119
The corresponding set of 9 Gold sequences of period 7 is given by:
1001011 1001110 0000101
1010110 1110001 0111111
01000010 0011000 1101100
Autocorrelation function for both m-sequences : thumb-tack shaped
t(m)=1 + 2
m+2/2
= 5
Crosscorrelation function are three-valued :-1,-5 or -3
x,y
(n) = 1, t(m) = 5, t(m)2 =
3
t(m)/N2
m/2
goes to 0 exponentially as m goes to innity
This suggests that longer Gold sequences will perform better as SSMA sequences.
6.2 Scrambler
LTE downlink scrambling implies that the block of code bits delivered by the hybrid-ARQ
functionality is multiplied (exclusive-or operation) by a bit-level scrambling sequence (usu-
ally a gold code).
In general, scrambling of the coded data helps to ensure that the receiver-side decoding can
fully utilize the processing gain provided by the channel code
120
The codewords are bit-wise multiplied with an orthogonal sequence and a cell-specic scram-
bling sequence to create a sequence of symbols for each codeword, q:
The scrambling sequence is pseudo-random, created using a length-31 Gold sequence gener-
ator and initialized using the slot number within the radio network temporary identier as-
sociated with the PDSCH transmission, n
RNT1
, the cell ID, N
cell
ID
, the slot number within
the radio frame,n
s
and the codeword index q=1,0 at the start of each subframe:
C
init
= n
RNT1
2
14
+q 2
1
3 +
n
s
2
| 2
9
+N
cell
ID
Scrambling with a cell-specic sequence serves the purpose of inter-cell interference rejec-
tion. When a UE descrambles a received bitstream with a known cell specic scrambling
sequence, interference from other cells will be descrambled incorrectly and therefore only
appear as uncorrelated noise.
Pseudo-random sequences are dened by a length-31 Gold sequence. The output sequence
c(n) of length M
PN
, where n = 0, 1, ..., M
PN
1, is dened by
c(n) = (x
1
(n +N
c
) +x
2
(n +N
c
))(mod2)
x
1
(n + 31) = (x
1
(n + 3) +x
1
(n))(mod2)
x
2
(n + 31) = (x
1
(n + 3) +x
2
(n + 2) +x
2
(n + 1) +x
2
(n))(mod2)
where = 1600 C N and the rst m-sequence shall be initialized with x
1
(0) = 1, x
1
(n) =
0, n = 1, 2, ..., 30 . The initialization of c
init
=
30
i=0
x
2
(i).2
i
with the value depending on the
application of the sequence.
121
6.3 Why scrambling
6.3.1 Data randomization
The scrambling process insures that no stream of zeros is transmitted , as zeros mean that no
power will be transmitted , which will lead to synchronization loss at the reciever end , assuming
that all data is recieved .also ,randomization of bits reduces the redundancy in the data stream
which will lead to better error correction performance.
6.3.2 PAPR reduction(peak to average power ratio)
The PAPR of a waveform may be described as
PAPR =
[x(t)[
2
max
P
avg
Where P
avg
is the average power of the waveform. In practical OFDM systems, the PAPR may
be reduced using one or a combination of several techniques. The techniques may be divided
into three major categories. The rst category employs various methods of nonlinear signal dis-
tortion such as hard clipping, soft clipping, companding, or pre-distortion. Generally speaking,
the nonlinear distortion techniques are simple to implement. However, many do not work well
in cases where the OFDM sub-carriers are modulated with higherorder modulation schemes. In
such scenarios, the Euclidian distance between the symbols is relatively small and the additional
noise introduced by the PAPR reduction causes signicant performance degradation. The second
category for PAPR reduction employs various coding methods. The coding techniques have an
advantage of being distortionless and the PAPR reduction is most commonly achieved by elimi-
nating symbols having large PAPR. However, to obtain an appreciable level of PAPR reduction,
high redundancy codes need to be used and as a result, the overall eciency of transmission be-
comes reduced. Finally, the third category is based on OFDM symbol scrambling and selection of
the sequence that produces minimum PAPR. The pre-scrambling techniques achieve good PAPR
reduction but they require multiple FFT transforms and somewhat higher processing power. The
method presented in this paper belongs to the third category of the PAPR reduction techniques.
It uses conveniently chosen Pseudorandom Noise (PN) sequences applied to the input data bit
stream. The method is very easy to realize in the software or hardware environment which is
very important if the PAPR needs to be implemented in Application Specic Integrated Circuits
(ASIC). In such a scenario, the PN-Scrambler may be implemented by the addition of external
FPGA and DSP hardware to the Commercial O-The-Shelf (COTS) ASICs. As a result, one ob-
tains cost ecient and reliable hardware solutions.
Block diagrams of the transmitter and receiver implementing the proposed PN-Scrambler are
presented in Figs. 1 and 2, respectively.
122
As seen in Fig. 1, two additional elements are added to a typical OFDM transmitter. The rst
element is the PAPR scrambler, and the second one is the PAPR threshold compare block. The
PN-Scrambler utilizes a Maximal-Length Linear Feedback Shift Register (MLLFSR) with log
2
(k) =
l taps in order to produce k = 2
l
1 uncorrelated unique sets of data from the same input
sequence. The k unique sets of data are used to generate k independent identically distributed
OFDM symbols. A block of Nb bits comprising one OFDM symbol is scrambled and passed along
for Forward Error Correction (FEC) coding, interleaving, modulation, symbol mapping and IFFT.
IFFT. In any given OFDM system, N
b
is a function of the number of subcarriers, the modula-
tion scheme applied to each subcarrier, and the coding rate. By examining each individual sam-
ple coming out of the IFFT, the PAPR threshold comparator determines if the scrambler has
achieved a desired PAPR on a symbol-by-symbol basis. If the PAPR of the symbol is below a
desired threshold, then the data is passed along towards the RF stage of the transmitter. How-
ever, if the PAPR is still high, the data is scrambled with a dierent phase of the ML-LFSRs PN
sequence. Since this technique operates on the input bit stream, it is essentially independent of
the OFDM modulation and may be adapted to any particular scenario. The receiver presented
in Fig. 2 is a typical OFDM receiver that needs to perform the tasks of down conversion, channel
estimation, and decoding. The only additional task required by the PN-Scrambler PAPR reduc-
tion technique is descrambling of the data at the receiver output. To perform descrambling,
the receiver has to know the phase of the ML-LFSR used on the transmission side. This
phase is embedded in the data stream. For example, the rst l bits of the OFDM symbol may
carry the information on the ML-LFSR phase.
123
A practical implementation of the PN-Scrambler PAPR reduction technique requires selection of
several parameters. These parameters are dened as follows:
1. Number of scrambling sequences (k) - dened as the number of PN sequences produced by
the MLLFSR. Each sequence is Nb bits long.
2. PAPR threshold (L) dened as the maximum PAPR for the OFDM symbol. This value
is used by the PAPR threshold comparator block in order to discard OFDM symbols with
PAPR greater than L. 3. IFFT size / number of sub-carriers (N) dened as the number of
the non-zero orthogonal subcarriers per OFDM symbol.
3. Average latency ( k ) dened as the average number of scrambling attempts per OFDM
symbol in order to pass the threshold level L.
4. Probability of clipping (p) probability that the PAPR exceeds the threshold level L after k
scrambling attempts.
5. PN scrambler overhead ( v ) dened as the ratio of the number of bits required to repre-
sent the phase of the ML-LFSR to the number of bits per OFDM symbol Nb.
In any actual design, the above parameters allow dierent tradeos. The subsequent
section highlights some of these design trades.
6.4 Matlab code
For the matlab code :
These equations are used to implement the feed back of the shift registers :
c(n) = (x
1
(n +N
c
) +x
2
(n +N
c
))(mod2)
x
1
(n + 31) = (x
1
(n + 3) +x
1
(n))(mod2)
x
2
(n + 31) = (x
1
(n + 3) +x
2
(n + 2) +x
2
(n + 1) +x
2
(n))(mod2)
For the initial phase of the two LFSRs
The rst register will have: [zeros(1,30) 1 ]
The second shift register will have: dec2bin
(C
init
= n
RNT1
2
14
+q 2
1
3 +
n
s
2
| 2
9
+N
cell
ID
)
dec2bin : converts the previous equation from decimal to binary , so it can be placed
in the LFSR. the constants of the equation , are assigned randomly with any integers. the
previous equation is only used in case of PDSCH channel For all downlink transport chan-
nels except the MCH, as well as for the L1/L2 control signaling, scrambling sequences should
be dierent for neighbor cells (cell-specic scrambling) to ensure interference randomization
between the cells. is achieved by having the scrambling depend on the physical-layer cell
identity. contrast, in case of MBSFN-based transmission using the MCH transport chan-
nel, same scrambling should be applied to all cells taking part in the MBSFN transmission
(cell-common scrambling). is achieved by having the scrambling depend on the so-called
MBSFN area identity.
124
for count=1:31
if xPD(1,count)==1 xpd : the initial phase of the shift register eected by the chan-
nel equation
xpd(1,count)=1;
else
xpd(1,count)=0;
end
end
The previous code is to convert the from data type char (output of the dec2bin) to dou-
ble,so it can be processed easily.
The next step is descarding the rst 1600 samples . for count=1:1600
feed1=xor(x1(1,end),x1(1,end-3));
feed2=xor(xor(xpd(1,end),xpd(1,end-1)),xor(xpd(1,end-3),xpd(1,end-2)));
x1=[feed1 x1(1,1:end-1)];
xpd=[feed2 xpd(1,1:end-1)]; end
Applying the previous feed back equations for the shift registers , as for the the rst regis-
ter feed1 is calculated and placed and placed at the begining of the sequence to be shifted ,
and the last sample is discarded , the same goes for the seconde register , useing feed2 , the
operation continuous till 1600 samples are discarded .
Now we start shifting using the past equations but this time ,the last bit of the two shift
registers will be xord ( generating the golden code) then ,xord with the data bit (bit level
scrambling)
for count=1:length(data)
feed1=xor(x1(1,end),x1(1,end-3));
feed2=xor(xor(xpd(1,end),xpd(1,end-1)),xor(xpd(1,end-3),xpd(1,end-2)));
x1=[feed1 x1(1,1:end-1)];
xpd=[feed2 xpd(1,1:end-1)];
same as previous shift operation
gold=xor(x1(1,end),xpd(1,end));
xoring the last bit of the two shift registers
scrambled(1,count)=xor(gold,data(1,count));
xoring the golden bit with the data bit
end
The reciever is the exact same thing , as xor operation is reversed with another xor opera-
tion.
125
126
Bibliography
[1] ALTERA. Gold code generator reference design. 2003.
[2] Ivica Kostanic Christopher Moatt. Practical implementation of pn scrambler for papr reduc-
tion in ofdm systems for range extension and lower power consumption. 2008.
127
128
Chapter 7
Digital Modulation Technique
7.1 INTRODUCTION
In baseband pulse transmission data stream represented in the form of a discrete pulse-amplitude
modulated (PAM) signal is transmitted directly over a low-pass channel. In digital pass band
transmission, on the other hand, the incomig data stream is modulated onto a carrier (usually
sinusoidal) with xed frequency limits imposed by a band-pass channel of interest, pass band
data transmission is studied in this chapter. The communication channel used for pass band data
transmission may be a micro wave radio link, a satellite channel, or the like. Yet other applica-
tions of pass band data transmission are in the design of pass band line codes for use on digital
subscriber loops and orthogonal frequency-division multiplexing techniques for broadcasting. In
any event the modulation process making the transmission possible involves switching (keying)
the amplitude, frequency, or phase of a sinusoidal carrier in some fashion in accordance with the
incoming data. Thus there are three basic signaling schemes, and they are known as:
129
FIGURE 7.1:waveforms for the three basic forms of signaling binary information. (a) Amplitude-
shift keying(OOK), Frequency-shift keying(FSK) and Phase-shift keying(PRK).
130
Amplitude-shift keying (ASK), frequency-shift keying (FSK), and phase-shift keying (PSK). They
may be viewed as special cases of amplitude modulation, frequency mod-ulation, and phase mod-
ulation, respectively, Figure 1.1 illustrates these three methods of modulation for the case of a
source supplying binary data. The following points are noteworthy from Figure 1.1: Although in
continuous-wave modulation it is usually dicult to distinguish between phase-modulated and
frequency-modulated signals by merely looking at their waveforms, this is not true for PSK and
FSK signals. Unlike ASK signals, both PSK and FSK signals have a constant envelope. This lat-
ter property makes PSK and FSK signals impervious to amplitude nonlinearities, commonly en-
countered in microwave radio and satellite channels. It is for this reason, in practice; we nd that
PSK and FSK signals are preferred to ASK signals for pass band data transmission over nonlin-
ear channels.
7.2 HIERARCHY OF DIGITAL MODULATION
TECHNIQUES
Digital modulation techniques may be classied into coherent and non coherent techniques, de-
pending on whether the receiver is equipped with a phase-recovery circuit or not. The phase-
recovery circuit ensures that the oscillator supplying the locally generated carrier wave in the re-
ceiver is synchronized (in both frequency and phase) to the oscillator supplying the carrier wave
used to originally modulate the incoming data stream in the transmitter.
In an M-ary signaling scheme, we may send any one of M possible signals s
1
(t), s
2
(t), . . . , s
M
(t),
during each signaling interval of duration T. For almost all applications, the number of possible
signals M = 2
n
, where n is an integer the symbol duration T = nTb, where T is the bit duration.
In pass band data transmission these signals are generated by changing the amplitude, phase, or
frequency of a sinusoidal carrier in M discrete steps. Thus we have M-ary ASK, M-ary PSK, and
M-ary FSK digital modulation schemes. Another way of generating M-ary signals is to combine
dierent methods of modulation into a hybrid form. For example, a special form of this hybrid
modulation is M-ary quadrature amplitude modulation (QAM), which has some attractive prop-
erties. M-ary ASK is a special case of M-ary QAM.
M-ary signaling schemes are preferred over binary signaling schemes for transmitting digital in-
formation over band-pass channels when the requirement is to conserve band-width at the ex-
pense of increased power. Thus when the bandwidth of the channel is less than the required
value, we may use M-ary signaling schemes for maximum eciency.
M-ary PSK, M-ary QAM, and M-ary FSK are commonly used in coherent systems. Am-plitude-
shift keying and frequency-shift keying lend themselves naturally to use in non-coherent sys-
tems whenever it is impractical to maintain carrier phase synchronization. But in the case of
phase-shift keying, we cannot have non coherent PSK because the term non coherent means do-
ing without carrier phase information. Instead, we employ a pseudo PSK technique known as dif-
ferential phase-shift keying (DPSK), which (in a loose sense) may be viewed as the non coherent
form of P5K. In practice, M-ary FSK and M-ary DPSK are the commonly used forms of digital
modulation in non coherent system.
131
7.3 Pass band Transmission Model
In a functional sense, we may model a pass band data transmission system as shown jn Figure
First, there is assumed to exist a message source that emits one symbol every T seconds, with
the symbols belonging to an alphabet of M symbols, which we denote by m
1
,m
2
,... , m
M
.The a
priori probabilities P(m
1
), P(m
2
),. . . , P(m
M
) specify the message source output. When the M
symbols of the alphabet are equally likely, we write
P
i
= P (m
i
)
=
1
M
for all i
(7.1)
The M-ary output of the message source is presented to a signal transmission encoder, producing
a corresponding vector Si made up of N real elements, one such set for each of the M symbols of
the source alphabet; the dimension N is less than or equal to M. With the vectorS2 as input, the
modulator then constructs a distinct signal s(t) of duration T seconds as the representation of
the symbol m generated by the message source. The signal s
1
(t) is necessarily an energy signal,
as shown by
E
i
=
_
T
0
S
2
i
(t) dt, i=1,2,......,M
(7.2)
Note that s
i
(t) is real valued. One such signal is transmitted every T seconds. The particular sig-
nal chosen for transmission depends in some fashion on the incoming message and possibly on
the signals transmitted in preceding time slots. With a sinusoidal carrier, the feature that is used
by the modulator to distinguish one signal from another is a step change in the amplitude, fre-
quency, or phase of the carrier. (Sometimes, a hybrid form of modulation that combines changes
in both amplitude and phase or amplitude and frequency is used.)
Figure 7.2
Functional model of pass band data transmission system. Returning to the functional model of
the band pass communication channel, coupling the transmitter to the receiver, is assumed to
have two characteristics:
1. The channel is linear, with a bandwidth that is wide enough to accommodate the transmission
of the modulated signal s(t) with negligible or no distortion.
2. The channel noise w(t) is the sample function of a white Gaussian noise process of zero mean
and power spectral density N0/2.
132
7.4 COHERENT PHASE-SHIFT KEYING
7.4.1 Binary Phase-Shift Keying
In a coherent binary PSK system, the pair of signals s
1
(t) and s
2
(t) used to represent binary
symbols 1 and 0, respectively, is dened by:
S
1
(t) =
_
2E
b
T
b
cos(2f
c
t) .
. S
2
(t) =
_
2E
b
T
b
cos(2f
c
t + ) =
_
2E
b
T
b
cos(2f
c
t).
(7.3),(7.4)
Where 0 t Tb, and Eb is the transmitted signal energy per bit. To ensure that each transmit-
ted bit contains an integral number of cycles of the carrier wave, the carrier frequency f is chosen
equal to nc/Tb for some xed integer n. Pair of sinusoidal waves that dier only in a relative
phase-shift of 180 degrees, as dened in Equations (7.3) and (7.4), are referred to as antipodal
signals. From this pair of equations it is clear that, in the case of binary PSK, there is only one
basis function of unit energy, namely,
1
(t) =
_
2
T
b
cos(2f
c
t), 0 t < T
b
(7.5)
Then we may express the transmitted signals S
1
(t) and S
2
(t) in terms of (t) as follows:
S
1
(t) =
E
b
1
(t), 0 t < T
b
. S
2
(t) =
E
b
1
(t), 0 t < T
b
(7.6)
FIGURE 7.3 Signal-space diagram for coherent binary PSK system. The waveforms depicting
the transmitted signals s
1
(t) and s
2
(t), displayed in the inserts, assume n=2.
133
A coherent binary PSK system is therefore characterized by having a signal space that is one-
dimensional (i.e., N=1), with a signal constellation consisting of two message points (i.e., M=2).
The coordinates of the message points are:
s
11
=
_
T
b
0
S
1
(t)
1
(t)dt = +
E
b
s
21
=
_
T
b
0
S
2
(t)
1
(t)dt =
E
b
(7.7)
The message point corresponding to S
1
(t) is located atS
11
= +
Eb
The set of points closest to message point 2 at -
Eb
This is accomplished by constructing the midpoint of the line joining these two message points,
and then marking o the appropriate decision regions. In Figure 7.3 these decision regions are
marked Z1 and Z2, according to the message point around which they are constructed.
The decision rule is now simply to decide that signal s
1
(t) (i.e., binary symbol 1) was transmit-
ted if the received signal point falls in region Z1, and decide that signal s
2
(t) (i.e., binary symbol
0) was transmitted if the received signal point falls in region Z2. Two kinds of erroneous deci-
sions may, however, be made. Signal s
2
(t) is transmitted, but the noise is such that the received
signal point falls inside region Z1 and so the receiver decides in favor of signal s
1
(t). Alterna-
tively, signal s
1
(t) is transmitted, but the noise is such that the received signal point falls inside
region Z2 and so the receiver decides in favor of signal s
2
(t).
To calculate the probability of making an error of the rst kind, we note from Figure 7.3 that the
decision region associated with symbol 1 or signal s
1
(t) is described by
Z
1
= 0 < X <
(7.8)
Where the observable element x
1
is related to the received signal x(t) by:
x
1
=
_
T
b
0
x(t)
1
(t)dt
(7.9)
134
The conditional probability density function of random variable X
1
, given that symbol 0 [i.e.,
signal s
2
(t)] was transmitted, is dened by:
f
x
1
(x
1
[0) =
1
No
exp
_
1
No
(x
1
S
21
)
2
_
=
1
No
exp
_
1
No
(x
1
E
b
)
2
_
(7.10)
The conditional probability of the receiver deciding in favor of symbol 1, given that symbol 0 was
transmitted, is therefore
P
10
=
_
0
f
x
1
(x
1
[0)dx
1
P
10
=
1
No
_
0
exp
_
1
No
(x
1
+
E
b
)
2
_
dx
1
(7.11)
Putting
z =
1
No
(x
1
+
E
b
)
(7.12)
And changing the variable of integration from x
1
to z, we may rewrite the compact form
P
10
=
1
Eb/No
exp (z
2
) dz.
P
10
=
1
2
erfc
__
E
b
No
_
(7.13)
Consider next an error of the second kind. We note that the signal space of Figure 7.3 is sym-
metric with respect to the origin. It follows therefore that P
01
, the condition probability of the
receiver deciding in favor of symbol 0, given that symbol 1 was transmitted. Thus, averaging the
conditional error probabilities P
10
and P
01
, we nd that the average probability of symbol error
or, equivalently, the bit error rate for coherent bi PSK is (assuming equi probable symbols)
P
e
=
1
2
erfc
__
E
b
No
_
(7.14)
As we increase the transmitted signal energy per bit, Eb, for a specied noise spectral density
N0, the message points corresponding to symbols 1 and 0 move further apart
135
7.4.1.2 Generation and Detection of Coherent Binary PSK Signals
To generate a binary PSK signal, we see that we have to represent the input binary sequence in
polar form with symbols 1 and 0 represented by constant amplitude levels of +
Eb and -
Eb
respectively. This signal transmission encoding is performed by a polar non-return-to-zero (NRZ)
level encoder. The resulting binary wave and a sinusoidal carrier 1(t), whose frequency f (n/T,)
for some xed integer n, are applied to a product modulator, as in Figure 7.4a. The carrier and
the timing pulses used to generate the binary wave are usually extracted from a common master
clock. The desired PSK wave is obtained at the modulator output.
To detect the original binary sequence of 1s and 0s, we apply the noisy PSK signal x(t) (at the
channel output) to a correlator, which is also supplied with a locally generated coherent reference
signal1(t), as in Figure 7.4b. The correlator output, x
1
, is compared with a threshold of zero
volts. If x
1
> 0, the receiver decides in favor of symbol 1 On the other hand, if x
1
< 0, it decides
in favor of symbol 0. If x
1
is exactly zero, the receiver makes a random guess in favor of 0 or 1.
FIGURE 7.4: Block diagrams for (a) binary PSK transmitter and (b) coherent binary PSK
receiver.
136
7.4.2 QUADRIPHASE-SHIFT KEYING
The provision of reliable performance, exemplied by a very low probability of error is one im-
portant goal in the design of a digital communication system. Another important goal is the e-
cient utilization of channel bandwidth. In this section, we study a band. width-conserving mod-
ulation scheme known as coherent quadriphase-shift keying, Which is an example of quadrature-
carrier multiplexing. In quadriphase-shift keying (QPSK), as with binary PSK, information car-
ried by the transmitted signal is contained in the phase. In particular, the phase of the carrier
takes on one of four equally spaced values, such as /4, 3/4, 5/4, and 7/4.For thi8 of values we
may dene the transmitted signal as
(7.15)
Where i = 1, 2, 3, 4; E is the transmitted signal energy per symbol, and T is the symbol du-
ration. The carrier frequency f equals n/T for some xed integer n. Each possible value of the
phase corresponds to a unique digit. Thus, for example, we may choose the foregoing set of phase
values to represent the Gray-encoded set of debits: 10, 00, 01, and 11, where only a single bit is
changed from one digit to the next.
7.4.2.1 Signal-Space Diagram of QPSK
Using a well-known trigonometric identity, we may use the last Equation to redene the trans-
mitted signal S
i
(t) for the interval 0 t T
in
the equivalent form:
(7.16)
Where i= 1,2,3,4. Based on this representation, we can make the following observations:
There are two orthonormal basis functions, 1(t) and 2(t), contained in the expands of s(t).
Specically, 1(t) and 2(t) are dened by a pair of quadrature carriers:
137
TABLE 7.1 Signal-space characterization of QPSK
FIGURE 7.5: Signal-space diagram of coherent QPSK system. There are four message points,
and the associated signal vectors are dened by:
(7.18)
138
The elements of the signal vectors, namely, S
1
and S
2
have their values summarized in Table 7.1.
The rst two columns of this table give the associated dibit and phase of the QPSK signal.
Accordingly, a QPSK signal has a two-dimensional signal constellation (i.e., N = 2) and four
message points (i.e., M = 4) whose phase angles increase in a counterclockwise direction, as il-
lustrated in Figure 7.6. As with binary PSK, the QPSK signal has minimum average energy.
7.4.2.2 EXAMPLE 7.1
Figure 7.6 illustrates the sequences and waveforms involved in the generation of a QPSK signal.
The input binary sequence 01101000 is shown in Figure 7.6
FIGURE 7.6 (a) Input binary sequence. (b) Odd-numbered bits of input sequence and associated
binary PSK wave. (c) Even-numbered bits of input sequence and associated binary PSK wave.
(d) QPSK waveform dened as:
S(t)=S
i1
1(t) +S
i2
2(t).
139
7.4.2.3 Error Probability of QPSK
(7.19)
7.4.2.4 Generation and Detection of Coherent QPSK Signals
Consider next the generation and detection of QPSK signals. Figure 7.7a shows a block diagram
of a typical QPSK transmitter. The incoming binary data sequence is rst trans-formed into po-
lar form by a non return-to-zero level encoder. Thus, symbols 1 and 0 are represented by +
Eb
and -
Eb, respectively. This binary wave is next divided by means of a de multiplexer into two
separate binary waves consisting of the odd and even- numbered input bits. These two binary
waves are denoted by a
1
(t) and a
2
(t). We note that in any signaling interval, the amplitudes of
a
1
(t) and a
2
(t) equal S
i1
, and S
i2
, respectively, de-pending on the particular dibit that is being
transmitted. The two binary waves a
1
(t) and a
2
(t) are used to modulate a pair of quadrature
carriers or orthonormal basis functions:
1(t) equal to
_
2/T cos(2fc t)
2(t) equal to
_
2/T sin(2fc t).
The result is a pair of Binary PSK signals, which may be detected independently due to the or-
thogonality of 1(t) and 2(t) Finally, the two binary PSK signals are added to produce the de-
sired QPSK signal.
140
FIGURE 7.7 Block diagrams of (a) QPSK transmitter and (b) coherent QPSK receiver
The QPSK receiver consists of a pair of correlators with a common input and supplied with a
locally generated pair of coherent reference signals 1(t) and 2(t), as in 7.7b. The correlator
outputs X
1
and X
2
, produced in response to the received signal x(t) are each compared with a
threshold of zero. Finally, the binary sequences at the in-phase and quadrature channel outputs
are combined in a multiplexer to reproduce the original binary sequence at the transmitter input
with the minimum probability of symbol error in an AWGN channel.
141
7.4.3 M-ARY PSK
QPSK is a special case of M-ary PSK, where the phase of the carrier takes on one of M possible
values, namely, i= 2(i 1)/M, where 1, 2,.. . , M. Accordingly, during each signaling interval of
duration T, one of the M possible signals
(7.20)
is sent, where E is the signal energy per symbol. The carrier frequency f = n/T for some xed
integer n.
Each s
i
(t) may be expanded in terms of the same two basis functions 1(t) and 2(t), respec-
tively. The signal constellation of M-ary PSK is therefore two-dimensional. The M message points
are equally spaced on a circle of radius
2
2
e
(yre+3)
2
2
2
+
1
2
2
e
(yre+1)
2
2
2
.
When the bit0 is 1, the real part of the QAM constellation takes values +1 or +3. The condi-
tional probability given b0 is zero is,
P(y[b
0
=1)=
1
2
2
e
(yre1)
2
2
2
+
1
2
2
e
(yre3)
2
2
2
168
Soft bit for b1
The bit mapping for the bit b1 with 16QAM Gray coded mapping is shown
below. We can see that when b0 toggles from 0 to 1, only the real part of
the constellation is aected.
When the b1 is zero, the real part of the QAM constellation takes values -3
or +3. The conditional probability given b1 is zero is,
P(y[b
1
=0)=
1
2
2
e
(yre+3)
2
2
2
+
1
2
2
e
(yre3)
2
2
2
When the b1 is 1, the real part of the QAM constellation takes values -1 or
+1. The conditional probability given b1 is one is,
P(y[b
1
=1)=
1
2
2
e
(yre+1)
2
2
2
+
1
2
2
e
(yre1)
2
2
2
169
Summary
The softbit for bit b0 is:
Sb(b
0
) = 2(y
re
+1) y
re
< 2
= y
re
2 y
re
< 2
= 2(y
re
-1) y
re
> 2
The softbit for bit b1 is:
Sb(b
1
) = y
re
+2 y
re
0
= -y
re
+2 y
re
> 0
The softbit for bit b1 can be simplied to:
Sb(b
1
) = -[y
re
[ +2 , for all y
re
It is easy to observe that the softbits for bits b
2
, b
3
are identical to softbits
for b
0
, b
1
respectively except that the decisions are based on the imaginary
component of the received vector y
im
.
The softbit for bit b2 is:
Sb(b
2
) = 2(y
im
+1) y
im
< 2
= y
im
2 y
im
< 2
= 2(y
im
-1) y
im
> 2
The softbit for bit b3 is:
Sb(b
3
) = -[y
im
[ +2 , for all y
im
simplication to avoids the need for having a threshold check in the
receiver for sofbits b0 and b2 respectively.
2(y
re
+1) = y
re
and
2(y
im
+1) = y
im
This simplication described in [1]
170
Bibliography
[1] Paola Bisaglia Filippo Tosato. Simplied soft-output demapper for bi-
nary interleaved cofdm with application to hiperlan/2. journal, October
2001.
[2] Simon Haykin. Communication Systems. John Wiley and Sons, Inc,
2001.
[3] Jia Yin Lang Tianyi. Application of soft demodulation in lte physical
layer downlink. journal, 2011.
171
172
Chapter 8
MIMO
8.1 MIMO concepts and capacity
8.1.1 Introduction
Wireless system designers are faced with numerous challenges, including
limited availability of radio frequency spectrum and transmission problems
caused by such factors as fading and multipath distortion. Meanwhile, there
is increasing demand for higher data rates, better quality service, fewer dropped
calls, and higher network capacity. Meeting these needs requires new tech-
niques that improve spectral eciency and network linksoperational relia-
bility. Multiple-input-multiple-output (MIMO) technology promises a cost-
eective way to provide these capabilities. MIMO uses antenna arrays at
both the transmitter and receiver. Algorithms in a radio chipset send infor-
mation out over the antennas. The radio signals reect o objects, creating
multiple paths that in conventional radios cause interference and fading.
But MIMO sends data over these multiple paths, thereby increasing the
amount of information the system carries. The data is received by multiple
antennas and recombined properly by other MIMO algorithms. This tech-
nology promises to let engineers scale up wireless bandwidth or increase
transmission ranges. MIMO is an underlying technique for carrying data.
It operates at the physical layer, below the protocols used to carry the data,
so its channels can work with virtually any wireless transmission protocol.
For example, MIMO can be used with the popular IEEE 802.11 (Wi-Fi)
technology, and in the upcoming mobile generations and broadband solu-
tions such as IEEE 802.16 (WiMAX) and Long Term Evolution (LET).
173
Figure 8.1: CHANNEL IMPAIREMENTS
For these reasons, MIMO eventually will become the standard for carry-
ing almost all wireless trac; it is thought that MIMO will become a core
technology in wireless systems. It is really the only economical way to in-
crease bandwidth and range. MIMO still must prove itself in large scale,
real-world implementations, and it must overcome several obstacles to its
success, including energy consumption, cost, and competition from similar
technologies.
8.1.2 WIRELESS CHANNEL IMPAIREMENTS:
a)Multipath fading (destructive interference) :scattering due to
dierent obstacles gure 1.1
b)Shadowing : Communication blocked by obstacles : gure 1.2
c)Interference : gure 1.3
8.1.3 What is MIMO
MIMO is an acronym that stands for Multiple Inputs Multiple Outputs.
It is an antenna technology that is used both in transmission and receiver
equipment for Wireless radio communication, to improve communication
performance. It is one of several forms of smart antenna technology.
174
Figure 8.2: Shadowing
Figure 8.3: Interference
175
Why MIMO in a key feature in the modern wireless communication systems? There
are many reasons to justify why it is thought that MIMO will become a
core technology in wireless systems, some reasons are listed here but the
coming future will demonstrate the powerful and importance of MIMO tech-
nology. MIMO technique is able to:
Exploit multipath by taking advantage of random fading, as it is known
that the main impairment to the performance of wireless communica-
tion systems is fading due to multipath and interference.
Achieve very high spectral eciency and it is a perfect solution to the
limited bandwidth availability.
Save the system power consumption, as it increases the system capac-
ity and reliability without consume excessive power.
Increase the system capacity so it can support many number of users.
Increase the system throughout as it can support high data rates.
Increase both the quality of service and the revenues signicantly.
From the previous reasons, there is no doubt about the importance of MIMO
technique, so the aim of this section is to provide a complete and concise
overview about this promising technique.
8.1.4 MIMO vs. Channel Capacity
Channel capacity: The maximum possible transmission rate such that the
probability of error is small. Multipath propagation has long been regarded
as an impairment because it causes signal fading, to mitigate this problem,
diversity techniques were developed Antenna diversity is a widespread form
of diversity, recent research has shown that multipath propagation can in
fact contribute to capacity.
There are a number of dierent MIMO congurations or formats that can
be used. These are termed SISO, SIMO, MISO and MIMO. These dierent
MIMO formats oer dierent advantages and disadvantages - these can be
balanced to provide the optimum solution for any given application.
176
8.1.5 SISO, SIMO, MISO and MIMO terminology
The dierent forms of antenna technology refer to single or multiple inputs
and outputs. These are related to the radio link. In this way the input is
the transmitter as it transmits into the link or signal path, and the output
is the receiver. It is at the output of the wireless link. Therefore the dier-
ent forms of single / multiple antenna links are dened as below:
SISO - Single Input Single Output.
SIMO - Single Input Multiple output.
MISO - Multiple Input Single Output.
MIMO - Multiple Input multiple Output.
The term MU-MIMO is also used for a multiple user version of MIMO as
described below:
SISO The simplest form of radio link can be dened in MIMO terms as
SISO - Single Input Single Output. This is eectively a standard radio chan-
nel - this transmitter operates with one antenna as does the receiver. There
is no diversity and no additional processing required gure 1.4.
The advantage of a SISO system is its simplicity. SISO requires no process-
ing in terms of the various forms of diversity that may be used. However
the SISO channel is limited in its performance. Interference and fading will
impact the system more than a MIMO system using some form of diver-
sity, and the channel bandwidth is limited by Shannons law - the through-
put being dependent upon the channel bandwidth and the signal to noise
ratio. The channel capacity of this form can be calculator by the Shannon
formula :
C = B log
2
(1 +S/R)bit/s
SIMO (receive diversity) The SIMO or Single Input Multiple Output ver-
sion of MIMO occurs where the transmitter has a single antenna and the
receiver has multiple antennas. This is also known as receive diversity. It
177
Figure 8.4: SISO
Figure 8.5: SIMO
is often used to enable a receiver system that receives signals from a num-
ber of independent sources to combat the eects of fading. It has been used
for many years with short wave listening / receiving stations to combat the
eects of ionosphere fading and interference gure 1.5.
SIMO has the advantage that it is relatively easy to implement although it
does have some disadvantages in that the processing is required in the re-
ceiver. The use of SIMO may be quite acceptable in many applications, but
where the receiver is located in a mobile device such as a cell phone hand-
set, the levels of processing may be limited by size, cost and battery drain.
In this case when the transmitter has a single antenna. To increase channel
capacity and dont change bandwidth, this form used with Capacity:
C = B log
2
(1 +nS/R)bit/s
For example, if n=2 (two transmitter antenna), B = 5 Mhz, S/N = 100, in
SISO system C= 33,3 Mb/s (Mbps); in SIMO system C = 38.3 Mb/s. This
capacity is large than a bit, but it have some other function like reduce fad-
ing (diversity gain).
MISO (transmit diversity) MISO is also termed transmit diversity. In this
case, the same data is transmitted redundantly from the two transmitter
antennas. The receiver is then able to receive the optimum signal which it
can then use to receive extract the required data gure 1.6.
178
Figure 8.6: MISO
Figure 8.7: MIMO
MIMO Where there is more than one antenna at either end of the radio
link, this is termed MIMO - Multiple Input Multiple Output. MIMO can
be used to provide improvements in both channel robustness as well as chan-
nel throughput gure 1.7.
C = B log
2
(1 +nT.nR.S/R)bit/s
nT: transmitter antenna
nR: receiver antenna
For above example, nT= nR= 2 = c= 43.3 Mb/s. But, with the signal is
coded using techniques called space-time coding
C = min(nT, nR) B log
2
(1 +S/R)bit/s
Min(nT,nR): minimum of nT and nR and C =66.6 Mb/s, much better. With
33 or 44 antenna C is more increase. MIMO is divided into MIMO single-user and
multi-user:
MIMO single-user (MIMO-SU): shown at gure 1.8
MIMO multi-user (MIMO-MU): The main dierence here with the MIMO
system is that we have many receivers each one has an antenna gure 1.9.
179
Figure 8.8: MIMO single-user
Figure 8.9: MIMO multi-user
Figure 8.10: table 1
180
8.2 Diversity
It is to send the same data over independent fading paths. These indepen-
dent paths are combined in some way such that the fading of the resultant
signal is reduced .so we will have many copies of the signal. we send on dif-
ferent independent paths so the probability that the two paths undergoes
deep fading is too small, that depend on how much that tow paths are de-
pendent on each other.
8.2.1 Types of diversity:
1. Time diversity: Time diversity is achieved by transmitting the same
signal at dierent times, where the time dierence is greater than the
channel coherence time (the inverse of the channel Doppler spread).
Time diversity does not require increased transmit power, but it does
decrease the data rate since data is repeated in the diversity time slots
rather than sending new data in these time slots. Time diversity can
also be achieved through coding and in-terleaving.
2. Frequency diversity: Separations between carriers should be at least
the coherent bandwidth (f) c will guarantee that the fading statistics
for dierent frequencies are essentially uncorrelated (Dierent copies
undergo independent fading). The coherence bandwidth is dierent for
181
Figure 8.11: Frequency diversity Vs time at one slot
Figure 8.12: Frequency diversity Vs time at two slots
dierent propagation environments . Like time diversity, frequency di-
versity induces a loss in bandwidth eciency due to a redundancy in-
troduced in the frequency domain gures 1.11 and 1.12.
3. Polarization Diversity: It uses either two transmit antennas or two
receive antennas with dierent polarization (e.g. vertically and hori-
zontally polarized waves). Disadvantages of polarization diversity rst,
you can have at most two diversity branches, corresponding to the two
types of polarization. The second disadvantage is that polarization di-
versity loses eectively half the power (3 dB) since the transmit or re-
ceive power is divided between the two dierently polarized antennas.
4. Delay diversity: a radio channel subject to time dispersion, with the
transmitted signal propagating to the receiver via multiple, indepen-
dently fading paths with dierent delays, provides the possibility for
multi-path diversity or, equivalently, frequency diversity. Thus multi-
182
path propagation is actually benecial in terms of radio-link perfor-
mance, assuming that the amount of multipath propagation is not too
extensive and that the transmission scheme includes tools to counter-
act signal corruption due to the radio-channel frequency selectivity,
for example, by means of OFDM transmission or the use of advanced
receiver-side equalization. If the channel in itself is not time dispersive,
the availability of multiple transmit antennas can be used to create ar-
ticial time dispersion or, equivalently, articial frequency selectivity
by transmitting identical signals with dierent relative delays from the
dierent antennas. In this way, the antenna diversity, i.e. the fact that
the fading experienced by the dierent antennas have low mutual cor-
relation can be transformed into frequency diversity. This kind of delay
diversity is illustrated in gure 1.13.for the special case of two trans-
mit antennas. The relative delay should be selected to ensure a suit-
able amount of frequency selectivity over the bandwidth of the signal
to be transmitted. It should be noted that, although Figure 1.13 as-
sumes two transmit antennas, delay diversity can straightforwardly be
extended to more than two transmit antennas with dierent relative
delays for each antenna. Delay diversity is in essence invisible to the
mobile terminal, which will simply see a single radio-channel subject to
additional time dispersion. Delay diversity can thus straightforwardly
be introduced in an existing mobile-communication system without
requiring any specic support in a corresponding radio-interface stan-
dard. Delay diversity is also applicable to basically any kind of trans-
mission scheme that is designed to handle and benet from frequency-
selective fading including for example, WCDMA and CDMA2000.
5. Cyclic-delay diversity: Cyclic-Delay Diversity (CDD) is similar to
delay diversity with the main dierence that cyclic-delay diversity op-
erates block-wise and applies cyclic shifts rather than linear delays,
to the dierent antennas (see Figure 1.14 ). Thus cyclic-delay diver-
sity is applicable to block-based transmission schemes such as OFDM
and DFTS-OFDM. In case of OFDM transmission, a cyclic shift of the
time-domain signal corresponds to a frequency-dependent phase shift
before OFDM modulation, as illustrated in Figure 1.14b . Similar to
183
Figure 8.13: Twp Antenna Delay Diversity
delay diversity, this will create articial frequency selectivity as seen by
the receiver. Also similar to delay diversity, CDD can straightforwardly
be extended to more than two transmit antennas with dierent cyclic
shifts for each antenna.
6. Space Diversity: The signal is transferred over several dierent prop-
agation paths. In the case of wired transmission, this can be achieved
by transmitting via multiple wires. In the case of wireless transmis-
sion, it can be achieved by antenna diversity using multiple transmit-
ter antennas (transmit diversity) and/or multiple receiving antennas
(reception diversity).The multiple antennas are separated physically by
a proper distance so that the individual signals are uncorrelated. The
separation requirements vary with antenna height, propagation envi-
ronment and frequency. Typically a separation of a few wavelengths is
enough to obtain uncorrelated signals. In space diversity, the replicas
of the transmitted signals are usually provided to the receiver in the
form of redundancy in the space domain. Unlike time and frequency
diversity, space diversity does not induce any loss in bandwidth e-
ciency. This property is very attractive for future high data rate wire-
less communications. In the latter case, a diversity combining tech-
nique is applied before further signal processing takes place. If the an-
tennas are far apart, for example at dierent cellular base station sites
or WLAN access points, this is called macrodiversity. If the antennas
are at a distance in the order of one wavelength, this is called micro-
diversity. A special case is phased antenna arrays, which also can be
184
Figure 8.14: Twp Antenna Cyclic Delay Diversity
used for beamforming, MIMO channels and Spacetime coding (STC).
8.2.2 Receive Diversity:
It called also SIMO (single input multiple output system) as we use multi
antenna at the receiver as shown in Figure 1.14.
Receive diversity is most often used in the uplink. Here, the base station
uses two antennas to pick up two copies of the received signal. The signals
reach the receive antennas with dierent phase shifts, but these can be re-
moved gure 1.16. by antenna-specic channel estimation. The base sta-
tion can then add the signals together in phase, without any risk of destruc-
tive interference between them. The signals are both made up from sev-
eral smaller rays, so they are both subject to fading. If the two individual
signals undergo fades at the same time, then the power of the combined
signal will be low. But if the antennas are far enough apart (a few wave-
lengths of the carrier frequency), then the two sets of fading geometries will
be very dierent, so the signals will be far more likely to undergo fades at
completely dierent times. We have therefore reduced the amount of fading
in the combined signal, which in turn reduces the error rate. Base stations
185
Figure 8.15: Receive Diversity
Figure 8.16: main idea of Receive Diversity
186
usually have more than one receive antenna. In LTE, the mobiles test spec-
ications assume that the mobile is using two receive antennas , so LTE
systems are expected to use receive diversity on the downlink as well as the
uplink. A mobiles antennas are closer together than a base stations, which
reduces the benet of receive diversity, but the situation can often be im-
proved using antennas that measure two independent polarizations of the
incoming signal.
Now for the receive diversity how the receiver get the signal from the many copies reached
to him? The answer is by using one technique of the diversity combining tech-
niques which is many types:
1. Selective Combining (SC): In this type the receiver has many diversity
branches he get the information from the branch has the largest signal-
to-noise ratio only so this technique is impractical to the continuous
transmission systems as we have to monitor all the diversity branches
and select of them the largest SNR branch . Moreover, since only one
branch output is used, co-phasing of multiple branches is not required,
so this technique can be used with either coherent or dierential modu-
lation gures 1.17 and 1.18.
2. Threshold Combining: A simpler type of combining, called threshold
combining, avoids the need for a dedicated receiver on each branch by
scanning each of the branches in sequential order and outputting the
187
Figure 8.17: Selective Combining
Figure 8.18: branch selective diversity
188
Figure 8.19: Threshold Combining
rst signal with SNR above a given threshold. As in SC, since only
one branch output is used at a time, co-phasing is not required Once
a branch is chosen, as long as the SNR on that branch remains above
the desired threshold, the combiner outputs that signal. If the SNR on
the selected branch falls below the threshold, the combiner switches to
another branch.
As in SC, since only one branch output is used at a time, co-phasing
is not required. Thus, this technique can be used with either coherent
or dierential modulation. There are several criteria the combiner can
use to decide which branch to switch to and the simplest criterion is to
switch randomly to another branch gure 1.19.
3. Equal Gain Combining: A simpler technique is equal-gain combining,
which co-phases the signals on each branch and then combines them
with equal weighting.
MRC requires knowledge of the time-varying SNR on each branch, which
can be very dicult to measure. A simpler technique is equal-gain com-
bining, which co-phases the signals on each branch and then combines
them with equal weighting This technique doesnt need channel estima-
tion of the envelope but for the phase only. The combiners output can
189
Figure 8.20: Switch-and-examine strategy
be written as:
4. Switched Diversity Combining (SDC): When the signal quality of the
used branch is good, there is no need to look for (to use) other branches,
other branches are needed only when the signal quality decreases. Two
strategies are used:
Switch-and-examine strategy: It Stay with the signal branch until
the envelop drops below a predened threshold gure 1.20.
Switch-and-stay strategy: The receiver switches to the strongest of
the M-1 other signals only if its level exceeds the threshold. Here
less signal discontinuities gure 1.21.
5. Maximal Ratio Combining: MRC Idea: Branches with better signal
energy should be enhanced, where as branches with lower SNRs given
lower weights In maximal ratio combining (MRC) the output is a weighted
sum of all branches due to its SNR . It is the optimal technique be-
cause it maximizes the output SNR. The combiner weights the branches
for maximum SNR. The combiners output can be written as:
190
Figure 8.21: Switch-and-stay strategy
The combiner chooses the weights to be the channel gain conjugate, so
in this technique the channel must be estimated at rst gure 1.22.
At a given time, a signalS
0
is sent from the transmitter .The channel
including the eects of the transmit chain, the air link, and the receive
chain may be modeled by a complex multiplicative distortion composed
of a magnitude response and a phase response. The channel between
the transmit antenna and the receive antenna zero is denoted byh
0
and
between the transmit antenna and the receive antenna one is denoted
byh
1
where
Noise and interference are added at the two receivers. The resulting re-
ceived baseband signals are
where n
0
and n
1
represent complex noise and interference .Assuming
n
0
and n
1
are Gaussian distributed, the maximum likelihood decision
191
Figure 8.22: Maximal Ratio Combining
rule at the receiver for these received signals is to choose signal S
i
if
and only if
Where d2(x,y) is the squared Euclidean distance between signal x and
y calculated by the following expression
We will make combining for the incoming two signal r0 and r1 in order
to get benet of the multipath, here we will use MRRC as we said be-
fore, the receiver combining scheme for two-branch MRRC is as follows
192
Expanding (3) and using (4) and (5) we get Choose Si Detected sym-
bol if
But if we are using QPSK or PSK modulation, all the constellations
have the same magnitude Energy [S
i
[ are equal
Where Es is the energy of the signal. Therefore, for PSK signals, the
decision rule in (6) may be simplied to Choose Si if
193
Detection
After combining the received signals at the receiver it is time to detect the
transmitted symbols that were transmitted from the transmitter whether in
case of using single or multiple antennas at the transmitter. There are two
main types of detectors:
1. Maximum A Posteriori (MAP): It is the optimum detector; it is based
on tracing all the possibilities of the incoming data and chooses the
one with higher probability
Example: If we are using BPSK
Where S
i
is the transmitted signal (1 or -1) at the time instant i and
Y
i
is the received signal at the receiver and S is the estimated output
from the MAP estimator, we can see that if the probability that the
transmitted symbol is 1 given the received signal is bigger than the
probability that the transmitted symbol is -1 given the received signal
then the estimated output is 1 and vice verse. From chain rule P(S/Y)
P(Y) =P(Y/S) P(S) Where P(S
i
) is priors it is the probability of the
194
transmitted symbol e.g. P(S
i
= 0) which is dicult to obtained by the
receiver.
Where P(S
i
) is priors it is the probability of the transmitted symbol
e.g.P(S
i
= 0) which is dicult to obtained by the receiver.
2. Maximum Likelihood Detector (MLD) It based on the same idea as
MAP but the only dierent it neglect the priors as it is dicult to be
obtained and needs a long time to be estimated.
In case of AWGN
Until now we have entered the eect of the channel, after adding the
eect of the channel the detection equation will slightly change. Now
the detector will compare Yi with Sihi not with Si, here we must make
channel estimation rst.
Transmit Diversity
Introduction: Here, we present space-time block codes and evaluate their
performance on MIMO fading channels. We rst introduce the Alamouti
code, which is a simple two branch transmit diversity scheme. The key fea-
ture of the scheme is that it achieves a full diversity gain with a simple maximum-
likelihood decoding algorithm. We also present space-time block codes with
195
Figure 8.23: Transmit Diversity
a large number of transmit antennas based on orthogonal designs. The de-
coding algorithms for space-time block codes with both real and complex
signal constellations are discussed gure 1.23. The performance of the schemes
on MIMO fading channels under various channel conditions is evaluated by
simulations.
Space-Time Codes: Spacetime codes (STCs) provide a new paradigm for
transmission over Rayleigh fading channels using multiple transmit anten-
nas. They are a method employed to improve the reliability of data trans-
mission in wireless communication systems using multiple transmit anten-
nas. STCs rely on transmitting multiple, redundant copies of a data stream
to the receiver in the hope that at least some of them may survive the phys-
ical path between transmission and reception in a good enough state to al-
low reliable decoding. In other words, it turns multipath propagation into a
benet for the user. There are 2 types of STCs:
1. SpaceTime Trellis Coding: SpaceTime Trellis Coding (STTCs) have been
proposed where we combine signal processing at the receiver with cod-
ing techniques appropriate to multiple transmit anten-nas and provides
coding and diversity gain Specic spacetime trellis codes designed for
twofour transmit antennas perform extremely well in slow fading en-
vironments (typical of indoor transmission) and come within 23 dB of
the outage capacity The bandwidth eciency is about threefour times
that of current systems.
2. SpaceTime Block Codes: Spacetime coding is a general term used to in-
dicate multi-antenna transmission schemes where modulation symbols
196
Figure 8.24: SpaceTime Block
are mapped in the time and spatial (transmit-antenna) domain to cap-
ture the diversity oered by the multiple transmit antennas. Two-antenna
spacetime block coding (STBC), more specically a scheme referred
to as SpaceTime Transmit Diversity (STTD), has been part of the 3G
WCDMA standard already from its rst release gure 1.24.
STTD operates on pairs of modulation symbols. The modulation sym-
bols are directly transmitted on the rst antenna. However, on the sec-
ond antenna the order of the modulation symbols within a pair is re-
versed. Furthermore, the modulation symbols are sign-reversed and
complex-conjugated In vector notation, STTD transmission can be ex-
pressed as:
The two-antenna spacetime coding can be said to be of rate one, im-
plying that the input symbol rate is the same as the symbol rate at
each antenna, corresponding to a bandwidth utilization of 1. Space-
time coding can also be extended to more than two antennas. How-
ever, in the case of complex-valued modulation, such as QPSK or 16/64QAM,
spacetime codes of rate one without any inter-symbol interference (or-
thogonal spacetime codes) only exist for two antennas. If inter-symbol
197
interference is to be avoided in the case of more than two antennas,
spacetime codes with rate less than one must be used, corresponding
to reduced bandwidth utilization. SpaceTime Block Codes (STBCs)
act on a block of data at once (similarly to linear block codes) and pro-
vide only diversity gain, but are much less complex in implementa-tion
terms than STTCs. The spacetime codes provide the best possible trade-
o between constellation size, data rate, diversity advantage, and trellis
complexity. We will focus on this type in our study.
SpaceFrequency Block Codes Spacefrequency block coding (SFBC) is
similar to spacetime block coding, with the dierence that the encod-
ing is carried out in the antenna/frequency domains rather than in the
antenna/ time domains. Thus, spacefrequency coding is applicable to
OFDM and other frequency-domain transmission schemes. The space-
frequency equivalence to STTD (which could also be referred to as Space-
Frequency Transmit Diversity, SFTD) is illustrated in Figure 1.25.
As can be seen, the block of (frequency-domain) modulation symbols
a0, a1, a2, a3, is directly mapped to OFDM carriers of the rst an-
tenna, while the block of symbols -a1*, a0*, -a3*, a2* is mapped to the
corresponding subcarriers of the second antenna.
Similar to spacetime coding, the drawback of spacefrequency coding is
that there is no straightforward extension to more than two antennas
unless a rate reduction is acceptable.
between SFBC and two-antenna cyclic-delay diversity in essence lies in
how the block of frequency domain modulation symbols are mapped
to the second antenna. The benet of SFBC compared to CDD is that
SFBC provides diversity at modulation-symbol level while CDD, in
the case of OFDM, must rely on channel coding in combination with
frequency-domain interleaving to provide diversity gure 1.26.
System Block Diagram: STBCs provide the maximum possible trans-
mission rate allowed. For complex constellations, spacetime block codes
can be constructed for any number of transmit antennas, and again
these codes have remarkably simple decoding algorithms based only
198
Figure 8.25: SpaceFrequency Block
Figure 8.26: Transmit Diversity Principle
Figure 8.27
199
Figure 8.28
on linear processing at the receiver. They provide full spatial diver-
sity and half of the maximum possible transmission rate allowed by the
theory of spacetime coding. Alamouti discovered a remarkable scheme
for transmission using two transmit antennas gure 1.27. Spacetime
block coding generalizes the transmission scheme discovered by Alam-
outi to an arbitrary number of transmit antennas and is able to achieve
the full diversity promised by the transmit and receive antennas.
Alamouti method (delay diversity method):
(a) Closed Loop Transmit Diversity Here, the transmitter sends two
copies of the signal in the expected way, but it also applies a phase
shift to one or both signals before transmission. By doing this, it
can ensure that the two signals reach the receiver in phase, without
any risk of destructive interference. The phase shift is determined
by a precoding matrix indicator (PMI), which is calculated by the
receiver and fed back to the transmitter. A simple PMI might in-
dicate two options: either transmit both signals without any phase
shifts, or transmit the second.
with a phase shift of 180. If the rst option leads to destructive in-
terference, then the second will automatically work. Once again,
the amplitude of the combined signal is only low in the unlikely
200
event that the two received signals undergo fades at the same time.
The phase shifts introduced by the radio channel depend on the
wavelength of the carrier signal and hence on its frequency. This
implies that the best choice of PMI is a function of frequency as
well. However, this is easily handled in an OFDMA system, as the
receiver can feed back dierent PMI values for dierent sets of sub-
carriers. The best choice of PMI also depends on the position of
the mobile, so a fast moving mobile will have a PMI that frequently
changes. Unfortunately the feedback loop introduces time delays
into the system, so in the case of fast moving mobiles, the PMI
may be out of date by the time it is used gure 1.28. For this rea-
son, closed loop transmit diversity is only suitable for mobiles that
are moving suciently slowly. For fast moving mobiles, it is better
to use the open loop technique described below.
(b) Open Loop Transmit Diversity open loop transmit diversity that
is known as Alamoutis technique.
The Alamouti scheme is historically the rst space-time block code
to provide full transmit diversity for systems with two transmit an-
tennas. In this section, we present Alamoutis transmit diversity
technique, including encoding and decoding algorithms and its per-
formance.
A)Two-Branch Transmit Diversity with one receiver: The g-
ure below shows the baseband representation of the Alamouti Scheme
with one receiver. In Alamouti Scheme we transmit encoded sequence,
not like MRRC where we send the transmitted message directly. The
encoding is done in space and time (space-time coding). The encod-
ing, however, may also be done in space and frequency gure 1.29. The
scheme uses two transmit antennas and one receive antenna and may
be dened by the following three functions:
The encoding and transmission sequence of information symbols at
the transmitter.
The combining scheme at the receiver.
The decision rule for maximum likelihood detection.
201
Figure 8.29: Two-Branch Transmit Diversity
Let us assume that an M-ary modulation scheme is used. In the Alam-
outi space-time encoder, each group of m information bits is rst mod-
ulated, where m = log2M. Then, the encoder takes a block of two mod-
ulated symbols S0 and S1 in each encoding operation and maps them
to the transmit antennas according to a code matrix given by
Here, the transmitter uses two antennas to send two symbols, denoted
s1 and s2, in two successive time steps. In the rst step, the transmit-
ter sends s1 from the rst antenna and s2 from the second, while in the
second step, it sends s*2 from the rst antenna and s*1 from the sec-
ond. (The symbol indicates that the transmitter should change the
sign of the quadrature component, in a process known as complex con-
jugation.) It is clear that the encoding is done in both the space and
time domains. Let us denote the transmit sequence from antennas one
and two by S1 and S2, respectively.
202
The key feature of the Alamouti scheme is that the transmit sequences
from the two transmit antennas are orthogonal, since the inner product
of the sequences S1 and S2 is zero, i.e.
Now we will transmit the encoded bits. The fading channel coecients
from the rst and second transmit antennas to the receive antenna at
time t are denoted by h0(t) and h1(t), respectively gure 1.30. Assum-
ing that the fading coecients are constant across two consecutive sym-
bol transmission periods, they can be expressed as follows
The receiver can now make two successive measurements of the received
signal, which correspond to two dierent combinations of s1 and s2. It
can then solve the resulting equations, so as to recover the two trans-
mitted symbols. There are only two requirements: the fading patterns
must stay roughly the same between the rst time step and the second,
203
Figure 8.30
and the two signals must not undergo fades at the same time. Both re-
quirements are usually met.
At the receive antenna, the received signals over two consecutive sym-
bol periods, denoted by r0 and r1 for time t and t+T , respectively,
can be expressed as
where n0 and n1 are independent complex variables with zero mean
and power spectral density N0/2 per dimension, representing additive
white Gaussian noise samples at time t and t+T , respectively.
Note that we cannot separate s1 and s2 from the received 2 vec-
tors. But simply and by linear method we can separate them.
204
Substituting the two equation , the maximum likelihood decoding
can be represented as
Thus, the maximum likelihood decoding rule (7) can be separated
into two independent decoding rules for S0 and S1, given by
Therefore, the decision rules in (10) can be further simplied to:
205
Figure 8.31: Two-Branch transmit diversity
B)Two-Branch transmit diversity with M receivers: There
may be applications where a higher order of diversity is needed and
multiple re-ceive antennas at the remote units are feasible. In such cases,
it is possible to provide a diversity order of 2M with two transmit and
receive antennas M gure 1.31.
206
The received signals at the two receive antennas:
There is no equivalent to Alamoutis technique for systems with more
than two antennas. Despite this, some extra diversity gain can still be
achieved in four antenna systems, by swapping back and forth between
the two constituent antenna pairs. This technique is used for four an-
tenna open loop diversity in LTE. We can combine open and closed
loop transmit diversity with the receive diversity techniques from ear-
lier, giving a system that carries out diversity processing using multi-
ple antennas at both the transmitter and the receiver. The technique is
dierent from the spatial multiplexing techniques that we will describe
next, although, as we will see, a spatial multiplexing system can fall
back to diversity transmission and reception if the conditions require.
Summary of Alamoutis scheme :
(a) Assumptions:
We have perfect channel knowledge at Rx.
207
Uncorrelated data streams (Flat fading).
(b) Advantages
The transmissions are orthogonal. This implies that the RX an-
tenna.
Simple maximum Likelihood decoding algorithm based on linear
processing of received signals.
Open-loop transmit diversity scheme (no feed-back from RX to
TX i.e. no need for channel information.
No B.W. expansion (as redundancy is applied in space across
multiple antennas, not in time or frequency).
Low complexity decoders.
Identical to MRC if we doubled total radiated power from that
used in MRC.
(c) Disadvantages
No coding gain unlike Space Time Trellis Codes.
Complexity of maximum Likelihood detectors rises exponen-
tially with the number of transmits antennas.
Spatial Interference.
8.3 Spatial multiplexing
8.3.1 Principles of Operation
Spatial multiplexing has a dierent purpose from diversity processing. If
the transmitter and receiver both have multiple antennas, then we can set
up multiple parallel data streams between them, to increase the data rate.
In a system with NT transmit and NR receive antennas, often known as an
NT NR spatial multiplexing system, the peak data rate is proportional
to min(NT,NR). Figure1.32 shows a basic spatial multiplexing system, in
which the transmitter and receiver both have two antennas. In the trans-
mitter, the antenna mapper takes symbols from the modulator two at a
time, and sends one symbol to each antenna. The antennas transmit the
two symbols simultaneously, so as to double the transmitted data rate. The
208
Figure 8.32
symbols travel to the receive antennas by way of four separate radio paths,
so the received signals can be written as follows:
y1 = H11x1 +H12x2 +n1
y2 = H21x1 +H22x2 +n2
Here, x1 and x2 are the signals sent from the two transmit antennas, y1
and y2 are the signals that arrive at the two receive antennas, and n1 and
n2 represent the received noise and interference. Hij expresses the way in
which the transmitted symbols are attenuatedand phase-shifted, as they
travel to receive antenna i from transmit antenna j. (The subscripts i and
j may look the wrong way round, but this is for consistency with the usual
mathematical notation for matrices.) In general, all the terms in the equa-
tion above are complex. In the transmitted and received symbols xj and yi
and the noise terms ni , the real and imaginary parts are the amplitudes of
the in-phase and quadrature components. Similarly, in each of the channel
elements Hij , the magnitude represents the attenuation of the radio signal,
while the phase represents the phase shift.
8.3.2 V-blast
Recent information theory research has shown that the rich-scattering wire-
less channel is capable of enormous theoretical capacities if the multipath is
209
properly exploited.
Introduction
The diagonally-layered space-time architecture proposed by Foschini , now
known as di- agonal BLAST (Bell Laboratories Layered Space-Time) or
D-BLAST, is one such ap- proach. D-BLAST utilizes multi-element an-
tenna arrays at both transmitter and receiver and an elegant diagonally
layered coding structure in which code blocks are dispersed across diag-
onals in space-time. In an independent Rayleigh scattering environment,
this processing structure leads to theoretical rates which grow linearly with
the number of an- tennas (assuming equal numbers of transmit and receive
antennas) with these rates approaching 90% of Shannon capacity. How-
ever, the diagonal approach suers from certain implementation complexi-
ties which make it inappropriate for initial implementation. System overview:
Operation
Single data stream is demultiplexed into M substreams. Each substream is then encoded
into symbols and fed to its respective transmitter. Transmitters operate co-channel, sym-
bols are synchronized. All use same QAM constellation. Transmitted substreams are inde-
pendent. V-BLAST is not transmit diversity.That transmissions are organized into bursts
of L symbols. Receivers 1 N are individually conventional QAM receivers. These receivers
also operate co-channel, each receiving the signals radiated from all M transmit antennas.
Basic Idea: Treat each substream in turn as desired signal, rest as interferers,and then
use AAA like techniques to detect each. (AAA= adaptive antenna array).Nulling is per-
formed by linearly weighting the received signals so as to satisfy some performance related
criterion, such as minimum mean-squared error (MMSE) or zero-forcing (ZF).
Zero forcing:
210
Figure 8.33: Demodulation/decoding of spatially multiplexed signals based on successive interfer-
ence cancellation
Successive interference cancellation: A superior technique to use successive inter-
ference cancellation with nulling zeroforcing. Where interference from already-detected
components of a is subtracted out from the re- ceived signal vector, resulting in a modied
received vector in which eectively fewer interferers are present gure 1.33.
Note: when symbol cancellation is used, the system performance is aected by the order
in which the components of a are detected, whereas it does not matter when pure nulling
is used.
Detection algorithm:
Simulation:
We used bpsk modulation.
Flat fading (rayleigh multipath channel)
211
Figure 8.34: 2 2 MIMO channel
In a 2 2 MIMO channel gure 1.34, probable usage of the available 2 transmit antennas
can be as follows:
1. Consider that we have a transmission sequence, for example x1,x2.
2. In normal transmission, we will be sending in the rst time slot x1, in the second
time slotx2, and so on.
3. However, as we now have 2 transmit antennas, we may group the symbols into groups
of two. In the rst time slot, send x1 and x2 from the rst and second an- tenna. In
second time slot, send x3 and x4 from the rst and second antenna, send x5and x6 in
the third time slot and so on.
4. Notice that as we are grouping two symbols and sending them in one time slot, we
need only time slots to complete the transmission data rate is doubled.
System Model: The received signal on the rst receive antenna is
The received signal on the second receive antenna is
where:
y1,y2 are the received symbol on the rst and second antenna respectively.
h1,1 is the channel from 1
st
transmit antenna to 1
st
receive antenna.
h1,2 is the channel from 2
nd
transmit antenna to 1
st
receive antenna.
h2,1 is the channel from 1
st
transmit antenna to 2
nd
receive antenna.
h2,2 is the channel from 2
nd
transmit antenna to 2
nd
receive antenna.
x1,x2 are the transmitted symbols and n1,n2 is the noise on receive antennas.
For convenience, the above equation can be represented in matrix notation as follows:
212
Equivalently
To solve for x The Zero Forcing (ZF) linear detector for meeting this constraint WH = I
is given by:
To do the Successive Interference Cancellation (SIC), the receiver needs to
perform the following:
Using successive interference cancellation: In classical Successive Interference Can-
cellation, the receiver arbitrarily takes one of the estimated symbols, and subtract its ef-
fect from the received symbol and . However, we can have more intelligence in choosing
whether we should subtract the eect of x1 rst or x2 rst. To make that decision, let us
nd out the transmit symbol (after multiplication with the channel) which came at higher
power at the receiver. The re-ceived power at the both the antennas corresponding to the
transmitted symbol x1 is
The received power at the both the antennas corresponding to the transmitted symbol is
r = hx1 +n
The equalized symbol is
x1 =
h
H
r
h
H
h
BER curve of ZF-sic and ZF:
213
8.3.3 spatial multiplexing Types :
1. Closed loop spatial multiplexing: In the closed-loop spatial multiplexing mode, the
NodeBapplies the spatial domain precoding on the transmitted signal taking into ac-
count the precoding matrix indicator (PMI) reported by the UE so that the trans-
mitted signal matches with the spatial channel experienced by the UE . To support
the closed-loop spatial multiplexing in the downlink, the UE needs to feedback the
rank indicator (RI), the PMI, and the channel quality indicator (CQI) in the uplink.
2. Open loop spatial multiplexed : Operated when reliable PMI feedback is not avail-
able at the e-Node-B, for example, The feedback consists of the RI and the CQI in
open-loop spatial multiplexing.
214
A transmission diversity scheme is used for rank-1 open loop transmissions. However,
for rank greater than one, the open-loop transmission scheme uses large-delay CDD
along with a xed precoder matrix for the two-antenna-ports P = 2 case, while pre-
coder cycling is used for the four-antenna-ports P = 4 case. The xed precoder used
for the case of two antenna ports is the identity matrix. Therefore, the precoder for
data resource element index i, denoted byW (i), is simply given as:
8.4 Downlink MIMO modes in LTE
Dierent downlink MIMO modes are envisaged in LTE which can be ad-
justed according to channel condition, trac requirements, and UE capabil-
ity. The following transmission modes are possible in LTE:
Single-Antenna transmission, no MIMO.
Transmit diversity.
Open-loop spatial multiplexing, no UE feedback required.
Closed-loop spatial multiplexing, UE feedback required.
Multi-user MIMO (more than one UE is assigned to the same Resource
block).
Closed-loop precoding for rank=1 (i.e. no spatial multiplexing, but pre-
coding is used).
Beam forming.
215
Figure 8.35
Downlink MIMO transmission chain
four-Tx transmission diversity respectively. We note that the term layer,
which generally refers to a stream in MIMO spatial multiplexing, can be
confusing when used in the context of transmission diversity. In transmis-
sion diversity, a single codeword is transmitted, which is Eectively a sin-
gle rank transmission. After layer mapping, transmission diversity precod-
ing, Which is eectively an SFBC block code for 2-Tx antennas and a bal-
anced SFBC-FSTD code for 4-Tx antennas, is applied. The signals after
transmission diversity precoding are mapped to time-frequency resources
on two or four antennas for the SFBC and balanced SFBC-FSTD cases and
OFDM signal generation by use of IFFT takes place shown in gure 1.35.
In the following sections, we will only discuss layer mapping and precoding
parts that are relevant for transmit diversity discussion.
216
Codeword to layer mapping In the case of transmit diversity transmission; a
single codeword is transmitted from two or four antenna ports. The num-
ber of layers in the case of transmit diversity is equal to the number of an-
tenna ports. The number of modulation symbols per layer M
layer
symb
for 2 and
4 layers is given by:
Where M
0
symb
represents the total number of modulation symbols within
the codeword. In the case of two antenna ports, the modulation symbols
from a single codeword are mapped to 2 (= 2) layers as below:
In the case of four antenna ports, the modulation symbols from a single
codeword are mapped to 4 layers (= 4) as below:
The codeword to layer mapping for two and four antenna ports transmit
diversity (TxD) transmissions in the downlink is shown in Figure 1.35. In
the case of two antenna ports (two layers), the even numbered (d
0
(0), d
0
(2), ...)and
odd-numbered (d
0
(1), d
0
(3), ...)codeword modulation symbols are mapped
to layers 0 and 1 respectively. In the case of four antenna ports 1/4 of the
codeword modulation symbols are mapped to a given layer as given by pre-
vious equation .
Transmit diversity precoding The block of vectors at the output of the layer
mapper x(i) = [x
0
(i), .....x
1
(i)]
T
i is provided as input to the precoding
217
Figure 8.36
Figure 8.37
218
stage The precoding stage then generates another block of vectors y(i) =
[y
0
(i), .....y
p1
(i)]
T
as shown in Figure 1.37.
This block of vectors is then mapped onto resources on each of the antenna
ports. The symbols at the output of precoding for antenna port p,y
(p)
(i)
are given as:
For the case of two antenna ports transmit diversity, the output of the pre-
coding operation is written as:
Where x
0
I
(i)andx
0
Q
(i)are real and imaginary parts of the modulation symbol
on layer 0 and x
1
I
(i)andx
1
Q
(i)are real and imaginary parts of the modulation
symbol on layer 1.
We note that the number of modulation symbols for mapping to resource
elements is two times the number of modulation symbols per layer, that is
M
map
symb
= 2 M
layer
symb
.
The transmit diversity precoding and RE mapping for two antenna ports is
shown in Figure 1.38. We note that the precoding and RE mapping opera-
tions result in a space frequency block coding (SFBC) scheme.
So
219
Figure 8.38: Transmit diversity precoding and RE mapping for two antenna ports
We note that the number of modulation symbols for mapping to resource
220
Figure 8.39
elements is four times the number of modulation symbols per layer, thatM
map
symb
=
4 M
layer
symb
. . The transmit diversity precoding and RE mapping for four an-
tenna ports is shown in Figure 1.39.
We note that the four antenna ports precoding and RE mapping operations
results in a balanced SFBC-FSTD scheme as is also illustrated by an alter-
native representation below:
In spatial multiplexing The LTE system supports transmission of a maxi-
mum of two codewords in the downlink. Each codeword is separately coded
using turbo coding and the coded bits from each codeword are scrambled
separately. The complex-valued modulation symbols for each of the code-
words to be transmitted are mapped onto one or multiple layers. The complex-
valued modulation symbols d
q
(0), ...d
q
(M
q
symp
1) for codeword q are mapped
onto the layers .A rank-1 transmission can happen for the case of one, two
221
or four antenna ports while for rank-2 transmission, the number of antenna
ports needs to be at least 2. In the case of rank-1 transmission, the complex-
valued modulation symbols d
q
(0), ...d
q
(M
q
symp
1)from a single codeword
(q = 0) are mapped to a single layer ( = 0) Also the number of modula-
tion symbols per layer M
layer
symp
is equal to the number of modulation symbols
per codeword M
0
symp
.It can be noted that for rank-1 transmission, the layer
mapping operation is transparent with codeword modulation symbols sim-
ply mapped to a single layer.In the case of rank-2 transmissions, which can
happen for both two and four antenna ports, the modulation symbols from
the two codewords with (q = 0, 1) are mapped to 2 layers ( = 0, 1) as be-
low:
We note that for rank-2 transmission, the codeword to layer mapping is an
MCW scheme with two codewords mapped to two layers separately as in
the above gure.
MIMO precoding
It is well known that the performance of a MIMO system can be improved
with channel knowledge at the transmitter. The channel knowledge at the
222
Figure 8.40: Illustration of feedback-based MIMO precoding
transmitter does not help to improve the degrees of freedom but power or
beam-forming gain is possible . In a TDD system, the channel knowledge
can be obtained at the eNB by uplink transmissions thanks to channel reci-
procity. However, the sounding signals needs to be transmitted on the up-
link, which represents an additional overhead. In an FDD system, the chan-
nel state information needs to be fed back from the UE to the eNB. The
complete channel state feedback can lead to excessive feedback overhead.
For example in a 4 4 MIMO channel, a total of 16 complex channel gains
from each of the transmission antennas to each of the receive antennas need
to be signaled. An approach to reduce the channel state information feed-
back overhead is to use a codebook gure 1.40. In a closed-loop MIMO pre-
coding system, for each transmission antenna conguration, we can con-
struct a set of precoding matrices and let this set be known at both the
eNB and the UE.
8.4.1 Precoding for two antenna ports
A square matrix with entries given by:
A2 2 (N = 2) Fourier matrix can be expressed as:
223
We can, for example, dene a set of four2 2Fourier matrices by taking G
= 4. These four 2 2 matrices with g = 0, 1, 2, 3are given as below:
The LTE codebook for two antenna ports consists of four precoders for rank-
1 and three precoders for rank-2 as given in next table :
Precoding operation where W(i) is size P precoding matrix, P is number
of ports and ( P) is number of layers transmitted. An example of rank-2
precoding for two and four antenna ports transmissions is shown in . We
assumed the precoders The symbols at the output of precoding is given as:
224
wherex
0
(i)andx
1
(i)represent modulation symbols from codewords 1 and 2
respectively.
8.4.2 CDD-based precoding
The LTE system also supports a composite precoding by introducing a cyclic
delay diversity (CDD) precoder on top of the precoders described before..
Two types ofCDDprecoding:
1. small-delay CDD.
225
2. large-delay CDD.
The goal of small-delay precoding is to introduce articial frequency selec-
tivity for opportunistic scheduling gains with low feedback overhead while
the large-delayCDDachieves diversity by making sure that each MIMO code-
wordis transmitted on all the available MIMO layers. Both the small-delay
and large-delay CDD schemes were incorporated in the LTE standard. How-
ever, the small-delay CDD was removed from the specication at the later
stages because the scheduling gains promised were small, particularly when
feedback-based precoding can be employed for closed-loop MIMO opera-
tion.
Small-delay CDD precoding:
The goal of small-delay CDD precoding is to provide gains by exploiting
frequency selectivity introduced via multi-user scheduling.For small-delay
cyclic delay diversity (CDD), the precoding is a composite precoding of CDD-
based precoding dened by matrix D(i) and precoding matrix W(i) as given
by the relationship below:
where W(i) is size P precodingmatrix, P is number of ports, (P) is
number of layers transmitted and D(i) is a diagonal matrix for support of
cyclic delay diversity. In the case of two antenna ports, the CDD diagonal
matrix D(i) is given as:
Large delay CDD precoding:
For large-delay cyclic delay diversity (CDD), the precoding is a composite
precoding of CDD-based precoding dened by matrix D(i) and precoding
226
matrix W(i) as given by the relationship below:
where W(i) is size P precodingmatrix, P is number of ports, (P) is
number of layers transmitted and D(i) is a diagonal matrix of layers
transmitted and irepresents modulation symbol index within each of the
layers with
In the case of two layers, the large-delay CDD diagonal matrix D(i) and
xed DFT matrix U are given as:
The CDD diagonal matrix D(i) for odd and even iis written as:
227
228
Bibliography
[1] 3GPP. Evolved Universal Terrestrial Radio Access (E-UTRA); Physi-
cal channels and modulation . TS 36.211, 3rd Generation Partnership
Project (3GPP), January 2010.
[2] Agilent TECHNOLOGIES. MIMO in LTE Operation and Measure-
ment.
[3] Siavash M. Alamouti. A simple transmit diversity technique for wire-
less communications. IEEE Journal on select areas in communication,
16(8), October 1998.
[4] Bernard Sklar Charan Langton. Finding mimo.
www.complextoreal.com.
[5] Christopher Cox. An Introduction to LTE. John Wiley & Sons Ltd,
2012.
[6] Stefan Parkvall Erik Dahlman and Johan Skld. 4G LTE/LTE-
Advanced for Mobile Broadband. Elsevier Ltd., 2011.
[7] Arunabha ghosh. fundamentals of LTE. prentice hall.
[8] Harri Holma and Antti Toskala. LTE for UMTS OFDMA and SC-
FDMA Based Radio Access. John Wiley & Sons, Ltd, 2009.
[9] Farooq Khan. LTE for 4G Mobile Broadband. Cambridge university
press, 2009.
[10] Rohde & Schwarz. UMTS Long Term Evolution (LTE) Technology In-
troduction. C.Gessner, 2008.
229
[11] Matthew Baker Stefania Sesia, Issam Touk. LTE The UMTS Long
Term Evolution From Theory to Practice. John Wiley & Sons, Ltd,
2011.
[12] Vahid Tarokh. Spacetime block codes from orthogonal designs. IEEE
TRANSACTIONS ON INFORMATION THEORY, 45(5), July 1999.
230
Chapter 9
Orthogonal Frequency Division
Multiplixing (OFDM)
231
9.1 Introduction
In general, multicarrier schemes subdivide the used channel bandwidth into
a number of parallel subchannels as shown in Figure 9.1 (a). Ideally the
bandwidth of each subchannel is such that they are, ideally, each non-frequency-
selective (i.e. having a spectrally at gain); this has the advantage that the
receiver can easily compensate for the subchannel gains individually in the
frequency domain.
Orthogonal Frequency Division Multiplexing (OFDM) is a special case of
multicarrier transmission where the non-frequency-selective narrowband
subchannels, into which the frequency-selective wideband channel is divided,
are overlapping but orthogonal, as shown in Figure 9.1(b). This avoids the
need to separate the carriers by means of guard-bands, and therefore makes
OFDM highly spectrally ecient. The spacing between the subchannels in
OFDM is such that they can be perfectly separated at the receiver. This
allows for a low complexity receiver implementation, which makes OFDM
attractive for high-rate mobile data transmission such as the LTE down-
link.
It is worth noting that the advantage of separating the transmission into
multiple narrowband subchannels cannot itself translate into robustness
against time-variant channels if no channel coding is employed. The LTE
downlink combines OFDM with channel coding and Hybrid Automatic Re-
peat reQuest (HARQ) to overcome the deep fading which may be encoun-
tered on the individual subchannels.
Figure 9.1: Spectral eciency of OFDM compared to classical multicarrier modulation: (a) clas-
sical multicarrier system spectrum; (b) OFDM system spectrum.
232
9.2 OFDM
9.2.1 Why OFDM
Transmission by means of OFDM can be seen as a kind of multi-carrier
transmission. The basic characteristics of OFDM transmission, which dis-
tinguish it from a straightforward multi-carrier extension of a more narrow-
band transmission scheme as outlined in Figure 9.2 are:
Figure 9.2: Extension to wider transmission bandwidth by means of multi-carrier transmission.
The use of a relatively large number of narrowband subcarriers. In con-
trast, a straightforward multi-carrier extension as outlined in Figure
9.2 would typically consist of only a few subcarriers, each with a rela-
tively wide bandwidth. As an example, a WCDMA multi-carrier evo-
lution to a 20MHz overall transmission bandwidth could consist of four
(sub)carriers, each with a bandwidth in the order of 5 MHz. In com-
parison, OFDM transmission may imply that several hundred subcarri-
ers are transmitted over the same radio link to the same receiver.
Simple rectangular pulse shaping as illustrated in Figure 9.3a. This
corresponds to a sinc-square-shaped per-subcarrier spectrum, as illus-
trated in Figure 9.3b.
Tight frequency-domain packing of the subcarriers with a subcarrier
233
spacing f =1/Tu, where Tu is the per-subcarrier modulation-symbol
time (see Figure 9.4). The subcarrier spacing is thus equal to the per-
subcarrier modulation rate 1/Tu.
An illustrative description of a basic OFDM modulator is provided in
Figure 9.4. It consists of a bank of Nc complex modulators, where each
modulator corresponds to one OFDM subcarrier.
Figure 9.3: Per-subcarrier pulse shape and spectrum for basic OFDM transmission.
Figure 9.4: OFDM subcarrier spacing.
In complex baseband notation, a basic OFDM signal x(t) during the time
interval mTu t <(m+1)Tu can thus be expressed as
x(t) =
N1
K=1
x
k
(t) =
N1
K=1
a
m
k
e
j2kft
(9.1)
234
where x
k
(t) is the kth modulated subcarrier with frequency f
k
=k f and
a
m
k
is the, in general complex, modulation symbol applied to the kth sub-
carrier during the mth OFDM symbol interval, i.e. during the time inter-
val mTu t <(m+1)Tu. OFDM transmission is thus block based, imply-
ing that, during each OFDM symbol interval, Nc modulation symbols are
transmitted in parallel. The modulation symbols can be from any modula-
tion alphabet, such as QPSK, 16QAM, or 64QAM.
The number of OFDM subcarriers can range from less than one hundred
to several thousand, with the subcarrier spacing ranging from several hun-
dred kHz down to a few kHz. What subcarrier spacing to use depends on
what types of environments the system is to operate in, including such as-
pects as the maximum expected radiochannel frequency selectivity (maxi-
mum expected time dispersion) and the maximum expected rate of channel
variations (maximum expected Doppler spread). Once the subcarrier spac-
ing has been selected, the number of subcarriers can be decided based on
the assumed overall transmission bandwidth, taking into account accept-
able out-of-band emission, etc.
As an example, for 3GPP LTE the basic subcarrier spacing equals 15 kHz.
On the other hand, the number of subcarriers depends on the transmission
bandwidth, with in the order of 600 subcarriers in case of operation in a
10MHz spectrum allocation and correspondingly fewer/more subcarriers in
case of smaller/larger overall transmission bandwidths.
9.2.2 Orthogonal Multiplexing Principle
Signals are orthogonal if they are mutually independent of each other.
Orthogonality is a property that allows multiple information signals to be
transmitted perfectly over a common channel and detected, without in-
terference. Mathematically, two functions are orthogonal if their prod-
uct when integrated over certain interval gives zero. We note that although
subcarriers overlap in time , we can separate them due to their orthogonal-
ity.
(m+1)T
u
_
mT
u
x
k1
(t)x
k2
(t) =
(m+1)T
u
_
mT
u
a
k1
a
k2
e
j2k1ft
e
j2k2ft
(9.2)
235
A high-rate data stream typically faces the problem of having a symbol pe-
riod Ts much smaller than the channel delay spread T
d
if it is transmit-
ted serially. This generates Inter- Symbol Interference (ISI) which can only
be undone by means of a complex equalization procedure. In general, the
equalization complexity grows with the square of the channel impulse re-
sponse length. In OFDM, the high-rate stream of data symbols is rst Serial-
to-Parallel (S/P) converted for modulation onto M parallel subcarriers as
shown in Figure 9.5. This increases the symbol duration on each subcar-
rier by a factor of approximately M, such that it becomes signicantly longer
than the channel delay spread. This operation has the important advantage
Figure 9.5: Serial-to-Parallel (S/P) conversion operation for OFDM.
of requiring a much less complex equalization procedure in the receiver, un-
der the assumption that the time-varying channel impulse response remains
substantially constant during the transmission of each modulated OFDM
symbol. Figure 9.6 shows how the resulting long symbol duration is vir-
tually unaected by ISI compared to the short symbol duration, which is
highly corrupted. Figure 9.7 shows the typical block diagram of an OFDM
system. The signal to be transmitted is dened in the frequency domain.
An S/P converter collects serial data symbols into a data block S[k] = [S
0
[k], S
1
[k], ..., S
M1
[k]]T
of dimension M, where k is the index of an OFDM symbol (spanning the
M subcarriers). The M parallel data streams are rst independently modu-
lated resulting in the complex vector
X[k] = [X
0
[k], X
1
[k], ..., X
M1
[k]]T .
Note that in principle it is possible to use dierent modulations (e.g. QPSK
or 16QAM) on each subcarrier; due to channel frequency selectivity, the
236
channel gain may dier between subcarriers, and thus some subcarriers can
carry higher data-rates than others. The vector X [k] is then used as input
to an N-point Inverse FFT (IFFT) resulting in a set of N complex time-
domain samples x[k] = [x
0
[k], ..., x
N1
[k]]T . In a practical OFDM system,
the number of processed subcarriers is greater than the number of modu-
lated subcarriers (i.e. N M), with the un-modulated subcarriers being
padded with zeros.
Figure 9.6: Eect of channel on signals with short and long symbol duration.
The next key operation in the generation of an OFDM signal is the cre-
ation of a guard period at the beginning of each OFDM symbol x [k] by
adding a Cyclic Prex (CP), to eliminate the remaining impact of ISI caused
by multipath propagation. The CP is generated by duplicating the last G
samples of the IFFT output and appending them at the beginning of x [k].
This yields the time domain OFDM symbol [x
NG
[k], ..., x
N1
[k], x
0
[k], ..., x
N1
[k]]T
, as shown in 9.8.
To avoid ISI completely, the CP length G must be chosen to be longer than
the longest channel impulse response to be supported. The CP converts the
linear (i.e. aperiodic) convolution of the channel into a circular (i.e. peri-
odic) one which is suitable for DFT processing. The insertion of the CP
into the OFDM symbol and its implications are explained more formally
later in this section.
The output of the IFFT is then Parallel-to-Serial (P/S) converted for trans-
237
Figure 9.7: OFDM system model: (a) transmitter; (b) receiver.
Figure 9.8: OFDM Cyclic Prex (CP) insertion.
238
mission through the frequency-selective channel. At the receiver, the re-
verse operations are performed to demodulate the OFDM signal. Assuming
that time- and frequency-synchronization is achieved , a number of samples
corresponding to the length of the CP are removed, such that only an ISI-
free block of samples is passed to the DFT. If the number of subcarriers N
is designed to be a power of 2, a highly ecient FFT implementation may
be used to transform the signal back to the frequency domain. Among the
N parallel streams output from the FFT, the modulated subset of M sub-
carriers are selected and further processed by the receiver. Let x(t) be the
symbol transmitted at time instant t. The received signal in a multipath
environment is then given by
r(t) = x(t) h(t) +z(t) (9.3)
where h(t) is the continuous-time impulse response of the channel, rep-
resents the convolution operation and z(t) is the additive noise. Assuming
that x(t) is band-limited to [
1
2Ts
,
1
2Ts
], the continuous-time signal x(t) can
be sampled at sampling rate Ts such that the Nyquist criterion is satised.
As a result of the multipath propagation, several replicas of the transmitted
signals arrive at the receiver at dierent delays.
9.2.3 OFDM adventage and disadventages
OFDM adventages
OFDM is an ecient way to deal with multipath eects.
Bandwidth eciency is high since it uses overlapping orthogonal sub-
carriers.
It is possible to enhance capacity signicantly by adapting the data
rate per subcarriers according to the SNR of that particular subcarrier.
OFDM disadventages
Intercarrier interference (ICI) due to phase noise and carrier frequency
oset which destroy the orthogonality.
Intersymbol ISI due to channel delays and dispersion.
High value of Peak-to-Average Power Ratio (PAPR).
239
9.2.4 Peak-to-Average Power Ratio and Sensitivity to Non-Linearity
While the previous section shows the advantages of OFDM, this section
highlights its major drawback: the Peak-to-Average Power Ratio (PAPR).
In the general case, the OFDM transmitter can be seen as a linear trans-
form performed over a large block of independent identically distributed
(i.i.d) QAM-modulated complex symbols (in the frequency domain). From
the central limit theorem , the time-domain OFDM symbol may be approx-
imated as a Gaussian waveform. The amplitude variations of the OFDM
modulated signal can therefore be very high. However, practical Power Am-
pliers (PAs) of RF transmitters are linear only within a limited dynamic
range. Thus, the OFDM signal is likely to suer from non-linear distortion
caused by clipping. This gives rise to out-of-band spurious emissions and
in-band corruption of the signal. To avoid such distortion, the PAs have
to operate with large power back-os, leading to inecient amplication or
expensive transmitters.
The PAPR is one measure of the high dynamic range of the input ampli-
tude, and hence a measure of the expected degradation. To analyse the PAPR
mathematically, let x
n
be the signal after IFFT as given by Equation
x
n
[k] =
1
N
N
m=1
X
m
[k]exp(2jm
n
N
) (9.4)
where the time index k can be dropped without loss of generality. The PAPR
of an OFDM symbol is dened as the square of the peak amplitude divided
by the mean power, i.e.
PAPR =
max
n
[x
n
[
2
E[x
n
[
2
(9.5)
Under the hypothesis that the Gaussian approximation is valid, the ampli-
tude of x
n
has a Rayleigh distribution, while its power has a central chi-
square distribution with two degrees of freedom. The Cumulative Distri-
bution Function (CDF) F
x
() of the normalized power is given by
F
x
() = Pr
_
[x
n
[
2
E[x
n
[
2
<
_
= 1 e
(9.6)
240
If there is no oversampling, the time-domain samples are mutually uncorre-
lated and the probability that the PAPR is above a certain threshold PAPR
0
is given by
Pr(PAPR > PAPR
0
) = 1 F
x
(PAPR
0
)
N
= 1 (1 e
PAPR
0
)
N
(9.7)
Figure 9.9 plots the distribution of the PAPR given by Equation ( 9.7 )
for dierent values of the number of subcarriers N. The gure shows that a
high PAPR does not occur very often. However, when it does occur, degra-
dation due to PA non-linearities may be expected.
Figure 9.9: PAPR distribution for dierent numbers of OFDM subcarriers.
9.2.5 PAPR Reduction Techniques
Many techniques have been studied for reducing the PAPR of a transmit-
ted OFDM signal.
Although no such techniques are specied for the LTE downlink signal gen-
eration, an overview of the possibilities is provided below. In general in LTE
the cost and complexi-ty of generating the OFDM signal with acceptable
Error Vector Magnitude (EVM) is left to the eNodeB implementation. As
OFDM is not used for the LTE uplink, such considerations do not directly
apply to the transmitter in the UE.
Techniques for PAPR reduction of OFDM signals can be broadly
categorized into three main concepts:
241
1. Clipping and ltering:
The time-domain signal is clipped to a predened level. This causes
spectral leakage into adjacent channels, resulting in reduced spectral
eciency as well as in-band noise degrading the bit error rate perfor-
mance. Out-of-band radiation caused by the clipping process can, how-
ever, be reduced by ltering.
If discrete signals are clipped directly, the resulting clipping noise will
all fall in band and thus cannot be reduced by ltering. To avoid this
problem, one solution consists of oversampling the original signal by
padding the input signal with zeros and processing it using a longer
IFFT. The oversampled signal is clipped and then ltered to reduce
the out-of-band radiation.
2. Selected mapping:
Multiple transmit signals which represent the same OFDM data sym-
bol are generated by multiplying the OFDM symbol by dierent phase
vectors. The representation with the lowest PAPR is selected. To re-
cover the phase information, it is of course necessary to use separate
control signalling to indicate to the receiver which phase vector was
used.
3. Coding techniques:
These techniques consist of nding the code words with the lowest PAPR
from a set of codewords to map the input data. A look-up table may
be used if N is small. It is shown that complementary codes have good
properties to combine both PAPR and forward error correction.
The latter two concepts are not applicable in the context of LTE; se-
lected mapping would require additional signalling, while techniques
based on codeword selection are not compatible with the data scram-
bling used in the LTE downlink.
9.2.6 Cyclic Prex Insertion
As described in Section 9.2.2, an uncorrupted OFDM signal can be de-
modulated without any interference between subcarriers. One way to un-
242
derstand this subcarrier orthogonality is to recognize that a modulated sub-
carrier x
k
(t) in ( 9.1 ) consists of an integer number of periods of complex
exponentials during the demodulator integration interval Tu =1/f
However, in case of a time-dispersive channel the orthogonality between the
subcarriers will, at least partly, be lost. The reason for this loss of subcar-
rier orthogonality in case of a time-dispersive channel is that, in this case,
the demodulator correlation interval for one path will overlap with the sym-
bol boundary of a dierent path, as illustrated in Figure 9.10. Thus, the
integration interval will not necessarily correspond to an integer number of
periods of complex exponentials of that path as the modulation symbols ak
may dier between consecutive symbol intervals. As a consequence, in case
of a time-dispersive channel there will not only be inter-symbol interference
within a subcarrier but also interference between subcarriers.
Figure 9.10: Time dispersion and corresponding received-signal timing.
Another way to explain the interference between subcarriers in case of a
timedispersive channel is to have in mind that time dispersion on the radio
channel is equivalent to a frequency-selective channel frequency response.
Orthogonality between OFDM subcarriers is not simply due to frequency-
domain separation but due to the specic frequency-domain structure of
each subcarrier. Even if the frequency-domain channel is constant over a
bandwidth corresponding to the main lobe of an OFDM subcarrier and only
the subcarrier side lobes are corrupted due to the radio-channel frequency
selectivity, the orthogonality between subcarriers will be lost with inter-
subcarrier interference as a consequence. Due to the relatively large side
lobes of each OFDM subcarrier, already a relatively limited amount of time
dispersion or, equivalently, a relatively modest radio-channel frequency se-
lectivity may cause non-negligible interference between subcarriers.
243
To deal with this problem and to make an OFDM signal truly insensitive
to time dispersion on the radio channel, so-called cyclic-prex insertion is
typically used in case of OFDM transmission. As illustrated in Figure 9.11,
cyclic-prex insertion implies that the last part of the OFDM symbol is
copied and inserted at the beginning of the OFDM symbol. Cyclic-prex
insertion thus increases the length of the OFDM symbol from T
u
to T
u
+
T
CP
, where T
CP
is the length of the cyclic prex, with a corresponding re-
duction in the OFDM symbol rate as a consequence. As illustrated in the
lower part of Figure 9.11, if the correlation at the receiver side is still only
carried out over a time interval T
u
=1/f , subcarrier orthogonality will
then be preserved also in case of a time-dispersive channel, as long as the
span of the time dispersion is shorter than the cyclic-prex length.
Figure 9.11: Cyclic-prex insertion.
In practice, cyclic prex insertion is carried out on the time discrete output
of the transmitter IFFT. Cyclic-prex insertion then implies that the last
N
CP
samples of the IFFT output block of length N is copied and inserted
at the beginning of the block, increasing the block length from N to N +
N
CP
. At the receiver side, the corresponding samples are discarded before
OFDM demodulation by means of, for example, DFT/FFT processing.
Cyclic-prex insertion is benecial in the sense that it makes an OFDM
signal insensitive to time dispersion as long as the span of the time disper-
244
sion does not exceed the length of the cyclic prex. The drawback of cyclic
prex insertion is that only a fraction T
u
/(T
u
+ T
CP
) of the received sig-
nal power is actually utilized by the OFDM demodulator, implying a cor-
responding power loss in the demodulation. In addition to this power loss,
cyclic prex insertion also implies a corresponding loss in terms of band-
width as the OFDM symbol rate is reduced without a corresponding reduc-
tion in the overall signal bandwidth.
One way to reduce the relative overhead due to cyclic-prex insertion is
to reduce the subcarrier spacing f , with a corresponding increase in the
symbol time T
u
as a consequence. However, this will increase the sensitivity
of the OFDMtransmission to fast channel variations, that is high Doppler
spread, as well as dierent types of frequency errors.
It is also important to understand that the cyclic prex does not necessar-
ily have to cover the entire length of the channel time dispersion. In gen-
eral, there is a trade-o between the power loss due to the cyclic prex and
the signal corruption (inter-symbol and inter-subcarrier interference) due to
residual time dispersion not covered by the cyclic prex and, at a certain
point, further reduction of the signal corruption due to further increase of
the cyclic-prex length will not justify the corresponding additional power
loss. This also means that, although the amount of time dispersion typi-
cally increases with the cell size, beyond a certain cell size there is often no
reason to increase the cyclic prex further as the corresponding power loss
due to a further increase of the cyclic prexwould have a larger negative
impact, compared to the signal corruption due to the residual time disper-
sion not covered by the cyclic prex.
Circular convolution
When an input data stream x[n] is sent through a linear time-invariant
FIR channel h[n] the output is the linear convolution: y[n] = x[n] h[n]
If the convolution is circular convolution, it is possible to take the DFT
of the channel output y[n] to get: DFTy[n] = DFTx[n] h[n] Or
in the frequency domain: Y [m] = X[m]H[m]
This formula describes an ISI-free channel in the frequency domain,
245
where each input symbol X[m] is simply scaled by a complex-value H[m].
For the convolution to be circular we need to add a cyclic prex.
If the maximum channel delay spread has a duration of N + 1 samples,
then by adding a guard band of at least N samples between OFDM
symbols, each OFDM symbol is made independent of those coming be-
fore and after it, and so the ISI between OFDM symbols is avoided.
The channel output y is decomposed into a simple multiplication of the
channel frequency response H = DFTh and the channel frequency
domain input, X = DFTx.
The cyclic prex is not entirely free. It comes with both a bandwidth
and power penalty.
Since N redundant symbols are sent, the required bandwidth for OFDM
increases from B to (L +N/L)B.
An additional v symbols must be counted against the transmit power
budget. The use of cyclic prex entails data rate and power losses that
are both: RateLoss = PowerLoss = L/(L +V )
9.2.7 Frequency-domain model of OFDM transmission
Assuming a suciently large cyclic prex, the linear convolution of a time
dispersive radio channel will appear as a circular convolution during the de-
modulator integration interval T
u
. The combination of OFDM modulation
(IFFT processing), a time-dispersive radio channel, and OFDM demodula-
tion (FFT processing) can then be seen as a frequency-domain channel as
illustrated in Figure 9.12, where the frequency-domain channel taps H
0
, . .
., H
N
c1
can be directly derived from the channel impulse response.
The demodulator output b
k
in Figure 9.12 is the transmitted modulation
symbol ak scaled and phase rotated by the complex frequency-domain chan-
nel tap H
k
and impaired by noise n
k
. To properly recover the transmitted
symbol for further processing, for example data demodulation and chan-
nel decoding, the receiver should multiply b
k
with the complex conjugate
of H
k
, as illustrated in Figure 9.13, This is often expressed as a one-tap
equalizer being applied to each received subcarrier.
246
Figure 9.12: Frequency-domain model of OFDM transmission/reception.
Figure 9.13: Frequency-domain model of OFDM transmission/reception with one-tap equaliza-
tion at the receiver.
247
9.2.8 Channel estimation and reference symbols
As described above, to demodulate the transmitted modulation symbol a
k
and allow for proper decoding of the transmitted information at the receiver
side, scaling with the complex conjugate of the frequency-domain channel
tap H
k
should be applied after OFDM demodulation (FFT processing) (see
Figure 9.13). To be able to do this, the receiver obviously needs an esti-
mate of the frequency-domain channel taps H
0
, . . ., H
N
c1
. The frequency-
domain channel taps can be estimated indirectly by rst estimating the chan-
nel impulse response and, from that, calculate an estimate of H
k
. However,
a more straightforward approach is to estimate the frequency-domain chan-
nel taps directly. This can be done by inserting known reference symbols,
sometimes also referred to as pilot symbols, at regular intervals within the
OFDM time-frequency grid, as illustrated in Figure 9.14. Using knowl-
edge about the reference symbols, the receiver can estimate the frequency-
domain channel around the location of the reference symbol. The reference
symbols should have a suciently high density in both the time and the
frequency domain to be able to provide estimates for the entire time/frequency
grid also in case of radio channels subject to high frequency and/or time se-
lectivity.
Dierent more or less advanced algorithms can be used for the channel esti-
mation, ranging from simple averaging in combination with linear interpo-
lation to Minimum-Mean-Square-Error (MMSE) estimation relying on more
detailed knowledge of the channel time/frequency-domain characteristics.
Figure 9.14: Time-frequency grid with known reference symbols.
248
9.3 OFDM as a user-multiplexing and multiple-access scheme
The discussion has, until now, implicitly assumed that all OFDM subcarri-
ers are transmitted from the same transmitter to a certain receiver, i.e.:
Downlink transmission of all subcarriers to a single mobile terminal.
Uplink transmission of all subcarriers from a single mobile terminal.
However, OFDM can also be used as a user-multiplexing or multiple-accessscheme,
allowing for simultaneous frequency-separated transmissions to/from multi-
ple mobile terminals. See Figure 9.15
Figure 9.15: OFDM as a user-multiplexing/multiple-access scheme : (a) downlink and (b) uplink
In the downlink direction, OFDM as a user-multiplexing scheme implies
that, in each OFDM symbol interval, dierent subsets of the overall set of
available subcarriers are used for transmission to dierent mobile terminals
(see Figure 9.15 a).
Similarly, in the uplink direction, OFDM as a user-multiplexing or multiple-
access scheme implies that, in each OFDM symbol interval, dierent sub-
sets of the overallset of subcarriers are used for data transmission from dif-
ferent mobile terminals
Assumes that consecutive subcarriers are used for transmission to/from the
same mobile terminal. However, distributing the subcarriers to/from a mo-
bile terminal in the frequency domain is also possible as illustrated in Fig-
ure 9.16. The benet of such distributed user multiplexing or distributed
249
Figure 9.16: Distributed user multiplexing
multiple access is a possibility for additional frequency diversity as each
transmission is spread over a wider bandwidth.
In the case when OFDMA is used as an uplink multiple-access scheme, i.e.
in case of frequency multiplexing of OFDM signals from multiple mobile
terminals, it is critical that the transmissions from the dierent mobile ter-
minals arrive approximately time aligned at the base station. More specif-
ically, the transmissions from the dierent mobile terminals should arrive
at the base station with a timing misalignment less than the length of the
cyclic prex to preserve orthogonality between subcarriers received from
dierent mobile terminals and thus avoid inter-user interference.
Figure 9.17: Uplink transmission-timing control
Due to the dierences in distance to the base station for dierent mobile
terminals and the corresponding dierences in the propagation time (which
may far exceed the length of the cyclic prex), it is therefore necessary to
control the uplink transmission timing of each mobile terminal (see Figure
9.17 ). Such transmit timing control should adjust the transmit timing of
each mobile terminal to ensure that uplink transmissions arrive approxi-
mately time aligned at the base station. As the propagation time changes
as the mobile terminal is moving within the cell, the transmittiming con-
trol should be an active process, continuously adjusting the exact transmit
250
timing of each mobile terminal.
Furthermore, even in case of perfect transmittiming control, there will al-
ways be some interference between subcarriers e.g. due to frequency er-
rors. Typically this interference is relatively low in case of reasonable fre-
quency errors, Doppler spread, etc. However, this assumes that the dier-
ent subcarriers are received with at least approximately the same power. In
the uplink, the propagation distance and thus the path loss of the dierent
mobile-terminal transmissions may dier signicantly. If two terminals are
transmitting with the same power, the received-signal strengths may thus
dier signicantly, implying a potentially signicant interference from the
stronger signal to the weaker signal unless the subcarrier orthogonality is
perfectly retained. To avoid this, at least some degree of uplink transmit-
power control may need to be applied in case of uplink OFDMA, reducing
the transmit power of user terminals close to the base station and ensuring
that all received signals will be of approximately the same power.
9.4 The downlink physical resource:
LTE downlink transmission is based on OFDM. The basic LTE downlink
physical resource can thus be seen as a time-frequency resource grid (Fig-
ure 9.18), where each resource element corresponds to one OFDM subcar-
rier during one OFDM symbol interval.
Figure 9.18: The LTE downlink physical resource
For LTE, the OFDM subcarrier spacing has been chosen to f =15 kHz.
Assuming an FFT-based transmitter/receiver implementation, this corre-
sponds to a sampling rate f
s
= 15 000 * NFFT, where NFFT is the FFT
size. The basic time unit T
s
dened in the pre-vious section can thus be
251
seen as the sampling time of an FFT-based transmitter/receiver implemen-
tation with an FFT size equal to 2048.
It is important to understand though that the time unit T
s
is introduced in
the LTE radio-access specications purely as a tool to dene dierent time
intervals and does not impose any specic transmitter and/or receiver im-
plementation constraints (e.g. a certain sampling rate).
In practice, an FFT-based transmitter/receiver implementation with an
FFT size equal to 2048 and a corresponding sampling rate of 30.72 MHz
is suitable for the wider LTE transmission bandwidths, such as bandwidths
in the order of 15 MHz and above. However, for smaller transmission band-
widths, a smaller FFT size and a correspondingly lower sampling rate can
very well be used. As an example, for transmission bandwidths in the order
of 5 MHz, an FFT size equal to 512 and a corresponding sampling rate of
7.68 MHz may be sucient.
Assuming a power-of-two FFT size and a subcarrier spacing of 15 kHz, the
sampling rate fNFFT will be a multiple or submultiple of the WCDMA/HSPA
chip rate (3.84 Mcps). This relation can be utilized when implementing mul-
timode terminals supporting both WCDMA/HSPA and LTE.
In addition to the 15 kHz subcarrier spacing, a reduced subcarrier spacing
ow = 7.5 kHz with twice as long OFDM symbol time is also dened
for LTE. The reduced subcarrier spacing specically targets MBSFN-based
multicast/broadcast transmissions.
As illustrated in Figure 9.19, in the frequency domain the downlink sub-
carriers are grouped into resource blocks, where each resource block con-
sists of 12 consecutive sub-carriers. In addition, there is an unused DC-subcarrier
in the center of the downlink band.
The reason why the DC-subcarrier is not used for downlink transmission is
that it may be subject to un-proportionally high interference, for example,
due to local-oscillator leakage.
The LTE physical-layer specication allows for a downlink carrier to consist
of any number of resource blocks, ranging from a minimum of 6 resource
blocks up to a maximum of 110 resource blocks. This corresponds to an
overall downlink transmission bandwidth ranging from roughly 1 MHz up
to in the order of 20 MHz with very ne granularity and thus allows for
252
Figure 9.19: Frequency-domain structurefor LTE downlink
a very high degree of LTE bandwidth exibility, at least from a physical-
layer-specication point-of-view. However, LTE radio-frequency require-
ments are, at least initially, only specied for a limited set of transmission
bandwidths, corresponding to a limited set of possible values for the num-
ber of resource blocks within a carrier.
Figure 9.20 outlines the more detailed time-domain structure for LTE down-
link transmission. Each 1 ms subframe consists of two equally sized slots of
length T
slot
= 0.5 ms (15 360 * T
s
). Each slot then consists of a number
of OFDM symbols including cyclic prex. A subcarrier spacing of 15 kHz
corresponds to a useful symbol time of approximately 66.7 s. The over-
all OFDM symbol time is then the sum of the useful symbol time and the
cyclic-prex length.
As illustrated in Figure 9.20 , LTE denes two cyclic-prex lengths, the
normal cyclic prex and an extended cyclic prex, corresponding to seven
and six OFDM symbols per slot, respectively.
The exact cyclic-prex lengths, expressed in the basic time unit Ts , are
given in Figure 9.21. It can be noted that, in case of the normal cyclic pre-
x, the cyclic-prex length for the rst OFDM symbol of a slot is some-
what larger, compared to the remaining OFDM symbols. The reason for
this is simply to ll the entire 0.5 ms slot as the number of basic time units
Ts per slot (15 360) is not dividable by seven.
The reasons for dening two cyclic-prex lengths for LTE are twofold:
A longer cyclic prex, although less ecient from a cyclic-prex-overhead
point-of-view, may be benecial in specic environments with very ex-
253
Figure 9.20: detailed time domain structure for LTE downlink transmission
Figure 9.21
254
tensive delay spread, for example in very large cells. It is important to
have in mind, though, that a longer cyclic prex is not necessarily ben-
ecial in case of large cells, even if the delay spread is very extensive in
such cases. If, in large cells, link performance is limited by noise rather
than by signal corruption due to residual time dispersion not covered
by the cyclic prex, the additional robustness to radio-channel time
dispersion, due to the use of a longer cyclic prex, may not justify the
corresponding loss in terms of reduced received signal energy.
In case of MBSFN-based multicast/ broadcast transmission, the cyclic
prex should not only cover the main part of the actual channel time
dispersion but also the timing dierence between the transmissions re-
ceived from the cells involved in the MBSFN transmission. In case of
MBSFN operation, the extended cyclic prex is therefore often needed.
Thus, the main use of the extended cyclic prex can be expected to be
MBSFN-based transmission. It should be noted that dierent cyclic-
prex lengths may be used for dierent subframes within a frame. As
an example, MBSFN-based multicast/broadcast transmission is typ-
ically conned to certain subframes in which case the use of the ex-
tended cyclic prex, with its associated additional cyclic-prex over-
head, may only be applied to these subframes.
Taking into account also the downlink time-domain structure, the resource
blocks mentioned above consist of 12 subcarriers during a 0.5 ms slot, as
illustrated in Figure 9.22. Each resource block thus consists of 84 resource
elements in case of normal cyclic prex and 72 resource elements in case of
extended cyclic prex.
Figure 9.22: downlink resource block assuming normal cyclic prex (i.e 7 OFDM symbols per
slot). with extended cyclic prex there are six OFDM symbols per slot.
Although resource blocks are dened over one slot, the basic time-domain
unit for dynamic scheduling in LTE is one subframe, consisting of two con-
255
secutive slots. The reason to dene the resource blocks over one slot is that
distributed downlink transmission is dened on a slot basis.
The minimum scheduling unit consisting of two resource blocks within one
subframe (one resource block per slot) is sometimes referred to as a resource-
block pair .
256
Bibliography
[1] Johan Skold Erik Dahlman, Stefan Parkvall and Per Beming. 3G Evolu-
tion HSPA and LTE for Mobile Broadband. First editionl. Elsevier Pub-
lishers, 2007.
[2] Matthew Baker Stefania Sesia, Issam Touk. The UMTS Long Term
Evolution. A John Wiley and Sons, Ltd., Publication, 2011.
257
258
Appendix A
Matlab
A.1 Communications System Toolbox
comm.BPSKModulator: Modulate using BPSK method
comm.BPSKDemodulator: deModulate using BPSK method
comm.OSTBCEncoder: The OSTBCEncoder object encodes an input sym-
bol sequence using orthogonal space-time block code (OSTBC). The block
maps the input symbols block-wise and concatenates the output codeword
matrices in the time domain.
comm.OSTBCCombiner: The OSTBCCombiner object combines the input
signal (from all of the receive antennas) and the channel estimate signal to
extract the soft information of the symbols encoded by an OSTBC. The
input channel estimate does not need to be constant and can vary at each
call to the step method. The combining algorithm uses only the estimate
for the rst symbol period per codeword block. A symbol demodulator or
decoder would follow the Combiner object in a MIMO communications sys-
tem. paragraphcomm.AWGNChannel The AWGNChannel object adds white
Gaussian noise to a real or complex input signal. When the input uses a
real-valued signal, this object adds real Gaussian noise and produces a real
output signal. When the input uses a complex signal , this object adds com-
plex Gaussian noise and produces a complex output signal.
259
Berfading: Bit error rate (BER) for Rayleigh and Rician fading channels
For All Syntaxes The rst input argument, EbNo, is the ratio of bit en-
ergy to noise power spectral density, in dB. If EbNo is a vector, the output
ber is a vector of the same size, whose elements correspond to the dierent
Eb/N0 levels.
Most syntaxes also have an M input that species the alphabet size for the
modulation. M must have the form 2k for some positive integer k.
berfading uses expressions that assume Gray coding. If you use binary cod-
ing, the results may dier.
For cases where diversity is used, the Eb/N0 on each diversity branch is
EbNo/divorder, where divorder is the diversity order (the number of diver-
sity branches) and is a positive integer.
comm.TurboEncoder: The Turbo Encoder System object encodes a binary
input signal using a parallel concatenated coding scheme. This coding scheme
uses two identical convolutional encoders and appends the termination bits
at the end of the encoded data bits.
comm.AWGNChannel: The AWGNChannel object adds white Gaussian noise
to a real or complex input signal. When the input uses a real-valued sig-
nal, this object adds real Gaussian noise and produces a real output signal.
When the input uses a complex signal , this object adds complex Gaussian
noise and produces a complex output signal.
comm.TurboDecoder: The Turbo Decoder System object decodes the input
signal using a parallel concatenated decoding scheme that employs the a-
posteriori probability (APP) decoder as the constituent decoder. Both con-
stituent decoders use the same trellis structure and algorithm.
comm.ErrorRate: The ErrorRate object compares input data from a trans-
mitter with input data from a receiver and calculates the error rate as a
running statistic. To obtain the error rate, the object divides the total num-
ber of unequal pairs of data elements by the total number of input data el-
ements from one source.
260
A.2 Fixed Point Toolbox
Construct xed-point numeric object
bin Binary representation of stored integer of object
hex Hexadecimal representation of stored integer of object
buildInstrumentedMex Generate MEX function with logging instrumenta-
tion
showInstrumentationResults Results logged by instrumented MEX function
accel Accelerate xed-point code
A.3 Matlab
svd: compute singular value decomposition of symbolic matrix
pinv: Moore-Penrose pseudoinverse of matrix
A.4 HDL Verier
The HDL Verier software provides a means for verifying HDL modules
using the HDL Cosimulation System object. You can use the System ob-
ject as a test bench or you can use it to represent a component still un-
der design. You can use the Cosim Wizard to create an HDL Cosimulation
System object from existing HDL code or you can create and populate the
System object manually .
A.4.1 Workow for Using the Cosimulation Wizard to Create a MATLAB
System Object
The workow for creating a System object using existing HDL code for cosim-
ulation with MATLAB is as follows:
261
1. Start Cosimulation Wizard.
2. Select HDL Cosimulation type as MATLAB System Object.
3. Select HDL les to use in creating block or function.
4. Specify commands for HDL compilation.
5. Select HDL module for cosimulation.
6. Congure input and output ports.
7. Provide output port details.
8. Provide clock and reset details.
9. Conrm or change start-time alignment.
10. Generate System object.
11. Create System object test bench.
For a step by step example see
http://www.mathworks.com/products/hdl-verifier/examples.html?
file=/products/demos/shipping/edalink/Tutorial_MATLAB_SysObj_
IN.html
262
Appendix B
Xilinx ISE Overview
The Xilinx ISE system is an integrated design environment that that con-
sists of a set of programs to create (capture), simulate and implement digi-
tal designs in a FPGA or CPLD target device. All the tools use a graphical
user interface (GUI) that allows all programs to be executed from toolbars,
menus or icons. On-line help is available from most windows. This write-up
is intended to get you started with the ISE tools. It gives a quick overview
of how to create a design, simulate it and download it into a FPGA. For
more detailed information please consult the on-line XILINX documenta-
tion and tutorials. The ISE User Guide is available on line.
B.1 Design Flow Overview
The following steps are involved in the realization of a digital system using
Xilinx FPGAs, as illustrated by gure (A.1).
263
B.1.1 Design Entry
The rst step is to enter y our design. This can be done by creating Source
les. Source les can be created in dierent formats such as a schematic, or
a Hardware Description Language (HDL) such as VHDL, Verilog or ABEL.
A project design will consist of a top-level source le and various lowerlevel
source les. Any of these les can be either a schematic or a HDL le.
B.1.2 Design Synthesis
The synthesis step creates netlist les from the various source les. The
netlist les can serve as input to the implementation module.
B.1.3 Design Verication (simulation)
This is an important step that should be done at various stages of the de-
sign. The simulator is used to verify the functionality of a design (func-
tional simulation), the behavior and the timing (timing simulation) of your
circuit. Timing simulation is run after implementing your circuit in the FPGA
since it needs to know the actual placement and routing to nd out the ex-
act speed and timing of the circuit.
264
B.1.4 Design Implementation
After generating the netlist le (synthesis step), the implementation will
convert the logic design into a physical le that can be downloaded on the
target device (e.g. Virtex FPGA). This step involves three sub-steps: Trans-
lating the netlist, Mapping and Place.
B.1.5 Device Conguration
This refers to the actual programming of the target FPGA by downloading
the programming le to the Xilinx FPGA.
B.2 Starting the ISE Software
To start ISE, double-click the desktop icon, or start ISE from the Start menu
by selecting: Start All Programs Xilinx ISE 12.2 Project
Navigator.
B.2.1 Create a New Project
To create a new project:
1. Select File New Project... The New Project Wizard appears.
2. Type tutorial in the Project Name eld.
3. Enter or browse to a location (directory path) for the new project.
4. A tutorial Subdirectory is created automatically.
5. Verify that HDL is selected from the Top-Level Source Type list.
6. Click Next to move to the device properties page.
7. Fill in the properties in the table as shown below:
Product Category: All
Family: Spartan3
Device: XC3S200
Package: FT256
265
Speed Grade: -4
Top-Level Source Type: HDL
Synthesis Tool: XST (VHDL/Verilog)
Simulator: ISE Simulator (VHDL/Verilog)
Preferred Language: Verilog (or VHDL)
Verify that Enable Enhanced Design Summary is selected. Leave
the default values in the remaining elds.
8. Click next to proceed to the Create New Source window in the New
Project Wizard.
When the table is complete, your project properties will look like that
the shown in gure (A.2):
B.2.2 Create an HDL Source
In this section, you will create the top-level HDL le for your design. De-
termine the language that you wish to use. We will start with Creating a
VHDL Source section below, and then Creating a Verilog Source.
266
Creating a VHDL Source
Create a VHDL source le for the project as follows:
1. Click the New Source button in the New Project Wizard.
2. Select VHDL Module as the source type.
3. Type in the le name counter.
4. Verify that the Add to project checkbox is selected.
5. Click Next.
6. Declare the ports for the counter design by lling in the port informa-
tion as shown in gure (A.3).
7. Click next, and then Finish in the New Source Wizard - Summary dia-
log box to complete the new source le template.
8. Click Next, then Next, then Finish.
267
The source le containing the entity/architecture pair displays in the Workspace,
and the counter displays in the Source tab, as shown in gure (A.4).
B.2.3 Checking the Syntax of the New Counter Module
When the source les are complete, check the syntax of the design to nd
errors and typos
1. Verify that Implementation is selected from the drop-down list in the
Sources window.
2. Select the counter design source in the Sources window to display the
related processes in the Processes window.
3. Click the + next to the Synthesize-XST process to expand the process
group.
4. Double-click the Check Syntax process.
5. Close the HDL le.
Note: You must correct any errors found in your source les. You
can check for errors in the Console tab of the Transcript window.
268
B.2.4 Implement Design and Verify Constraints
Implement the design and verify that it meets the timing constraints speci-
ed in the previous section.
Implementing the Design
1. Select the counter source le in the Sources window.
2. Open the Design Summary by double-clicking the View Design Sum-
mary process in the Processes tab.
3. Double-click the Implement Design process in the Processes tab.
4. Notice that after Implementation is complete, the Implementation pro-
cesses have a green check mark next to them indicating that they com-
pleted successfully without Errors or Warnings.
5. Locate the Performance Summary table near the bottom of the Design
Summary.
269
6. Click the All Constraints Met link in the Timing Constraints eld to
view the Timing constraints report. Verify that the design meets the
specied timing requirements.
7. Close the Design Summary.
270