Channel Coding

DIGITAL COMMUNICATIONS
Block 3
Channel Coding
Francisco J. Escribano, 2013-14

Fundamentals of error control
Error control:
error detection (ARQ schemes)
correction (FEC schemes)
bn cn rn n Channel bn
Channel Channel error
encoder corrector
Channel
error
detector
Channel model: RTx
discrete inputs,
discrete (hard, rn) or continuous (soft, n) outputs,
memoryless. 2
Enabling detection/correction:
Adding redundancy to the information: for every k bits,
transmit n, n>k.
Shannon's theorem (1948):

1) If R<C =supr ( X ; Y ) , for >0, there is n, R=k/n constant, so that Pe<.
pX
2) If Pb is acceptable, rates R<R(Pb)=C/(1-H2(Pb)) are achievable.
3) For any Pb, rates greater than R(Pb) are not achievable.
Problem: Shannon's theorem is not constructive.
3
Added redundancy is structured redundancy.
This relies on sound algebraic & geometrical basis.
Our approach:
Algebra over the Galois Field of order 2, GF(2)={0,1}.
GF(2) is a proper field, GF(2)m is a vector field of dim. m.
Dot product :logical AND. Sum +(-) : logical XOR.
Scalar product: b, d GF(2)m
bdT=b1d1+...+bmdm
Product by scalars: a GF(2), b GF(2)m
ab=(ab1...abm)
It is also possible to define a matrix algebra over GF(2).
4
Given a vector b GF(2)m, its binary weight is w(b)=number of

1's in b.
It is possible to define a distance over vector field GF(2)m,

called Hamming distance:
dH(b,d)=w(b+d); b, d GF(2)m
Hamming distance is a proper distance and accounts for the
number of differing positions between vectors.
Geometrical view:
(1011)
(0110) (1010)
(1110)
5
A given encoder produces n bit outputs for each k bit inputs:
R=k/n < 1 is the rate of the code.
The information rate decreases by R by the use of a code.
R'b=RRb < Rb (bit/s)
Moreover, if used jointly with a modulation with spectral
efficiency =Rb/B (bit/s/Hz), the efficiency decreases by R
'=R < (bit/s/Hz)
In terms of Pb, the achievable Eb/N0 region in AWGN is lower
bounded by:
Eb 1 '
N0
(dB)10log ( ( 2
10 ' 1 ) )
6
Achievable rates, capacity and limits.
Source: http://www.comtechefdata.com/technologies/fec/ldpc
7
How a channel code can improve Pb (BER in statistical terms).
Cost: loss in resources (spectral efficiency, power). 8

Linear block codes
An (n,k) linear block code (LBC) is a subspace C(n,k) < GF(2)n

with dim(C(n,k))=k.
C(n,k) contains 2k vectors c=(c1 cn).
R=k/n is the rate of the LBC.
n-k is the redundancy of the LBC

we would only need vectors with k components to
specify the same amount of information.
9
Linear block codes
Recall vector theory:
A basis for C(n,k) has k vectors over GF(2)n
C(n,k) is orthogonal to an n-k dimensional subspace
over GF(2)n (its null subspace).
c C(n,k) can be both specified as:

c=b1g1+...+bkgk, where {gj}j=1,...,k is the basis, and
(b1...bk) are its coordinates over it.
c such that the scalar products chiT are null, when

{hi}i=1,...,n-k is a basis of the null subspace.
10
Linear block codes
Arranging in matrix form, an LBC C(n,k) can be specified by
G={gij}i=1,...,k, j=1,...,n, c=bG, b GF(2)k.
H={hij}i=1,...,n-k, j=1,...,n, cHT=0.
G is a kn generator matrix of the LBC C(n,k).
H is a (n-k)n parity-check matrix of the LBC C(n,k).

In other approach, it can be shown that the rows in H stand for
linearly independent parity-check equations.
The row rank of H for an LBC should be n-k.
Note that gj C(n,k), and so GHT=0.

11
Linear block codes
The encoder is given by G
Note that a number of different G generate the same LBC
b=(b1...bk) LBC c=(c1...cn)=bG

encoder
G
For any input information block with length k, it yields a

codeword with length n.
An encoder is systematic if b is contained in c=(b1...bk | ck+1...cn),
so that ck+1...cn are the n-k parity bits.
Systematicity is a property of the encoder, not of the LBC
C(n,k) itself.
GS=[Ik | P] is a systematic generator matrix. 12
Linear block codes
How to obtain G from H or H from G.
G rows are k vectors linearly independent over GF(2)n
H rows are n-k vectors linearly independent over GF(2)n
They are related through GHT=0 (a)
(a) does not yield a sufficient set of equations, given H or G.
A number of vector sets comply with it (basis sets are not unique).
Given G, put it in systematic form by combining rows (the code will be the
same, but the encoding does change).
If GS=[Ik | P], then HS=[PT | In-k] complies with (a).
Conversely, given H, put it in systematic form by combining rows.
If HS=[In-k | P], then GS=[PT | Ik] complies with (a).
Parity check submatrix P can be on the left or on the right side (but on
opposite sides of H and G simultaneously for a given LBC). 13
Linear block codes
Note that, by taking 2k vectors out of 2n, we are

getting apart the binary words.
Minimum Hamming distance between input words
is
d min ( GF (2) ) =min { d H ( bi , b j ) b i , b j GF (2) }=1
k k
bi b j
Recall that we have added n-k redundancy bits, so

that
d min ( C (n , k ))=min {d H ( ci , c j ) c i ,c j C (n , k ) }>1
c i c j
d min ( C (n ,k )) nk +1
14
Linear block codes
The channel model corresponds to a BSC (binary symmetric channel)
c=(c1...cn) r=(r1...rn)
Channel
BSC(p)
Modulator AWGN Hard

channel demodulator
1-p
0 p 0
p
1 1
1-p
p=P(ciri ) is the bit error probability of the modulation in AWGN.
15
Linear block codes
The received word is r=c+e, where P(ei=1)=p.
e is the error vector introduced by the noisy channel
w(e) is the number of errors in r wrt original word c
P(w(e)=t)=pt(1-p)n-t, because the channel is memoryless.
At the receiver side, we can compute the syndrome s=(s1...sn-k) as

s=rHT=(c+e)HT=cHT+eHT=eHT.
r=(r1...rn) Channel s=(s1...sn-k)=rHT

decoder
H
r C(n,k) s=0.
16
Linear block codes
Two possibilities at the receiver side:
a) Error detection (ARQ schemes):
If s0, there are errors, so ask for retransmission.
b) Error correction (FEC schemes):
Decode an estimated C(n,k), so that dH(,r) is the minimum
over all codewords in C(n,k) (closest neighbor decoding).
is the most probable word under the assumption that p is small
(otherwise, the decoding fails).
(1011)
1 r1 2
(0110)
e1
OK (1010)
c (1110)
r2
e2
17
Linear block codes
Detection and correction capabilities (worst case) of an LBC with
dmin(C(n,k)).
a) It can detect error events e with binary weight up to
w(e)|max,det=d=dmin(C(n,k))-1
b) It can correct error events e with binary weight up to

w(e)|max,corr=t=(dmin(C(n,k))-1)/2
It is possible to implement a joint strategy:

A dmin(C(n,k))=4 code can simultaneously correct all error
patterns with w(e)=1, and detect all error patterns with w(e)=2.
18
Linear block codes
The minimum distance dmin(C(n,k)) is a property of the set of
codewords in C(n,k), independent from the encoding (G).
As the code is linear, dH(ci,cj)=dH(ci+cj,cj+cj)=dH(ci+cj,0).
ci, cj, ci+cj, 0 C(n,k)
dmin(C(n,k))=min{w(c) | c C(n,k), c0}

i.e., corresponds to the minimum word weight over all codewords
different from null.
dmin(C(n,k)) can be calculated from H:
It is the minimum number of different columns of H adding to 0.
It is the column rank of H + 1.
19
Linear block codes
Detection limits: probability of undetected errors?
Note that an LBC contains 2k codewords, and the received
word corresponds to any of the 2n possibilities in GF(2)n.
An LBC detects up to 2n-2k error patterns.
An undetected error occurs if r=c+e with e0 C(n,k)
In this case, rHT=0.
n
P u ( E)= Aip (1 p )
i ni
i =d min
Ai is the number of codewords in C(n,k) with weight i: it is

called the weight spectrum of the LBC.
20
Linear block codes
On correction, an LBC considers syndrome s=rHT.
Assume correction capabilities up to w(e)=t, and EC to be
the set of correctable error patterns.
A syndrome table associates a unique si over the 2n-k
possibilities to a unique error pattern ei EC with w(ei)t.
If si=rHT, decode =r+ei.

^
Given an encoder G, estimate information vector b such
^
that bG=.
If the number of correctable errors #(EC)<2n-k, there are
2n-k-#(EC) syndromes usable in detection, but not in correction.
At most, an LBC can correct 2n-k error patterns.
21
Linear block codes
A w(e)t error correcting LBC has a probability of
correcting erroneously bounded by
n
n pi(1 p)ni
P (E )
i=t +1
()
i
This is an upper bound, since not all the codewords are
separated by the minimum distance of the code.
Calculating the resulting P'b of an LBC is not an easy task,

and it depends heavily on how the encoding is made
through G.
LBC codes are mainly used in detection tasks (ARQ).
22
Linear block codes
Observe that both coding & decoding can be performed with
low complexity hardware (combinational logic: gates).
Examples of LBC
Repetition codes
Single parity check codes
Hamming codes
Cyclic redundancy codes
Reed-Muller codes
Golay codes
Product codes
Interleaved codes
Some of them will be examined in the lab. 23
Linear block codes
An example of performance: RHam=4/7, RGol=1/2.
24
Convolutional codes
A binary convolutional code (CC) is another kind of linear
channel code class.
The encoding can be described in terms of a finite state
machine (FSM).
A CC can eventually produce sequences of infinite length.
A CC encoder has memory. General structure:
not
mandatory
Backward logic (feedback)
k input streams
MEMORY: ml bits for l-th input
not n output streams
mandatory
Forward logic (coded bits)
Systematic output 25
Convolutional codes
The memory is organized as a shift register.
Number of positions for input l: memory ml.
ml=l is the constraint length of the l-th input/register.

The register effects step by step delays on the input: recall discrete LTI
systems theory.
A CC encoder produces sequences, not just blocks of data.

Sequence-based properties vs. block-based properties.
to backward
logic
l-th input stream 1 2 3 4 ml
input at instant i
(l )
to forward
di (l )
d i1
(l )
d i2
(l )
d i3
(l )
d i4
(l )
d im l
logic 26
Convolutional codes
Both forward and backward logic is boolean logic.
Very easy: each operation adds up (XOR) a number of memory
positions, from each of the k inputs.
inputs from all the k registers at instant i
Same structure for

backward logic
k iml
c = g
( j)
i
( j)
l , qi d (l )
q
j-th output at instant i
l =1 q=i
( j)
g l,p , p=0,. .. , ml , is 1 when the p-th register position for the
l-th input is added to get the j-th output.
27
Convolutional codes
Parameters of a CC so far:
k input streams
n output streams
k shift registers with length ml each, l=1,...,k
l=ml is the constraint length of the l-th register
m=maxl{l} is the memory order of the code
=1+...+k is the overall constraint length of the code
A CC is denoted as (n,k,).
Its rate is R=k/n, where k and n usually take small
values.
28
Convolutional codes
The backward / forward logic may be specified in the form
of generator sequences.
Theses sequences are the impulse responses of each
output j wrt each input l.
( j) ( j) ( j)
g =( g
l l ,0 , ... , g l ,ml )
Observe that:
g(l j )=( 1,0,... ,0 ) connects the l-th input directly to the
j-th output
( j)
gl =( 0,... ,1(q th ) ,... ,0 ) just delays the l-th input to
the j-th output q time steps.
29
Convolutional codes
Given the presence of the shift register, the generator
sequences are better denoted as generator polynomials
ml
( j) ( j) ( j) ( j) ( j) q
g =( g
l l ,0 , ... , g l ,ml )g l ( D)= g D l ,q
q=0
We can write then
( j) (j)
g =( 1,0,... ,0 ) g (D)=1
l l
( j) th ( j) q
g =( 0,... ,1(q ) ,... ,0 ) g ( D)= D
l l
g(l j )=( 1,1,0,... ,0 ) g (l j ) (D)=1+ D

30
Convolutional codes
As all operations involved are linear, a binary CC is linear
and the sequences produced constitute CC codewords.
A feedforward CC (without backward logic - feedback) can
be denoted in matrix from as
( )
(1) (2) (n)
g ( D) g ( D) g (D)
1 1 1
(1) (2) (n)
G( D)= g ( D)
2 g ( D)
2 g (D) 2

(1) (2) (n)
g k ( D) g k ( D) g k (D)
31
Convolutional codes
If each input has a feedback logic given as
(0) ml (0) q
g ( D)=
l q= 0 g D
l ,q
the code is denoted as
(1) (2) (n)
( )
g ( D)
1 g (D)
1 g ( D)
0
(0) (0)
(0)
g (D)
1 g (D)
1 g (D)
1
(1) (2) (n)
g ( D)
2 g (D)
2 g ( D)
2
G( D)= g (D) (0) (0)
(0)
2 g (D)
2 g (D)
2

(1) (2) (n)
g k ( D) g k (D) g k ( D)
(0) (0)

g (D)
k g (D)
k g (0)
k (D)
32
Convolutional codes
We can generalize the concept of parity-check matrix H(D).
An (n,k,) CC is fully specified by G(D) or H(D).
Based on the matrix description, there are a good deal linear
tools for design, analysis and evaluation of a given CC.
A regular CC can be described as a (canonical) all-feedforward
CC and through an equivalent feedback (recursive) CC.
Note that a recursive CC is related to an IIR filter.
Even though k and n could be very small, a CC has a very rich
algebraic structure.
This is closely related to the constraint length of the CC.
Each output bit is related to the present and past inputs via
powerful algebraic methods.
33
Convolutional codes
Given G(D), a CC can be classified as:
Systematic and feedforward (NSC).
Systematic and recursive (RSC).
Non-systematic and feedforward.
Non-systematic and recursive.
RSC is a popular class of CC, because it provides an
infinite output for a finite-weight input (IIR behavior).
Each NSC can be converted straightforwardly to a RSC
with similar error correcting properties.
CC encoders are easy to implement with standard
hardware: shift registers + combinational logic.
34
Convolutional codes
We do not need to look into the algebraic details of G(D)
and H(D) to study:
Coding
Decoding
Error correcting capabilities
A CC encoder is a FSM!
k input bits determine

The memory positions the shifting of The memory positions
store a content (among 2 the registers store a new content
possible ones) at instant i-1 at instant i
Coder is said to be at Coder is said to be at
state s(i-1) state s(i)
And we get n related

35
output bits
Convolutional codes
The finite-state behavior of the CC can be captured by the
concept of trellis.
For any starting state, we have 2k possible edges
leading to a corresponding set of ending states.
ss=s(i-1)
s=1,...,2 se=s(i)
e=1,...,2
input bi=(bi,1...bi,k)
output ci=(ci,1...ci,n)
36
Convolutional codes
The trellis illustrates the encoding process in 2 axis:
X-axis: time / Y-axis: states
Example for a (2,1,3) CC: input 0 input 1
s1 output 00 s1
s2 output 0 s2
1
s3 s3
s4 s4
s5 s5
s6 s6
s7 s7
s8 s8
i-1 i i+1
For a finite-size input data sequence, a CC can be forced to finish at a
known state (often 0) by adding terminating (dummy) bits.
37
Note that one section (e.g. i-1 i) fully specifies the CC.
Convolutional codes
The trellis illustrates the encoding process in 2 axis: Memory:
same input,
X-axis: time / Y-axis: states different outputs
Example for a (2,1,3) CC: input 0 input 1
s1 output 00 s1
s2 output 0 s2
1
s3 s3
s4 s4
s5 s5
s6 s6
s7 s7
s8 s8
i-1 i i+1
For a finite-size input data sequence, a CC can be forced to finish at a
known state (often 0) by adding terminating (dummy) bits.
38
Note that one section (e.g. i-1 i) fully specifies the CC.
Convolutional codes
The trellis description allows us
To build the encoder
To build the decoder
To get the properties of the code
The encoder:
ss=s(i-1)
s=1,...,2 se=s(i)
input bi=(bi,1...bi,k) e=1,...,2
output ci=(ci,1...ci,n)
Registers
Combinational
k logic n
CLK H(D)G(D)
39
Convolutional codes
The decoder is far more complicated
Long sequences
Memory: dependence with past states
In fact, CC were already well known before there existed a
practical good method to decode them: the Viterbi algorithm.
It is a Maximum Likelihood Sequence Estimation (MLSE)
algorithm with many applications.
Problem: for a length N>>n sequence at the receiver side
There are 22Nk/n paths through the trellis to match with
the received data.
Even if the coder starting state is known (often 0), there are
still 2Nk/n paths to walk through in a brute force approach.
40
Convolutional codes
Viterbi algorithm setup.
Key facts:
input bi output ci(s(i-1),bi) The encoding corresponds to a Markov
start s(i-1) end s(i)(s(i-1),bi) chain model:P(s(i))=P(s(i)|s(i-1))P(s(i-1)).
s1 Total likelihood P(r|b) can be factorized as
s2 a product of probabilities.
bi
s3 (i1) (i)
Given s s , P(ri|s(i),s(i-1)) depends
s4 only on the channel kind (AWGN, BSC...).
s5
Transition from s(i-1) to s(i) (linked in the
s6 trellis) depends on the probability of bi:
s7 P(s(i)|s(i-1))=2-k if the source is iid.
s8
i-1 i

P(s(i)|s(i-1))=0 if they are not linked in the
trellis (finite state machine: deterministic).
received data ri
41
Convolutional codes
The total likelihood can be recursively calculated as:
N /n
P ( rb ) = P ( r is ,s
(i) (i 1)

)P s s P s )
( (i) (i1)
) ( (i1)
i=1
In the BSC(p), the observation metric would be:
w ( r i+c i ) n w ( r i +c i )
P ( ris ,s )= P ( r ici ) = p
(i ) (i1)
(1 p)
Maximum likelihood (ML) criterion:

b=arg
{
max [ P ( rb ) ]
b }
42
Convolutional codes
We know that the brute force approach to ML criterion is at least
O(2Nk/n).
The Viterbi algorithm works recursively from 1 to N/n on the

basis that
Many paths can be pruned out (transition probability=0).
During forward recursion, we only keep the paths with highest
probability: the path probability goes easily to 0 from the moment
a term metric transition probability is very small.
When recursion reaches i=N/n, the surviving path guarantees the
ML criterion (optimal for ML sequence estimation!).
The Viterbi algorithm complexity goes down to O(N22).

43
Convolutional codes
The algorithm recursive rule is
V = P ( s =s j ) ;
(0) (0)
j
V j =P ( r is =s j ,s =smax ) max { P ( s =s js }
(i) (i) (i1) (i) (i1) (i1)
=sl )V l
(i 1)
s = smax
{Vj(i)} stores the most probable state sequence wrt observation r

s1 s1
s2 s2 MAX
s3 s3
s4 s4
s5 s5
MAX s6 s6
s7 s7
s8 s8 44
i-1 i i+1
Convolutional codes
Probability of the
most probable
The algorithm recursive rule is state sequence
corresponding to the i-1
V = P ( s =s j ) ; previous observations
(0) (0)
j
(i) (i) (i1) (i) (i1) (i1)
=sl )V l
(i 1)
s = smax

s1 s1
s2 s2 MAX
s3 s3
s4 s4
s5 s5
MAX s6 s6
s7 s7
s8 s8 45
i-1 i i+1
Convolutional codes
Probability of the
most probable
The algorithm recursive rule is state sequence
corresponding to the i-1
V = P ( s =s j ) ; previous observations
(0) (0)
j
(i) (i) (i1) (i) (i1) (i1)
=sl )V l
(i 1)
s = smax

s1 s1
s2
Note that we may s2 MAX
better work with logs
s3 products additions s3
s4 Criterion remains the same s4
s5 s5
MAX s6 s6
s7 s7
s8 s8 46
i-1 i i+1
Convolutional codes
Note that we have considered the algorithm when the
demodulator yields hard outputs
ri is a vector of n estimated bits (BSC(p) equivalent channel).
In AWGN, we can do better to decode a CC
We can provide soft (probabilistic) estimations for the
observation metric.
For an iid source, we can easily get an observation transition
metric based on the probability of each bi,l=0,1, l=1,...,k,
associated to a possible transition.
There is a gain of around 2 dB in Eb/N0.
LBC decoders can also accept soft inputs (non syndrome-based
decoders).
We will examine an example of soft decoding of CC in the lab.
47
Convolutional codes
We are now familiar with the encoder and the decoder
Encoder: FSM (registers, combinational logic).
Decoder: Viterbi algorithm (for practical reasons,
suboptimal adaptations are usually employed).
But what about performance?
First...
CC are mainly intended for FEC, not for ARQ schemes.
In a long sequence (=CC codeword), the probability of
having at least one error is very high...
And... are we going to retransmit the whole sequence?
48
Convolutional codes
Given that we truncate the sequence to N bits and CC is linear
We may analyze the system as an equivalent (N,Nk/n)
LBC.
But... equivalent matrices G and H would not be practical.
Remember FSM: we can locate error loops in the trellis.

b
b+e
i i+1 i+2 i+3 49

Convolutional codes
The same error loop may occur irrespective of s(i-1) and b.
b+e
b+e
50
i i+1 i+2 i+3
Convolutional codes
Examining the minimal length loops and taking into
account this uniform error property we can get dmin of a CC.
For a CC forced to end at 0 state for a finite input data
sequence, dmin is called dfree.
We can draw a lot of information by building an encoder

state diagram: error loops, codeword weight spectrum...
Diagram of a (2,1,3) CC,

from Lin & Costello (2004).
51
Convolutional codes
With a fairly amount of algebra, related to FSM, modified
encoder state diagrams and so on, it is possible to get an upper
bound for optimal MLSE decoding.
BPSK in
P b B derfc
d
( dR
Eb
N0 ) AWGN
Bd is the total number of nonzero information bits associated with

CC codewords of weight d, divided by the number of information
bits k per unit time... A lot of algebra behind...
There are easier, suboptimal ways to decode a CC, and
performance will vary accordingly.
A CC may be punctured to match other rates lower than R=k/n:
performance-rate trade-off.
52
Convolutional codes
Performance examples with BPSK using ML bounds.
53
Turbo codes
Canonically, turbo codes (TC) are parallel concatenated
convolutional codes (PCCC).
k input streams n=n1+n2 output streams
CC1
b c=c1c 2
? Rate R=k/(n1+n2)
CC2
Coding concatenation has been known and employed for

decades, but TC added a joint efficient decoding.
Example of concatenated coding with independent decoding is
the use of ARQ + FEC hybrid strategies (CRC + CC).
54
Turbo codes
Canonically, turbo codes (TC) are parallel concatenated
convolutional codes (PCCC).
CC1
b c=c1c 2
? Rate R=k/(n1+n2)
We will see this is CC2

a key element...
Coding concatenation has been known and employed for

decades, but TC added a joint efficient decoding.
Example of concatenated coding with independent decoding is
the use of ARQ + FEC hybrid strategies (CRC + CC).
55
Turbo codes
We have seen that standard CC decoding with Viterbi algorithm
relied on MLSE criterion.
This is optimal when binary data at CC input is iid.
For CC, we also have decoders that provide probabilistic (soft)

outputs.
They convert a priori soft values + channel output soft
estimations into updated a posteriori soft values.
They are optimal from the maximum a posteriori (MAP)
criterion point of view.
They are called soft input-soft output (SISO) decoders.
56
Turbo codes
What's in a SISO?
r
SISO
(for a CC)
P ( bi =br )
1
P ( bi =b )=
2
0 1
0 1
Probability density function of bi
Note that the SISO works on a bit
by bit basis, but produces a
sequence of APP's. 57
Turbo codes
What's in a SISO? Soft demodulated values
from channel
r
SISO
(for a CC)
P ( bi =br )
1
P ( bi =b )=
2
0 1
0 1
Turbo codes
from channel
r
A priori probabilities SISO
(APR) (for a CC)
P ( bi =br )
1
P ( bi =b )=
2
0 1
0 1
Turbo codes
from channel
r
A priori probabilities SISO
(APR) (for a CC)
P ( bi =br )
1
P ( bi =b )= A posteriori probabilities
2 (APP) updated with
channel information
0 1
0 1
Turbo codes
The algorithm inside the SISO is some suboptimal version of
the MAP BCJR algorithm.
BCJR computes the APP values through a forward-backward
dynamics it works over finite length data blocks, not over
(potentially) infinite length sequences (like pure Ccs).
BCJR works on a trellis: recall transition metrics, transition
probabilities and so on.
(0) (N )
Assume the block length is N: trellis starts s
at s
, ends at .
i ( j ) = P ( s =s j , r 1, , r i )
(i)
i ( j ) = P ( r i +1 , , r N s =s j )
(i)
i j , k = P ( r i , s =s js =sk )
( ) (i) (i1)
61
Turbo codes
(0) (N )
at s
, ends at .
i ( j ) = P ( s =s j , r 1, , r i )
(i)
FORWARD term
i ( j ) = P ( r i +1 , , r N s =s j )
(i)
i j , k = P ( r i , s =s js =sk )
( ) (i) (i1)
62
Turbo codes
(0) (N )
at s
, ends at .
i ( j ) = P ( s =s j , r 1, , r i )
(i)
FORWARD term
i ( j ) = P ( r i +1 , , r N s =s j )
(i)
BACKWARD term
i j , k = P ( r i , s =s js =sk )
( ) (i) (i1)
63
Turbo codes
(0) (N )
at s
, ends at .
i ( j ) = P ( s =s j , r 1, , r i )
(i)
FORWARD term
i ( j ) = P ( r i +1 , , r N s =s j )
(i)
BACKWARD term
i j , k = P ( r i , s =s js =sk )
( ) (i) (i1)
TRANSITION
64
Turbo codes
(0) (N )
at s
, ends at .
i ( j ) = P ( s =s j , r 1, , r i )
(i)
FORWARD term
Remember,
n components
i ( j ) = P ( r i +1 , , r N s =s j )
(i)
for an (n,k,) CC BACKWARD term
i j , k = P ( r i , s =s js =sk )
( ) (i) (i1)
TRANSITION
65
Turbo codes
BCJR algorithm in action:
Forward step i=1,...,N:

2
0 ( j ) = P ( s =s j ) ; i ( j )= i1 ( k ) i ( k , j )
(0)
k =1
Backward step i=N-1,...,0:

2
N ( j )= P (s =s j ) ; i ( j ) = i+1 ( k ) i+1 ( j , k )
(N )
k =1
Compute the joint probability sequence i=1,...,N:
P (s =s j , s =sk ,r ) =i ( k ) i ( j , k )i1 ( j )
(i1) (i)
66
Turbo codes
Finally, the APP's can be calculated as:
1
P ( bi =br )=
p(r ) s

(i 1)
P (s
(i1) (i )
=s j ,s =s k ,r )
s(i )
b i =b
Decision criterion based on these APP's:
P (s =s j , s =sk ,r )
( )
(i1) (i)
P ( bi =1r )
( )
bi =1
s(i 1) s(i )
log =log b i =1
> 0
P ( bi =0r ) (i1)
P (s
(i)
=s j , s =sk , r ) < 0
(i 1) (i )

b i= 0
s s
b i =0
67
Turbo codes
Finally, the APP's can be calculated as:
1
P ( bi =br )=
p(r ) s

(i 1)
P (s
(i1) (i )
=s j ,s =s k ,r )
s(i )
b i =b
Its module is the
reliability of the
decision
Decision criterion based on these APP's:
P (s =s j , s =sk ,r )
( )
(i1) (i)
P ( bi =1r )
( )
bi =1
s(i 1) s(i )
log =log b i =1
> 0
P ( bi =0r ) (i1)
P (s
(i)
=s j , s =sk , r ) < 0
(i 1) (i )

b i= 0
s s
b i =0
68
Turbo codes
How do we get i(j,k)?
This probability takes into account
The restrictions of the trellis (CC).
The estimations from the channel.
i ( j ,k ) = P ( r i ,s(i) =s js(i1) =s k ) =
= p ( r is(i)=s j , s(i1) =s k )P ( s(i)=s js(i1)=sk )
69
Turbo codes
i ( j ,k ) = P ( r i ,s(i) =s js(i1) =s k ) =
=0 if transition is not possible

=1/2 if transition is possible
(binary trellis)
70
Turbo codes
i ( j ,k ) = P ( r i ,s(i) =s js(i1) =s k ) =
n
( r i , l c i , l )
2 =0 if transition is not possible
l =1 =1/2 if transition is possible
1 2
2
(binary trellis)
e
2 n/ 2
in AWGN
(2 ) for unmodulated ci,l 71
Turbo codes
Idea: what about feeding APP values as APR values for
other decoder whose coder had the same inputs?
r2
From CC1 SISO SISO
(for CC2)
P ( bi =br 2 )
P ( bi =br 1 )
0 1
0 1
72
Turbo codes
Idea: what about feeding APP values as APR values for
other decoder whose coder had the same inputs?
r2
From CC1 SISO SISO
(for CC2)
P ( bi =br 2 )
P ( bi =br 1 )
0 1
0 1
This will happen
under some
conditions
73
Turbo codes
APP's from first SISO used as APR's for second SISO
increase updated APP's reliability iff
APR's are uncorrelated wrt channel estimations for
second decoder.
This is achieved by permuting input data for each
encoder.

CC1
b c=c1c 2
Rate R=k/(n1+n2)
CC2
d
74
Turbo codes
APP's from first SISO used as APR's for second SISO
increase updated APP's reliability iff
APR's are uncorrelated wrt channel estimations for
second decoder.
This is achieved by permuting input data for each
encoder.

CC1
b c=c1c 2
Rate R=k/(n1+n2)
INTERLEAVER CC2
(permutor) d
75
Turbo codes
The interleaver preserves the data (b), but changes its
position within the second stream (d).
Note that this compels the TC to work with blocks of
N=size() bits.
The decoder has to know the specific interleaver used
at the encoder.
b1 b 2 b 3 b 4 bN
d (i ) =b i
d ( 2) d (N ) d (3 ) d (1 ) d ( 4)
76
Turbo codes
The mentioned process is applied iteratively (l=1,...).
Iterative decoder this may be a drawback, since it adds
latency (delay).
r2
from channel
r1 SISO 1 SISO 2
APP1(l) APR2(l)
APP2(l)
APR1(l+1)
1

Note the feedback connection: it is the same principle as in

the turbo engines (that's why they are called turbo!).
77
Turbo codes
latency (delay).
r2
from channel
r1 SISO 1 SISO 2
APP1(l) APR2(l)
APP2(l)
APR1(l+1)
1


78
Turbo codes
latency (delay).
r2
from channel
r1 SISO 1 SISO 2
APP1(l) APR2(l)
APP2(l)
APR1(l+1)
1


79
Turbo codes
latency (delay).
r2
from channel
r1 SISO 1 SISO 2
APP1(l) APR2(l)
APP2(l)
APR1(l+1)
1


80
Turbo codes
latency (delay).
r2
from channel
r1 SISO 1 SISO 2
APP1(l) APR2(l)
APP2(l)
APR1(l+1)
1


81
Turbo codes
latency (delay).
r2
from channel
r1 SISO 1 SISO 2
Initial APR1(l=0)
APP1(l) APR2(l)
is taken with APP2(l)
P(bi=b)=1/2 APR1(l+1)
1


82
Turbo codes
When the interleaver is adequately chosen and the CC's employed
are RSC, the typical BER behavior is
Note the two distinct zones: waterfall region / error floor. 83

Turbo codes
The location of the waterfall region can be analyzed by the
so-called density evolution method
Based on the exchange of mutual information between SISO's.
The error floor can be lower bounded by the minimum

Hamming distance of the TC
Contrary to CC's, TC relies on reducing multiplicities rather than just
trying to increase minimum distance.
Pb floor
>
w minM min
N
erfc ( d min R
Eb
N0 )
84
Turbo codes

Pb floor
>
w minM min
N
erfc ( d min R
Eb
N0 )
Hamming weight
of the error
with minimum
distance
85
Turbo codes

Pb floor
>
w minM min
N
erfc ( d min R
Eb
N0 )
Hamming weight
Error
of the error
multiplicity
with minimum
(low value!!)
distance
86
Turbo codes

Pb floor
>
w minM min
N
erfc ( d min R
Eb
N0 )
Hamming weight
Error
of the error
Interleaver gain multiplicity
with minimum
(only if recursive (low value!!)
distance
CC's!!) 87
Turbo codes
Examples of 3G TC. Note that TC's are intended for FEC...
88
Low Density Parity Check Codes
LDPC codes are just another kind of channel codes derived
from less complex ones.
While TC's were initially an extension of CC systems,
LDPC codes are an extension of the concept of binary
LBC, but they are not exactly our known LBC.
Formally, an LDPC code is an LBC whose parity check matrix is

large and sparse.
Almost all matrix elements are 0!!!!!!!!!!
Very often, the LDPC parity check matrices are randomly
generated, subject to some constraints on sparsity...
Recall that LBC relied on extreme powerful algebra related
to carefully and well chosen matrix structures.
89
Formally, a (,)-regular LDPC code is defined as the null
space of a parity check matrix Jn H that meets these
constraints:
a) Each row contains 1's.
b) Each column contains 1's.
c) , the number of 1's in common between any two
columns, is 0 or 1.
d) and are small compared with n and J.
These properties give name to this class of codes: their

matrices have a low density of 1's.
The density r of H is defined as r=/n=/J. 90
Example of a (4,3)-regular LPDC parity check matrix
[ ]
11110 000 00 00 000 00 00 0
0 000 111100 00 000 00 00 0
0 000 00 001111 000 00 00 0
0 000 00 00 000 011110 00 0
0 000 00 00 000 00 00 01111
10 0010 00 100 0100 00 00 0
01 000 100 010 000 00 100 0
H= 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0
0 0010 00 00 0100 010 001 0
0 000 00 010 0010 00 100 01
10 000 100 00 0100 00 010 0
01 000 010 001 000 010 00 0
0 0100 00 100 0010 00 001 0
0 0010 00 010 000 100 100 0
0 000 100 001 000 010 00 01
91
[ ]
11110 000 00 00 000 00 00 0
1520 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0
0 000 00 001111 000 00 00 0
0 000 00 00 000 011110 00 0
0 000 00 00 000 00 00 01111
10 0010 00 100 0100 00 00 0
01 000 100 010 000 00 100 0
H= 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0
0 0010 00 00 0100 010 001 0
0 000 00 010 0010 00 100 01
10 000 100 00 0100 00 010 0
01 000 010 001 000 010 00 0
0 0100 00 100 0010 00 001 0
0 0010 00 010 000 100 100 0
0 000 100 001 000 010 00 01
92
This H defines a
[ ]
11110 000 00 00 000 00 00 0
1520 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 (20,7) LBC!!!
0 000 00 001111 000 00 00 0
0 000 00 00 000 011110 00 0
0 000 00 00 000 00 00 01111
10 0010 00 100 0100 00 00 0
01 000 100 010 000 00 100 0
H= 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0
0 0010 00 00 0100 010 001 0
0 000 00 010 0010 00 100 01
10 000 100 00 0100 00 010 0
01 000 010 001 000 010 00 0
0 0100 00 100 0010 00 001 0
0 0010 00 010 000 100 100 0
0 000 100 001 000 010 00 01
93
This H defines a
[ ]
11110 000 00 00 000 00 00 0
1520 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 (20,7) LBC!!!
0 000 00 001111 000 00 00 0
0 000 00 00 000 011110 00 0
0 000 00 00 000 00 00 01111
10 0010 00 100 0100 00 00 0
01 000 100 010 000 00 100 0 r=4/20=3/15=0.2
H= 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0
0 0010 00 00 0100 010 001 0
0 000 00 010 0010 00 100 01
10 000 100 00 0100 00 010 0
01 000 010 001 000 010 00 0
0 0100 00 100 0010 00 001 0
0 0010 00 010 000 100 100 0
0 000 100 001 000 010 00 01
94
This H defines a
[ ]
11110 000 00 00 000 00 00 0
1520 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 (20,7) LBC!!!
0 000 00 001111 000 00 00 0
0 000 00 00 000 011110 00 0
0 000 00 00 000 00 00 01111
10 0010 00 100 0100 00 00 0
01 000 100 010 000 00 100 0 r=4/20=3/15=0.2
H= 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0
0 0010 00 00 0100 010 001 0
0 000 00 010 0010 00 100 01
Sparse!
10 000 100 00 0100 00 010 0
01 000 010 001 000 010 00 0
0 0100 00 100 0010 00 001 0
0 0010 00 010 000 100 100 0
0 000 100 001 000 010 00 01
95
This H defines a
[ ]
11110 000 00 00 000 00 00 0
1520 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 (20,7) LBC!!!
0 000 00 001111 000 00 00 0
0 000 00 00 000 011110 00 0
0 000 00 00 000 00 00 01111
10 0010 00 100 0100 00 00 0
01 000 100 010 000 00 100 0 r=4/20=3/15=0.2
H= 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0
0 0010 00 00 0100 010 001 0
0 000 00 010 0010 00 100 01
Sparse!
10 000 100 00 0100 00 010 0
01 000 010 001 000 010 00 0
0 0100 00 100 0010 00 001 0
0 0010 00 010 000 100 100 0
0 000 100 001 000 010 00 01
=0,1
96
Note that the J rows of H are not necessarily linearly
independent over GF(2).
To determine the dimension k of the code, it is mandatory
to find the row rank of H = n-k < J.
That's the reason why in the previous example H defined a
(20,7) LBC instead of a (20,5) LBC as could be expected!
The construction of large H for LDPC with high rates and good
properties is a complex subject.
Some methods relay on smaller Hi used as building blocks,
plus random permutations or combinatorial manipulations;
resulting matrices with bad properties are discarded.
Other methods relay on finite geometries and lot of
algebra. 97
LDPC codes yield performances equal or even better than
TC's, but without the problem of their relatively high error
floor.
Both LDPC codes and TC's are capacity approaching
codes.
As in the case of TC, their interest is in part related to the

fact that
The encoding can be easily done (even when H or G are
large, the low density of 1's reduces complexity of the
encoder).
At the decoder side, there are powerful algorithms that can
take full advantage of the properties of the LDPC code.
98
There are several algorithms to decode LDPC codes.
Hard decoding.
Soft decoding.
Mixed approaches.
We are going to examine two important instances thereof:

Majority-logic (MLG) decoding; hard decoding, the simplest
one (lowest complexity).
Sum-product algorithm (SPA); soft decoding, best error
performance (but high complexity!).
Key concepts: Tanner graphs & belief propagation.

99
MLG decoding: hard decoding; r=c+e received word.
The simplest instance of MLG decoding is the decoding of
a repetition code by the rule choose 0 if 0's are dominant,
1 if otherwise.
Given a (,)-regular LDPC code, for every bit position

i=1,...,n, there is a set of rows
(i) (i)
Ai ={h , , h }
1
that have a 1 in position i, and do not have any other

common 1 position among them...
100
We can form the set of syndrome equations
(i)T (i )T (i)
S i ={s i =rh j =eh j , h Ai , i=1, , }
j
Si gives a set of checksums orthogonal on ei.
ei is decoded as 1 if the majority of the checksums give 1;

0 in the opposite case.
Repeating this for all i, we estimate , and =r+.

Correct decoding of ei is guaranteed if there are less than
/2 errors in e.
101
Tanner graphs. Example for a (7,3) LBC.
c1 c2 c3 c4 c5 c6 c7
+ + + + + + +
s1 s2 s3 s4 s5 s6 s7
It is a bipartite graph with interesting properties for decoding.
A variable node is connected to a check node iff the
corresponding code bit is checked by the corresponding
parity sum equation.
102
Variable nodes or
c1 c2 c3 c4 c5 c6 c7 code-bit vertices
+ + + + + + +
s1 s2 s3 s4 s5 s6 s7
103
Variable nodes or
+ + + + + + + Check nodes or
s1 s2 s3 s4 s5 s6 s7 check-sum vertices
104
Variable nodes or
The absence of short

loops is necessary for
iterative decoding
+ + + + + + + Check nodes or
s1 s2 s3 s4 s5 s6 s7 check-sum vertices
105
Based on the Tanner graph of an LDPC code, it is possible
to make iterative soft decoding (SPA).
SPA is performed by belief propagation (which is an
instance of a message passing algorithm).
c1 c2 c3 c4 c5 c6 c7
+ + + + + + +
s1 s2 s3 s4 s5 s6 s7
106
c1 c2 c3 c4 c5 c6 c7 Messages (soft values)
are passed to and from
related variable and
check nodes
+ + + + + + +
s1 s2 s3 s4 s5 s6 s7
107
check nodes
This process, applied

iteratively and under
+ + + + + + + some rules, yields
s1 s2 s3 s4 s5 s6 s7 P ( ci )
108
check nodes

s1 s2 s3 s4 s5 s6 s7 P ( ci )
soft
109
values
check nodes

s1 s2 s3 s4 s5 s6 s7 P ( ci )
soft
110
values
check nodes

s1 s2 s3 s4 s5 s6 s7 P ( ci )
soft
N(c5): check nodes neighbors of variable node c5 111
values
check nodes

s1 s2 s3 s4 s5 s6 s7 P ( ci )
soft
values
check nodes
N(s7)

s1 s2 s3 s4 s5 s6 s7 P ( ci )
soft
values
If we get P(ci | ), we have an estimation of the codeword sent .
The decoding aims at calculating this through the marginalization
P ( ci )= P ( c ' )
c ' :c ' i =c i
Brute-force approach for LDPC is impractical, hence the iterative solution

through SPA. Messages interchanged at step l:
(l )
i j
(l)
c s ( ci =c ) =i , jP ( ci =c i ) (l 1)
s c ( c i =c )
k i
sk N ( c i )
sk s j
(l )
s j c i ( ci =c ) = P ( s j =0ci =c , c ) (l)
ck s j ( c ' k =c k )
c ck N (s j ) c ' k N ( s j)
114
c k c i
P ( ci )= P ( c ' )
c ' :c ' i =c i

through SPA. Messages interchanged at step l: From variable node
to check node
(l )
i j
(l)
c s ( ci =c ) =i , jP ( ci =c i ) (l 1)
s c ( c i =c )
k i
sk N ( c i )
sk s j
(l )
ck s j ( c ' k =c k )
c ck N (s j ) c ' k N ( s j)
115
c k c i
P ( ci )= P ( c ' )
c ' :c ' i =c i

through SPA. Messages interchanged at step l: From variable node
to check node
(l )
i j
(l)
c s ( ci =c ) =i , jP ( ci =c i ) (l 1)
s c ( c i =c )
k i
sk N ( c i )
sk s j From check node
to variable node
(l )
ck s j ( c ' k =c k )
c ck N (s j ) c ' k N ( s j)
116
c k c i
Note that:
(l )
i,j is a normalization constant.
P ( ci =c i ) plugs into the SPA the values from the

channel it is the APR info.
(l) (l)
P ( c i=c ) =i P ( c i=ci ) (l)
s c ( c i=c ) : APP value.
j i
s j N (c i )
Based on the final values of P(ci | ), a candidate is chosen

and HT is tested. If 0, the information word is decoded.
117
Note that:
(l )
i,j is a normalization constant.
P ( ci =c i ) plugs into the SPA the values from the

channel it is the APR info.
Normalization
(l) (l)
P ( c i=c ) =i P ( c i=ci ) (l)
s c ( c i=c ) : APP value.
j i
s j N (c i )
Based on the final values of P(ci | ), a candidate is chosen

and HT is tested. If 0, the information word is decoded.
118
LDPC BER performance examples (DVBS2 standard).
119
Short n=16200
120
Short n=16200
Long n=64800
121
Coded modulations
We have considered up to this point channel coding and
decoding isolated from the modulation process.
Codewords feed any kind of modulator.
Symbols go through a channel (medium).
The info recovered from received modulated symbols
is fed to the suitable channel decoder
As hard decisions.
As soft values (probabilistic estimations).
The abstractions of BSC(p) (hard demodulation) or
soft values from AWGN ( exp[-|ri-sj|2/(22)] ) -and
the like for other cases- are enough for such an
approach.
Note that there are other important channel kinds not
122
considered so far.
Coded modulations
Coded modulations are systems where channel coding
and modulation are treated as a whole.
Joint coding/modulation.
Joint decoding/demodulation.
This offers potential advantages (recall the improvements

made when the demodulator outputs more elaborated
information -soft values vs. hard decisions).
We combine gains in BER with spectral efficiency!
As a drawback, the systems become more complex.

More difficult to design and analyze.
123
Coded modulations
TCM (trellis coded modulation).
Ideally, it combines a CC encoder and the modulation
symbol mapper.
output mk
s1 s1
output m
s2 j
s2
s3 s3
s4 s4
s5 s5
s6 s6
s7 s7
s8 s8
i-1 i i+1
124
Coded modulations
If the modulation symbol mapper is well matched to the CC
trellis, and the decoder is accordingly designed to take
advantage of it,
TCM provides high spectral efficiency.
TCM can be robust in AWGN channels, and against fading and
multipath effects.
In the 80's, TCM become the standard for telephone line data
modems.
No other system could provide better performance over the
twisted pair cable before the introduction of DMT and ADSL.
However, the flexibility of providing separated channel coding
and modulation subsystems is still preferred nowadays.
Under the concept of Adaptive Modulation & Coding (ACM).
125
Coded modulations
Other possibility of coded modulation, evolved from TCM and
from the concatenated coding & iterative decoding framework is
Bit-Interleaved Coded Modulation (BICM).
What about if we provide an interleaver between the channel
coder (normally a CC) and the modulation symbol mapper?
CC
A soft demodulator can also accept APR values and update as

APP's its soft outputs in an iterative process!
Channel corrupted
outputs APP values
Soft
APR values (to interleaver
demapper and CC SISO)
(interleaved
from CC SISO) 126
Coded modulations
As TCM, BICM has special good behavior (even better!)
In channels where spectral efficiency is required.
In dispersive channels (multipath, fading).
Iterative decoding yields a steep waterfall region.
Being a serial concatenated system, the error floor is very
low (contrary to the parallel concatenated systems).
BICM has already found applications in standards such as

DVB-T2.
The drawback is the higher latency and complexity of the

decoding.
127
Coded modulations
Examples of BICM.
128
References
S. Lin, D. Costello, ERROR CONTROL CODING, Prentice
Hall, 2004.
S. B. Wicker, ERROR CONTROL SYSTEMS FOR
DIGITAL COMMUNICATION AND STORAGE, Prentice
Hall, 1995.
129

Channel Coding

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Channel Coding

Uploaded by

Copyright:

Available Formats

DIGITAL COMMUNICATIONS

Francisco J. Escribano, 2013-14

Shannon's theorem (1948):

2) If Pb is acceptable, rates R<R(Pb)=C/(1-H2(Pb)) are achievable.

Problem: Shannon's theorem is not constructive.

Given a vector b GF(2)m, its binary weight is w(b)=number of

It is possible to define a distance over vector field GF(2)m,

Cost: loss in resources (spectral efficiency, power). 8

An (n,k) linear block code (LBC) is a subspace C(n,k) < GF(2)n

C(n,k) contains 2k vectors c=(c1 cn).

R=k/n is the rate of the LBC.

n-k is the redundancy of the LBC

c C(n,k) can be both specified as:

c such that the scalar products chiT are null, when

H={hij}i=1,...,n-k, j=1,...,n, cHT=0.

G is a kn generator matrix of the LBC C(n,k).

H is a (n-k)n parity-check matrix of the LBC C(n,k).

Note that gj C(n,k), and so GHT=0.

b=(b1...bk) LBC c=(c1...cn)=bG

For any input information block with length k, it yields a

Note that, by taking 2k vectors out of 2n, we are

Recall that we have added n-k redundancy bits, so

Modulator AWGN Hard

At the receiver side, we can compute the syndrome s=(s1...sn-k) as

r=(r1...rn) Channel s=(s1...sn-k)=rHT

b) It can correct error events e with binary weight up to

It is possible to implement a joint strategy:

As the code is linear, dH(ci,cj)=dH(ci+cj,cj+cj)=dH(ci+cj,0).

ci, cj, ci+cj, 0 C(n,k)

dmin(C(n,k))=min{w(c) | c C(n,k), c0}

Ai is the number of codewords in C(n,k) with weight i: it is

If si=rHT, decode =r+ei.

Calculating the resulting P'b of an LBC is not an easy task,

ml=l is the constraint length of the l-th input/register.

A CC encoder produces sequences, not just blocks of data.

Same structure for

l=ml is the constraint length of the l-th register

m=maxl{l} is the memory order of the code

=1+...+k is the overall constraint length of the code

g(l j )=( 1,1,0,... ,0 ) g (l j ) (D)=1+ D

k input bits determine

And we get n related

The Viterbi algorithm works recursively from 1 to N/n on the

The Viterbi algorithm complexity goes down to O(N22).

{Vj(i)} stores the most probable state sequence wrt observation r

{Vj(i)} stores the most probable state sequence wrt observation r

{Vj(i)} stores the most probable state sequence wrt observation r

But what about performance?

Remember FSM: we can locate error loops in the trellis.

i i+1 i+2 i+3 49

The same error loop may occur irrespective of s(i-1) and b.

We can draw a lot of information by building an encoder

Diagram of a (2,1,3) CC,

Bd is the total number of nonzero information bits associated with

Coding concatenation has been known and employed for

We will see this is CC2

Coding concatenation has been known and employed for

For CC, we also have decoders that provide probabilistic (soft)

Backward step i=N-1,...,0:

Decision criterion based on these APP's:

= p ( r is(i)=s j , s(i1) =s k )P ( s(i)=s js(i1)=sk )

= p ( r is(i)=s j , s(i1) =s k )P ( s(i)=s js(i1)=sk )

=0 if transition is not possible

= p ( r is(i)=s j , s(i1) =s k )P ( s(i)=s js(i1)=sk )