You are on page 1of 8

6.

897 Algorithmi Introdu tion to Coding Theory September 26, 2001


Le ture 6
Le turer: Madhu Sudan S ribe: Ni ole Immorli a

Today we will talk about:


 Wozen raft onstru tion ontinued
 Building odes from other odes
Parity he k bit
{
Pun turing
{
Restri tion
{
Dire t Produ t
{
Con atenation
{
 Forney odes
 Justesen odes

1 Wozen raft onstru tion ( ontinued)

The Wozen raft onstru tion gives a 2O(n) time algorithm for onstru ting [n; k; d℄2 odes. We pi k up
where we left o in the last le ture. Re all our goal is to onstru t a family of sets S1 ; S2 ; : : : ; St 
f0; 1gn 0 su h that
1. The sets are pairwise disjoint.
2. 8i, Si [ f0g is a linear subspa e of f0; 1gn.
3. t  Vol(d; n).
4. 8i; j : jSi j = 2k 1.
We saw last le ture that if we an onstru t su h a family of sets, one of these sets will yield a [n; k; d℄2
ode. Today we will see Wozen raft's onstru tion of su h a family of sets. We will show the onstru tion
only n = 2k. It is fairly simple to generalize it to a onstru tion for n = k for any integer .
We will use the orresponden e between elds and ve tor spa es that preserves addition (see Le ture
Notes on Algebra, Se tion 6). In parti ular we will view Fk2 as F2k and Fn2 as F22k . The sets we will
onstru t will be indexed by 2 F2k , with S de ned as follows: S = f(x; x) j x 2 F2k f0gg. We
now verify that the S 's satisfy the above onditions for t = 2k and d su h that Vol(d; n)  t.
1. S 's are pairwise disjoint: In parti ular, For every (x; y) 2 F22k , there is at most one su h that
(x; y) 2 S , namely = xy 1 provided y is non-zero and = 0 if y = 0. (If x = 0 then (x; y) 62 S
for any .)
2. S [ f0g is linear: Clearly ea h S is a linear subspa e of F22k and is generated by the matrix [1 ℄.
Sin e the orresponden e between Fk2 and F2k respe ts addition, it follows that S [ f0g are linear
over F2 as well.
3. There are learly t = 2k of the S 's. The ondition t  Vol(d; n) follows from the de nition of d.

6-1
4. It is also obvious that jS j = 2k 1.
Taking the ratios k=n and d=n we note that the odes S always have a rate of 21 . Further if we x any
 > 0, and set d = (H 1 ( 21 ) )n then for all suÆ iently large n we have Vol(d; n)  2n=2 and thus the
family above gives a ode of rate 12 and relative distan e approa hing H 1( 12 ).
By a slightly more areful argument we an a tually verify that most odes in the family a hieve the
Gilbert-Varshamov bound. Spe i ally, we an prove:
Theorem 1 For every  > 0 and for all suÆ iently large even numbers n, Wozen raft's onstru tion
with parameter n gives a family of 2n=2 odes with all but  fra tion of whi h are [n; 21 n; (H 1 ( 21 ) )n℄2 -
odes.

Remarks:
1. Furthermore, for all su h n, given an index i of a ode from the family with parameter n, any
spe i entry of the generator matrix of the ith ode an be omputed in time polynomial in n.
2. If n is of the form 4  3t , then the omputation an be arried out in O(log n) spa e. This part
follows from the fa t that the irredu ible polynomial for su h F2k where k = n=2 is known expli itly
and this polynomial is sparse. (Thanks to Dieter van Melkebeek (dieterias.edu) for pointing out
this use of sparsity.)
Exer ise: Extend the argument above to onstru t for every integer , every  > 0, and all suÆ iently
large k, an ensemble of 2( 1)k odes su h that all but an -fra tion of the ensemble are [ k; k; (H 1(1
) )( k )℄2 - odes. Your onstru tion should take time 2
1 O( k) .

Referen es: The Wozen raft ensemble of odes do not appear in any paper by Wozen raft. They are
alluded to in a monograph by Massey [3, Se tion 2.5℄. The a tual family as des ribed above is from
Justesen's paper [2℄. The extension asked for in the exer ise is from the paper of Weldon [4℄.

2 Building odes from other odes

In the previous se tion we saw that asymptoti ally good odes exist. However, we had no expli it
onstru tion for them. The se ond holy grail of oding theory is to onstru t in polynomial time binary
odes that meet the GV-bound. No one knows how to do this yet. One approa h to this problem is
to reate new odes from existing ones. We look at ve ways of getting new odes from old odes.
Four of them don't improve the asymptoti s of the ode. The fth leads to onstru tions of families of
asymptoti ally good odes. (However, they do not meet the GV-bound.)

2.1 Parity he k bit

We re all a onstru tion of Hamming (see notes for Le ture 3). Given a ode C = [n; k; d℄2 , reate a
new ode C 0 = [n + 1; k; d0 ℄2 as follows. First en ode the message using C to get a odeword of length
n. Then, add an extra bit whi h is the parity of the bits of . This new odeword, 0 has length n + 1.
Furthermore, as argued in Le ture 3, if n is odd, the new distan e d0 = d + 1. Otherwise the distan e
may remain d.
The parity he k bit operation does improve relative distan e for odes of odd length but not for odes
of even length. Furthermore, the rate su ers. So we an not repeat this method to obtain really great
odes.

6-2
2.2 Pun turing

Given a ode C = [n; k; d℄q , reate a new ode C 0 = [n t; k; d0 ℄q by simply deleting t oordinates. The
new distan e d0 will be d t  d0  d. For t = 1 we an think of the pun turing operation as a hieving
the e e t of the inverse of the parity he k bit operation (in a very loose sense).
This operation has the bene t of de reasing the en oding length thereby improving the rate. But at the
same time it sa ri es the minimum distan e of the ode and thus de reases the relative distan e.
While this operation does not yield a generi onstru tion method for good odes, it turns out to be
very useful in spe ial ases. Often the best known ode for a spe i hoi e of, say n and k, might be a
ode obtained from pun turing a well-known ode of longer blo k length. In su h ases, spe ial features
of the ode are often used to show that the distan e is larger than the proven bound. Note further that
all linear odes are pun tured Hadamard odes! So obviously pun turing an lead to good odes. The
question remains: When does it work? and what part of the odes should be pun tured?

2.3 Restri tion

Given a ode C = (n; k; d)q over an alphabet , reate a new ode C 0 = (n 1; k0 ; d)q by hoosing 2 
and i 2 [n℄ and retaining only those odewords in whi h the ith oordinate of the odeword is . The
ode C 0 is then obtained by deleting the ith oordinate from all remaining odewords.
The resulting ode has blo k length n. If we pi k so that it is the most ommon letter in the
ith oordinate (among odewords of C ) then at least qk =q messages will remain in C 0 . Sin e odewords
di ered in d positions to start with, and the only odewords that remain agreed in the deleted oordinate,
the new odewords are still at Hamming distan e at least d.
Restri tion does improve the relative distan e, but not ne essarily the rate.

2.4 Dire t Produ t

Given a odes C1 = [n1 ; k1 ; d1 ℄q and C2 = [n2 ; k2 ; d2 ℄q , the dire t produ t of C1 and C2 , denoted C1
C2 ,
is an [n1 n2 ; k1 k2 ; d1 d2 ℄q onstru ted as follows. View a message of C1
C2 as a k2 by k1 matrix M.
En ode ea h row of M by the ode C1 to obtain an k2 by n1 intermediary matrix. En ode ea h olumn
of this intermediary matrix with the C2 ode to get an n2 by n1 matrix representing the odeword
en oding M. This pro ess works generally - for linear as well as non-linear odes C1 and C2 . We rst
show that the resulting ode has distan e at least d1 d2 in either ase. Then we show that if C1 and C2
are linear, then the resulting ode is also linear, and furthermore is the same as the ode that would be
obtained by en oding the olumns with C2 rst and then en oding the rows with C1 .
We prove this new ode has distan e at least d1 d2 . Consider two distin t message matri es M1 and M2 .
Let N1 and N2 be the intermediate matri es obtained after the rst step of the en oding pro ess. Let
C1 and C2 be the nal odewords obtained from these matri es. Suppose M1 and M2 di er on the ith
row. Then N1 and N2 must di er on at least d1 oordinates on the ith row. In parti ular they di er on
at least d1 olumns. Say j1 ; : : : ; jd1 are indi es of d1 su h olumns where N1 and N2 di er. Then the
olumn-by- olumn en oding results in odewords C1 and C2 whi h di er on at least d2 oordinates on
ea h of these d1 olumns. Thus C1 and C2 di er on at least d1 d2 entries.
Next we show that C1
C2 is linear if C1 and C2 are linear, and the en oding fun tions used are linear
fun tions.
Claim 2 Let R1 2 Fkq 1 n1 generate the ode C1 and let R2 2 Fkq 2 n2 generate the ode C2 . Then the

6-3
dire t produ t ode C1
C2 is a linear ode that has as its odewords fR2 T MR1 j M 2 Fkq 2 k1 g.
Remark: As a onsequen e, we note that it does not matter if we en ode the rows rst and then the
olumns as above or vi e versa.
Proof The proof follows easily from the fa t that the intermediate matrix equals MR1 and thus the
nal matrix equals R2 T (MR1 ). The inter hangeability follows from asso iativity of matrix multipli a-
tion. The linear follows from the fa t that the matrix R2 T M1 R1 + R2 T M2 R1 is just the en oding of
M1 + M2 and the matrix R2 T M1 R1 is the en oding of M1 , where 2 Fq .

Exer ise: In general the dire t produ t of two odes depends on the hoi e of the en oding fun tion.
Prove that this is not the ase for linear odes. Spe i ally, prove that if R1 and R01 generate C1 and
R2 and R02 generate C2 , then fR2 T MR1 j Mg = fR02 T MR01 j Mg.
Again, the dire t produ t does not help in the onstru tion of asymptoti ally good odes. E.g. if we
started with odes C1 and C2 of rate and relative distan e 101 , then the resulting ode is weaker and has
rate and relative distan e of only 100
1
.
So far all the operations on odes have been ine e tive in getting to asymptoti ally good odes. In
retrospe t one may say that this is be ause all these operations xed the alphabet and tried to play
around with the other three parameters. A simply but brilliant idea, due to Forney [1℄, showed how
to extend the game to in lude the alphabet size in the parameters altered/expoited by the operations
on odes. This operation is that of \ on atenating odes". This method turns out to have profound
impa t on our ability to onstru t asymptoti ally good binary odes. We des ribe this method an its
onsequen es in the next se tion.

3 Con atenation of odes

To motivate the notion of on atenation, let us re all the example using Reed-Solomon odes on CD
players. Reed-Solomon odes were de ned on large alphabets, while CD players work with the binary
alphabet. However, given an [n; k; d℄2r Reed-Solomon ode, we interpreted this ode as an [nr; kr; d℄2
binary ode by naively representing the alphabet of the RS ode, elements of F2r , as binary strings of
length r. The main idea of on atenation is to fo us on this \naive interpretation" step and to generalize
it so that elements of F2r an be represented by binary strings of length larger than r. Note that the
main loss in performan e is due to the fa t that in going from strings of length n (over F2r ) to binary
strings of length nr, we did not in rease the minimum distan e of the ode, and so lost in terms of the
relative distan e. A areful hoi e of the en oding in the se ond step ought to be able to moderate this
loss, and this is exa tly what the method of on atenation addresses.
As in the ase of dire t produ t odes, it is best to explain on atenation of odes in terms of the
en oding fun tions. First we de ne the l-fold on atenation of a single en oding fun tion.
De nition 3 For positive integer l, linearity preserving bije tive map  : Fqk ! Fkq and en oding
fun tion E : F kq ! Fnq the l-fold on atenation of E is the fun tion l E : F lqk ! F nl
q given by
hx1 ; : : : ; xl i 7! hE ((x1 )); : : : ; E ((xl ))i, where xi 2 Fqk for i 2 [l℄.
Typi ally the exa t map  : Fqk ! Fkq is irrelevant so we will simply ignore it. Further if l is lear from
ontext, we will ignore it and simply refer to the map E . We now de ne the on atenation of two
odes.

6-4
De nition 4 For en oding fun tions E1 : Fkqk12 ! Fnq2k1 and E2 : Fkq 2 ! Fnq 2 (and some impli it bije tion
 : Fqk2 ! Fkq 2 ), the on atenation of E1 and E2 is the fun tion E1 E2 : Fqk1 k2 ! Fnq 1 n2 given by
n1 E2 

! Fkqk12 E!1 Fnqk12 ! ! Fqn2 n1 ! Fqn1 n2 :
1
F qk1 k2 F kq 2

In the message hx ; : : : ; xk1 i is mapped to the ve tor n1 E (E (h (x ); : : : ;  (xk1 )i)).
1 2 1
1
1
1

If the en oding fun tions E1 ; E2 are linear maps giving linear odes C1 and C2 respe tively, then E1 E2
is a linear map whose image is denoted by C1 C2 . It may be veri ed that C1 C2 is a fun tion of C1 and
C2 alone and not dependent on E1 ; E2 or . It is ustomary to all the ode C1 the outer ode and the
ode C2 the inner ode, and C1 C2 is the on atenated ode.
The next proposition veri es the distan e properties of on atenated odes.
Proposition 5 If C1 is an [n1 ; k1 ; d1 ℄qk2 - ode and C2 is an [n2 ; k2 ; d2 ℄q - ode then C1 C2 is an [n1 n2 ; k1 k2 ; d1 d2 ℄q -
ode.

Proof The only part that needs to be veri ed is the distan e. To do so onsider the en oding of a
non-zero message. The en oding by E1 leads to an intermediate word from Fnqk12 that in non-zero in d1
oordinates. The n1 -fold on atenation of E2 applied to the resulting odeword produ es d2 non-zero
symbols in every blo k where the outer en oding produ ed a non-zero symbol. Thus we end up with at
least d1 d2 non-zero symbols in the on atenated en oding.
If we ignore the non-trivial behavior with respe t to the alphabet size, then the on atenation operator
has essentially the same parameters as the dire t produ t operator. However the on atenation operator
allows the outer ode to be over a larger alphabet and we have seen that it is easier to onstru t good
odes over large alphabets. Thus the on atenation operator is stri tly better than dire t produ t. Below
we show an example of non-trivial results it yields.
Example - RS  Hadamard: Suppose we on atenate an outer ode that is an [n; k; n k℄n -Reed-
Solomon ode with a [n; log n; n2 ℄2 -Hadamard ode. (Assume for this example that n is a power of 2.)
Then the on atenated odes is an [n2 ; k log n; n2 (n k)℄2 - ode. Depending on our hoi e of rate k=n of
the outer ode, we get a family of binary odes of onstant relative distan e and an inverse polynomial
rate R = k log n
n2 . This is a new range of parameters that we have not seen in the odes so far.
While it is possible to employ multiple levels of on atenation to improve the dependen e of the blo k
length n on the message length k making n loser and loser to being linear in n, we an never get an
asymptoti ally good ode this way. Informally, to get an asymptoti ally good family, we need both the
inner ode and outer ode to be asymptoti ally good. In what follows, we will des ribe two approa hes
at getting onstru tions of asymptoti ally good odes using on atenation.

3.1 Forney odes/Zyablov bound

The rst family of odes we des ribe are due to Forney [1℄, who des ribed the basi idea of the odes,
but did not stress the hoi e of parameters that would optimize the tradeo between rate and relative
distan e. (Forney was after bigger sh, spe i ally an algorithmi version of Shannon's theorem. We
will get to this when we get to algorithms.) The a tual bounds were worked out by Zyablov [5℄ and are
usually referred to as the Zyablov bounds.
The idea to get a polynomial time onstru tible family of asymptoti ally good odes is a simple one.
As an outer ode we will use a Reed-Solomon ode over an n-ary alphabet, say an [n; k; n k℄n - ode.

6-5
For the inner ode, we will sear h for the best linear ode in, say, Wozen raft's ensemble of odes. This
takes exponential time in the blo k length of the inner ode, but the blo k length of the inner ode only
needs to be linear in the message length and the message length of the inner ode is only log n. Thus
the time it takes to nd the best ode in Wozen raft's ensemble is only polynomial in n.
Getting a little more spe i , to onstru t a ode of relatve distan e Æ, we pi k Æ1 and Æ2 so that Æ1 Æ2 = Æ.
For the outer ode we pi k an [n; (1 Æ1 )n; Æ1 n℄n -RS- ode. For the inner ode we sear h Wozen raft's
ensemble to obtain an [n0 ; (1 H (Æ2 ))n0 ; Æ2 n0 ℄2 - ode with (1 H (Æ2 ))n0 = log n. The resulting ode has
blo k length nn0 = O(n log n), relative distan e Æ and rate (1 Æ1 )(1 H (Æ2 )). Thus we obtain the
following theorem:
Theorem 6 For every Æ 2 (0; 21 ), there exists an in nite family of polynomial time onstru tible odes
C with rate R and relative distan e Æ satisfying
  
Æ
R  max 1 (1 H (Æ2 ))  1 : (1)
ÆÆ2 < 2 Æ2

The bound (1) above is the Zyablov bound.

3.2 Expli it onstru tions

We take a brief digression to dis uss what it means to onstru t a ode expli itly. It is lear that
this ought to be a omplexity-theoreti de nition, sin e a ode is a nite set and one an obviously
enumerate all nite sets to see if one of them gives, say, an (n; k; d)- ode. The onstru tions of Gilbert
took exponential time, while Varshamov's is a randomized polynomial time onstru tion that possibly
returns an erroneous solution (to the task of nding an [n; k; d℄ ode). We asserted that Forney's
onstru tion is somehow expli it, and yet this is not satisfa tory to many mathemati ians. Here we
enumerate some riteria for expli it onstru tions for the ase of odes (though similar riteria apply to
onstru tions of all ombinatorial obje ts).
Let fCR;Æ g(R;Æ) be a olle tion of families of odes, where the family CR;Æ has rate R and relative distan e
Æ. The following are possible notions of C being expli itly onstru tible:

Polytime For every 0 < R < 1 and 0 < Æ < 1, there exists a polynomial p su h that generator matrix
of the ith element of the family CR;Æ , with blo k length ni , is onstru tible in time p(ni ), if su h a
family exists.
Uniform polytime There exists a polynomial p su h that for every 0 < R < 1 and 0 < Æ < 1, generator
matrix of the ith element of the family CR;Æ , with blo k length ni , is onstru tible in time p(ni ),
if su h a family exists.
The di eren e between polytime onstru tibility and uniform polytime onstru tibility is relatively
small. This distin tion an be made in the remaining de nitions too, but we will skip the extra
quanti ers, and simply fo us on what makes a ode C onstru tible (leaving it to the reader to nd
a preferen e within uniform and nonuniform time bounds).
Logspa e The generator matrix of the ith member of C is onstru tible in logarithmi spa e. (This
implies that C is polynomial time onstru tible.)
Lo ally Polytime Constru tible 1 Here we will require that a spe i entry, say the j; lth entry, of
the generator matrix of the ith member of the ode C be omputable in time polynomial in the
1 A tually, this notion does not have a name and I had to generate one on the y. Thanks to Anna Lysyanskaya for
suggesting this name.

6-6
size of the binary representation of i; j; l. (Note this representation has size logarithmi in n and
so this notion is mu h more expli it than earlier notions.)
Lo ally Logspa e Constru tible The j; lth entry of the generator matrix of the ith ode is logspa e
onstru tible in the length of the binary representations of i; j and l.
As noted, the requirements get more stringent as we go down the list above. The notion of Lo ally
Logspa e Constru tible is about as strong a requirement we an pose without getting involved with
ma hine-dependent problems. (What operations are allowed? Why? et .)
Forney's odes, as des ribed above, are polytime onstru tible, but not uniform polytime or logspa e
onstru tible. The next family of odes we will des ribe are lo ally logspa e onstru tible, making them
as expli it as we ould desire (de ne?).

3.3 Justesen Codes

The prin ipal barrier we seem to fa e in produ ing odes expli itly is that we know how to onstru t
smaller and smaller ensembles of good odes, but we don't know how to get our hands on any parti ular
good one. In fa t in the ensembles we onstru t almost all odes are good. Is there any way to use
this fa t? Justesen's idea [2℄ is a brilliant one | one that \derandomizers" should take note of: On
the one hand we an produ e a small sample spa e of mostly good odes. On the other hand we need
one good ode that we wish to use repeatedly | n1 times in the on atenation. Do we really need to
use the same ode n1 times? Do they all have to be good? The answer, to both questions, is NO! And
so, surprisingly enough, the ensemble of odes is exa tly what suÆ es for the onstru tion. Spe i ally,
we take an [n1 ; k1 ; d1 ℄qk2 -outer ode with en oding fun tion E1 and an ensemble onsisting of n1 inner
odes with the ith member denoted E2(i) . We en ode a message m by rst applying the outer en oding
fun tion to get E1 (m) and then applying the ith inner en oding fun tion to the ith oordinate of E1 (m),
getting the ve tor hE2(1) ((E1 (m))1 ); : : : ; E2(n1 ) ((E1 (m))n1 )i.
The above de nition an be formalized to get a notion of on atenating an [n1 ; k1 ; ℄qk2 -outer ode
with an ensemble ontaining n1 [n2 ; k2 ; ℄q -inner odes ( representing the fa t that the distan es are
unknown, or possibly not all the same). Denoting the outer ode by C1 , and the inner ensemble by C2 ,
we extend the notation for on atenation and use C1 C2 to denote su h on atenations. The following
proposition shows how the parameters of the on atenated odes relate to those of the outer ode and
inner ensemble.
Proposition 7 Let C1 be an [n1 ; k1 ; d1 ℄qk2 ode. Let C2 be an ensemble of n1 [n2 ; k2 ; ℄q - odes of whi h
all but -fra tion have minimum distan e d2 . Then the on atenated ode C1 C2 is an [n1 n2 ; k1 k2 ; (d1
n1 )d2 ℄q ode.

Proof The proof follows from the fa t that the rst level en oding of a non-zero message leaves at
least d1 oordinates that are non-zero. At most n1 of the inner odes do not have minimum distan e
d2 . Thus at least d1 n1 oordinates, when en oded by C2 result in d2 non-zero zymbols ea h. The
distan e follows.

Note that it is not entirely trivial to nd an ensemble with just the right parameters: To use every
element of the ensemble at least on e, we need the inner ensemble size to be no larger than the outer
blo k length. To use an RS ode at the outer level, we need the outer blo k length to be no larger
than the outer alphabet size. To use on atenation, we need the number of outer alphabet size to be
no larger than the number of inner odewords. Putting it all together, we need an ensemble with no

6-7
more members than odewords per member of the ensemble. Fortunately enough, this is exa tly what
is a hieved by Wozen raft's ensemble, so we an use it. Consequenntly we get one fully expli it (lo ally
logspa e onstru tible) family of error- orre ting odes on the Zyablov bound. In parti ular the ode is
asymptoti ally good.
Theorem 8 For every 0 < Æ < H 1 ( 21 ), there exists a lo ally

logspa e onstru tible in nite family of
odes C that has relative distan e Æ and rate 2 1 H 1 ( 1 ) .
1 Æ
2

The ode above is obtained by on atenating a Reed-Solomon ode of appropriate rate with the Wozen-
raft ensemble. We note that to get lo al logspa e onstru tibility, we need the inner ode length to be
4  3l for some integer l so that we an use the expli it onstru tion of elds of size 2  3l .

Referen es

[1℄ G. David Forney. Generalized Minimum Distan e de oding. IEEE Transa tions on Information
Theory, 12:125{131, 1966.
[2℄ Jrn Justesen. A lass of onstru tive asymptoti ally good algebrai odes. IEEE Transa tions on
Information Theory, 18:652{656, 1972.
[3℄ James L. Massey. Threshold de oding. MIT Press, Cambridge, Massa husetts, USA, 1963.
[4℄ Edward J. Weldon, Jr. Justesen's onstru tion | the low-rate ase. IEEE Transa tions on Infor-
mation Theory, 19:711{713, 1973.
[5℄ Vi tor V. Zyablov. An estimate on the omplexity of onstru ting binary linear as ade odes.
Problems of Information Transmission, 7(1):3{10, 1971.

6-8

You might also like