Basic Lecture Information Theory and Coding

Information & Coding Theory Course Outline/ Introduction
Dr. M. Arif Wahla EE Dept arif@mcs.edu.pk Military College of Signals National University of Sciences & Technology (NUST), Pakistan
Class webpage: http://learn.mcs.edu.pk/course/view.php?id=544
Information Theory
Founded by Claude E. Shannon (1916-2001) The Mathematical Theory of Communication, 1948 Study fundamental limits in communications: transmission, storage,
etc
1:31 AM
Course Outline
1:31 AM
Course Outline
Two Key Concepts

Information is uncertainty: modeled as random
variables Information is digital: transmission should be 0s and 1s (bits) with no reference to what they represent
1:31 AM
Course Outline
Two Fundamental Theorems

Source coding theorem
fundamental limit in data compression (zip, MP3, JPEG, MPEG) Channel coding theorem fundamental limit for reliable communication through a noisy channel (telephone, cell phone, modem, data storage, etc)
1:31 AM
Course Outline
Information & Coding Theory
The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point.
(Claude Shannon, 1948)
1:31 AM
Course Outline
Course Outline
1:31 AM
Course Outline
Course Objective (3+0)

This course will provide the students an introduction to
classical information theory and coding theory. The main course objective is to introduce the students to wellknown information theoretic tools that can be used to solve engineering problems.
The course will begin by describing basic communication
systems problems where information theory may be applied. An explanation of information measurement and characterization will be given. Fundamentals of noiseless source coding and noisy channel coding will be taught next. Finally, some key information theory principles applied to communication security systems will be covered.
1:31 AM
Course Outline
Course Outline -I
Information theory is concerned with the fundamental limits of communication.
What is the ultimate limit to data compression? e.g. how many bits
are required to represent a music source.

What is the ultimate limit of reliable communication over a noisy
channel, e.g. how many bits can be sent in one second over a telephone line.
1:31 AM
Course Outline
Course Outline -II

Coding theory is concerned with practical techniques to realize the limits specified by information theory
Source coding converts source output to bits. Source output can be voice, video, text, sensor output
Channel coding adds extra bits to data transmitted over the channel This redundancy helps combat the errors introduced in transmitted bits due to channel noise
1:31 AM
Course Outline
10
Main Topics to be Covered

Introduction
Communications Model Information Sources Source Coding Channel Coding
Information Measurement
Definition and Properties of Entropy Uniqueness of the Entropy Measure Joint and Conditional Entropy Mutual Information and Conditional Mutual Information Information Divergence Measures
1:31 AM
Course Outline
11

Noiseless Source Coding Optimum Codes Shannons Source Coding Theorem Huffman Codes Lemple Ziv Codes Arithmetic Coding Channel Capacity Capacity of Memoryless Symmetric Channels Capacity of Erasure Channels Shannons Channel Coding Theorem Channel Codes (Error Correcting Codes) Block Codes Cyclic codes Convolutional Codes Turbo Codes Space Time Codes
1:31 AM
Course Outline
12

Secrecy Systems Mathematical Structure Pure and Mixed Ciphers Similar Systems Perfect Secrecy Equivocation Characteristic Ideal Secrecy
1:31 AM
Course Outline
13
Recommended Text Books & Study Material

Applied Coding and Theory for Engineers Richard B. Wells, Prentice Hall, 1999.
A Mathematical Theory of Communication, Claude E. Shannon, Bell System Technical Journal, 1948 available for free on line
1:31 AM
Course Outline
14
Class Webpage/ Learning management system

Class webpage:
https://lms.nust.edu.pk/portal/course/view.php?id=9944
Lecture Notes, Reading material and Assignments will be posted here. 1. Sessional marks and exam results will be uploaded 2. Students are encouraged to maintain a discussion blog and discuss the assignments and course topics
1:31 AM
Course Outline
15
Recommended Text Books & Study Material

Applied Coding and Theory for Engineers Richard B. Wells, Prentice Hall, 1999.
A Mathematical Theory of Communication, Claude E. Shannon, Bell System Technical Journal, 1948 available for free on line
1:31 AM
Course Outline
16
Schedule
Class Meetings Wednesday (5pm-8pm) 3L
Consultancy Hours Wednesday (4pm-5pm), (8pm-8:30pm) Other times by appointment (phone or email)
1:31 AM
17
Teaching Methodology
Organized material will be presented in PPT slides. Concepts, mathematical expressions and examples will be performed on board
OR writing on transparency using Overhead Projector.

Reading assignments far the next lecture. Assignments: Frequency 8 Every second week Quiz: Frequency 4 Preferably at the end of logical segment of topics (Not necessarily unanounced) Resources:
Lectures will be posted on class webpage
1:31 AM
18
Grading Scheme & Policy Matters

Assignments [10%] Assignments will be due after one week from the issue date Quizzes [10%] Quizzes may be conducted in class during the first 5-10 minutes NUST policy does not permit quizzes to be retaken under any circumstances
Found assisting or committing plagiarism in any assignment or quiz
will have their assignment and quiz marks cancelled OR marks would be shared by the group having similar solution
OHT 1&2 [30%] Exam during 7th & 11th week
Research Paper Project [10%]

Final Exam [40%] Exam during 18th week (9 -15January 2014) NUST policy requires at least 80% attendance to be maintained in order to be
allowed to sit the Final Exam.

1:31 AM
Course Outline
19
Introduction to Information Theory
1:31 AM
Course Outline
21
What is Information Theory (IT)?

IT is about asking what is the most efficient path
from one point to another, in terms of some way of measuring things.
1:31 AM
22
What is Information Theory (IT)?

Politics Ask not what your country can do for you, but what you can do for your country - John F. Kennedy
What makes the this political statements powerful (or at least
famous)?
force is efficiency of expression, there is an interpolation of many feelings,
attitudes and perceptions; there is an efficient encoding of emotional and mental information.
1:31 AM
23
Information Theory
Two important questions in engineering: - What to do if information gets corrupted by errors?
- How much memory does it require to store data?

Both questions were asked and to a large degree answered by Shannon in his 1948 article: use error correction and data compression.
Claude Elwood Shannon (1916 2001), American electrical engineer and mathematician, has been called the father of information theory, and was the founder of practical digital circuit design theory.
1:31 AM
25
Problems in Communications
Speed Minimise length of transmitted data Accuracy Minimise and eliminate noise Security Ensure data is not changed or intercepted whilst in transit
1:31 AM
26
Solutions
Speed Minimise length of transmitted data Use Data Compression Accuracy Minimise and eliminate noise Use Error Detection / Correction Codes Security Ensure data is not changed or intercepted whilst in transit Use Data Encryption / Authentication
1:31 AM
27
Communications Model
Source data
Evesdropper
Destination data
signal Transmitter
received signal
Receiver
noise
1:31 AM
28
Data Compression
This is the study of encoding information so that it may be
stored or transmitted efficiently.

Examples: WinZip, GSM Algorithms: Run Length Encoding (RLE), Huffman,
LZ77, LZW, Deflate (WinZip)
1:31 AM
29
Error Detecting/Correcting Codes

Error detection is the ability to detect errors that are
made due to noise or other impairments in the course of the transmission from the transmitter to the receiver.
Error correction has the additional feature that enables
locating the errors and correcting them.

Examples: Compact Disc, DVD, GSM Algorithms: Check Digit, Parity Bit, CRC, Hamming
Code, Reed-Solomon Code, Convolutional Codes, Turbo Codes and LDPC Codes
1:31 AM
30
Coding in the Communications Model

Source data signal Evesdropper Destination data received signal
Transmitter
Receiver
noise
Source Coding
Channel Coding
Compression
Encryption
Error Correction Coding
Modulation
1:31 AM
31
What is information?
information: [m-w.org]
1: the communication or reception of knowledge or intelligence
1:31 AM
32
What is an information source?

An information source produces a message or a sequence
of messages to be communicated to a destination or receiver On a finer granularity, an information source produces symbols to be communicated to the destination In this lecture, we will focus on discrete sources
i.e., sources that produce discrete symbols from a predefined
alphabet
However, most of these concepts can be extended to
continuous sources as well
1:31 AM
33
What is an information source?

Intuitively, an information source having more symbols
should have more information For instance, consider a source, say S1, that wants to communicate its direction to a destination using the following symbols:
North (N), South (S), East (E), West (W)
Another source, say S2, can communicate its direction
using:
North (N), South (S), East (E), West (W), Northwest (NW),
Northeast (NE), Southwest (SW), Southeast (SE)
Intuitively, all other things being equally likely, S2 has
more information than S1

1:31 AM
34
Minimum number of bits for a source

Before we formally define information, let us try to
answer the following question:

What is the minimum number of bits/symbol required to communicate an information source having n symbols?
1:31 AM
35

What is the minimum number of bits/symbol required to communicate an information source having n symbols?
A simple answer is that log2(n) bits are required to
represent n symbols
2 symbols: 0, 1
4 symbols: 00, 01, 10, 11 8 symbols: 000, 001, 010, 011, 100, 101, 110, 111
1:31 AM
36

Let there be a source X that wants to communicate
information of its direction to a destination

i.e., n=4 symbols: North (N), South (S), East (E), West (W)
According to our previous definition, log2(4)=2 bits are
required to represent each symbol

N: 00, S: 01, E: 10, W: 11
If 1000 symbols are generated by X, how many bits are required to transmit these 1000 symbols?
1:31 AM
37

Let there be a source X that wants to communicate information
of its direction to a destination

i.e., n=4 symbols: North (N), South (S), East (E), West (W)
According to our previous definition, log2(4)=2 bits are
required to represent each symbol

N: 00, S: 01, E: 10, W: 11
If 1000 symbols are generated by X, how many bits are required to transmit these 1000 symbols?
2000 bits are required to communicate 1000 symbols 2 bits/symbol

1:31 AM
38

Thus we need two bits/symbol to communicate the information of a
source X with 4 symbols

Lets reiterate our original question:
What is the minimum number of bits/symbol required to communicate an information source having n symbols? (n=4 in present example)
In fact, lets rephrase the question as:
Are 2 bits/symbol the minimum number of bits/symbol required to communicate an information source having n=4 symbols?
1:31 AM
39

Are 2 bits/symbol the minimum number of bits/symbol required to communicate an information source having n=4 symbols? The correct answer is NO! Lets see an example to emphasize this point
1:31 AM
40

So far in this example, we implicitly assumed that all
symbols are equally likely to occur

Lets now assume that symbols are generated according to
a probability mass function pX

pX 0.6 0.3 0.05 N S E W X
1:31 AM
41

pX 0.6 0.3 0.05 N S E W X
Let us map the symbols to the following bit sequences: N: 0 S: 01 E: 011 W: 0111
1:31 AM
42

pX 0.6
N: 0 S: 01 E: 011 W: 0111
0.3 0.05 N S E W X
Now if 1000 symbols are generated by X, how many bits are required to transmit these 1000 symbols?
1:31 AM
43

pX 0.6
N: 0 S: 01 E: 011 W: 0111
0.3 0.05 N S E W X
On average, the 1000 symbols will have: 600 Ns, 300 Ss, 50 Es and 50 Ws
1:31 AM
44

pX
N: 0 S: 01 E: 011 W: 0111 0.6 0.3 0.05 N S E W X
Now if 1000 symbols are generated by X, how many bits are required to transmit these 1000 symbols? 600 Ns, 300 Ss, 50 Es and 50 Ws Total bits=6001+3002+503+504=1550
1550 bits are required to communicate 1000 symbols 1.55 bits/symbol

1:31 AM
45

pX
N: 0 S: 01 E: 011 W: 0111 0.6 0.3 0.05 N S E W X
1550 bits are required to communicate 1000 symbols 1.55 bits/symbol

The bit mapping defined in this example is generally called a code And the process of defining this code is called source coding or source compression The mapped symbols (0, 01, 011 and 0111) are called codewords
1:31 AM
46

Coming back to our original question:
Are 1.55 bits/symbol the minimum number of bits/symbol required to communicate an information source having n=4 symbols?
1:31 AM
47

Are 1.55 bits/symbol the minimum number of bits/symbol required to communicate an information source having n=4 symbols? The correct answer is I dont know!
To answer this question, we first need to know the
minimum number of bits/symbol for a source with 4 symbols
1:31 AM
48
Information content of a source

The minimum number of bits/symbol required to
communicate the symbols of a source is the information content of the source How to find a code that can provide the minimum information is a different question However, we can quantify the information of a source without knowing the code(s) that can achieve this minimum In this lecture, we will refer to the minimum number of bits/symbol of a source as the information content of the source
1:31 AM
49

We start quantification of a sources information content
using a simple question

Recall from our earlier example that we assigned less
number of bits to symbols with high probabilities

pX
0.6 to represent the source Will the number of bits required
increase or decrease if we assign longer codewords to 0.3 more probable symbols?

0.05 N S E W X
1:31 AM
50

pX
0.6
N: 0111 S: 011 E: 01 W: 0
0.3 0.05 N S E W X
Total bits=6004+3003+502+501=3450 3.45 bits/symbol
These are more bits than we would need if we assumed all symbols to be equally likely
1:31 AM
51

So in the worst-case scenario, we can simply ignore the
probability of each symbol and assign an equal-length codeword to each symbol

i.e., we are inherently assuming that all symbols are equally
likely
The total number of bits required in this case will be
log2(n), where n is the total number of symbols

Using this coding, we will always be able to communicate
all the symbols of the source
1:31 AM
52

If we assume equally-likely symbols, we will always be
able to communicate all the symbols of the source using log2(n) bits/symbol
In other words, this is the maximum number of bits
required to communicate any discrete source

But if a sources symbols are in fact equally likely, what is
the minimum number of bits required to communicate this source?

1:31 AM
53
Information content of uniform sources

If the sources symbols are in fact equally likely, what is the
minimum number of bits required to communicate this source?

The minimum number of bits required to represent a source with
equally-likely symbols is log2(n) bits/symbol

Such sources are sometimes called uniform sources
pX 1/n
1:31 AM
54

The minimum number of bits required to represent a
discrete uniform source is log2(n) bits/symbol

For any discrete source where all symbols are not equally-
likely (i.e., non-uniform source), log2(n) represents the maximum number of bits/symbol Among all discrete sources producing a given number of n symbols, a uniform source has the highest information content
1:31 AM
55

The minimum number of bits required to represent a
discrete source with equally-likely symbols is log2(n) bits/symbol

Now consider two uniform sources S1 and S2
Let n1 and n2 respectively represent the total number of
symbols for the two sources with n1 > n2

Which uniform source has higher information content?
1:31 AM
56

Two uniform sources S1 and S1 n1 and n2 respectively represent the total number of

Which source has higher information content?
1:31 AM
57

Two uniform sources S1 and S1 n1 and n2 respectively represent the total number of

Which source has higher information content?
S1 has more information than S2

For example, compare the (North, South, East, West)
source with a source having the symbols (North, South, East, West, Northwest, Northeast, Southeast, Southwest)
1:31 AM
58

Thus if there are multiple sources with equally-likely
symbols, the source with the maximum number of symbols has the maximum information content
In other words, for equally likely sources, a function H(.)
that quantifies information content of a source should be an increasing function of the number of symbols
Lets call this function H(n)
Any ideas what H(n) should be?

1:31 AM
59

You should convince yourself that for a uniform source:
H(n) = log2(n)
1:31 AM
60
Information content of non-uniform sources

Generally, information sources do not have equally-likely
symbols; i.e., they are non-uniform

An example is the frequency distribution of letters in the
English language
1:31 AM
61
Information content of a non-uniform source

Normalized frequencies of letters in the English
language:
Image courtesy of Wikipedia: http://en.wikipedia.org/wiki/Letter_frequencies
1:31 AM
62

Since in general symbols are not equally-likely, source
compression can be achieved by designing a code that:

assigns more number of bits to less likely symbols, and
assigns less number of bits to more likely symbols
As more likely symbols will occur more often than less
likely ones, the total number of bits required by a code having the above properties will be less than log2(n)
1:31 AM
63

A function to quantify the information content of a non-
uniform source X should be a function of the probability distribution pX of X, say H(pX)

Since more bits are assigned to less likely symbols, H(pX)
should increase as pX decreases
1:31 AM
64

Since more bits are assigned to less likely symbols, H(pX) should
increase as pX decreases
The following function has been proven to provide the right
quantification for a given symbol i: H(pX=i)=log2(1/pX=i)

Since pX=i 1, H(pX=i) is always positive
1:31 AM
65

For a given symbol i, the information content of that
symbol is given by: H(pX=i)=log2(1/pX=i) So what is the expected or average value of the information content of all the symbols of pX?
1:31 AM
66

What is the expected or average value of the information content of all the symbols of a source with probability pX?
This expected value should be the weighted average of the
information content of all the symbols of a source with probability pX:
1:31 AM
67

The information content of a discrete source with symbol distribution pX is:
This is called the entropy of the source and represents the minimum expected number of bits/symbol required to communicate this source
68
1:31 AM

Before finishing our discussion on information sources, apply the formula for entropy on a uniform source:
pX 1/n
1:31 AM
69

Before finishing our discussion on information sources, apply the formula for entropy on a uniform source:
Note that this is the same function that we had deduced earlier
1:31 AM
70

Basic Lecture Information Theory and Coding

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Basic Lecture Information Theory and Coding

Uploaded by

Copyright:

Available Formats

Information & Coding Theory Course Outline/ Introduction

Class webpage: http://learn.mcs.edu.pk/course/view.php?id=544

Two Key Concepts

Two Fundamental Theorems

Information & Coding Theory

(Claude Shannon, 1948)

Course Objective (3+0)

are required to represent a music source.

Course Outline -II

Main Topics to be Covered

Main Topics to be Covered

Main Topics to be Covered

Recommended Text Books & Study Material

Class Webpage/ Learning management system

Recommended Text Books & Study Material

OR writing on transparency using Overhead Projector.

Grading Scheme & Policy Matters

Found assisting or committing plagiarism in any assignment or quiz

Research Paper Project [10%]

allowed to sit the Final Exam.

Introduction to Information Theory

What is Information Theory (IT)?

from one point to another, in terms of some way of measuring things.

Introduction to Information Theory

What is Information Theory (IT)?

Introduction to Information Theory

- How much memory does it require to store data?

Introduction to Information Theory

Introduction to Information Theory

Introduction to Information Theory

Introduction to Information Theory

stored or transmitted efficiently.

LZ77, LZW, Deflate (WinZip)

Introduction to Information Theory

Error Detecting/Correcting Codes

locating the errors and correcting them.

Introduction to Information Theory

Coding in the Communications Model

Error Correction Coding

Introduction to Information Theory

Introduction to Information Theory

What is an information source?

However, most of these concepts can be extended to

continuous sources as well

Introduction to Information Theory

What is an information source?

Another source, say S2, can communicate its direction

Northeast (NE), Southwest (SW), Southeast (SE)

Intuitively, all other things being equally likely, S2 has

more information than S1

Introduction to Information Theory

Minimum number of bits for a source

answer the following question:

Introduction to Information Theory

Minimum number of bits for a source

Introduction to Information Theory

Minimum number of bits for a source

information of its direction to a destination

According to our previous definition, log2(4)=2 bits are

required to represent each symbol

Introduction to Information Theory

Minimum number of bits for a source

of its direction to a destination