You are on page 1of 18

EE 708 Information Theory, IIT Bombay, 02 January 2012

EE 708: Information Theory and Coding


Sibi Raj B Pillai
Department of Electrical Engineering

1/17

Mindmap

Signal
Mathematical Processing
Finance
Telecom
Evolution

Ergodic Theory

Random Matrix
Theory

Vistas
Beyond

Data
Compression

Network Info.
Theory

Relay/Interference
Multiple
Descriptions Joint Source
Channel Code

Lossless
Compression
Information
Measures

Information
Theory

Non Shannon
measures
MultipleAcess
Broadcast

Distributed Rate Distortion


Theory
Compression
Universal
Compression

Channel
Coding

Coding with
Side Info
Error
Exponents

Error Correction
Communication Codes
Capacity

Sending Packets
ARQ

Transmitter

Channel

Receiver

If each packet takes T seconds, how many packets can we send in


nT seconds (n >>).

Sending Packets
ARQ

Transmitter

Channel

Receiver

Each Packet
lost with
probability p

If each packet takes T seconds, how many packets can we send in


nT seconds (n >>).

Average Packet Time


I
I
I

Let T be the time spent for correctly sending a packet.


T is a random variable due to errors in the channel.
The average time required for correctly sending a packet is
E T = (1 p)T + p(1 p)(2T ) + p2 (1 p)(3T ) +

X
= (1 p)T
ipi1
i1

= (1 p)T
=
I

1
(1 p)2

T
1p

From this, the number of correct packets Nn n(1 p) 1

1 important to notice that this is an approximation, actual computation in next page

(1)

Successful Packets
The probability that k packets were lost is
 
n k
p (1 p)nk .
k
Thus the average number of lost packets Nf is
 
n
X
n k
Nf =
k
p (1 p)nk
k
k=0
 
n
X
n k
p
= (1 p)n
k
, where =
k
1p
k =0
n  
X
n d k
d
n
= (1 p)
= (1 p)n
(1 + )n
k d
d
k =0

= n(1 p) = np

Randomness in ARQ
I

Let Nn be the number of packets correctly conveyed in nT secs.

Nn is a random variable (we saw its distribution in the last slide).

Every time we independently repeat the experiment, Nn takes a


new (possibly different) realization.

What is the sample space of Nn .

Weak Law of Large numbers say


Nn1 + Nn2 + Nn3 + + Nnm i.p
n(1 p)
m

What does this mean?

:- see later slides

Scratching the CDs

Have you noticed the scratches on CD.

Scratches most often cause loss of data.

How does the music still gets played out of a scratched CD.

ans: Reed Solomon Codes/Discrete Fourier Transform

Defective Memory
I

Suppose you develop a technique to make a dirt-cheap high


density memory disk, on which you wish to mass-distribute some
content.

Unfortunately, on every disk there are some number of


faulty(stuck) memories whose locations are known to you on
testing.

How efficiently can you pack data so that no information is lost at


the intended receivers.

ans: Coding with Transmitter Side information

Distributed Source Coding


I

Two sensors at different locations observe versions of the same


phenomenon. Say temperature in a plant. They both wish to
communicate their data to a central station.

Note that though the observed information is related, each


sensor has no idea (other than statistical properties) of the
observations at the other one.

Do they have to individually send all the information they collect?

Even if you let those sensors sit together, collect and process
the data, the sum of the informations sent individually is same
as the case when they are not talking to each other.
Slepian-Wolf Coding

CDMA vs OFDMA
I

From 2G to 3G marked the shift from TDMA to CDMA, largely


due to the efforts of Qualcomm and others.

From 3G to 4G, CDMA could not sustain the steam.

In particular, technologies like OFDMA (as word says:- a way of


FDM) took over.

How do we know a new physical layer technology has future, say


in wireless communication.

ans: Capacity region of fading MAC/BC

:-not fully covered in EE708

Taking it too Far


I

A huge hadron collider is looking for subatomic particles. The


sensor produces a binary output at integral multiples of
seconds. In each interval, the probability of observing the
particle is , independent of all other observations. The data is to
be stored inside the collider till the experiment lasts, say 1010
seconds. Of the onboard memory available, you wish to minimize
the average memory required per observation, take = 0.0001.

Design a scheme which will get you there.

Many codes can do this, in particular Arithmetic coding is very


popular.

Probability Essentials
Introductory Treatments and Reference
I William Feller, An Introduction to Probability, Wiley Student
Edition (Indian), 1970.
I P. Billingsley, Probability and Measure, Wiley 1995.
I Firstedt and Gray, A Modern Approach to Probability Theory,
Birkhauser 1998.
Details Relevant to Shannon Theory
I Robert Gray, Probability, Random Processes and Ergodic
Properties, available online at
http://ee.stanford.edu/gray/arp.html

Note: We will use the minimal required set

Bayes Rule
In its simplest form,
P(A, B) = P(A)P(B/A).
But, what is P, A and B?. Try this example.
Question) Consider a blood testing equipment which detects the
presence of a disease 99% of the cases, if presented with infected
samples. Thus 1% of the infected escape undetected. On the other
hand, the test gives false results for 2% of healthy patients too.
Suppose, on average, 1 out of 1000 people are infected. If the
machine gives a positive test, what is the chance of the blood sample
being actually infected.

Let us now make these precise.

Probability Definition
Probability Space (, F, P)
I
I
I

We assume a set of possible experimental outcomes, say .


To obtain a level playing ground, we generate a field F
containing subsets (events) of .
In order to be stable (meaningful) with respect to operations like
limits, we insist that our field F is closed with respect to the
countable unions, and we call it a field.
Our final party is the measure (loosely, non-negative function) P:
I

Unit Norm
P() = 1

(2)

Countable Additivity: For disjoint sets Ai F , i = 1, 2,

[
X
P( Ai ) =
P(Ai ).
i=1

i=1

(3)

Random Variables
A random variable X is a measurable mapping from a probability
space
(, F, P) (, B),
where is a set of observables, and B a field of observed events.
I The RHS imbibes a probability measure from the LHS, and no
need to explicitly mention it.
I If X is discrete and finite (the case for a large part of our
lectures), then then the associated fields can be taken as the
powerset (set of all subsets).
I More generally, there are other fields as well, but we stick to
the sensible ones with respect to which X is measurable. The
ones we use are mostly clear from the context.
I From now, we assume that X takes values in R.
I We have to be more careful when X is also continuous valued.

Expectation
I

For a discrete random variable X , the quantity


X
E[X ] =
xP(X = x),
x

is known as the expectation of X .


I

Notice that E() is a linear operation.

Similarly, when X is continuous, admitting a density fX (x),


Z
E[X ] =
xfX (x)dx

Weak Law of Large Numbers (WLLN)


I

Weak law is about weak convergence of random variables.


Let {Xi }, i N be a sequence of iid R-valued random variables.
Define
4

Sn =

1X
Xi .
n
i=1

Then Sn converges in probability to E[X ].


I

In other words,  > 0,


P(|Sn E[X ]| > ) 0.

The Strong LLN says that if E[|Xi |] < , then  > 0


P( lim |Sn E[X ]| > ) = 0
n

You might also like