Source Coding

ECNG 6703 - Principles of Communications
Introduction to Information Theory - Source Coding
Sean Rocke
September 16th , 2013
1 / 25
Outline
Digital Communication Preliminaries Models for Information Sources Measures of Information Source Coding Conclusion
2 / 25
Digital Communication Preliminaries
Communications Signal Examples

Lets get some intuition before starting source coding. . . Consider what happens just before transmitting or just after receiving any of the following: Taking & posting a narcissistic picture of you ziplining in Chaguaramas, on your Facebook prole A GSM phone conversation Sending instrumentation data to a control system for a manufacturing plant Downloading a legal copy of an ebook from amazon.com Live transmission of a Machel Montano concert over the Internet How do we communicate these signals using a digital communications system? What characteristics are of importance?
ECNG 6703 - Principles of Communications 3 / 25
Elements of a Digital Communications System

Information source and input transducer Source encoder Channel encoder Digital modulator
Channel
Output transducer
Source decoder
Channel decoder
Digital demodulator
Elements not specically included in the illustration: Carrier and Symbol Synchronization A\D interface Channel interfaces (e.g., RF front end (RFFE), ber optic front end (FOFE), BAN front end (BANFE), . . . )
4 / 25
Signal Classication Matrix
Time Discrete Continuous
Value Discrete
Continuous
5 / 25
Models for Information Sources
Source Coding Dened
Source coding: The process of efciently converting the output of either an analog or digital source into a bit sequence. Source coding challenge: How can the source output (digital or analog) be represented in as few bits as possible? Key performance metrics: coding efciency, redundancy, rate distortion, implementation complexity To answer the above, it is essential to model the signal sources. . .
6 / 25
Analog Source Models
An information source produces a random output The analog output at any time, t , is a random variable, X (t ) with CDF FX (x1 , t1 ) = P [X (t1 ) x1 ] The joint CDF is dened as, FX (x1 , . . . , xn ; t1 , . . . , tn ) = P [X (t1 ) x1 , . . . , X (tn ) xn ] We consider statistically stationary outputs where, FX (x1 , . . . , xn ; t1 , . . . , tn ) = FX (x1 , . . . , xn ; t1 + , . . . , tn + )
7 / 25
AnalogtoDigital Conversion
1 0.8 0.6 0.4 Amplitude 0.2 0 0.2 0.4 0.6 0.8 1 0 1 2 3 Time, t 4 5 6 Original signal Quantized signal Sampled signal Quantization error
For bandlimited X (t ) with bandwidth W , we can obtain samples {X ( 2n W )} to obtain a discretetime output Precision is lost when discretetime analog signals are quantized (Unavoidable!)
Discrete Source Models

An information source produces a random output, either directly (i.e., if discretetime&discretevalued), or through A\D conversion (i.e., if continuoustime/continuousvalued) Information source with an alphabet, L := {x1 , . . . , xL }, emits random letter sequence Assume each letter has probability, pk = PX (xk ), 1 k L of occurrence, where L k =1 pk = 1 If letter occurrences are independent, source is discrete memoryless source (DMS) For dependent case we consider statistically stationary outputs where, pX (x1 , . . . , xn ; t1 , . . . , tn ) = pX (x1 , . . . , xn ; t1+m , . . . , tn+m )
9 / 25
Information Source Models

Information Sources
Continuous
Discrete
Memory
Memoryless
Questions:
1
Annotate this diagram based upon the previous information.

Measures of Information
So given a particular source model, how do we quantify the information content?. . . Consider two discrete RVs, X and Y , and assume that an outcome Y = y is observed Can we determine quantitatively the amount of information the occurrence of Y = y provides about the event X = x ? Mutual information between outcomes x and y : measure of information provided by the occurrence of Y = y about X = x : I (x ; y ) = log
PX |Y (x ,y ) PX (x )
Mutual information between RVs X and Y : average of I (x ; y ): I (X ; Y ) = E [I (x ; y )] =

x X y Y
PXY (x , y )I (x ; y )
Units of bits (i.e., if log2 used) or nats (i.e., if ln used)

Properties of Mutual Information: I (X ; Y ) = I (Y ; X ) I (X ; Y ) 0 I (X ; Y ) = 0 if and only if X and Y independent I (X ; Y ) min{|X |, |Y|} Questions: Consider I (x ; y ) for the following cases:
1 2 3
X and Y are statistically independent X and Y are fully dependent X and Y are partially dependent
12 / 25
Measures of Information: Entropy

How can we measure uncertainty in the source? Entropy of X : H (X ) = E [log PX (x )] = Note: We dene 0 log 0 = 0 Units of bits (i.e., log2 used) or nats (i.e., ln used) Questions:
1
x X
PX (x )log PX (x )
What is the entropy of a deterministic information source? When is the entropy of a DMS with alphabet size, |X |, maximized? What is H (X ) in this case?
13 / 25
Measures of Information: Entropy

Properties: I (X ; X ) = H (X ) 0 H (X ) log |X | I (X ; Y ) min{H (X ), H (Y )|} If Y = g (X ), then H (Y ) H (X ) Questions:
1
Calculate the entropy for a binary source with probability, p, that a 1 occurs. Plot the Entropy function (i.e., H (X ) vs p) using MATLAB.
14 / 25
Measures of Information: Joint & Conditional Entropy

Multivariable extension of entropy. Consider RVs X and Y : H (X , Y ) = E [log PXY (x , y )] = (x ,y )X Y PXY (x , y )log PXY (x , y ) If X = x is known, then the entropy of Y , given X = x : H (Y |X = x ) =
y Y
PY |X (x , y )log PY |X (x , y )
We obtain the conditional entropy if this quantity is averaged over all X values: H (Y |X ) = E log PY |X (x , y ) = (x ,y )X Y PXY (x , y )log PY |X (x , y ) Question: Show that H (X , Y ) = H (X ) + H (Y |X ).
Measures of Information: Joint & Conditional Entropy
Properties of Joint and Conditional Entropy: 0 H (X |Y ) H (X ) H (X |Y ) = H (X ) if and only if X and Y independent H (X , Y ) = H (X ) + H (Y ) if and only if X and Y independent H (X , Y ) = H (X ) + H (Y |X ) = H (Y ) + H (Y |X ) H (X ) + H (Y ) I (X ; Y ) = H (X ) H (X |Y ) = H (Y ) H (Y |X ) = H (X ) + H (Y ) H (X , Y )
16 / 25
Measures of Information: Continuous RVs

Consider X and Y are continuous RVs with joint and marginal PDFs, fXY (x , y ), fX (x ) and fY (y ) respectively: Average mutual information: I (X ; Y ) = E log
log fY |X (x ,y )fX (x ) = fX (x )fY (y ) fY |X (x ,y )fX (x ) dxdy fX (x )fY (y )
Average conditional entropy: H (X |Y ) = E log fX |Y (x , y ) =

log fX |Y (x , y )dxdy
Differential entropy: Not the same as entropy!!! H (X ) = E [log fX (x )] =

fX (x )log fX (x )dx
Note: I (X ; Y ) = H (X ) H (X |Y ) = H (Y ) H (Y |X ).
Source Coding
Lossless Source Coding

How can the source output (digital) be represented in as few bits as possible, such that perfect reconstruction of the source is possible from compressed data? What is the theoretical lowest number of bits required for representing the source output? Shannons 1st Theorem: Lossless Source Coding Theorem: Let X denote a DMS with entropy, H (X ). There exists a lossless source code at any rate R if R > H (X ). No code exists for R < H (X ). Is it possible to achieve this bound? How?
Variablelength coding algorithms (e.g., Huffman algorithm) Fixedlength coding algorithms (e.g., LempelZiv algorithm)
You are required to know how to use both the Huffman and LempelZiv algorithms to determine source codes!
Source Coding
Evaluating Code Performance: Efciency

Letter x1 x2 x3 Probability Self-information 0.45 1.156 0.35 1.520 0.20 2.330 H(X) = 1.513 bits/letter R1 = 1.55 bits/letter Efficiency = 97.6% Code 1 00 01
Questions:
1
Can we improve on this code?

Source Coding
Evaluating Code Performance: Efciency

Letter x1 x1 x1x2 x2x1 x2 x2 x1 x3 x3x1 x2 x3 x3x2 x3x3 Probability Self-information 0.2025 2.312 0.1575 2.676 0.1575 2.676 0.1225 3.039 0.09 3.486 0.09 3.486 0.07 3.850 0.07 3.850 0.04 4.660 2H(X) = 3.026 bits/letter pair R2 = 3.0675 bits/letter pair Efficiency = 98.6% Code 10 001 010 011 111 0000 0001 1100 1101
Questions:
1
What is the tradeoff compared to the rst code for single letters?
Source Coding
Evaluating Code Performance

Letter x1 x2 x3 x4 Variable Length? Fixed Length? Uniquely Decodable? Instantaneous? Efficiency Probability Code 1 0.5 1 0.25 00 0.125 01 0.125 10 Code 2 0 10 110 111 Code 3 0 01 011 111 Code 4 00 01 10 11
Questions:
1 2 3
Classify the following codes. Which is the most efcient? Which would you choose?
Source Coding
Evaluating Code Performance: Rate Distortion

Sampling and quantization of an analog source generally results in:
waveform distortion loss of signal delity
Distortion between actual source samples {xk } & quantized k }: values {x

Single sample measure: k ) = (xk x k )2 (squared error ) d (xk , x nsample sequence measure: k ) = d (xk , x
1 n n k =1
k ) d (xk , x
Distortion, D : k ) = E d (Xk , X
1 n n k =1
k ) E d (Xk , X
22 / 25
Source Coding
Evaluating Code Performance: Rate Distortion
Rate distortion function: Minimum bits/source output symbol required to represent source output, X , with distortion less than or equal to D: R (D ) = minf Note: Evaluation of code performance using rate distortion applies to lossy coding, where the data is compressed, subject to a maximum tolerable distortion (i.e., some of the information is lost during coding and hence cannot be regained via reconstruction).
X |X
)] D {I (X ; X )} ):E [d (X ,X (x ,x
23 / 25
Conclusion
Conclusion
We covered: Elements of a digital communications system Mathematical models for information sources Measures of information systems Lossless & lossy source coding Source code evaluation More MATLAB Your goals for next class: Continue ramping up your MATLAB skills Make sure you can apply Huffman and Lempel-Ziv algorithms! Complete HW 2 Review notes on Channel Coding in prep for next class
Q&A
Thank You
Questions????
25 / 25

Source Coding

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Source Coding

Uploaded by

Copyright:

Available Formats

ECNG 6703 - Principles of Communications

Introduction to Information Theory - Source Coding

September 16th , 2013

ECNG 6703 - Principles of Communications

ECNG 6703 - Principles of Communications

Digital Communication Preliminaries

Communications Signal Examples

Digital Communication Preliminaries

Elements of a Digital Communications System

ECNG 6703 - Principles of Communications

Digital Communication Preliminaries

Signal Classication Matrix

Time Discrete Continuous

Models for Information Sources

Source Coding Dened

ECNG 6703 - Principles of Communications

Models for Information Sources

Analog Source Models

ECNG 6703 - Principles of Communications

Models for Information Sources

Models for Information Sources

Discrete Source Models

ECNG 6703 - Principles of Communications

Models for Information Sources

Information Source Models

Annotate this diagram based upon the previous information.

Mutual information between RVs X and Y : average of I (x ; y ): I (X ; Y ) = E [I (x ; y )] =

Units of bits (i.e., if log2 used) or nats (i.e., if ln used)

ECNG 6703 - Principles of Communications

Measures of Information: Entropy

ECNG 6703 - Principles of Communications

Measures of Information: Entropy

ECNG 6703 - Principles of Communications

Measures of Information: Joint & Conditional Entropy

Measures of Information: Joint & Conditional Entropy

ECNG 6703 - Principles of Communications

Measures of Information: Continuous RVs

Average conditional entropy: H (X |Y ) = E log fX |Y (x , y ) =

Differential entropy: Not the same as entropy!!! H (X ) = E [log fX (x )] =

Lossless Source Coding

Evaluating Code Performance: Efciency

Can we improve on this code?

Evaluating Code Performance: Efciency

Evaluating Code Performance

Evaluating Code Performance: Rate Distortion

Distortion between actual source samples {xk } & quantized k }: values {x

ECNG 6703 - Principles of Communications

Evaluating Code Performance: Rate Distortion

ECNG 6703 - Principles of Communications

ECNG 6703 - Principles of Communications

You might also like