Mod 1 PDF

EE132A - Introduction to Communication Systems
Module 1
Basic Probability
Winter Quarter 2015
An outline of this module...
Basic communication block diagram

Motivate course outline using this block diagram
Basic probability: sample space, events & probability
Independence
Conditional probability
Random variables: pdfs, cdfs
Expectation (mean) and variance
Expectation of multiple random variables
Uncorrelated vs. Independence
Basic communication block diagram

Msg
Estimated
Transmitter
Channel
Receiver
message
Messages generated by a source, e.g., video, speech, web page request.

Transmitter maps message into signal to be transmitted over channel.
Channel is wired cable (telephone line, TV cable, etc.) or wireless (riding on
electromagnetic waves).
Receiver maps the transmitted signal after transformation by channel onto
an estimate of the message.
In digital communication, the message is a bit sequence and this is the focus of
this course. In analog communication, the message is an analog waveform; a
topic that we will briefly touch upon in the class.
In order to study this basic scenario, we will break up the problem into modules.
Up-converter
Transmitter
Tx1
Waverform mapping
Encoder
Txmt msg
Tx3
Channel
Tx2
Rx1
Down-coverter
Rx2
Baseband front-end
Rx3
Decoder
Estimated msg
Receiver
Modules
Receiver design for discrete time channels: Rx3.
Waveform design & baseband front-end: Tx2, Rx2.
Sequence of transmitted messages: Transmit Pulse-shape: Tx2 refined.
Encoding to handle errors: Error correcting codes & decoding: Tx3 and
Rx3 refined.
Bandpass communication: Upconverter & Downconverter: Tx1 & Rx1.
ISI channels & OFDM: Modification of transmitter waveform & Rx

baseband front end.
Application to wireless systems, DSL systems.
Basic Probability
Formal probability theory starts with the concept of a probability space consisting
of
(i) An abstract space , sample space, containing all distinguishable elementary
outcomes of an experiment.
(ii) An event space F, consisting of a collection of subsets of which we
consider to be possible events to which we want to assign probabilities. We
require an algebraic structure that we will specify soon.
(iii) Probability measure, P, which assigns a number between 0 and 1 to every
event in F. It has to satisfy some axioms that we will explain later.
A natural question to ask is whether we can make F the set of all subsets of ?
This is in fact a valid choice when is a finite space, i.e. || 8.
However for infinite spaces like r0, 1s this is much more tricky due to the
uncountably infinite number of outcomes in for this case.
In the beginning we will introduce notions for a discrete probability space where
|| 8 or || is countably infinite.
Probability Measure
A probability measure P satisfies the following axioms
1st PpF q 0
@ F PF
2nd Ppq 1 i.e. probability of everything is 1

3rd If Fi
1, ..., n are disjoint then P
Fi
i 1
th
PpFi q
i 1
If N pAq, N pB q are frequencies of disjoint events A, B,

PpA Y B q N pAqNN pB q PpAq PpB q, where N is the total number
of trials. This is the frequentist way of interpreting this property.
If Fi , i
1, 2, ... are disjoint, then P
8
i 1
Fi
i 1
PpFi q
Notes
P is a measure in the same sense as mass, length, area and volume, each of
which satisfy axioms 1, 3, 4.
But P is special since it is bounded due to axiom 2.
Also probability is different due to aspects like conditioning and
independence, that does not occur in this analogy.
Sometimes (read as mostly) it is convenient to write the probability of a
subset s P as Pps q instead of Ppts uq. This abuse of notation is (very)
common.
Examples: discrete probability spaces
A sample space is said to be discrete if it is countable.

Flipping a coin: tH, Tu, F
tH, tHu, tTu, u

Rolling a die: t1, 2, 3, 4, 5, 6u, F set of all subsets
Number of packets arriving at a node in a communication network in an
interval r0, T s : N t0, 1, 2, ...u
Flipping a coin till first head occurs: tH, TH, TTH, ...u
Notes
For discrete spaces F can be taken to be the set of all subsets of which is
also called the power set of .
F need not necessarily be the entire power set.
Probability measure can be defined

by assigning probabilities to individual
outcomes Pp q 0 @ P and
Pp q 1 . For any event A,
PpAq
Pp q.
Examples:
Suppose we take a die rolling experiment

with t1, 2, 3, 4, 5, 6u and
1
assign Pp q 6 , @ P . Clearly
Pp q 1. If A event that we get
even die then A t2, 4, 6u. Then, we get PpAq

For number of packets arriving in time r0, T s,
Pptk uq
pT qk e T
k!
for k
Pp q 3
1
6
12 .
0, 1, 2, ... and 0.
This is the Poisson probability distribution (we will define in awhile) and is
the average number of packets per unit time.
Basic Probability Laws

PpAc q 1 PpAq
A Y Ac and A X Ac
therefore PpAq
H
PpAc q Ppq 1.
If A B then PpAq PpB q

B A Y pB zAq
therefore PpB q PpAq
PpA Y B q PpAq
PpA Y B q PpAq
P pB zAq PpAq
PpB q PpA X B q
PpB q. Union of events bound:
Ai
PpAi q
i 1
i 1
Law of total probability: Let A1 , A2 , ... be events that partition , i.e.

Ai X Aj H, i j and i Ai . Then for any event B,
PpB q
PpB
X Ai q.
This is sometimes useful in finding probabilities of sets.
Independence
Given a probability space p, F, Pq, two events F , G P F are said to be

independent of each other if PpF X G q PpF qPpG q. A collection of subsets
tFi uki1 are independent or mutually independent if for any distinct subcollection
tFl umi1 we have
m m
P
Fl
PpFl q
i
i 1
i 1
Note It is not sufficient to only have P

i 1 Fi
imply the independence of a subcollection.
ki1 PpFi q since it does not
Conditional Probability
Suppose we have a probability space p, F, Pq and an observer has told us that
an event G has occurred. Thus the observer knows the outcome of the
experiment, but we do not know which element of G has occurred. Now, we need
to calculate the probability that another event F has occurred. We denote this
conditional probability as PpF |G q. For a fixed G , we want to be able to
compute PpF |G q, @F P F. Therefore, we define a new probability measure
PG PpF |G q on p, F q. We define this measure from earlier principles.
Since we are told that G has occurred P G , and so PG must assign zero
measure to G c i.e. we should have:
PpG c |G q 0 PpG |G q 1.
Therefore
PpF |G q
[total probability law]
PpF X |G q PpF X rG Y G c s|G q

PpF X G |G q PpF X G c |G q PpF X G |G q
0 since F XG c G c
Next intuitively we should expect that relative probabilities should not change
given knowledge that G has indeed occurred. For example if F G and H G
and if PpF q 2PpH q then we expect PpF |G q 2PpH |G q. Hence if
PpF X G q 2PpH X G q we expect PpF |G q 2PpH |G q.
Therefore we would want
PpF X G |G q
PpH X G |G q
PpF |G q
PpH |G q
Hence if we take H
PPppHF XX GG qq , i.e.,
PPppHF XX GG qq @F , H, G P F.
, we get
PpF |G q
PpPFpXG qG q or
PpG |G q
PpF X G q
PpF |G q
PpG q
which is the definition of conditional probability. Note that we need PpG q 0 for
this to work, i.e., conditioning event should have a non-zero probability measure.
Note that if two events are independent, i.e. PpF X G q PpF qPpG q, then we
have
PpF X G q
PpF |G q
PpF q,
PpG q
i.e. we have the intuitive result that probability of F is unaffected by the
knowledge that G has occurred. This is what we expect from the notion of
independence.
Note that it is not useful to define independence as PpF q PpF |G q since it
requires PpG q 0 and therefore is slightly less general.
If PpF q 0 and PpG q 0, then
PpG |F q
PpF X G q
PpF q
PpF P|GpFqPq pG q .
Bayes Rule
Let A1 , A2 , ..., An be events with non-zero probability measure, which partition the
space i.e.,
i 1
Ai
and Ai X Aj H, i j, and let B P F be any other event,
then by the law of total probability

PpB q
i 1
Moreover,
PpAj |B q
PpB
X Ai q
PpB |Ai qPpAi q.
i 1
PpAPjpBXqB q PpB |PApjBqPqpAj q .
Hence we get the Bayes Rule,

PpB |Aj q
PpAj q, j
i 1 PpB |Ai qPpAi q
PpAj |B q n
1, 2, ..., n
Clearly this works for a countably infinite number of events as well.
Example - Binary Communication Channel

P(0) = 0.2
P(0|0) = 0.9
P(1|0) = 0.1
P(0|1) = 0.025
P(1) = 0.8
1
P(1|1) = 0.975
Figure: Model for a noisy binary communication channel.

tp0, 0q, p0, 1q, p1, 0q, p1, 1qu
where the pair pi, j q denotes (tx bit,rx bit). Since is finite, then F 2 , the
power set. Moreover we just need to assign elementary outcomes a probability
measure,
Pptpi, j quq Ppinput i q Ppoutput j |input i q, i, j
P t0, 1u.
Let us define events

A
t0 is sentu tp0, 0qu Y tp0, 1qu

B t0 is receivedu tp0, 0qu Y tp1, 0qu
Then, the a-posteriori probability that 0 was sent is given by
PpA|B q
PpAqPpB |APq pB P|ApqAc qPpB |Ac q PpAq
PpA|B q 0.2 0.9
[using Bayes rule]
0.9
0.9

0.2
0.2 0.9.
0.8 0.025
0.2
Note that PpA|B q PpAq 0.2.
Now suppose the same channel is used twice to send independent bits of
information. Let,
t0 is sent on i th transmissionu Aci t1 is sent on i th txu

Bi t0 is received on i th transmissionu Bic t1 is recd on i th rxu,
where i 1, 2.
Now PpA1 X A2 q PpA1 qPpA2 q 0.2 0.2 due to independence. Suppose we are
Ai
interested in the error event Ei , i.e.,

Ei
tin i th use of channel, tx bit rx bitu

pAi X Bic q Y pAci X Bi q, i 1, 2.
And we are interested in the event that both transmissions are received
erroneously, i.e., the event E1 X E2 .
Since the channel uses cause independent errors as well, we would expect
PpE1 X E2 q PpE1 qPpE2 q
PpE1 q
P rpA1 X B1c q Y pAc1 X B1 qs PpA1 X B1c q PpAc1 X B1 q

PpA1 qPpB1c |A1 q PpAc1 qPpB1 |Ac1 q
0.2 0.1 0.8 0.025 0.04
PpE1 X E2 q PpE1 qPpE2 q p0.04q2 16 104 1.6 103
Random Variables
Random variables are just functions mapping elementary outcomes in the abstract
space to a real number. Here is where the power of the probability space
formalism is more apparent since we can define
X : R as a random variable X p q
and similarly define a random vector as
X : Rd i.e., Xp q P Rd is a random vector
and a discrete time random process as a function as well,
X : R i.e.,

X
P R, t P Z.
Notes:
The random variable/vector/process inherits its probability measure from

the underlying probability space p, F, Pq.
We distinguish the probability measure of r.v. which is Px : B pRq r0, 1s

from the underlying measure of probability space P.
We need an important condition on the r.v. that the images are in F, that
is,
X 1 pAq t : X p q P Au P F,
but these sets A need to be open sets in R (or unions, intersections, or

complements)
In general Px is defined for all F
X R.
P F x ,i.e. the event space of subsets of
Description of Random Variables:

A Cumulative distribution function(c.d.f.) is
FX paq PX px
P p8, asq PrtX pq au.
Now from a cumulative distribution function (which always exists for a random
variable), we can define a probability density function (which may or may not be
well defined)
FX px
x q FX px q Ptx
X pq x
x u.
If FX is smooth, then we can write using the mean-value theorem, for x

sufficiently small,
FX px x q FX px q fX px qx.
A probability density function(p.d.f.) is
fX paq
dFX px q
dx x a
if definable.
If the random variable is discrete, then a more suitable description is a

probability mass function(p.m.f.),
PX px q PrtX
x u.
Example:
Suppose we toss a coin 4 times, each toss being independent. If the coin is
unbiased, then each toss can be H or T, each with probability 21 . Let X
number of H happening in trial. Clearly X P t0, 1, 2, 3, 4u.
PrtX
PrtX
PrtX
1u
2u
PrtX
0u 161
4
1
1
16
4 161 41
4
2
1
16
6 161 83
3u
PrtX
4
3
1
16
164 41
4u 161
Joint, marginal and conditional pmfs

Now we will carefully define and use joint, marginal and conditional probability
mass functions. Let X , Y be discrete random variables, i.e., X , Y are finite or
countably infinite.
PXY px, y q PrtX
x, Y y u Ppt : X pq x, Y pq y uq , @x P X , y P Y,
where for short hand we define X X , Y Y . Note that

PXY px, y q 1.
P
x Xy Y
Marginal pmf: PX px q. Use the law of total probability
PX px q
PXY px, y q,
y Y
@x P X .
Conditional pmf: PX |Y y px |y q is defined as

PX |Y y px |y q
PXY px, y q
PY py q
@x P X ,
y : PY py q 0.
Chain rule:
Independence:
PXY px, y q PX |Y px |y qPY py q PY |X py |x qPX px q.

PXY px, y q PX px qPY py q
equivalent to
@x, y , P X Y
PY |X py |x q PY py q @y P Y, x : PX px q 0
Bayes Rule for pmfs
Given PX px q, PY |X py |x q for every px, y q P X
Y, we can find
PY |X py |x q
PY |X py |x q
PX px q
PX |Y px |y q
1
1 PX px q
PY py q
x PX PY |X py |x qPX px q
1
Continuous-valued random variables and Bayes rule

FXY px, y q PrtX
x, Y y u Ppt : X pq x, Y pq y uq
is the joint cdf of X , Y .

Independence: X , Y are independent if @x, y , FXY px, y q FX px qFY py q
Joint pdf: If FXY px, y q is differentiable in x, y then
fXY
2
B FBXYx Bpyx, y q xlim0 Prtx X x
y
x, y
xy
Y y
We then define conditional pdf as

fY |X py |x q
Bayes rule:
fXY px, y q
fX px q
if fX px q 0.
fY |X py |x qfX px q
fX |Y px |y q 8
fXY pu, y qdu
8
y u
Expectation or Mean
Expectation of a random variable X is defined as,
$
'
discrete random variables
'
& x xPX px q
E rX s
'
'
% xfX px qdx continuous random variables
x
Fundamental Theorem of Expectation:

For a function g px q of a random variable,
$
'
discrete
'
& x g px qPX px q
E rg px qs
'
'
% g px qfX px qdx continuous
x
Variance
$
'
2
'
2 & x x fX px qdx
E X
2
'
'
x PX px q
%
x
Variance of X :
VarpX q E pX
second moment or mean-square of X
E rX sq2

Var E X 2 pE rX sq2 2X E rX s

E X 2 pE rX sq2 2pE rX sq2

i.e., E X 2 pE rX sq2
Standard deviation is X
VarpX q i.e. VarpX q X2 .
Expectation involving two random variables, Covariance
E rg pX , Y qs
g px, y qfXY px, y qdxdy
extension to the Fundamental Theorem of Expectation

Covariance: CovpX , Y q E rpX E rX sqpY E rY sqs , X and Y are
uncorrelated if CovpX , Y q 0.
Uncorrelated versus independence
E rXY s
xyfXY px, y qdxdy
xfX px qdx

If CovpX , Y q 0, then X and Y are uncorrelated. Is there any relationship

between independence and uncorrelatedness? Suppose X
Y , where
denotes
independence, then
xyfX px qfY py qdxdy
yfY py qdy
or E rXY s E rX s E rY s
CovpX , Y q E rXY s E rX s E rY s 0.
Hence independence uncorrelated.
But uncorrelated independence!
Example
Let X , Y
P t2. 1,
1, 2u.
PXY px, y q
$
2
'
'
'5
'
&
'
'
10
'
'
%0
px, y q p
1, 1q, p1, 1q
px, y q p
2, 2q, p2, 2q
otherwise
Clearly X &Y are not independent,

E rX s
xPXY px, y q 1
x Xy Y
2
5
p1q 52
2
1
10
2 101 0
and similarly E rY s 0.
E rXY s 1
2
5
1
2
5
4 101 4 101 0
uncorrelated!
Notations for the course

1
A random variable will always be denoted by an upper-case letter e.g. X ,

while a realization of it will be denoted by a lower-case letter e.g. X x.
PX , fX denote the pmf or the pdf of a discrete or a continuous random

variable respectively.
Pr() or Ppq will denote the probability measure of an event or a subset

of the sample space.
Probability of error will be denoted by Pe .
Expectations will be denoted by E; a subscript indicating the random

variable against which the expectation is taken may sometimes be given for
clarity e.g. EX |Y rX |Y s.
Vectors, which are almost everywhere column (tall) vectors by default, will
be denoted bold-face e.g. X (a random vector) or v (a vector).
Matrices are also bold-face e.g. covariance matrix KX , dimensions which will
be specified when they are defined will determine their usage.

Mod 1 PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Mod 1 PDF

Uploaded by

Copyright:

Available Formats

EE132A - Introduction to Communication Systems

Winter Quarter 2015

An outline of this module...

Basic communication block diagram

Basic communication block diagram

Messages generated by a source, e.g., video, speech, web page request.

Receiver design for discrete time channels: Rx3.

Waveform design & baseband front-end: Tx2, Rx2.

Sequence of transmitted messages: Transmit Pulse-shape: Tx2 refined.

Bandpass communication: Upconverter & Downconverter: Tx1 & Rx1.

ISI channels & OFDM: Modification of transmitter waveform & Rx

Application to wireless systems, DSL systems.

2nd Ppq  1 i.e. probability of everything is 1

 1, ..., n are disjoint then P

If N pAq, N pB q are frequencies of disjoint events A, B,

 1, 2, ... are disjoint, then P

Examples: discrete probability spaces

A sample space is said to be discrete if it is countable.

 tH, tHu, tTu, u

Probability measure can be defined

Suppose we take a die rolling experiment

even die then A  t2, 4, 6u. Then, we get PpAq 

Basic Probability Laws

If A B then PpAq PpB q

PpB q. Union of events bound:

Law of total probability: Let A1 , A2 , ... be events that partition , i.e.

This is sometimes useful in finding probabilities of sets.

Given a probability space p, F, Pq, two events F , G P F are said to be

Note It is not sufficient to only have P

 ki1 PpFi q since it does not

 PpF X |G q  PpF X rG Y G c s|G q

 and Ai X Aj  H, i  j, and let B P F be any other event,

then by the law of total probability

PpB |Ai qPpAi q.

 PpAPjpBXqB q  PpB |PApjBqPqpAj q .

Hence we get the Bayes Rule,

Clearly this works for a countably infinite number of events as well.

Example - Binary Communication Channel

Figure: Model for a noisy binary communication channel.

Let us define events

t0 is sentu  tp0, 0qu Y tp0, 1qu

 PpAqPpB |APq pB P|ApqAc qPpB |Ac q PpAq

PpA|B q  0.2 0.9

[using Bayes rule]

Note that PpA|B q PpAq  0.2.

 t0 is sent on i th transmissionu Aci  t1 is sent on i th txu

interested in the error event Ei , i.e.,

 tin i th use of channel, tx bit  rx bitu

 P rpA1 X B1c q Y pAc1 X B1 qs  PpA1 X B1c q PpAc1 X B1 q

The random variable/vector/process inherits its probability measure from

We distinguish the probability measure of r.v. which is Px : B pRq r0, 1s

but these sets A need to be open sets in R (or unions, intersections, or

P F x ,i.e. the event space of subsets of

Description of Random Variables:

P p8, asq  PrtX pq au.

If FX is smooth, then we can write using the mean-value theorem, for x

If the random variable is discrete, then a more suitable description is a

Joint, marginal and conditional pmfs

Marginal pmf: PX px q. Use the law of total probability

Conditional pmf: PX |Y y px |y q is defined as

PXY px, y q  PX |Y px |y qPY py q  PY |X py |x qPX px q.

Bayes Rule for pmfs

Given PX px q, PY |X py |x q for every px, y q P X

2nd Ppq 1 i.e. probability of everything is 1

1, ..., n are disjoint then P

1, 2, ... are disjoint, then P

tH, tHu, tTu, u

even die then A t2, 4, 6u. Then, we get PpAq

ki1 PpFi q since it does not

PpF X |G q PpF X rG Y G c s|G q

and Ai X Aj H, i j, and let B P F be any other event,

PpAPjpBXqB q PpB |PApjBqPqpAj q .

t0 is sentu tp0, 0qu Y tp0, 1qu

PpAqPpB |APq pB P|ApqAc qPpB |Ac q PpAq

PpA|B q 0.2 0.9

Note that PpA|B q PpAq 0.2.

t0 is sent on i th transmissionu Aci t1 is sent on i th txu

tin i th use of channel, tx bit rx bitu

P rpA1 X B1c q Y pAc1 X B1 qs PpA1 X B1c q PpAc1 X B1 q

P p8, asq PrtX pq au.

Conditional pmf: PX |Y y px |y q is defined as

PXY px, y q PX |Y px |y qPY py q PY |X py |x qPX px q.

VarpX q i.e. VarpX q X2 .

If CovpX , Y q 0, then X and Y are uncorrelated. Is there any relationship