1
Lecture 3: Review of basic probability theory
Dr. Itamar Arel
College of Engineering
Electrical Engineering and Computer Science Department
The University of Tennessee
Fall 2011
August 25, 2011
ECE517: Reinforcement Learning in
Artificial Intelligence
ECE517  Reinforcement Learning in AI
2
Outline
Probability theory fundamentals
Random variables
ECE517  Reinforcement Learning in AI
3
Basic definitions
The collection or set of "all possible" distinct outcomes of
an experiment is called the sample space of the
experiment/trial.
Flipping a coin {H,T}
Rolling a die {1,2,3,4,5,6}
Outcomes Elements of the sample space
Event  The possible outcome of a experiment/trial
An Experiment
The Sample Space
ECE517  Reinforcement Learning in AI
4
More definitions
Independence
Two experiments are independent if the outcome
of either one does not depend on the outcome of
the other
Deterministic  outcome of a trial is predictable
(100%)
Randomness The absence of any pattern
A sample space is called discrete if it is a finite
or a countable infinite set, otherwise it is called
continuous
Probability can be viewed as the likelihood of an
event occurring
ECE517  Reinforcement Learning in AI
5
Fundamentals
Let S denote the sample space and A
i
the set of all
possible outcomes with probabilities P(A
i
), respectively
P(A
i
) > 0 for all i
P(A
i
) =1
For example, a probabilistic model might present the
length of a packet sent over a network
Two events A and B are called mutually exclusive, or
disjoint, if they have no common outcomes
P(A + B) = P(A) + P(B)  P(AB)
Often, P(A + B) = P(A) + P(B)
A
B
AB
ECE517  Reinforcement Learning in AI
6
Conditional Probability
Conditional probability
P(AB) = P(AB)/P(B)
A and B are defined as independent events if and only if,
P(AB) = P(A)P(B)
P(AB) = P(A)
Bayes Rule
Consider two events, A and B, where P(AB) = P(AB)P(B) and
P(BA) = P(BA)P(A)
But P(AB) = P(BA) , so P(AB)P(B) = P(BA)P(A) and P(AB) =
P(AB)/P(B)
) (
) ( )  (
)  (
B P
A P A B P
B A P =
ECE517  Reinforcement Learning in AI
7
Outline
Probability theory fundamentals
Random variables
ECE517  Reinforcement Learning in AI
8
Discrete Random Variables
A random variable (r.v.) is a function that assigns a real number
to each outcome in the sample space of a random experiment
For a discrete r.v. X, the probability mass function (PMF) gives
the probability that X will take on a particular value in its range.
We note this by P
X
, i.e.
P
X
(x) = P(X=x)
The expected value of a discrete r.v. X is defined by
E[X] = x P
X
(x)
The variance of X is defined as
E(X E[X])
2
= E[X
2
]E[X]
2
Question: in what scenario will the variance be zero ?
ECE517  Reinforcement Learning in AI
9
Bernoulli and Geometric Random Variables with parameter p
X is a Bernoulli r.v. with parameter p if it can take on
values 1 (success) and 0 (failure) with
P(x=1)=p
P(x=0)=1p
Example: Packet arrivals may be modeled as either correct
(1) or erroneous (0)
Given a sequence of independent Bernoulli r.v.s, let T be
the number of successes observed up to and including the
first. Then T will have a geometric distribution; its PMF is
given by
P(T=n)=(1p)
n1
p
E[T]=1/p
ECE517  Reinforcement Learning in AI
10
Memoryless property the fact that there were n time steps
separating success events has no influence on future events
The memoryless property makes it very useful in various analysis
tasks
Geometric distribution with N = 16, p = 0.3
ECE517  Reinforcement Learning in AI
11
Binomial Random Variable with parameters p and n
Let S denote the number of successes out of n
independent Bernoulli r.v.s. The PMF is given by
for k = 0,1,, n.
The expected number of successes is given by
Example: if packets arrive correctly at a node in a network
with probability p (independently); then the number of
correct arrivals out of n is a Binomial r.v.
k n k
p p
k
n
k S P


.

\

= = ) 1 ( ) (
E[S] = np
ECE517  Reinforcement Learning in AI
12
Mean of a Binomial r.v.
Note that:
ECE517  Reinforcement Learning in AI
13
Examples (from ECE453)
Consider the following network. Packets transmitted from Router A
to Router B have a packet error rate (PER) of p
AB
, while packets
transmitted from Router B to Router C have a PER of p
BC
. The packet
error rates are assumed to be independent.
p
AB
A B C
p
BC
If all traffic from Router A to Router C traverses Router B, what is the probability
that all N packets transmitted from Router A to Router C are received correctly?
Given that N packets were transmitted from Router A to Router B, write an
expression for the probability that at least m of those N packets are received
correctly.
Assuming Router A has transmitted N packets to Router C (via Router B), what is
the probability that exactly m packets (where m<N) are received correctly at Router
C?
ECE517  Reinforcement Learning in AI
14
CDF and PDF
The Cumulative Distribution Function (cdf) of a r.v. X,
F
X
(x), is defined as the probability of the event {X s x}
Axioms related are:
The probability density function (pdf) of a r.v. X, f
X
(x), is
defined as the derivative of the CDF
< < s = x x X P x F
X
], [ ) (
) ( ) ( b a
0 ) ( lim , 1 ) ( lim , 1 ) ( 0
x x
b F a F then if
x F x F x F
X X
X X X
> >
= = s s
k} {X (k) P
dx
x dF
x f
X
X
X
= = = Pr
) (
) (
ECE517  Reinforcement Learning in AI
15
Exponential Distribution
Continuous random variable
Continuoustime analogy to the geometric
distribution (memoryless properties hold)
Models lifetime, interarrival times,
ECE517  Reinforcement Learning in AI
16
Minimum of Independent Exponential rvs
Assume X
1
, X
2
, , X
n
, are Independent Exponentials
ECE517  Reinforcement Learning in AI
17
Memoryless Property
True for Geometric and Exponential Dist.:
The coin does not remember that it came up tails l times
Root cause of Markov property (discussed later)
ECE517  Reinforcement Learning in AI
18
Useful Results
The following are some results that are useful for
manipulating many of the equations that may arise when
dealing with discretetime probabilistic models
when x<1,
Differentiating both sides of the previous equation yields
another useful expression:
=
+
=
n
k
n
k
x
x
x
0
1
1
1
=
=
0
1
1
k
k
x
x
=
=
0
2
) 1 (
k
k
x
x
kx