Professional Documents
Culture Documents
Probability Review
Ulisses Braga-Neto
ECE Department
Texas A&M University
The event E that the first coin lands tails is: E = {(T, H), (T, T )}.
Example 2: If the experiment consists in measuring the lifetime of a
lightbulb, then
S = {t ∈ R | t ≥ 0}
The event that the lightbulb will fail at or earlier than 2 time units is
the real interval E = [0, 2].
We can see that lim supn→∞ En occurs iff En occurs for an infinite
number of n, that is, En occurs infinitely often.
If E1 , E2 , . . . is any sequence of events, then
∞ \
[ ∞
lim inf En = Ei
n→∞
n=1 i=n
We can see that lim inf n→∞ En occurs iff En occurs for all but a finite
number of n, that is, En eventually occurs for all n.
Proof.
∞ [
∞ ∞
! !
\ [
P Ei =P lim Ei
n→∞
n=1 i=n i=n
∞
!
[
= lim P Ei
n→∞
i=n
∞
X
≤ lim P (Ei )
n→∞
i=n
=0
1
P ({Xn = 0}) = n
, n = 1, 2, . . .
2
Since
∞
X
P ({Xn = 0}) = 2 < ∞
n=1
The 1st Borel-Cantelli lemma implies that P ([{Xn = 0} i.o.]) = 0, that is,
for n sufficiently large, Xn = 1 with probability one. In other words,
Proof. Exercise.
This is the converse to the 1st lemma; here the stronger assumption
of independence is needed.
As immediate consequence of the 2nd Borel-Cantelli lemma is that
any event with positive probability, no matter how small, will occur
infinitely often in independent repeated trials.
P (E ∩ F1 ∩ F2 ∩ . . . ∩ Fn )
P (E|F1 , F2 , . . . , Fn ) =
P (F1 ∩ F2 ∩ . . . ∩ Fn )
P (E1 , E2 , . . . , En ) =
P (En |E1 , . . . , En−1 )P (En−1 |E1 , . . . , En−2 ) · · · P (E2 |E1 )P (E1 )
Pn Pn S
P (E) = i=1 P (E, Fi ) = i=1 P (E|Fi )P (Fi ), whenever Fi ⊇ E
(Bayes Theorem):
P (E, F ) = P (E)P (F )
{X ∈ A} = X −1 (A) ⊆ S.
{X ∈ (−∞, a]}, a ∈ R.
Properties of a PDF:
1. FX is non-decreasing: a ≤ b ⇒ FX (a) ≤ FX (b).
2. lima→−∞ FX (a) = 0 and lima→+∞ FX (a) = 1
3. FX is right-continuous: limb→a+ FX (b) = FX (a).
The joint PDF of two r.v.’s X and Y is the joint probability of the
events {X ≤ a} and {Y ≤ b}, for a, b ∈ R. Formally, we define a
function FXY : R × R → [0, 1] given by
This is the probability of the "lower-left quadrant" with corner at (a, b).
Note that FXY (a, ∞) = FX (a) and FXY (∞, b) = FY (b). These are
called the marginal PDFs.
The r.v.’s X and Y are indepedent r.v.s if FX|Y (a, b) = FX (a) and
FY |X (b, a) = FY (b), that is, if FXY (a, b) = FX (a)FY (b), for a, b ∈ R.
dFX
fX (a) = (a), a ∈ R.
dx
Probability statements about X can then be made in terms of
integration of fX . For example,
Z a
FX (a) = fX (x)dx, a ∈ R
−∞
Z b
P ({a ≤ X ≤ b}) = fX (x)dx, a, b ∈ R
a
Binomial(n,p):
n
pX (i) = P ({X = i}) = pi (1 − p)n−i , i = 0, 1, . . . , n
i
Poisson (λ):
i
−λ λ
pX (i) = P ({X = i}) = e , i = 0, 1, . . .
i!
E[X]
P ({X ≥ a}) ≤
a
p
Cauchy-Schwarz Inequality: E[XY ] ≤ E[X 2 ]E[Y 2 ]
If X, Y are jointly discrete and pY (yj ) > 0, this can be written as:
X X pXY (xi , yj )
E[X|Y = yj ] = xi pX|Y (xi , yj ) = xi
pY (yj )
i:pX (xi )>0 i:pX (xi )>0
E[E[X|Y ]] = E[X]
Var(X)
P ({|X − E[X]| ≥ ǫ}) ≤
ǫ2
Cov(X, Y )
ρ(X, Y ) = p
Var(X)Var(Y )
Properties:
1. −1 ≤ ρ(X, Y ) ≤ 1
2. X and Y are uncorrelated iff ρ(X, Y ) = 0.
3. (Perfect linear correlation):
σy
ρ(X, Y ) = ±1 ⇔ Y = a ± bX, where b =
σx
Σ = U DU T
− 21 − 12
Y=Σ (X − µ) = D U T (X − µ)
has zero mean and covariance matrix Id (so that all components of
Y are zero-mean, unit-variance, and uncorrelated).
ECEN 649 Pattern Recognition – p.34/42
Sample Estimates
Given n independent and identically-distributed (i.i.d.) sample
observations X1 , . . . , Xn of the random vector X, then the sample
mean estimator is given by:
n
1X
µ̂ = Xi
n i=1
It can be shown that this estimator is unbiased (that is, E[µ̂] = µ) and
consistent (that is, µ̂ converges in probability to µ as n → ∞).
The sample covariance estimator is given by:
n
1 X
S= (Xi − µ̂)(Xi − µ̂)T
n − 1 i=1
The axes of the ellipsoids are given by the eigenvectors of Σ and the
length of the axes are proportional to its eigenvalues.