You are on page 1of 2

Probability Cheat Sheet

Distributions
Unifrom Distribution
notation U [a, b]
cdf
x a
b a
for x [a, b]
pdf
1
b a
for x [a, b]
expectation
1
2
(a +b)
variance
1
12
(b a)
2
mgf
e
tb
e
ta
t (b a)
story: all intervals of the same length on the
distributions support are equally probable.
Gamma Distribution
notation Gamma (k, )
pdf

k
x
k1
e
x
(k)
Ix>0
(k) =
_

0
x
k1
e
x
dx
expectation k
variance k
2
mgf (1 t)
k
for t <
1

ind. sum
n

i=1
Xi Gamma
_
n

i=1
ki,
_
story: the sum of k independent
exponentially distributed random variables,
each of which has a mean of (which is
equivalent to a rate parameter of
1
).
Geometric Distribution
notation G(p)
cdf 1 (1 p)
k
for k N
pmf (1 p)
k1
p for k N
expectation
1
p
variance
1 p
p
2
mgf
pe
t
1 (1 p) e
t
story: the number X of Bernoulli trials
needed to get one success. Memoryless.
Poisson Distribution
notation Poisson()
cdf e

i=0

i
i!
pmf

k
k!
e

for k N
expectation
variance
mgf exp
_

_
e
t
1
__
ind. sum
n

i=1
Xi Poisson
_
n

i=1
i
_
story: the probability of a number of events
occurring in a xed period of time if these
events occur with a known average rate and
independently of the time since the last event.
Normal Distribution
notation N
_
,
2
_
pdf
1

2
2
e
(x)
2
/(2
2
)
expectation
variance
2
mgf exp
_
t +
1
2

2
t
2
_
ind. sum
n

i=1
Xi N
_
n

i=1
i,
n

i=1

2
i
_
story: describes data that cluster around the
mean.
Standard Normal Distribution
notation N (0, 1)
cdf (x) =
1

2
_
x

e
t
2
/2
dt
pdf
1

2
e
x
2
/2
expectation
1

variance
1

2
mgf exp
_
t
2
2
_
story: normal distribution with = 0 and
= 1.
Exponential Distribution
notation exp ()
cdf 1 e
x
for x 0
pdf e
x
for x 0
expectation
1

variance
1

2
mgf

t
ind. sum
k

i=1
Xi Gamma (k, )
minimum exp
_
k

i=1
i
_
story: the amount of time until some specic
event occurs, starting from now, being
memoryless.
Binomial Distribution
notation Bin(n, p)
cdf
k

i=0
_
n
i
_
p
i
(1 p)
ni
pmf
_
n
i
_
p
i
(1 p)
ni
expectation np
variance np (1 p)
mgf
_
1 p +pe
t
_
n
story: the discrete probability distribution of
the number of successes in a sequence of n
independent yes/no experiments, each of
which yields success with probability p.
Basics
Comulative Distribution Function
F
X
(x) = P(X x)
Probability Density Function
F
X
(x) =
_

f
X
(t) dt
_

f
X
(t) dt = 1
f
X
(x) =
d
dx
F
X
(x)
Quantile Function
The function X

: [0, 1] R for which for any

p [0, 1], F
X
_
X

(p)

_
p F
X
(X

(p))
F
X
= F
X
E(X

) = E(X)
Expectation
E(X) =
_
1
0
X

(p)dp
E(X) =
_
0

F
X
(t) dt +
_

0
(1 F
X
(t)) dt
E(X) =
_

xf
X
xdx
E(g (X)) =
_

g (x) f
X
xdx
E(aX +b) = aE(X) +b
Variance
Var (X) = E
_
X
2
_
(E(X))
2
Var (X) = E
_
(X E(X))
2
_
Var (aX +b) = a
2
Var (X)
Standard Deviation
(X) =
_
Var (X)
Covariance
Cov (X, Y ) = E(XY ) E(X) E(Y )
Cov (X, Y ) = E((X E(x)) (Y E(Y )))
Var (X +Y ) = Var (X) + Var (Y ) + 2Cov (X, Y )
Correlation Coecient

X,Y
=
Cov (X, Y )

X
,
Y
Moment Generating Function
M
X
(t) = E
_
e
tX
_
E(X
n
) = M
(n)
X
(0)
M
aX+b
(t) = e
tb
M
aX
(t)
Joint Distribution
P
X,Y
(B) = P((X, Y ) B)
F
X,Y
(x, y) = P(X x, Y y)
Joint Density
P
X,Y
(B) =
__
B
f
X,Y
(s, t) dsdt
F
X,Y
(x, y) =
_
x

_
y

f
X,Y
(s, t) dtds
_

f
X,Y
(s, t) dsdt = 1
Marginal Distributions
P
X
(B) = P
X,Y
(B R)
P
Y
(B) = P
X,Y
(R Y )
F
X
(a) =
_
a

f
X,Y
(s, t) dtds
F
Y
(b) =
_
b

f
X,Y
(s, t) dsdt
Marginal Densities
f
X
(s) =
_

f
X,Y
(s, t)dt
f
Y
(t) =
_

f
X,Y
(s, t)ds
Joint Expectation
E((X, Y )) =
__
R
2
(x, y) f
X,Y
(x, y) dxdy
Independent r.v.
P(X x, Y y) = P(X x) P(Y y)
F
X,Y
(x, y) = F
X
(x) F
Y
(y)
f
X,Y
(s, t) = f
X
(s) f
Y
(t)
E(XY ) = E(X) E(Y )
Var (X +Y ) = Var (X) + Var (Y )
Independent events:
P(A B) = P(A) P(B)
Conditional Probability
P(A | B) =
P(A B)
P(B)
bayes P(A | B) =
P(B | A) P(A)
P(B)
Conditional Density
f
X|Y =y
(x) =
f
X,Y
(x, y)
f
Y
(y)
f
X|Y =n
(x) =
f
X
(x) P(Y = n | X = x)
P(Y = n)
F
X|Y =y
=
_
x

f
X|Y =y
(t) dt
Conditional Expectation
E(X | Y = y) =
_

xf
X|Y =y
(x) dx
E(E(X | Y )) = E(X)
P(Y = n) = E(I
Y =n
) = E(E(I
Y =n
| X))
Sequences and Limits
limsup An = {An i.o.} =

_
m=1

_
n=m
An
liminf An = {An eventually} =

_
m=1

_
n=m
An
liminf An limsup An
(limsup An)
c
= liminf A
c
n
(liminf An)
c
= limsup A
c
n
P(limsup An) = lim
n
P
_

_
n=m
An
_
P(liminf An) = lim
n
P
_

_
n=m
An
_
Borel-Cantelli Lemma

n=1
P(An) < P(limsup An) = 0
And if An are independent:

n=1
P(An) = P(limsup An) = 1
Convergence
Convergence in Probability
notation Xn
p
X
meaning lim
n
P(|Xn X| > ) = 0
Convergence in Distribution
notation Xn
D
X
meaning lim
n
Fn (x) = F (x)
Almost Sure Convergence
notation Xn
a.s.
X
meaning P
_
lim
n
Xn = X
_
= 1
Criteria for a.s. Convergence
Nn > N : P(|Xn X| < ) > 1
P(limsup (|Xn X| > )) = 0

n=1
P(|Xn X| > ) < (by B.C.)
Convergence in L
p
notation Xn
Lp
X
meaning lim
n
E(|Xn X|
p
) = 0
Relationships
Lq

q>p1
Lp

a.s.

p

D

If Xn
D
c then Xn
p
c
If Xn
p
X then there exists a subsequence
n
k
s.t. Xn
k
a.s.
X
Laws of Large Numbers
If Xi are i.i.d. r.v.,
weak law Xn
p
E(X1)
strong law Xn
a.s.
E(X1)
Central Limit Theorem
Sn n

n
D
N (0, 1)
If tn t, then
P
_
Sn n

n
tn
_
(t)
Inequalities
Markovs inequality
P(|X| t)
E(|X|)
t
Chebyshevs inequality
P(|X E(X)| )
Var (X)

2
Chernos inequality
Let X Bin(n, p); then:
P(X E(X) > t (X)) < e
t
2
/2
Simpler result; for every X:
P(X a) M
X
(t) e
ta
Jensens inequality
for a convex function, (E(X)) E((X))
Miscellaneous
E(Y ) <

n=0
P(Y > n) < (Y 0)
E(X) =

n=0
P(X > n) (X N)
X U (0, 1) ln X exp (1)
Convolution
For ind. X, Y , Z = X +Y :
f
Z
(z) =
_

f
X
(s) f
Y
(z s) ds
Kolmogorovs 0-1 Law
If A is in the tail -algebra F
t
, then P(A) = 0
or P(A) = 1
Ugly Stu
cdf of Gamma distribution:
_
t
0

k
x
k1
e
k
(k 1)!
dx
This cheatsheet was made by Peleg Michaeli in
January 2010, using L
A
T
E
X.
version: 1.01