You are on page 1of 26

Discrete

Distribution

Kaustav Banerjee

7 July, 2017

Kaustav Banerjee Discrete Distribution 7 July, 2017 1 / 26


Bernoulli trial
A machine produces 3% defectives, while working properly. What is
the probability of having no defective(s) in a sample of size 1 and 10?
A random variable X has a Bernoulli distribution if its PMF is

θ,
 if x = 1,
P(X = x) = 1 − θ, if x = 0,

0, otherwise

where θ ∈ (0, 1) is a parameter.


If a random experiment has only two outcomes (say 0 and 1) and it is
repeated independently and identically then the resulting sequence of
trials is called Bernoulli trials
1. E(X) = θ, V(X) = θ(1 − θ)

Kaustav Banerjee Discrete Distribution 7 July, 2017 2 / 26


Binomial distribution
A random variable X has a binomial distribution if its PMF is
( 
n x
θ (1 − θ)n−x , if x = 0, 1, 2, ..., n
P(X = x) = x
0, otherwise

where n and θ ∈ (0, 1) are two parameters.


A binomial random variable can be viewed as the number of success
(failure) in n independent and identical BernoulliPtrials. Thus, if Y1 ,
Y2 ,...,Yn are IID Bernoulli (θ) random variables, ni=1 Yi ∼ B(n, θ)
1. E(X) = nθ, V(X) = nθ(1 − θ)
2. If X1 ∼ B(n1 , θ) is independent of X2 ∼ B(n2 , θ),
X1 + X2 ∼ B(n1 + n2 , θ)

Kaustav Banerjee Discrete Distribution 7 July, 2017 3 / 26


Binomial as sequence of Bernoulli trials
1. Suppose we flip a coin n times, under identical conditions. Each flip then is a
Bernoulli trial with success probability θ, done n times independent of each other.
Denoting outcome of the i’th flip as Yi , then Yi ∼ Bernoulli(θ), i = 1, 2, .., n.
Such Yi ’s are independent and identically distributed (IID) Bernoulli(θ)
Prandom
variables. The number of success in n such trials, say Y, is then: Y = i Yi .
n
X n
X n
X
E(Y) = E( Yi ) = E(Yi ) = θ = nθ
i=1 i=1 i=1
Xn Xn Xn
V(Y) = V( Yi ) = V(Yi ) = θ(1 − θ) = nθ(1 − θ)
i=1 i=1 i=1

2. Suppose we flip a coin n2 times, consecutively after we flipped it for n1 times.


The flips are carried out identically and independently with success probability θ.
Then it is (n1 + n2 ) Bernoulli trials, and probability of getting X successes is
 
n1 + n2 x
P(X = x) = θ (1 − θ)n1 +n2 −x , x = 0, 1, 2, ..., (n1 + n2 )
x

Kaustav Banerjee Discrete Distribution 7 July, 2017 4 / 26


n = 15, p = 0.2

0.25
0.20
0.15
Probability

0.10
0.05
0.00

0 5 10 15

Outcome

Kaustav Banerjee Discrete Distribution 7 July, 2017 5 / 26


n = 15, p = 0.5

0.20
0.15
Probability

0.10
0.05
0.00

0 5 10 15

Outcome

Kaustav Banerjee Discrete Distribution 7 July, 2017 6 / 26


n = 15, p = 0.8

0.25
0.20
0.15
Probability

0.10
0.05
0.00

0 5 10 15

Outcome

Kaustav Banerjee Discrete Distribution 7 July, 2017 7 / 26


Applications of binomial

1. Survey finds 30% of workers take public transportation daily. In a sample of 10


workers, what is the probability that exactly three workers take public
transportation daily? what is the probability that at least three workers take
public transportation daily?
2. A company accepts a lot from a particular supplier if the defective components
in the lot do not exceed 1%. Suppose a random sample of five items from a
recent shipment is tested. Assuming that 1% of the shipment is defective, what
is the probability that no items in the sample are defective? what is the
probability that exactly one item in the sample is defective? What is the
expected number of defective items in the sample? How precise is this expected
number of defective items in the sample?

Kaustav Banerjee Discrete Distribution 7 July, 2017 8 / 26


Poisson distribution
What is the probability of getting 50 heads in 500 flips, with success
500

probability 0.03? → 50
0.03 (1 − 0.03)500−50
50

If θ = λ/n, where λ is a positive constant, 0 < θ < 1 and n is a


x
positive integer, limn→∞ x θ (1 − θ)n−x = e−λ λx!
n x


A random variable X has Poisson distribution if its PMF is


( x
e−λ λx! , if x = 0, 1, 2, ...
P(X = x) =
0, otherwise

where λ > 0 is a parameter.


1. E(X) = λ, V(X) = λ
2.: If X1 ∼ P(λ1 ) is independent of X2 ∼ P(λ2 ),
X1 + X2 ∼ P(λ1 + λ2 )

Kaustav Banerjee Discrete Distribution 7 July, 2017 9 / 26


Poisson as approximation to binomial

1. Poisson probabilities are approximations to binomial probabilities under the


condition: as n → ∞, θ = λ/n → 0 where λ > 0 is a constant. Thus, if we
repeat Bernoulli(θ) trials independently and identically (IID) indefinite number of
times (n → ∞), where success probability is too low (θ → 0) and nθ = λ, the
probability of observing X number of success with such a rare event

λx
 
n x
P(X = x) = θ (1 − θ)n−x → e−λ
x x!
E(X) = nθ = λ, V(X) = nθ(1 − θ) = λ(1 − θ) → λ

2. If X1 ∼ B(n1 , θ) is independent of X2 ∼ B(n2 , θ), X1 + X2 ∼ B(n1 + n2 , θ). So


as n1 → ∞, θ = λ1 /n1 → 0 and similarly as n2 → ∞, θ = λ2 /n2 → 0. Under
these conditions, X1 ∼ P(λ1 ) and X2 ∼ P(λ2 ) and the result follows from additive
property of binomial distribution

Kaustav Banerjee Discrete Distribution 7 July, 2017 10 / 26


λ=5

0.15
0.10
Probability

0.05
0.00

0 10 20 30 40 50

Outcome

Kaustav Banerjee Discrete Distribution 7 July, 2017 11 / 26


λ = 10

0.12
0.10
0.08
Probability

0.06
0.04
0.02
0.00

0 10 20 30 40 50

Outcome

Kaustav Banerjee Discrete Distribution 7 July, 2017 12 / 26


λ = 30

0.06
0.04
Probability

0.02
0.00

0 10 20 30 40 50

Outcome

Kaustav Banerjee Discrete Distribution 7 July, 2017 13 / 26


Applications of Poisson
1. Phone calls arrive at the rate of 48/hour at the reservation desk of Airways.
(a) Compute the probability of receiving three calls in a 5-minute interval of time.
(b) Compute the probability of receiving exactly 10 calls in 15 minutes.
(c) Suppose no calls are currently on hold. If the agent takes 5 minutes to
complete the current call, how many callers do you expect to be waiting by that
time? What is the probability that none will be waiting?
(d) If no calls are currently being processed, what is the probability that the
agent can take 3 minutes for personal time without being interrupted by a call?
2. A new automated production process averages 1.5 breakdowns per day.
Because of the cost associated with a breakdown, management is concerned
about the possibility of having three or more breakdowns during a day. Assume
that breakdowns occur randomly, that the probability of a breakdown is the same
for any two time intervals of equal length, and that breakdowns in one period are
independent of breakdowns in other periods. What is the probability of having
three or more breakdowns during a day?

Kaustav Banerjee Discrete Distribution 7 July, 2017 14 / 26


Geometric distribution
What is the probability that the first head will occur in the 5’th flip
of a coin?
X : Number of failures before the 1’st success, X has geometric
distribution if its PMF is
(
(1 − θ)x θ, if x = 0, 1, 2, ...
P(X = x) =
0, otherwise

where θ ∈ (0, 1) is a parameter. Alternatively, if Y: Number of trials


to get 1’st success, notice that Y = X + 1 and thus
(
(1 − θ)y−1 θ, if y = 1, 2..
P(Y = y) =
0, otherwise

1. E(X) = (1 − θ)/θ, V(X) = (1 − θ)/θ2


2. For integers s > t, P(Y > s|Y > t) = P(Y > s − t)
Kaustav Banerjee Discrete Distribution 7 July, 2017 15 / 26
θ = 0.2

0.20
0.15
Probability

0.10
0.05
0.00

0 5 10 15 20

Outcome

Kaustav Banerjee Discrete Distribution 7 July, 2017 16 / 26


θ = 0.5

0.5
0.4
0.3
Probability

0.2
0.1
0.0

0 5 10 15 20

Outcome

Kaustav Banerjee Discrete Distribution 7 July, 2017 17 / 26


θ = 0.8

0.8
0.6
Probability

0.4
0.2
0.0

0 5 10 15 20

Outcome

Kaustav Banerjee Discrete Distribution 7 July, 2017 18 / 26


Negative binomial distribution
What is the probability that the second head will occur in the 5’th
flip of a coin?
X: Number of failures before s’th success, X will have negative
binomial distribution if its PMF is
(
s+x−1 s

s−1
θ (1 − θ)x , if x = 0, 1, 2, ...
P(X = x) =
0, otherwise

where s and θ ∈ (0, 1) are parameters.


1. E(X) = s(1 − θ)/θ, V(X) = s(1 − θ)/θ2
2. If proportion of units possessing a certain characteristic is θ, and
we sample until we see s such units, this conforms to inverse or
negative binomial sampling.

Kaustav Banerjee Discrete Distribution 7 July, 2017 19 / 26


Negative binomial as sum of Geometric
Think of a sequence of Bernoulli trials...

X : Number of failures before s successes


= Number of failures before 1st success +
= Number of failures before 2nd success +
= Number of failures before 3rd success + ...
= Number of failures before s0 th success
= X1 + X2 + X3 + ... + Xs

IID
Notice that, Xi ∼ Geometric(θ), i = 1(1)s ⇒ si=1 Xi ∼ NB(s, θ)
P

1−θ 1−θ
Thus, E(X) = E ( si=1 Xi ) = si=1 E(Xi ) = si=1
P P P
=s
θ θ
Ps Ps Ps 1 − θ 1−θ
V(X) = V ( i=1 Xi ) = i=1 V(Xi ) = i=1 2 = s 2
θ θ
Kaustav Banerjee Discrete Distribution 7 July, 2017 20 / 26
θ = 0.2 , s = 5

0.04
0.03
Probability

0.02
0.01
0.00

0 10 20 30 40 50

Outcome

Kaustav Banerjee Discrete Distribution 7 July, 2017 21 / 26


θ = 0.5 , s = 5

0.14
0.12
0.10
0.08
Probability

0.06
0.04
0.02
0.00

0 10 20 30 40 50

Outcome

Kaustav Banerjee Discrete Distribution 7 July, 2017 22 / 26


θ = 0.8 , s = 5

0.30
0.25
0.20
Probability

0.15
0.10
0.05
0.00

0 10 20 30 40 50

Outcome

Kaustav Banerjee Discrete Distribution 7 July, 2017 23 / 26


Waiting time distributions
1. Probability that a light bulb will fail in any given day is 0.001.
(a) What is the probability that it will last at least 10 days?
(b) What is the probability that it will last at least 20 days?
(c) What is the probability that it will last at least 30 days, having lasted 10
days?
2. A couple decides to have children till they have a girl child. What is the
expected number of children they would have, given that the sex ratio is 940
females to 1000 males?
3. A survey reveals that 25% people invest in mutual funds. In order to
understand these investors by a follow up study, the decision is to continue with
the survey till meeting 30 such investors. How many people the survey is
expected to meet? Provide an interval of likely values for the number of people
to be surveyed.

Kaustav Banerjee Discrete Distribution 7 July, 2017 24 / 26


Appendix I: Poisson (feel free to skip)
If θ = λ/n, where λ is a positive constant, 0 < θ < 1 and n is a positive integer,
λx
limn→∞ nx θx (1 − θ)n−x = e−λ

x!
   x  n−x
n x n(n − 1)(n − 2)...(n − x + 1) λ λ
θ (1 − θ)n−x = 1−
x x! n n
x
 n  −x
λ n(n − 1)(n − 2)...(n − x + 1) λ λ
= × x
× 1− × 1−
x! n n n
As n → ∞
n(n − 1)(n − 2)...(n − x + 1) n n−1 n−x+1
1. = × × .. × →1
nx n n n
 n
λ
2. 1− → e−λ
n
 −x
λ
3. 1− →1
n

Kaustav Banerjee Discrete Distribution 7 July, 2017 25 / 26


Appendix II: Geometric (feel free to skip)
1. P(X = x) = (1 − θ)x θ, x = 0, 1, 2, ...

X ∞
X
E(X) = x(1 − θ)x θ = θ(1 − θ) x(1 − θ)x−1
x=0 x=0

" #
d X x d (1 − θ)
= θ(1 − θ) (1 − θ) (−1) = θ(1 − θ) (−1)/θ =
dθ x=0 dθ θ

Similarly, V(X) = (1 − θ)/θ2


2. P(Y > s) = P(No success in s trials) = (1 − θ)s

P(Y > s ∩ Y > t) P(Y > s) (1 − θ)s


P(Y > s|Y > t) = = = = (1 − θ)s−t
P(Y > t) P(Y > t) (1 − θ)t
= P(Y > s − t) since s > t

This reveals that Geometric distribution can be used to model ‘lifetime’ data for
components having ‘lack of aging’/‘memory-less’ property

Kaustav Banerjee Discrete Distribution 7 July, 2017 26 / 26

You might also like