You are on page 1of 14

Brief Lecture Notes

Probability Distributions: Binomial, Poisson and Normal Distribution

Binomial Distribution
Our approach begins by first developing the Bernoulli model, which is a
building block for the Binomial. We consider a random experiment that can
give rise to just two possible mutually exclusive and collectively outcomes,
which for convenience we will label “success” and “failure”.
Let p denote the probability of success, so that the probability of failure is (1 –
p). Now define the random variable X so that X takes the value 1 if the outcome
of the experiment is success and 0 otherwise. The probability function of this
random variable is then
P(X=0) = (1 – p) and P(X=1) = p
This distribution is known as Bernoulli distribution.

Mean and variance of a Bernoulli Random Variable


The mean is
  E ( X )   xP( x)  (0)(1  p)  (1)( p)  p
x

and the variance is


 2  E[( X   ) 2 ]   ( x   ) 2 P( x)

 (0  p) 2 (1  p)  (1  p) 2 p  p(1  p)

The Binomial Distribution


Suppose that a random experiment can result in two possible mutually exclusive
and collectively exhaustive outcomes, “success” and “failure” and that P is the
probability of a success in a single trial. If n independent trials are carried out,
the distribution of the number of resulting success, x, is called the binomial

1
distribution. Its probability distribution function for the binomial random
variable X = x is
P(x Success in n Independent Trials) = P(x)
n!
 P x (1  P) ( n x ) for x = 0, 1, …….,n
x!(n  x)!

Mean and Variance


Let X be the number of successes in n independent trials, each with probability
of success P. Then X follows a binomial distribution with mean
  E( X )  np

and variance
 2  E[( X   ) 2 ]  nP(1  P)

Example: An insurance broker believes that for a particular contract the


probability of making a sale is 0.4. Suppose that the broker has five contracts.
a. Find the probability that she makes at most one sale.
b. Find the probability that she makes between two and four sales (inclusive).
c. Graph the probability distribution function.

Solution:
a. P(At most 1 sale) = P( X  1)  P( X  0)  P( X  1).
5!
P(X=0) = (0.4) 0 (0.6) 5  0.078
0!5!
5!
P( X  1)  (0.4)1 (0.6) 4  5(0.4)(0.6) 4  0.259
1!4!
Hence P( X  1)  0.078  0.259  0.337
b. P(2  X  4)  P( X  2)  P( X  3)  P( X  4)
5!
P( X  2)  (0.4) 2 (0.6) 3  10(0.4) 2 (0.6) 3  0.346
2!3!
5!
P( X  3)  (0.4) 3 (0.6) 2  10(0.4) 3 (0.6) 2  0.230
3!2!

2
5!
P( X  4)  (0.4) 4 (0.6)1  5(0.4) 4 (0.6)1  0.077
4!1!
Hence P(2  X  4)  0.346  0.230  0.077  0.653
c. The probability distribution function is shown in Figure 1.

X P(X=x)
0 0.078
1 0.259
2 0.346
3 0.230
4 0.077
5 0.010
0 1 2 3 4 5 6
Comments:
 This shape is typical for binomial probability when P is neither very large
nor very small.
 At least extremes (o or 5 sales), the probabilities are quite small.

Example: Early in August an undergraduate college discovers that it can


accommodate a few extra students, Enrolling those additional students would
provide a substantial increase in revenue without increasing the operating costs
of the college; that is, no new classes would have to be added. From past
experience the college knows that 40% of those students admitted will actually
enroll.
a. What is the probability that at most 6 students will enroll if the college offers
admission to 10 more students?
b. What is the probability that more than 12 will actually enroll if admission
offered to 20 students?
c. If 70% of those students admitted actually enroll, what is the probability that
at least 12 out of 15 students will actually enroll?

3
Solution:
a. This probability can be obtained using the cumulative binomial probability
distribution from Table 3 in the Appendix. The probability of at most 6
students enrolling if n = 10 and P = 0.40 is
P( X  6 / n  10, P  0.40)  0.945

b. P( X  12 / n  20, P  0.40)  1  P( X  12)  1  0.979  0.021.


c. The probability that at least 12 out of 15 students enroll is the same as the
probability that at most 3 out of 15 students do not enroll (the probability of a
student not enrolling is 1 – 0.70 = 0.30).
P( X  12 / n  15, P  0.70)  P( X  3 / n  15, P  0.30)  0.297

4
The Poisson Probability Distribution
The Poisson probability distribution is an important discrete probability
distribution for a number of applications, including:
1. The number of failures in a large computer system during a given day.
2. The number of replacement orders for a part received by a firm in a given
month.
3. The number of dents, scratches, or other defects in a large roll of sheet
metal used to manufacture filters.

Assumptions of the Poisson Probability Distribution


Assume that an interval is divided into a very large number of subintervals so
that the probability of the occurrence of an event in any subinterval is very
small. The assumptions of a Poisson probability distribution are as follows:
1. The probability of the occurrence of an event is constant for all subintervals.
2. There can be no more than one occurrence in each subinterval.
3. Occurrences are independent; that is, the occurrences in non -overlapping

intervals are independent of one another.

We can derive the equation for computing Poisson probabilities directly from
the binomial probability distribution by taking the mathematical limits as
P  0 and n   . With these limits the parameter   nP is a constant that

specifies the average number of occurrences (success) for a particular time


and/or space.

The Poisson Probability Distribution Function, Mean and Variance


The random variable X is said to follow the Poisson probability distribution if it
has the probability function
e  x
P( x)  for x = 0, 1, 2
x!

5
Where
P(x) = the probability of x successes over a given time or space, given 
 = the expected number of successes per time or space unit:  > 0
e = 2.71828 (the base natural logarithms)
The mean and variance of the Poisson probability distribution are
  E( X )   and  2  E[( X   ) 2 ]  

Example: A computer center manager, reports that his computer system


experienced three component failures during the past 100 days.
a. What is the probability of no failures in a given day?
b. What is the probability of one or more component failures in a given day?
c. What is the probability of at least two failures in a 3-day period?

Solution: A modern computer system has a very large number of components,


each of which could fail and thus result in a computer system failure. To
compute the probability of failures using the Poisson distribution, assume that
all each of the millions of components has the same very small probability of
failure. Also assume the first failure does not affect the probability of a second
failure (in some cases, these assumptions may not hold, and more complex
distributions would be used).
From past experience the expected number of failures per day is 3/100, or
  0.03 .
a. P(No failures in a given day) = P(X = 0/  = 0.03)
e 0.030
  0.970446
0!
b. The probability of at least one failure is the complement of the
probability of 0 failure:
P( X  1)  1  P( X  0)

6
 e  x   e 0.030  0.03
= 1    1     1 e  0.029554
 x !   0 ! 

c. P(At least two failures in a 3-day period) = P( X  2 /   0.09), where the


average over a 3-day period is  = 3(0.03)=0.09:
P( X  2 /   0.09)  1  P( X  1)  1  [ P( X  0)  P( X  1)]  1  [0.913931  0.082254]

 1  0.996185  0.003815

Example: Customers arrive at a photocopying machine at an average rate of


two every 5 minutes. Assume that these arrivals are independent, with a
constant arrival rate, and that this problem follows a Poisson model, with X
denoting the number of arriving customers in a 5-minute period and mean  = 2.
Find the probability that more than two customers arrive in a 5-minute period.

Solution: Since the mean number of arrivals in 5 minutes is two, then  = 2. To


find the probability that more than two customers arrive, first compute the
probability of at most two arrivals in a 5-minute period, and then use
complement rule. So,
e 2 2 0
P( X  0)   e 2  0.1353
0!

e 2 21
P( X  1)   2e 2  0.2707
1!

e 2 2 2
P( X  2)   2e 2  0.2707
2!
Thus P( X  2)  1  P( X  2)  1  [0.1353  0.2707  0.2707]  0.323325

Example: A professor receives, on average, 4.2 telephone calls from students


the day before a final examination. If the distribution of calls is Poisson, what is
the probability of receiving at least three of these calls on such a day?

7
Solution: The distribution of X, number of telephone calls, is Poisson with
  4.2 . Thus the probability distribution function is
e  x e4.2 (4.2) x
P( X  x)  
x! x!
The probability of receiving at least three calls is
P( X  3)  1  P( X  3)  1  [ P( X  0)  P( X  1)  P( X  2)]

e4.2 (4.2)0
Now P( X  0)   0.015
0!
e4.2 (4.2)1
P( X  0)   0.063
1!
e4.2 (4.2)2
P( X  0)   0.132
2!
P( X  3)  1  P( X  3)  1  [0.015  0.063  0.132]
 1  .021  0.79

8
The Normal Distribution
Probability Density Function
Let X be a continuous random variable, and let x be any number lying in the
range of values this random variable can take. The probability density function,
f (x), of the random variable is a function with the following properties:
1. f(x) > 0 for all values of x.
2. The area under the probability density function, f(x), over all values of the
random variable, X, is equal to 1.0.
3. Suppose that this density function is graphed. Let a and b be two possible
values of random variable X, with a < b. Then the probability that X lies
between a and b is the area under the density function between these points.
4. The cumulative distribution function, F( x0 ), is the area under the probability
density function , f(x), up to x0 :
x0

F ( x0 )   f ( x)dx
 xm

where x m is the minimum value of the random variable x.

Areas Under Continuous Probability Density Functions


Let X be a continuous random variable with probability density function f(x)
and cumulative distribution function F(x). Then the following properties hold:
1. The total area under the curve f(x) is 1.
2. The area under the curve f(x) to the left of x0 is F(x0), where x0 is any value
that the random variable can take.

Probability Density Function


The probability density function for a normally distributed random variable X is
1
f ( x)  e ( x   ) / 2 2
for    x  
2

2 2

9
where  and  2 are any numbers such that       and 0   2   and e
and  are physical constants, e = 2.71828…. and   3.14159.......
The normal probability distribution represents a large family of distributions,
each with a unique specification for the parameters  and  2 . These parameters
have a very convenient interpretation.

Properties of the Normal Distribution


Suppose that the random variable X follows a normal distribution with
parameters  and  2 . Then the following properties hold:
1. The mean of the random variable is  :
E (X )  

2. The variance of the random variable is  2 :


Var(X) = E[( X   ) 2 ]   2
3. The shape of the probability density function is a symmetrical bell-shaped
curve centered on the mean,  , as shown in Figure 1.
4. If we know the mean and variance, we can define the normal distribution by
using the notation
X ~ N ( , 2 )

Figure: Probability density function for a Normal Distribution.

f(x)

10
For our applied statistical analyses the normal distribution has a number of
important characteristics. It is symmetric. Different central tendencies are
indicated by differences in  . In contrast, differences in  2 result in density
functions of different widths. By selecting values for  and  2 we can define a
large family of normal probability density functions. Differences in the mean
result in shifts of entire distributions. In contrast, differences in the variance
result in distributions with different widths.

Figure: Effects of  and  2 on the probability density function of a Normal


random variable.
a. Two normal distributions with different means.
b. Two normal distributions with different variances and mean = 5.

f(x) f(x)

Mean =5 Mean = 6 Variance = 0.056

Variance = 1
x x
(a) (b)

Cumulative Distribution Function of the Normal Distribution


Suppose that X is a normal random variable with mean  and variance  2 ; that
is, X ~ N ( ,  2 ). Then the cumulative distribution function is
F ( x 0 )  P( X  x 0 )

This is the area under the normal probability density function to the left of x 0.
As for any proper density function, the total area under the curve is 1; that is,
F ()  1

11
Figure: The shaded area is the probability that X does not exceed x0 for a
normal random variable.
f(x)

Range Probabilities for Normal Random Variables


Let X be a normal random variable with cumulative distribution function F(x),
and let a and b be two possible values of X, with a < b. Then
P(a  X  b)  F (b)  F (a)

The probability is the area under the corresponding probability density function
between a and b, as shown in the following Figure.

Figure: Normal density function with the shaded area indicating the probability
that X is between a and b.
f(x)

x
a  b

12
The Standard Normal Distribution
Let Z be a normal random variable with mean 0 and variance 1; that is,
Z~N(0, 1)
We say that Z follows the standard normal distribution.
Denote the cumulative distribution function as F(z) and a and b as two numbers
with a < b; then
P(a<Z<b)= F(b) – F(a)
We can obtain probabilities for any normally distributed random variable by
first converting the random variable to the standard normally distributed random
variable, Z. There is always a direct relationship between any normally
distributed random variable and Z. That relationship uses the transformation
X 
Z

where X is a normally distributed random variable
X~N ( ,  2 )

This important result allows us to use the standard normal table to compute
probabilities associated with any normally distributed random variable. Now let
us see how probabilities can be computed for the standard normal Z.
The cumulative distribution function of the standard normal distribution is
tabulated in Table 1 in the Appendix. This table gives values of
F(z) = P(Z  z )
for non-negative values of z. For example, the cumulative probability for a Z
value of 1.25 from Table 1 is
F(1.25) = 0.8944
This is the area, designated in the following Figure, for Z less than 1.25.
Because of the symmetry of the normal distribution, the probability that Z > -
1.25 is also equal to .8944. In general, values of the cumulative distribution

13
function for negative values of Z can be inferred using the symmetry of the
probability density function.
To find the cumulative probability for a negative Z (for example, Z = -1.0)
defined as
F(-Z0 ) = P(Z   z 0 )  F (1.0)
we use the complement of the probability for Z = +1, as shown in the following
Figure.
Figure: Standard Normal Distribution for Negative Z equal to –1.

F(-1) =0.1587 1 –F(z) = 1 –0.1587 = 0.8413

Examples:

14

You might also like