You are on page 1of 8

Lecture 4

Discrete Probability Distributions


Random variables
A random variable is a numerical description of the outcome of an experiment. A random variable is also known as a chance variable or stochastic variable.

Examples 4.1: Experiment 1. Make 100 sales call 2. Inspect a shipment of 70 radios 3. Operate a restaurant 4. Build a new library Random variable (x) Total number of sales Number of defective radios Number of customers entering in one day Percentage of project completed after 6 months Possible Values for the random variable 0,1,2,,100 0,1,2,,70 0,1,2,
0 x 100

Discrete random variables


If a variable X can assume a discrete set of values X 1 , X 2 , X 3 ,, X K with respective probabilities p1 , p 2 , p3 ,, p K , where p1 + p 2 + p3 ++ p K =1, we say that a discrete probability distribution for X has been defined. The function f ( X ) , which has the respective values p1 , p 2 , p3 ,, p K for X = X 1 , X 2 , X 3 ,, X K , is called the probability function, or frequency function, of X. Because X can assume certain values with given probabilities, it is often called a discrete random variable. Example 4.2. Let a pair of fair dice be tossed and let X denote the sum of the points obtained. Then the probability distribution is as shown in Table4. 1. For example, the probability of getting 4 1 = . Thus in 900 tosses of the dice we would expect 100 tosses to give the sum 5. sum 5 is 36 9 Table4. 1 X 2 3 4 5 6 7 8 9 10 11 12 f ( X ) 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36 If X denotes a discrete random variable that can assume the values X 1 , X 2 , X 3 ,, X K with respective probabilities p1 , p 2 , p3 ,, p K , where p1 + p 2 + p3 ++ p K =1, the expected value of X (or the mathematical expectation of X), denoted by E ( X ) (or ) is defined as

E ( X ) = = X 1 p1 + X 2 p 2 + + X K p K = X j p j = Xf ( X )
j =1

For a discrete random variable X with possible values X 1 , X 2 , X 3 ,, X K and respective probabilities p1 , p 2 , p3 ,, p K ( p1 + p 2 + p3 ++ p K =1) the measure of the dispersion, or variability of the possible values of X is called the variance and can be calculated by the following way: Var ( X ) = 2 = ( X 1 ) 2 p1 + ( X 2 ) 2 p 2 + + ( X K ) 2 p K = = ( X j )2 p j = (X )2 f (X )
j =1 K

Standard deviation: = VarX

Binomial probability distribution


Binomial experiment is an experiment that satisfies the following conditions: 1.The experiment consists of a sequence of n identical trials. 2.Two outcomes are possible on each trial. We refer to one outcome as a success and the other as a failure. 3. The probabilities of the two outcomes do not change from one trial to the next. 4.The trials are independent (i.e., the outcomes of one trial does not affect the outcome of any other trial). An important discrete random variable associated with the binomial experiment is x= the number of success in the n trials. x can have a value of 0, 1, 2, 3,.., n, depending on the number of successes observed in the n trials. The probability distribution associated with this random variable is called the binomial probability distribution.

Mathematical formula for computing the probability of any value for the binomial random variable is
k (4.1) f ( x = k ) = Cn p k (1 p ) n k k= 0, 1,, n n! k Cn = , n = number of trials, p = probability of success on one trial, k!( n k )!

Where

k = number of successes in n trials, f(x=k) = probability of k successes in the n trials The term n! in the preceding expression is referred to as n factorial and is defined as

n!= n(n 1)( n 2) ( 2)(1)

And by definition

0!= 1

There are tables (for example, the table in the book of McClave and Benson) that give cumulative binomial probabilities: P( X k ) = P ( X = k ) + P ( X = 1) + ... + P ( X = k ) . If you know cumulative probabilities, you can find P( X = k ) = P ( X k ) P( X k 1) and P( X > k ) = 1 P ( X k ) . Example 4.3. The probability of getting exactly 2 heads in 6 tosses of a fair coin is 6! 1 15 = = 2!4! 2 64 1 using formula (4.1) with n = 6, k = 2, and p = 1 p = . 2 Example 4.4. The probability of getting at least 4 heads in 6 tosses of a fair coin is
2 6

1 C 2

1 2

62

1 1 C64 2 2

6 4

5 1 1 + C6 2 2

6 5

6 1 1 + C6 2 2

66

15 6 1 11 + + = 64 64 64 32

The discrete probability distribution (4.1) is called the binomial distribution since for k = 0, 1, 2 ,,n it corresponds to successive terms of the binomial formula, or binomial expansion,

( a + b ) n = Cnk a k b n k = Cn0 a 0b n0 + Cn1a1b n1 + Cn2 a 2b n 2 + + Cnn a nb n n =


k =0

n(n 1) 2 n 2 = a + nab + a b + ... + b n 2


n n 1
k where Cn , k = 0,1,2,..., n

(4.2)

are called the binomial coefficients.

Example 4.5. From (4.2) we get: 3

0 1 2 n = 2 : ( a + b ) = C2 a 0b 2 0 + C2 ab + C2 = b 2 + 2ab + a 2 2 1 3 n = 3 : ( a + b ) = C30 a 0b3 0 + C3 a1b3 1 + C32 a 2b3 2 + C3 a 3b3 3 = b3 + 3ab 2 + 3a 2b + a 3 3

Number Cn is the number of ways of selecting k objects out of n. For example, from four objects A, B, C, and D, there are six ways of selecting two:AB,AC,AD,BC,BD,CD. The numbers have the following properties:
nk k 1. Cn = Cn

2.

k k k Cn +1 = Cn 1 + Cn This property can be seen displayed in Pascals triangle:

1 1 1 2 1 1 1 1 1
0 1 2 n n 3. Cn + Cn + Cn + ... + Cn = 2

3 3 4 6 4

1 1 1 6 1

5 10 10 5 6 15 20 15

Distribution (4.1) is also called the Bernoulli distribution after James Bernoulli, who discovered it at the end of the seventeenth century. Some properties of the binomial distribution are listed in Table 4.2 Table 4.2 Mean Variance Standard deviation

= np 2 = np(1 p) = np (1 p )

Example 4.6. In 100 tosses of a fair coin the mean number of heads is 1 = np = (100 ) = 50 this is the expected number of heads in 100 tosses of the coin. The 2 standard deviation is = np (1 p) =

(100 ) 1 1 = 5 .
2 2

Example 4.7. To illustrate the binomial probability distribution, let us consider the experiment of customers entering the Nastke Clothing Store. To keep the problem relatively small, we restrict the experiment to the next three customers. If, based on experience, the store manager estimates that the probability of a customer making a purchase is 0.3, what is the probability that exactly two of the next three customers make a purchase? We first want to demonstrate that three customers entering the clothing store and deciding whether to make a purchase can be viewed as a binomial experiment. Checking the four requirements for a binomial experiment, we note the following: 1. The experiment can be described as a sequence of three identical trials, one trial for each of the three customers who will enter the store. 2. Two outcomes the customer makes a purchase (success) or the customer does not make a purchase (failure) are possible for each trial. 3. The probabilities of the purchase (0.30) and no purchase(0.70) outcomes are assumed to be the same for all customers. 4

4. The purchase decision of each customer is independent of the purchase decision of the other customers. Thus, if we define the random variable x as the number of customers making a purchase (i.e., the number of successes in the three trials), we satisfy the requirements of the binomial probability distribution. With n = 3 trials and the probability of a purchase p = 0.30 for each customer, we use equation (4.1) to compute the probability of two customers making a purchase. This probability, denoted f(2), is

f (2) =

3! 3 2 1 (0.30) 2 (0.70)1 = (0.30) 2 (0.70)1 = 0.189 2!1! 2 11 3! 3 2 1 (0.30) 0 (0.70) 3 = (0.70) 3 = 0.343 0!3! 1 3 2 1

Similarly, the probability of no customers making a purchase, denoted f(0), is

f (0) =

Similarly, equation (4.1) can be used to show that the probabilities of one and three purchases are f(1) = 0.441 and f(3) = 0.027. Table below summarizes the binomial probability distribution for the Nastke Clothing Store problem. k 0 1 2 3 Total f(x=k) 0.343 0.441 0.189 0.027 1.000

The Poisson Distribution


The discrete probability distribution p( x = j ) =

j e j!

j = 0,1,2,...

(5.5)

where e = 2.71828 and is a given constant, is called the Poisson distribution after Simeon-Denis Poisson, who discovered it in the early part of the nineteenth century. A Typical examples of random variables for which the Poisson probability distribution provides a good model are as follows: 1.The number of industrial accidents per month at a manufacturing plant 2.The number of noticeable surface defects (scratches, dents, etc.) found by quality inspectors on a new automobile 3.The parts per million of some toxin found in the water or air emission from a manufacturing plant 4.The number of customer arrivals per unit of time at a supermarket checkout counter 5.The number of death claims received per day by an insurance company 6. The number of errors per 100 invoices in the accounting records of a company

Characteristics of a Poisson Random Variable 1. The experiment consists of counting the number of times an event occurs during a given unit of time (or in a given area, or volume, or any other unit of measurement. 2. The probability that the event occurs in a given unit of time (or area, or volume, ) is the same for all the units. 3. The number of events that occur in a given unit of time (or area, or volume, ) is independent of the number of events that occur in other units. Some properties of the Poisson distribution are listed in Table 4.3 Table 4.3 Mean (expected value) Variance Standard deviation Finding Poisson Probabilities Example4.8 Suppose the number, x, of a company's employees who are absent on Mondays has (approximately) a Poisson probability distribution. Furthermore, assume that the average number of Monday absentees is 2.6. a. Find the mean and standard deviation of x, the number of employees absent on Monday b. Find the probability that fewer than two employees are absent on a given Monday c. Find the probability that more than five employees are absent on a given Monday d. Find the probability that exactly five employees are absent on a given Monday Solution a. The mean and variance of a Poisson random variable are both equal to . Thus, for this example, = = 2 .6

= E (x) = 2 = =

2 = = 2 .6 = 2.6 = 1.61 b. To find the probability that fewer than two employees are absent on a given Monday we should use Table4.4. The rows of the Table 4.4 correspond to different values of , and the columns correspond to different values (k) of the Poisson random variable x. The entries in the table give the cumulative
probability P ( x k ) = P ( x = j )
J =0 k

2.2
2.4 2.6 2.8

0 .111 .091 .074 .061

1 .355 .308 .267 .231

2 .623 .570 .518 .469

3 .819 .779 .736 .692

4 .928 .904 .877 .848

Table4.4 6 .993 .988 .983 .976 6

7 .998 .997 .995 .992

8 1.00 0 .999 .999 .998

9 1.000 1.000 1.000 .999

.975 .964 .951 .935

3.0 3.2 3.4 3.6 3.8 4.0 4.2 4.4 4.6 4.8 5.0 5.2 5.4 5.6 5.8 6.0

.050 .041 .033 .027 .022 .018 .015 .012 .010 .008 .007 .006 .005 .004 .003 .002

.199 .171 .147 .126 .107 .092 .078 .066 .056 .048 .040 .034 .029 .024 .021 .017

.423 .380 .340 .303 .269 .238 .210 .185 .163 .143 .125 .109 .095 .082 .072 .062

.647 .603 .558 .515 .473 .433 .395 .359 .326 .294 .265 .238 .213 .191 .170 .151

.815 .781 .744 .706 .668 .629 .590 .551 .513 .476 .440 .406 .373 .342 .313 .285

.916 .895 .871 .844 .816 .785 .753 .720 .686 .651 .616 .581 .546 .512 .478 .446

.966 .955 .942 .927 .909 .889 .867 .844 .818 .791 .762 .732 .702 .670 .638 .606

.988 .983 .977 .969 .960 .949 .936 .921 .905 .887 .867 .845 .822 .797 .771 .744

.996 .994 .992 .988 .984 .979 .972 .964 .955 .944 .932 .918 .903 .886 .867 .847

.999 .998 .997 .996 .994 .992 .989 .985 .980 .975 .968 .960 .951 .941 .929 .916

Probability P ( x < 2) = P ( x 1) is a cumulative probability and therefore is the entry in Table4.4 in the row corresponding to = 2.6 and the column corresponding to k = 1 , that is 0.267. c. To find the probability that more than five employees are absent on a given Monday, we consider the complementary event P ( x > 5) = 1 P ( x 5) = 1 0.951 = 0.049 where 0.951 is the entry in Table4.4 corresponding to = 2.6 and k = 5 . d. To use Table4.4 to find the probability that exactly five employees are absent on a Monday, we must write the probability as the difference between two cumulative probabilities: P ( x = 5) = P ( x 5) P( x 4) = 0.951 0.877 = 0.074 Home Work: 1. A Harris Interactive survey for InterContinental Hotels & Resorts asked respondents, We traveling internationally, do you generally venture out on your own to experience culture, or stick with your tour group and itineraries? The survey found that 23% of the respondents stick with their tour group (USA Today, January 21, 2004). a. In a sample of six international travelers, what is the probability that two will stick with their tour group? b. In a sample of six international travelers, what is the probability that at least two will stick with their tour group? c. In a sample of 10 international travelers, what is the probability that none will stick with the tour group? 2. When a new machine is functioning properly, only 3% of the items produced are defective. Assume that we will randomly select two parts produced on the machine and that we are interested in the number of defective parts found. a. Describe the conditions under which this situation would be a binomial experiment. b. How many experimental outcomes yield 1 defect? c. Compute the probabilities associated with finding no defects, 1 defect, and 2 defects. 3. Military radar and missile detection systems are designed to warn a country of enemy attacks. A reliability question deals with the ability of detection system to identify an attack and issue the warning. Assume that a particular detection system has a 0.90 probability of defecting missile attack. Answer the following questions using the binomial probability distribution. 7

a. What is the probability that 1 detection system will detect an attack? b. If 2 detection systems are installed in the same area and operate independently, what is the probability that at least 1 of the systems will detect the attack? c. If 3 systems installed, what is the probability that at least 1 of the systems will detect the attack? d. Would you recommend that multiple detection systems be operated? Explain. 4. The Federal Deposit Insurance Corporation (FDIC) insures deposits of up to $100,000 in banks that are members of the Federal Reserve System against losses due to bank failure or theft. Over the last 5 years, the average number of bank failures per year among insured banks was 5.8 (FDIC Stats at a Glance, Mar. 2003). Assume that x, the number of bank failures per year among insured banks, can be adequately characterized by a Poisson probability distribution with mean 6. a.Find the expected value and standard deviation of x. b. Find probabilities P ( x 9) , P( x > 6) , P( x = 7)

You might also like