Professional Documents
Culture Documents
0 OVERVIEW
In this Unit, the student is introduced to the concept of a random variable. Probability
Unit Structure
mass functions are then presented, followed by some of the more important mass
3.0 Overview functions together with their applications.
3.1 Learning Objectives
3.2 Introduction
3.1 LEARNING OBJECTIVES
3.3 Random Variables
3.4 Probability Distribution for Discrete Random Variables By the end of this Unit, the student should be able to do the following:
3.4.1 Probability Mass Function
3.4.2 The Cumulative Distribution Function 1. Define a random variable and understand the difference between a discrete and
3.4.4 Rules of Expected Value 2. Understand the concept of a probability distribution for a discrete random
3.4.6 Rules of Variance 3. Construct binomial, hypergeometric, negative binomial and Poisson discrete
3.5.1 The Bernoulli Distribution 4. Determine the mean and variance of these discrete probability distributions.
3.5.2 The Mean & Variance 5. Apply each of the probability distributions to real life problems.
1 2
3.3 RANDOM VARIABLES probability distribution or probability mass finction of X says how the total probability 1
is distributed among the various possible X values.
Events of major interest to the scientist, engineer, or businessperson are those identified
by numbers called numerical events. The research physician is interested in the event that Definition 3.3 The probability distribution or probability mass function (p.m.f) of a
ten of ten treated patients survive an illness; the businessperson is interested in the event discrete random variable is defined for every number x by p(x) = P(X=x)=P (all s S:
that sales next year will reach Rs 5 million. Because the value of a numerical event will X(s) =x).
sample space S. Thus, if Y denotes a r.v., Note that since p(x) is a probability then, by the axioms of probability,
Y :S p(x) 0 for all values of x.
i.e. Y (w) = y, w S , y P (x) = 1
all possible x
a
For example, when a student attempts to log on to a computer time-sharing system, either
all ports could be busy (F), in which case the student will fail to obtain access, or else
there will be at least one port free (S), in which case the student will be successful in (3.1)
accessing the system. With S = {S,F}, define a random variable X by In fact, the above two conditions are necessary and sufficient conditions for any function
The r.v. X indicates whether (1) or not (0) the student can log on.
Example 3.1
Definition 3.2 A r.v. Y is said to be discrete if it can assume only a finite number or if its Suppose we go to a large tire store during a particular week and observe whether the next
elements can be listed so that there is a first element, a second element, a third element customer to purchase tires purchases a radial or a bias-ply tire. Let
and so on, or simply countably infinite. 1 if the customer purchases a radial tire
X=
0 if the customer purchases a bias-ply tire
3 4
An equivalent description is The c.d.f is thus given as
0.4 if x=0 0 if y < 1
0.4 if 1 y < 2
p(x) = 0.6 if x=1
if x 0 or 1 F (y ) = 0.7 if 2 y < 3
0.9 if 3 y < 4
1 if 4 y
A graph of F(y) is given below:
3.4.2 The Cumulative Distribution Function
F (y)
For some fixed value x, we often wish to compute the probability that the observed value
X will be at most x. 1
FX (x) = P( X x) , 1/4
i.e. is the probability of non-exceedance of x. For a discrete r.v. X,
y
0 1 2 3 4
F (x) = P( X x) = p ( y ).
y:y x
Figure 3.1
For any number x , F(x) is the probability that the observed value X will be at most x.
By using the simple equation
P( X a) + P(a < X b) = P(a X b),
Example 3.2 we have the following basic theorem for cumulative distribution functions.
5 6
3.4.3 Expected Values of Discrete Random Variables X 40 60 68 70 72 80 100
p (x) .01 .04 .05 .80 .05 .04 .01
For any random variable X, the formula for its distribution (the PMF if X is discrete)
Solution
completely describes the behaviour of the random variable. However, associated with any
random variable are constants, or parameters, that are descriptive. Knowledge of the By definition, we have
E ( X ) = xp X ( x )
numerical values of these parameters gives the researcher quick insight into the nature of all x
the variable. We consider three dsuch parameters: the mean , the variance 2, and the = 40 (.01) + 60 (.04 ) + 68 (.05 ) + ... + 100 (.01)
standard deviation . If the exact form of the distribution is known, then the numerical = 70.
To understand the reasoning behind most statistical methods, the student must become 3.4.4 Rules of Expected Value
familiar with one fundamental concept, namely the idea of mathematical expectation, or
Let X and Y be random variables and let c be any constant. Then:
expected value. This concept is used is used in defining many statistical parameters and
1. E(c) = c;
provides the logical basis for most of the procedures later.
2. E(cX) = cE(X);
3. E(X+Y) = E(X) + E(Y).
Definition 3.4 Let X be a discrete random variable. with p.m.f p ( x ) . Then the expected
E(X) = x p(x)
all x
Example 3.4
Let X and Y be random variables with E(X) = 7 and E(Y) = -5. Compute E(4X - 2Y + 6).
Solution
E(4 X 2Y + 6) = 4E( X ) 2E(Y ) + 6
Example 3.3
= 4(7 ) 2( 5) + 6
= 44.
A drug is used to maintain a steady heart rate in patients who have suffered a mild heart
attack. Let Y denote the number of heartbeats per minute obtained per patient. Consider
the hypothetical distribution given in the table below. What is the mean heart rate
obtained by all patients receiving this drug?
7 8
3.4.5 Variance of Discrete Random Variables Example 3.5
Find the variance of the heart rate in Example 3.3 variables; one that may assume only two different values, that we associate with the
experimental outcomes success/failure, non-defective/defective, etc. This is numerically
Solution
( )
indicated by the numerical values 1 representing success, non-defective, and 0
var ( X ) = E X 2 [E( X)]2
representing failure, defectiveSuch 0-1 random variables are called Bernoulli
= ( 40 ) (.01) + ( 60 ) (.04 ) + ... + (100 ) (.01) ( 70 )
2 2 2 2
variables. A simple experiment that results in a success/failure outcome is called a
= 26.4. Bernoulli trial.
9 10
Definition 3.6 A random variable . X is said to have a Bernoulli distribution with Suppose, for example, that the number of trials, n =3. Then there are eight possible
parameter p, if X has a distribution given by the p.m.f outcomes for the experiment:
p (x) = p q
x 1 x
, x = 0,1; q = 1 p.
SSS, SSF, SFS, SFF, FSS, FSF, FFS, FFF.
If X denotes the number of Ss among the n trials, then X(SSF) = 2, X(SFF) =1, and so on.
If X has the above p.m.f, we write
Possible values for X in an n-trial experiment are x = 0,1,2,,n. We often write
X ~ Bernoulli( p ) .
X ~ Bin (n , p) to indicate that X is a binomial random variable based on n trials with
success probability p. We therefore arrive at this definition.
x
If X ~ Bernoulli( p ) , then
and we write
E( X ) = p, var( X ) = pq,
X ~ Bin (n, p ).
where q = 1 p.
11 12
Since X denotes the number of successes in a fixed number of trials, X is binomial. By The next theorem summarises other theoretical properties of the binomial distribution.
definition, its PMF is
10 x
10 1 1 10 1
x
Even for a relatively small value of n , the computation of binomial probabilities can be where q = 1 p.
tedious. The Binomial Tables tabulate the cumulative distribution function F(x)=P(X x)
for n = 5, 10, 15, 20,25 in combination with selected values of p.
Note: For X~Bin(n,p), the cumulative distribution function will be denoted by
x
Example 3.8
P ( X x ) = B ( x : n, p ) = b ( y : n, p ) x = 0,1,..., n
y =0
Find the expectation and variance of the r.v. X in Example 3.6
Solution
The hypergeometric and negative binomial distributions are both closely related to the
From Binomial Tables , the probability is B(8; 15, 0.2) = 0.999.
binomial distribution. Whereas the binomial distribution is the approximate probability
b. The probability that exactly eight fail is
model for sampling without replacement form a finite dichotomous (S-F) population, the
P ( X = 8) = P( X 8) P( X 7) = B(8;15, 0.2) B(7;15, 0.2)
hypergeometric distribution is the exact probability model for the number of Ss in the
The answer is 0.999-0.996=0.003.
sample. The binomial random variable X is the number of Ss when the number n of trials
c. The probability that at least eight fail is
is fixed, whereas the negative binomial distribution arises from fixing the number of Ss
P ( X 8) = 1 P( X 7) = 1 B(7;15, 0.2) = 1 0.996 = 0.004 .
and letting the number of trials be random.
13 14
3.6.1 Assumptions & Definition Solution
For this example N = 20, n = 10, r = 5, i.e. the selecting any of the five best engineers is
The assumptions leading to the hypergeometric distribution are:
a success. We seek the probability {X = 5} , where X is the number of best engineers
1. The population or set to be sampled consists of N individuals, objects, or
among the ten applicants selected. X thus has a hypergeometric distribution and
elements (a finite population).
2. Each individual can be characterised as a success (S) or a failure (F), and there are 5 15
15! 10!10!
P( X = 5) = =
5 5 21
M successes in the population. = = .0192.
20 5!10! 20! 1292
3. A sample of n individuals is drawn in such a way that each subset of size n is
10
equally likely to be chosen.
The random variable of interest is X = the number of Ss in the sample. The probability 3.6.2 The Mean and Variance
distribution of X depends on the parameters n, M, N, so we wish to obtain P(X=x) = h(x:n,
M, N). If X has a hypergeometric distribution with parameters N, n and r, then
M M M N n
X = E ( X ) = n , X 2 = var ( X ) = n 1 .
N N N N 1
Definition 3.8 A r.v. X has a hypergeometric distribution with parameters N, n,
and r if its PMF is given by
M N M
nx x = 0,1, 2,..., n subject to the restrictions Note that if we write p = r / N , then we get
p ( x ) = P( X = x) = h(x : n, M, N) =
x
,
N x r, n x N r Nn
E( X) = np and Var ( X) = np(1 p)
n N 1
Indeed, any hypergeometric distribution tends to a binomial as N becomes large and
M / N , the proportion of successes in the population is held constant (at p).
Mathematically,
Example 3.9 M N M
n x n x
lim
x
= p (1 p ) .
nx
An important problem encountered by personnel directors and others faced with the N N
x
r
=p
selection the best in a finite set of applicants is illustrated by the following situation. N
n
From a group of twenty PhD engineers, ten are selected at random for employment. What (3.2)
is the probability that the ten selected include all the five best engineers in the group of
twenty?
15 16
Example 3.10 4. The experiment continues (trials are performed) until a total of r successes have
been observed, where r is a specified positive integer.
Standard water pump components ordered for a water supply system all have the same
The random variable of interest is X = the number of failures that precede the r th
specifications but 2% are found to be defective. A consignment of 100 items has been
success ; X is called a negative binomial random variable because, in contrast
received. For this consignment to be accepted no more than one item in a lot of 10 items
to the binomial random variable, the number of successes is fixed and the number
selected at random can be defective. Calculate the probability that the consignment will
of trials is random.
be accepted by using both the hypergeometric and binomial distributions.
Definition 3.9 A r.v. Y has a negative binomial distribution with parameters r and p if its
p.m.f is given by
Solution
y 1 r y r
Let success here be a defective item. If X denotes the number of defective items in the p ( y ) = P(Y = y ) = p q , r = 1, 2,3,..; y = r, r + 1, r + 2,...
r 1
sample of 10, then X has a hypergeometric distribution with parameters N = 100, n = 10
and r = .02(100) = 2. For the consignment to be accepted, we need
2 98 2 98
P( X 1) = + =
0 10 1 9 981
= .991
100 100 990 Example 3.11
10 10
If we now approximate X by a binomial distribution with parameters n = 10 and p = .02, Cotton linters are used in the production of rocket propellant are subjected to a nitration
we have process that enables the cotton fibres to go into solution. The process is 90% effective in
that the material produced can be shaped as desired in a later processing stage with
10 10
P( X 1) (.02 ) (.98 ) + (.02 ) (.98 ) = .984
0 10 1 9
0 1 probability .9. What is the probability that the third defective lot occurs after 20 lots have
Note that the two probabilities are quite close. been produced?
Solution
Here we view a defective lot as a success. Let Y denote the number of trials before the
3.7 NEGATIVE BINOMIAL AND GEOMETRIC DISTRIBUTIONS
third success. Then, Y has a negative binomial distribution with parameters p = .9 and r =
The negative binomial random variable and distribution are based on an experiment 3 and
satisfying the following conditions: 19
P(Y = 20) = (.1) (.9 ) = .0285 .
3 17
17 18
3.7.1 The Mean and Variance Solution
If Y has a negative binomial distribution with parameters r and p, then Here we view an engine malfunction as a success. Let Y be the number of 1-hour
r rq
Y = E(Y ) = , Y 2
= var(Y ) = 2 . intervals until the first malfunction. Then
p p y 1
Y ~ Geom (.02 ) p ( y ) = (.98 ) (.02 ) , y = 1, 2, 3,...
and we need P(Y > 2) . We have
2
When r = 1, the resulting negative binomial distribution is called the geometric P( Y > 2) = 1 p ( y )
y =1
= 1 ( .02 ) ( .98 )( .02 ) = .9604.
distribution. In a series of Bernoulli trials, the geometric r.v. is the number of the trial on
which the first success occurs, i.e. it is the waiting time for the first success. The
geometric r.v. is thus frequently used to model distribution of waiting times. For
example, suppose that a commercial aircraft engine is serviced periodically so that its
various parts are replaced at various points in time and hence are of varying ages. Then it 3.8 POISSON DISTRIBUTION
might be reasonable to assume that the probability of engine malfunction, p, during any
1-hour interval is the number of 1-hour intervals, Y, until the first malfunction. The binomial, hypergeometric, and negative binomial distributions were all derived by
starting with an experiment consisting of trials or draws and applying the laws of
Definition 3.10 A r.v. Y is said to have a geometric distribution with parameter p if its probability to various outcomes of the experiment. There is no simple experiment on
PMF is given by which Poisson distribution is based, though we shall describe shortly how it can be
p ( y ) = P(Y = y ) = q p, y = 1, 2,3,...
y 1 obtained.
Definition 3.10 A r.v. X is said to have a Poisson distribution with parameter if its
If Y has the above p.m.f we write p.m.f is given by
Y ~ Geom( p ) .
e x
p (x : ) = = x = 0,1, 2,...
x!
for some >0.
Note that if X ~ Geom( p ), then X = 1 / p, X = q / p and FX ( x ) = 1 q , x = 1,2,3,...
x
2 2
If X has the above p.m.f, we write
X ~ Po ( ) .
The value of is frequently a rate per unit time or per unit area.
Example 3.12
Suppose that the probability of engine malfunction during any 1-hour period is p = .02.
Example 3.12
Find the probability that a given engine will survive 2 hours.
Let X denote the number of flaws on the surface of a randomly selected boiler of a
certain type. Suppose X has a Poisson distribution with =5. Find the probability
19 20
(i) that a randomly selected boiler has exactly two flaws. (b) Find the probability that no silent paging errors occur.
(ii) That a boiler contains at most two flaws. (c) Find the probability that at least one such error occurs.
(d) Would it b unusual for more than four such errors to occur. Explain,
Solution
based on the probability involved.
e5 (5) 2
(i) P(X=2) = = 0.084
2!
2
e 5 (5) x 25 3. An automobile service facility specializing in engine tune-ups knows that 25% of
(ii) P(X 2) = = e5 (1 + 5 + ) = 0.125
x =0 x! 2 all tune-ups are done on four cylinders automobiles, 40% on six cylinder
automobiles, and 35% on eight cylinder automobiles. Let X = the number of
cylinders on the next car to be tuned.
What is the probability mass function of X ?
Y = E ( X ) = , Y 2 = var ( X ) = .
the probability that, of 20 randomly chosen drivers coming to an intersection
under these conditions,
a. at most 5 will come to a complete stop?
Note also that if, in an interval t, X has a Poisson distribution with parameter , then in
b. Exactly 5 will come to a complete stop?
an interval kt, where k is a constant, X will be Poisson distributed with parameter k .
c. At least 5 will come to a complete stop?
d. How many of the next 20 drivers do you expect to come to a complete
3.9 ACTIVITIES
stop?
1. A manufacturer claims that at most 10% of his product is defective. To test this
5. Twenty micro-processor chips are in stock. Three have etching errors that cannot
claim, 18 units are inspected and his claim is accepted if among these 18 units, at
be detected by the naked eye. Five chips are selected and installed in field
most 2 are defective. Find the probability that the manufacturers claim will be
equipment.
accepted if the actual probability that a unit is defective is
(a) 0.05 (b) 0.2
(a) Find the p.m.f of X, the number of chips that have etching errors.
(b) Find E(X) and Var(X).
2. It is possible for a computer to pick up an erroneous signal that does not show up
(c) Find the probability that no chips with etching errors will be selected.
as an error on the screen. The error is called a silent error. A particular terminal is
(d) Find the probability that at least one chip with an etching error will be selected.
defective, and when using the system word processor it introduces a silent paging
error with probability .1. The word processor is used 20 times during a given
week.
21 22
6. The number of typing errors made by a particular typist has a Poisson distribution
with an average of four errors per page. If more than four errors show on a given
page, the typist must retype the whole page. What is the probability that a certain
page does not have to be retyped?
7. The number of bacteria colonies of a certain type in samples of polluted water has
a Poisson distribution with a mean of 2 per cubic centimetre.
(a) If four 1-cubic-centimetre samples are independently selected from this
water, find the probability that at least one sample will contain one or
more bacteria colonies.
(b) How many 1-cubic-centimetre samples should be selected in order to have
the probability of approximately .95 of seeing at least one bacteria colony?
1. 0.941
2. (a) 0.1216 (b) 0.8784 (c) It would be usual because P(Y>4) = 0.432 which is
quite small.
3. (a) p(4) = 0.25 p(6) = 0.40 p(8) = 0.35 p(x) = 0 for x 4,6,8.
4. (a) 0.804 (b) 0.174 (c) 0.370 (d) 0.4
5. (b) 0.75, 0.5033 (c) 0.3991 (d) 0.6009
6. 0.6288
7. (a) 0.9997 (b) 2.
3.11 SUMMARY
In this Unit, you have been introduced to the concept of a random variable. You have
studied various distributions. You must now be able to solve problems involving discrete
distributions covered in this unit.
23