You are on page 1of 7

24

5. Continuous Random Variables


5.1

Continuous Random Variables


When a random variable is continuous, we use a probability density function f(x) to describe the
distribution. If f(x) is the probability density function of X, then the probability that X will be between x1
and x2 is given by the formula
P(x1 X x2) =

x2

x1

f ( x )dx

which, pictorially, is the area below the curve f(x) between x1 and x2. Figure 5.1.1 shows this area.

Area = P(x1 X x2)

Figure 5.1.1. Probability Density Function f(x)


Some restrictions on f(x) are natural. Because a probability cannot be negative, f(x) cannot be negative over
any interval; and because the total probability should be 1, the area under f(x) should be 1.
5.2

Uniform Distribution
The uniform distribution is the simplest of continuous distributions. The probability density
function is
f(x) = 1/(b a)
for a x b
=0
for all other x.
where a is the minimum possible value and b is the maximum possible value of X. The graph of f(x) is
shown in Figure 5.2.1.
f(x)
1/(b a)
0

a
b
Figure 5.2.1. Uniform Distribution

Because the curve of f(x) is a flat line, the area under it between any two points x 1 and x2 will be a rectangle
with height 1/(b a) and width (x2 x1). Thus P(x1 X x2) = (x2 x1)/(b a). If X is uniformly
distributed between a and b we shall write X ~ U(a, b).
The mean of the distribution is the midpoint between a and b which is (a + b)/2. Using
integration, it can be shown that the variance is (b a)2/12. The skewness is zero (why?). The distribution
is flat, and has an absolute kurtosis of 1.8. The formulas for uniform distribution are summarized in Box
5.2.1. Because the probability calculation is simple, there is no special spreadsheet function for uniform
distribution.
A common instance of uniform distribution is waiting time for a facility that goes in cycles. A
shuttle bus or an elevator are good examples. They move, roughly, in cycles with some cycle time. If the
user comes to a stop at a random time and waits till the facility arrives, the waiting time will be uniformly
distributed between a minimum of zero and a maximum equal to the cycle time. In other words, if a shuttle
bus has a cycle time of 20 minutes, the waiting time would be uniformly distributed between 0 and 20
minutes.

25
Occasionally, customer service times, such as the time taken by a barber to give a haircut, are
found to be uniformly distributed.
Box 5.2.1. Uniform Distribution Formulas
If X ~ U(a, b), then
f(x) = 1/(b a)
for a x b
=0
for all other x.
P(x1 X x2) = (x2 x1)/(b a)
for a x1 < x2 b
E[X] = (a + b)/2
V(X) = (b a)2/12
Example: If a = 10 and b = 20, then
P(12 X 18) = (18 12)/(20 10) = 0.6
E[X] = (10 + 20)/2 = 15
V(X) = (20 10)2/12 = 8.3333

To solve problems, it is convenient to see many probabilities at the same time, for which a
spreadsheet template works well. Figure 5.2.2 shows the template to be used for uniform distribution. If X
is uniformly distributed between 10 and 20, what is P(12 X 18)? In the template, make sure the Min
and Max are set to 10 and 20 in cells B4 and C4. Enter 12 and 18 in cells E7 and F7. The answer of 0.6
appears in cell G7. If probability of at least or at most x is needed, columns A through C may be used.
Inverse calculations are possible in columns I through M, where for a given probability of at least or at
most x, the corresponding x value is calculated. As usual, one may also use facilities such as the Goal
Seek command or the Solver tool in conjunction with this template.

Figure 5.2.2. Uniform Distribution Template


[Workbook: Continuous Distributions; Sheet: Uniform]

26
5.3

Normal distribution

5.3.1

Introduction
The Normal Distribution is the most common continuous distribution. If the value of a random
variable is affected by many independent causes, and the effect of each cause is small, then the random
variable will follow a normal distribution. The length of a pin made by an automatic machine, the time
taken by an assembly worker to complete an assembly, the weight of a baseball, the tensile strength of a
bolt, and the volume of cola in a can are all normally distributed random variables. Much of the later
chapters are based on normal distribution.
For a normal distribution, the probability density function is

f ( x)

1
2

1 x

A plot of the normal distribution with mean 100 and standard deviation 2 is shown in Figure 5.3.1.
0.2
0.15
0.1
0.05

96

98

100

102

104

106

Figure 5.3.1. Normal Distribution


As seen in the figure, the normal distribution is symmetric with a (relative) kurtosis of 0, which means it is
neither too peaked nor too flat.
The normal distribution template is shown in Figure 5.3.2. As usual it can be used in conjunction
with the Goal Seek command and the Solver tool to solve complicated problems. To use the template,
make sure the right values are entered for Mean and Standard Deviation in cells A4 and B4. Cells A8 and
B8 have been filled with x1 and x2 values so that cell C8 gives the probability P(x 1 X x2Cell E8
contains102 and therefore we get in cell F8 the area to the left of 102, and in cell G8 the area to the right of
102. Because cell I8 contains 0.9 we get in cell J8 that value of x to the left of which there will be 90%
area. Because cell L8 contains 0.9 we get in cell M8 that value of x to the right of which there will be 90%
area.
In later chapters, confidence intervals will be needed. A 2-tailed confidence interval [x1, x2] is
symmetric about the mean and contains an area equal to the confidence level denoted by (1-). In the
range E3:G5, we see 99%, 95% and 90% 2-tailed confidence intervals.

27

Figure 5.3.2. Normal Distribution Template


[Workbook: Continuous Distributions; Sheet: Normal]
5.4

Normal Approximation of Binomial Distribution


When the number of trials n in a binomial distribution is large (say, >1000) the calculation of
binomial probabilities become difficult. Fortunately, the distribution approaches the normal distribution as
n increases and therefore we can approximate it as a normal distribution. The template to be used for this
case is shown in Figure 5.4.1.
The n and p parameters of the binomial distribution are input in cells A4 and B4 and they have
been named n and p. The mean and the standard deviation of the corresponding normal distribution are
calculated in cells E4 and F4.

Figure 5.4.1. Normal Approximation of Binomial Distribution


[Workbook: Continuous Distributions; Sheet: Normal approx. of Binomial]
Whenever a binomial distribution is approximated as a normal distribution, a continuity correction
is required because binomial is discrete and normal is continuous. Thus, a column in the histogram of a
binomial distribution for, say, X = 10 covers, in the continuous sense, the interval [9.5, 10.5]. Therefore,
when we calculate the binomial probability of an interval, say, P(195 X 255) we should subtract 0.5 on
the left and add 0.5 on the right to get the corresponding normal probability, namely, P(194.5 < X < 255.5).
Adding and subtracting 0.5 in this manner is known as continuity correction. In Figure 4.13.1, this
correction has been applied as seen in cells A7 and B7. Cell C7 has the binomial probability of P(195 X
255).

28

5.5

The Exponential Distribution

Suppose an event occurs with an average frequency of occurrences per hour and this average
frequency is constant in that the probability that the event will occur during any tiny duration t is t.
Suppose further we arrive at the scene at any given time and wait till the event occurs. The waiting time
will then follow an exponential distribution. This distribution is the continuous limit of the geometric
distribution. Suppose our waiting time was x. For the event (or success) to occur at time x, every tiny
duration t from time 0 to time x should be a failure and the interval x to x+t must be a success. This is
nothing but a geometric distribution. To get the continuous version, we take the limit of this process at t
approaches zero.
Exponential distribution is fairly common in practice. The time between two successive break
downs of a machine will be exponentially distributed, and is relevant to maintenance engineers. The mean
in this case is known as the Mean Time Between Failures or MTBF. The life of a product that fails by
accident rather than by wear and tear also follows exponential distribution and is relevant to warranty
policies. When X is exponentially distributed with frequency , we shall write X ~ E().
The probability density function f(x) of exponential distribution has the form
f(x) = ex
where is the frequency with which the event occurs. The frequency is expressed as so many times per
unit time, such as 1.2 times per month. The mean of the distribution is 1/ and the variance is (1/)2. Just
as the geometric distribution, the exponential distribution is positively skewed. The template for this
distribution is shown in Figure 5.5.1. Box 5.5.1 summarizes the formulas and provides example
calculations.
Box 5.5.1. Exponential Distribution Formulas
f(x) = ex for x 0
=EXPONDIST(x,,FALSE)
P(X x) = 1 ex
for x 0
=EXPONDIST(x,,TRUE)
P(X x) = ex
P(x1 X x2) =

for x 0

e x1 e x2

for 0 x1 < x2

E[X] = 1/
V(X) = 1/2
Example: If = 1.2, then
P(X 1
=EXPONDIST(0.5,1.2,TRUE)
P(X
P(1 X 2) = e-1*1.2 e-2*1.2 = 0.2105
E[X] = 1/1.2 = 0.8333
V(X) = 1/1.22 = 0.6944

In order to use the exponential distribution template shown in Figure 5.5.1, the value of must be
entered in cell A4. At times, the mean rather than may be known, and its reciprocal is what
should be entered in cell A4. The green cells are input cells and the rest are protected. As usual, the Goal
Seek command and the Solver tool can be used in conjunction with this template to solve problems.

29

Figure 5.5.1. Exponential Distribution Template


[Workbook: Continuous Distributions; Sheet: Exponential]

5.6

Exercises

1. A student takes the campus shuttle bus to reach the classroom building. The shuttle bus arrives at his
stop every 15 minutes but the actual arrival time at the stop is random. The student allows 10 minutes
waiting time for the shuttle in his plan to make it in time to the class.
a. What is the expected waiting time? What is the variance?
b. What is the probability that the wait will be between 4 and 6 minutes?
c. What is the probability that the student will be in time for the class?
d. If he wants to be 95% confident of making it in time for the class, how much time should he
allow for waiting for the shuttle?
2. A hydraulic press breaks down at the rate of 0.1742 times per day.
a.
What is the mean time between failures?
b.
On a given day, what is the probability that it will break down?
c. If four days have passed without a break down, what is the probability that it will break down
on the fifth day?
d. What is the probability that five consecutive days will pass without any break down?
3. Laptop computers produced by a company have average life of 38.36 months. Assume that the life of a
computer is exponentially distributed (which is a good assumption).
a. What is the probability that a computer will fail within 12 months?
b. If the company gives a warranty period of 12 months, what proportion of computers will fail
during the warranty period?
c. Based on the answer to part b., would you say the company can afford to give a warranty
period of 12 months?
d. If the company wants not more than 5% of the computers to fail during the warranty period,
what should be the warranty period?
e. If the company wants to give a warranty period of 3 months and still wants not more than 5%
of the computers to fail during the warranty period, what should be the minimum average life
of the computers?

30
4. The GMAT scores of students who are potential applicants to a university are normally distributed with a
mean of 487 and a standard deviation of 98.
i.
What percentage of students will have scores exceeding 500?
ii.
What percentage of students will have scores between 600 and 700?
iii.
If the university wants only the top 75% the students to be eligible to apply, what should be
the minimum GMAT score specified for eligibility?
iv.
Find the narrowest interval that will contain 75% of the students scores.
v.
Find x such that the interval [x, 2x] will contain 75% of the students scores.
5. The profit (or loss) from an investment is normally distributed with a mean of $11,200 and a standard
deviation of $8,250.
i.
What is the probability that there will be a loss rather than profit?
ii.
What is the probability that the profit will be between $10,000 and $20,000?
iii.
Find x such that the probability that the profit will exceed x is 25%.
iv.
If the loss exceeds $10,000 the company will be ruined. What is the probability of ruin?
v.
Comment on the risk in the investment.

5.7

Projects

1. An automatic lathe produces pins whose lengths are normally distributed with a mean of 1.012 and a
standard deviation of 0.018. The customer will buy only those pins with lengths in the interval [0.98,
1.02].
i.
What percentage of the pins will be acceptable to the consumer?
ii.
If the lathe can be adjusted to have the mean of the lengths to any desired value, what
should it be adjusted to?
iii.
Suppose the mean cannot be adjusted, but the standard deviation can be reduced, and it
costs more and more to reduce the standard deviation further and further. What
maximum value of the standard deviation would make 90% of the pins acceptable to the
consumer? (Assume the mean to be 1.012.)
iv.
Repeat question iii. with 95% and 99% of the pins acceptable.
v.
In practice, which one do you think is easier to adjust, the mean or the standard
deviation? Why?
vi.
Assume it costs $150x2 to decrease the standard deviation by (x/1000). Find the cost of
reducing the standard deviation to the values found in questions iii. and iv.
vii.
Now assume that the mean has been adjusted to the best value found in question ii. at a
cost of $80. Calculate the reduction in standard deviation necessary to have 90%, 95%
and 99% of the parts acceptable. Calculate the respective costs, per question vi.
viii.
Based on your answers to questions vi. and vii. what is your recommended mean and
standard deviation?

You might also like