You are on page 1of 26

Full-Time MBA: Business Statistics (BST510)

Continuous Probability
Distributions

Section 4b

Paul Bottomley (Room F03)


Reading: Silver pp.150-154, 163-180.

1
Discrete vs. Continuous Probability
Distributions
A series of coin flipping experiments focus on # of heads
0.35 0.3
0.3 0.25
0.25

Probability
Probability

0.2
0.2
0.15
0.15
0.1
0.1
0.05 0.05

0 0
0 1 2 3 4 5 0 1 2 3 4 5 6 7 8 9 10

# of heads # of heads

0.2
As the number of coins (trials) increases,
0.15 charts become less like a staircase and
Probability

0.1 more like a smooth curve.


0.05

0
Ultimately, we have a probability density
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
# of heads
function for a continuous random variable.
2
The Normal Distribution
N(, )

X
Sometimes called the Gaussian Distribution or Normal curve.
It describes many real life situations
Height, weight, intelligence of people
Lifetime of car tyres, speed of machine operators, and production output.
There are many Normal distributions each with a different mean
() and a different standard deviation ().
Notation: N(, ); for instance N(10,5).
3
Properties of the Normal Distribution
N(, )
1 x 2
1

f ( x) exp 2

2
X
Is a continuous probability distribution (density function).
On the horizontal axis, any value in that interval of X can occur. In
contrast, the Binomial is a discrete distribution with only integer values.
The mean () determines the position of the distribution.
The larger the standard deviation (), wider / flatter the curve.
With information on the mean and SD, we have everything we
need to know about a specific Normally distributed variable.
4
Applying the Normal Distribution

In practice, knowledge of Normal Curve is used for:

1. Direct probability calculations.

2. Approximating other discrete probability distributions, such


as the Binomial.

3. As a description of random error (used in statistical inference


see hypothesis testing and confidence intervals later).

5
Standard Normal Distribution (Z)
Standard Normal has a mean = 0 and standard deviation = 1.
N(0,1) is denoted as Z.

To determine a probability, we work out the area under the curve


f(x) corresponding to an given interval.
We cant take the height of the curve as the probability because
it is a continuous variable with many, many, possible values.
Probability of any value is zero very thin vertical slice of graph.
Example: P(X = 100), why not 99.99 or 100.001?
But the area (probability) in each half of the Normal distribution is
0.50 because 50% of values lie above or below the mean.
6
Standard Normal Tables:
A Simple Introduction
For more information on the Standard Normal, we use tables.
From the table, we can read off the probability (area) and so
determine how unusual a particular value of Z is.
Z scores can be interpreted as how many standard deviations
a value is above or below the mean.

Question: How unusual is a value of Z:


1. 2.0 standard deviations or more above the mean?
2. Within +/- 0.5 standard deviations of the mean?
3. Between 0.5 and 1.5 standard deviations from the mean?

7
Reading the Standard Normal Table

N(0,1)

Z2
Our tables report the area in the distributions tail, the area
to the right of the Z value. This area is denoted by alpha ().
Part 1. Z = 2, corresponds to an area of 0.0228.
There is a 2.28% chance of getting a Z value equal to or
greater than 2. A relatively rare event.
Recall, the Empirical Rule? (Check Z = +/- 1.96)
8
Reading the Standard Normal Table (Z)

N(0,1) N(0,1)

-0.5 +0.5 Z +0.5 +1.5 Z


Part 2: Z = 0.5 corresponds to an area of 0.3085.
Subtracting 0.3085 from 0.5 gives the area between Z = 0 and Z
= 0.1915; curve is symmetrical, so multiple by 2 = 0.3830
38% of values lie within a standard deviation of the mean.

9
Reading the Standard Normal Tables (Z)

N(0,1) N(0,1)

-0.5 +0.5 Z +0.5 +1.5 Z


Part 2: Z = 0.5 corresponds to an area of 0.3085.
Subtracting this from 0.5 gives the area between Z = 0 and Z =
+0.5 (0.5 - 0.3085 =0.1915); symmetry - multiple by 2 = 0.3830.
38% of values lie within a standard deviation of the mean.
Part 3: Z between +0.5 and +1.5.
Subtract 0.3085 0.0668 to get the area from Z = 0.5 to 1.5. 10
Mapping Onto the Standard
Normal Distribution
Why all this fuss about the Standard Normal? When will we
find a problem that can be described by N(0,1)?
In practice, there are many Normal distributions each with a
different mean and a different standard deviation.
So, wont each problem need a different table of values and
associated probabilities? Yes

But, we can convert values of X into values of Z, find out how


many standard deviations each value is from its mean, and
then find the probability from the Standard Normal Tables.

11
Mapping Onto the Standard
Normal Distribution

Example: Let us assume that we have a random variable (X)


that is Normally distributed with a mean of 20 and a standard
deviation of 5. We can convert values of X into values of Z.

Questions:
1. Whereabouts in the distribution is a value of 10 located?
2. How many standard deviations is 10 away from the mean?
3. What is the Z value corresponding to a value of 22?

12
Mapping Any Normal Distribution Onto
the Standard Normal Distribution (Z)

X 10 20 N(20,5)
Z 2
5
N(0,1)
10 20 X
0 20
1 5
Z=? 0 Z

13
Mapping Any Normal Distribution Onto
the Standard Normal Distribution (Z)

X 22 20 N(20,5)
Z 0.4
5
N(0,1)
20 22 X

0 Z=?

14
Motor Vehicle Recovery Problem
A motor vehicle breakdown companys customer survey reveals
that the average time taken to reach a vehicle is 50 minutes with
a standard deviation of 15 minutes. What proportion of callouts
will take 70 minutes or longer to reach the vehicle?
Must assume times are (1) Normally distributed (2) continuously
measured. Next convert the X value into its equivalent Z value.

70 50 N(50,15)
Z 1.33
15

P(X 70) = P(Z 1.33) = 0.0918

Is this quality of service adequate? 50 70 X


15
Mobile Phone Manufacturing
A firm makes plastic casings for mobile phones. A large sample of
casings are found to be Normally distributed with a mean depth of
18 mm. and standard deviation of 0.8 mm.

1. If casings with a depth of 17 mm. or less are too shallow to hold the
phones components, what % of casings are currently rejected?

2. The firm wishes to reduce wastage to 5% by adjusting the mean


depth of the manufacturing process while maintaining its standard
deviation (0.8mm). What should the new mean depth be set to?

3. If the extra plastic needed to press the casings costs 15.00 per
thousand, and the cost of wasting each casing is 0.75, what is
the financial implication of this change to the firm? 16
Mobile Phone Manufacturing Pt.1
Q1: What proportion of casings are currently rejected?

N(18,0.8) N(0,1)
Wasted
casings 10.6%

17 18 X -1.25 0 Z
17 18
Z 0.8 1.25 P(X 17) = P(Z -1.25) = 0.1057
This is 1.25 standard deviations below the mean. The table gives
only areas to the right but because of symmetry, we use Z = +1.25.

Currently 10.6% of casings are too shallow to hold the mobile


phone components: 106 per 1000 produced are wasted. 17
Mobile Phone Manufacturing Pt. 2
To reduce wastage from 10.6% to 5%, we must increase the
depth of the typical casing shift the Normal curve to the right.
Work backwards: specify the N(?,0.8)
probability, then find the mean. Wasted
casings
From the Table, 5% of the area
lies 1.64 standard deviations
from the mean (Z = -1.64 approx).
17 =? X
17 Rearrange &
1.64 17 (1.64 * 0.8) 18.31
0.8 solve for

Solution: increase the mean depth of the manufacturing process from


18.00mm. to 18.31mm. to reduce wastage from 10.6% to 5%.
18
Mobile Phone Manufacturing Pt.3
If the extra plastic required to press the casings will cost the
firm 15 per thousand, and the cost of wasting each casing
is 0.75, what is the financial implication of this change?

Extra plastic: costs rise by 15.00 per thousand.


Wastage falls from 10.6% to 5%
Throw away 50 instead of 106 casings for every 1000 made.
If cost of wastage is 0.75, we save (56 x 0.75) = 42.00
Net gain = 42.00 - 15.00 = 27.00.

This more than offsets the extra plastic required.

19
The Normal Approximation to the
Binomial Distribution
Useful for determining the chance that X takes on a range of
values when n is large & probability of success (p) is near 0.5.

Imagine the Binomial calculations when flipping 50 coins.


To calculate P(20 heads) requires large factorials.
50! 30
P( X 20) ?
20

(50 20)!20! 0.5 (0.5)


To calculate cumulative probabilities is even more difficult.
P(less than 5 heads) = P(0 heads) + P(1 head) + P(4 heads).

20
Some Bell-Shaped Binomial Distributions
n = 5, p = 0.3 n = 25, p = 0.3
0.4 0.2

0.3 0.15

0.2 0.1

0.1 0.05

0 0
0 1 2 3 4 5

When n is large and p is near 0.5, the Binomial distribution


looks like the Normal distribution (smooth / symmetrical).
Binomial looks like the Normal even when we move away
from p = 0.5, as long as the number of trials is large (n).

To formally test whether this is a good approximation:


(i) np > 5 and (ii) n(1 p) > 5
21
Normal Approx. to Binomial Distribution
To describe any Normal distribution, we require information on:
Mean () = np
Standard deviation () = np(1 p)
Then, we can map any Normal distribution onto the Standard
Normal (Z) and determine the probability of the event occurring.
Example Airline Catering
Imagine a plane with 200 seats. If the probability of selecting the
vegetarian (chicken) meal is 0.3 (0.7), what is the chance that 50
or fewer passengers 1st choice is the vegetarian meal (full flight)?
Mean = np = 200 x 0.30 = 60;
Standard deviation = np(1 p) 200 * 0.3 * 0.7 6.48

22
Continuity Correction Factor
To use a continuous probability distribution to calculate discrete
probabilities, we must apply a Continuity Correction factor.
The Binomial only takes on integer values {0, 1, 2 etc.}, but the
Normal can take on fractions.

We want the total area of 50, 49, 48


etc. bars of the histogram to find the
correct probability.
If we use 50 for X we would miss
half the 50 bar, so we use X = 50.5.
Triangles between the Normal curve 48 49 50 51
and bars will cancel each other out.
Think of this as rounding up / down. 50.5

23
Airline Catering Cont.
So the probability that 50 or fewer passengers 1st choice is the
vegetarian meal becomes P(X < 50.5).
Next, we transform X to its equivalent Z value:
N(60,6.48) N(0,1) 50.5 60
Z
6.48
Z 1.47
50.5 60 X -1.47 0 Z

This shows 50.5 is 1.47 standard deviations below the mean.


Using the Tables and symmetry, we find P(Z > 1.47) = 0.071.
This would be an unusual but not impossible event. If 50 or
fewer people want vegetarian, it implies 150+ want chicken!
24
Exactly 50 Vegetarian Meals
Part 2: Probability of exactly 50 vegetarian meals P(X = 50).
P(49.5 < X < 50.5).
We know the P(X < 50.5) from before.
P(X < 49.5), Z = (49.5 60) / 6.48 = -1.62

So 49.5 is 1.62 SDs below the mean.


From tables and symmetry, we find
P(Z > 1.62) = 0.0526

P(X < 50.5) P(X < 49.5) =


P(Z < -1.47) - P(Z < -1.62) = 48 49 50 51
P(exactly 50) = 0.071 - 0.053 = 0.018.
1.8% chance of exactly 50 veg meals. 49.5 50.5

25
Summary: Probability Distributions
Binomial and Poisson distributions are discrete distributions
which only take on whole (integer) values.

The Normal and exponential are continuous distributions that


can take on many, many, many values including fractions.

The Normal and Binomial distributions require information on


two parameters (B: n and p; N: and ), while the Poisson
requires only one (mu ).

But sometimes we might approximate a discrete distribution with


the Normal curve, e.g. Binomial when n is large, p is near 0.5.
26