You are on page 1of 58

Chapter 3

The Normal Distributions

BPS - 5th Ed.

Chapter 3

Exploring A Distribution
1. Always plot your data: make a graph
(histogram, stemplot, normal probability plot,
boxplot, dotplot)
2. Look for overall patterns (shape, center,
spread) and for striking deviations such as
outliers.
3. Calculate a numerical summary to briefly
describe center and spread.
4. (What we will be studying in this chapter)
Sometimes the overall pattern of a large
number of observations is so regular that we
can describe it by a smooth curve.
BPS - 5th Ed.

Chapter 3

Density Curves
Example: here is a
histogram of vocabulary
scores of 947 seventh
graders.
The smooth curve
drawn over the
histogram is a
mathematical model for
the distribution.

BPS - 5th Ed.

Chapter 3

Density Curves
The density curves are defined by probability
density functions. It is a formula used to
specify and compute areas under the curve.
This will give us probabilities/proportions for
the random variable.
The function must have two properties:
1. The total area under the graph of the
function is equal to 1 (i.e. the total
probability is 1)
2. The function is always greater than or
equal to zero.
BPS - 5th Ed.

Chapter 3

Density Curves
Lets look at those two properties and what they
give us.
1. The total area under the graph of the function
is equal to 1 (i.e. the total probability is 1)
That will let us determine the probability
(proportion) a continuous random variable
takes on a value between two numbers.
The probability the variable takes on values
between the two numbers of interest will be
the area under the curve.
2. The function is always greater than or equal to
zero, ensures that we have a something called
a probability distribution (to be studied later).
BPS - 5th Ed.

Chapter 3

Density Curves
Example: the areas of
the shaded bars in this
histogram represent the
proportion/probability of
scores in the observed
data that are less than
or equal to 6.0. This
probability is equal to
0.303.

BPS - 5th Ed.

Chapter 3

Density Curves
Example: now the area
under the smooth curve
to the left of 6.0 is
shaded. If the scale is
adjusted so the total
area under the curve is
exactly 1, then this
curve is called a density
curve. The
probability/proportion of
scores to the left of 6.0
is now equal to 0.293.
BPS - 5th Ed.

Chapter 3

A Density Curve (Review)


Always on or above the horizontal axis
Has an area exactly 1 underneath curve
Area under the curve and above any range
of values is the proportion/probability of all
observations that fall in that range
The density curve describes the overall
pattern of a distribution

NOTE: No set of real data is exactly described by a density


curve. The curve is an idealized description that is easy
touse and accurate enough for practical use.

BPS - 5th Ed.

Chapter 3

Likelihood Interpretation
Probability Density Function
The probability
of being
between 4 and 8.

More likely
values
BPS - 5th Ed.

Less likely
values

An interpretation of the probability


density function is:
The random variable is more likely
to be in those regions where the
function is higher.
The random variable is less likely
to be in those regions where the
function is lower.
The random variable is never in
those regions where the function
is zero.
Chapter 3

Density Curves
The median of a density curve is the
equal-areas point, the point that divides
the area under the curve in half
The mean of a density curve is the
balance point, at which the curve would
balance if made of solid material
The mean and the median are the same
for a symmetric density curve. They both
lie at the center. The mean of a skewed
curve is pulled away from the median in
the dirction of the long tail.
BPS - 5th Ed.

Chapter 3

10

Mean and Median of a Density Curve


If the mean and the median are different, this
difference often gives us clues about the shape of
the distribution. (symmetric?, skewed left?, skewed
right?)
Symmetric mean will usually be close to the
median.
Skewed left mean will usually be smaller than the
median.
Skewed right mean will usually be larger than the
median

BPS
- 5th
Ed.
BPS
- 5th
Ed.

Chapter
Chapter
3
3

11

Mean and Standard Deviation or


Density Curves
Remember: Density curves are an idealized
description of a distribution of data, so we need
to distinguish between the mean and standard
deviation of the density curve and the mean and
standard deviation computed from the actual
observations.
The mean and standard deviation of the idealized
distribution of the data represented by the density
curve are denoted by (mu) and (sigma),
respectively.
The mean and standard deviation computed from
actual observations (data) are denoted by and s,

respectively.
BPS - 5th Ed.

Chapter 3

12

Mean and Standard Deviation or


Density Curves
We can roughly locate the mean, , of any
density curve by eye (the balance point).
There is no easy way to locate the standard
deviation,, by eye for density curves IN
GENERAL.
However, we will be studying a very special
density curve called the Normal curve. The
Normal curve is a density curve where we
can locate by eye.
BPS - 5th Ed.

Chapter 3

13

The Normal Distribution


The distribution Normal curves describe
are Normal distributions.
The fundamental distribution underlying
most of inferential statistics is the normal
distribution, which is a continuous
distribution.
The normal curve has a very specific bell-shape

BPS - 5th Ed.

Chapter 3

14

Properties of the Normal Distribution


In drawing the normal curve, the mean
and the standard deviation have specific
roles:
The mean is the center of the curve.
The values ( ) and ( + ) are the
inflection points of the curve.
Different
Curvatures

BPS - 5th Ed.

Chapter 3

15

Empirical Rule The 68-95-99.7 Rule


The Empirical Rule can be used to determine the proportion
(or percentage) of the variable values within a specified
number of standard deviations of the mean, provided the
variable's distribution is approximately bell-shaped.
Empirical Rule: If the distribution (i.e., histogram) is
roughly bell-shaped, then
Approximately 68% of the data will lie within
1 standard deviation of the mean.
Approximately 95% of the data will lie within
2 standard deviations of the mean.
Approximately 99.7% of the data (i.e., almost all) will lie
within 3 standard deviations of the mean.
Applies to both populations and samples.
BPS - 5th Ed.

Chapter 3

16

Empirical Rule The 68-95-99.7 Rule

BPS - 5th Ed.

Chapter 3

17

Empirical Rule (cont.)

The standard deviation is very useful for estimating the


percentage of the observations with values within
certain intervals about the mean.
(100 99.7)/2 = 0.15%

0.15%

(100 95)/2 = 2.5%

2.5%

(100 68)/2 = 16%

16%

2.5 0.15 = 2.35

16 2.5 = 13.5
68/2 = 34

BPS - 5th Ed.

Chapter 3

18

Question
Data sets consisting of physical measurements
(heights, weights, lengths of bones, and so on) for
adults of the same species and sex tend to follow
a similar pattern. The pattern is that most
individuals are clumped around the average, with
numbers decreasing the farther values are from
the average in either direction. Describe what
shape a histogram (or density curve) of such
measurements would have.

BPS - 5th Ed.

Chapter 3

19

The Normal Distribution


Knowing the mean () and standard
deviation () allows us to make various
conclusions about Normal distributions.
Notation: N(,).

BPS - 5th Ed.

Chapter 3

20

Graph of a Normal Distribution (cont.)


Different values of shift the curve left and right.

Two normal curves


with different
means, but the
same standard
deviation.

BPS - 5th Ed.

Chapter 3

21

Graph of a Normal Distribution (cont.)


Different values of shift the curve up and down.

Two normal curves


with different
standard
deviations, but the
same mean.

BPS - 5th Ed.

Chapter 3

22

Properties of the Normal Density Curve


1. The curve is symmetric about the mean.
2. The mean = median = mode. So, the highest point of
the curve is at x = .
3. The curve has inflection points at ( ) and ( + ).
4. The total area under the curve is equal to 1.
5. The area under the curve to the left of the mean is
equal to the area under the curve to the right of the
mean. (So, by symmetry, the area to the left of the
mean equals 0.5; and the area to the right of the mean
equals 0.5.)
6. As x gets larger and larger (in either the positive or
negative directions), the graph approaches but never
reaches the horizontal axis.
BPS - 5th Ed.

Chapter 3

23

Empirical Rule
The Empirical Rule is true for the Normal Distribution:
Approximately 68%
(exactly 68.26%)
of the values lie between
( ) and ( + ).
Approximately 95%
(exactly 95.44%)
of the values lie between
( 2) and ( + 2).
Approximately 99.7%
(exactly 99.74%)
of the values lie between
( 3) and ( + 3).

BPS - 5th Ed.

Chapter 3

24

Approximation to Histogram
When we collect data on a continuous
variable, we can draw a histogram to
summarize its distribution.
However, using histograms has several
drawbacks:
Histograms are based on classes, i.e.,
grouped values of the variable, so there are
always grouping "errors".
It is difficult to make detailed calculations.
Instead of using a histogram, we can use a
probability density function that is an
approximation of the histogram.
BPS - 5th Ed.

Chapter 3

25

Approximation to Histogram (cont.)


Frequently,
histograms of
continuous
variables are bellshaped:

We can approximate bell-shaped


histograms with normal curves.

Normal
Approximation

In this case, the normal


curve is close to the
histogram, so the
approximation should
be accurate.
BPS - 5th Ed.

Chapter 3

26

Approximation to Histogram
(cont.)
When we model a relative frequency
distribution with a normal probability
distribution, we use the area under the
normal curve to:
Approximate the areas of the bars in the
histogram being modeled.
Approximate proportions that are too detailed to
be computed from just the histogram.

BPS - 5th Ed.

Chapter 3

27

Health and Nutrition Examination


Study of 1976-1980
Heights of adult men, aged 18-24
mean: 70.0 inches
standard deviation: 2.8 inches
heights follow a normal distribution, so
we have that heights of men are N(70,
2.8).
mean

BPS - 5th Ed.

Chapter 3

standard deviation

28

Health and Nutrition Examination


Study of 1976-1980
N(70,2.8)

68-95-99.7 Rule for mens heights


68%

are between 67.2 and 72.8 inches

[ - = 70.0 - 2.8]
+ 2.8]
95%

[ + = 70.0

are between 64.4 and 75.6 inches

[ 2 = 70.0 2(2.8) = 70.0 5.6 ]


99.7%

are between 61.6 and 78.4 inches

[ 3 = 70.0 3(2.8) = 70.0 8.4 ]

BPS - 5th Ed.

Chapter 3

29

Health and Nutrition Examination


Study of 1976-1980
What proportion/probability of men
are less than 72.8
inches tall?
68%

(by 68-95-99.7 Rule)

16%

?
-1

+1

? = 84%

BPS - 5th Ed.

70

Chapter 3

72.8

(height values)

30

Health and Nutrition Examination


Study of 1976-1980
What proportion/probability of men are
less than 68 inches tall?

?
68 70

(height values)

How many standard deviations is 68 from 70?


It is less than 2.8 standard deviations from 70. What can
we do? That is what we will learn next.
BPS - 5th Ed.

Chapter 3

31

The Area under a Normal Curve


Suppose a random variable X is normally
distributed with mean and standard
deviation . The area under the normal curve
for any interval of values represents either
The proportion of the population with the
characteristics described by the interval of values
or
The probability that a randomly selected individual
from the population will have the characteristic
described by the interval of numbers.

So, the area under a normal curve is a


proportion or a probability.

BPS - 5th Ed.

Chapter 3

32

The Area under a Normal Curve


Since there is no area under the normal
curve associated with a single value, the
probability of observing a specific value for
a normal random variable is 0.
We only get proportions/probabilities with
a range of values.

BPS - 5th Ed.

Chapter 3

33

Graph of a Normal Distribution


There are normal curves for each
combination of the mean and standard
deviation .
The equation of the normalcurve
( x )2with mean and
standard deviation 1is:
2 2

This is a complicated formula, but we will never need


to calculate it (thankfully).
The curves for different values of the mean and
Remember
Normal
are
standard
deviation
lookprobability
different,distributions
but are related.
BPS - 5th Ed.

specified by
their means andChapter
standard
deviations.
3

34

Finding Normal Proportions


(probabilities)

The cumulative proportions (probabilities)


for what is called the standard Normal
distribution is found in a table. This table
is found in your book, Table A and also on
Blackboard.

So we need to define the Standard Normal


distribution and what is meant by
cumulative proportions (probabilities)

BPS - 5th Ed.

Chapter 3

35

Standard Normal Distribution


The standard Normal distribution is the
Normal distribution with mean of 0 and
standard deviation of 1: N(0,1).
What do we mean by cumulative
proportions?
The cumulative proportion
for a value x in a distribution
is the proportion of observations
in the distribution that are less
than or equal to x. For the
standard Normal distribution we
denote the values with z.

BPS - 5th Ed.

Chapter 3

36

Finding Normal Proportions


(probabilities)

The Standard Normal Distribution


Find the area (proportions or probabilities) under
the standard normal curve.
Find Z-scores for a given area.
Interpret the area under the standard normal curve
as a probability.

BPS - 5th Ed.

Chapter 3

37

Finding Normal Proportions


(probabilities)

There are several ways to calculate the area under the


standard normal curve.

What does not work some kind of a simple formula.


We can use a table (such as Table A or Normal Table on
Blackboard).
We can use technology (a calculator or software).

Three different area calculations:

Find the area to the left of a boundary.


Find the area to the right of a boundary.
Find the area between two boundaries.

BPS - 5th Ed.

Chapter 3

38

"Area to the left of" Using a Table


I will write the proportion as P(Z<z)
Calculate the area to the left of Z = 1.68 (P(Z<1.68)):
Break up 1.68 as 1.6 + 0.08
Find the row 1.6
Find the column 0.08

Enter

Part of Table A

Enter

The probability is 0.9535 = P(Z 1.68) = P(Z < 1.68)


BPS - 5th Ed.

Chapter 3

Read
39

"Area to the right of" Using a Table


Calculate the area to the right of Z = 1.68:
The area to the left of Z = 1.68 is 0.9535

Enter

Enter

Area to the right of is the remaining amount. Read


The two add up to 1, so:
Area to the right of = 1 Area to the left of

Area to the right of Z = 1.68 = 1 0.9535 = 0.0465


The proportion/probability is 0.0465 = P(Z > 1.68) = P(Z 1.68)
BPS - 5th Ed.

Chapter 3

40

Area in between Using a Table


Calculate the area between Z = 0.51 and Z =
1.87.
This is not a one step calculation.
P(0.51 Z 1.87)
= P(0.51 Z < 1.87)
= P(0.51 < Z 1.87)
= P(0.51 < Z < 1.87)
Different ways to write the
same
proportion/probability.
Since the P(Z=z)=0, it
doest matter if we have <
or . Similiarly, it doesnt
matter if we have > or
BPS - 5th Ed.

Chapter 3

41

Area in between (cont.)


We Want: Area between Z =
0.51 and Z = 1.87.

What we know how to calculate:


Area to the left of Z = 1.87.
Can start out with this area, but its
too much.
It is too much by the area to the left
of 0.51, which we also know how
to calculate.
So, we "correct" by subtracting the
excess area.
BPS - 5th Ed.
Chapter 3

Included
too much

42

Area in between (cont.)


To calculate the area between Z = 0.51
and Z = 1.87: P(0.51 Z 1.87)= P(Z 1.87)- P(Z -0.51)
Find the area to the left of 1.87, which is 0.9693.
Find the area to the left of 0.51, which is 0.3050.
Take the difference of these two areas:
0.9693 0.3050 = 0.6643
The proportion/probability is 0.6643 = P(0.51 Z 1.87).

BPS - 5th Ed.

Chapter 3

43

Caution!
When the Z-score is off the standard normal table:
State the area under the standard normal curve to the left of Z =
3.50 (or the right of Z = 3.50) as < 0.0002 (not 0).
P(Z z) < 0.0002 for any z 3.50
P(Z z) < 0.0002 for any z 3.50
State the area under the standard normal curve to the left of Z =
3.50 (or the right of Z = 3.50) as > 0.9998 (not 1).
P(Z z) > 0.9998 for any z 3.50
P(Z z) > 0.9998 for any z 3.50

When using MINITAB:


State the area as < 0.0001 (not 0), when result from MINITAB is 0.
State the area as > 0.9999 (not 1), when result from MINITAB is 1.
Why? Because the normal curve extends indefinitely in both
directions on the Z-axis, there is no Z-value for which the area to
the left of it is exactly
0 or exactly 1.
BPS - 5th Ed.
Chapter 3
44

Review: Standard Normal Distribution


The area under a normal curve can be interpreted as a
proportion or probability.
The standard normal curve can be interpreted as a
probability density function.
We use Z to represent a standard normal random variable,
so it has probabilities such as:
P(a Z b) = P(a Z < b) = P(a < Z b) = P(a < Z < b)
P(Z a) = P(Z < a)
P(Z > a) = P(Z a)
Areas under the curve of general normal probability
distributions can be related to areas under the curve of the
standard normal probability distribution, via a
Z-score.
BPS - 5th Ed.

Chapter 3

45

Standard Normal Distribution


The standard Normal distribution is the Normal
distribution with mean 0 and standard deviation 1:
N(0,1).
If a variable x has any Normal distribution with mean
and standard deviation [ x ~ N(,) ], then we can
standardize the variable using the following calculation:

x
z

And x is now a standardized variable (standardized score)


and has the standard Normal distribution:

BPS - 5th Ed.

Chapter 3

46

The Standardized Score (Z-Score)


The standardized score tells how
many standard deviations a
particular data value (x) is in the
random variable (X) distribution.

BPS - 5th Ed.

Chapter 3

47

Standardized Scores
Lets look back at our Health and Nutrition
Examination Study of 1976-1980 where the
mean was 70 and the standard deviation of
this normally distributed variable was 2.8.
How many standard deviations is 68 from
70?
standardized score =
(observed value minus mean) / (std dev)

[ z = (68 70) / 2.8 = 0.71 ]


The value 68 is 0.71 standard deviations
below the mean 70 (the z score is 0.71).
BPS - 5th Ed.

Chapter 3

48

Health and Nutrition Examination


Study of 1976-1980
What proportion of men are less than 68
inches tall?

?
68 70 (height values)
-0.71 0 (standardized values)

BPS - 5th Ed.

Chapter 3

49

Table A:
Standard Normal Probabilities
See pages 690-691 in text for Table A.
(the Standard Normal Table)
Look up the closest standardized score (z)
in the table.
Find the probability (area) to the left of
the standardized score.

BPS - 5th Ed.

Chapter 3

50

Table A:
Standard Normal Probabilities
z

.00

.01

.02

0.8

.2119

.2090

.2061

0.7

.2420

.2389

.2358

0.6

.2743

.2709

.2676

BPS - 5th Ed.

Chapter 3

51

Health and Nutrition Examination


Study of 1976-1980
What proportion of men are less than 68
inches tall?

.2389
68
-0.71

BPS - 5th Ed.

70 (height values)
0

Chapter 3

(standardized values)

52

Health and Nutrition Examination


Study of 1976-1980
What proportion of men are greater than
68 inches tall?

.2389
68
-0.71

BPS - 5th Ed.

1.2389
= .7611
70 (height values)
0

Chapter 3

(standardized values)

53

Health and Nutrition Examination Study of


1976-1980
How tall must a man be to place in the
lower 10% for men aged 18 to 24?
This is a different problem. We must now
look in the body of Table A for a
proportion/probability close to 0.1000 and
find the z score. Then we must take the z
score and calculate the x value (height)
that corresponds to z standard deviations
in the X distribution.
.10

? 70
BPS - 5th Ed.

Chapter 3

(height values)
54

Table A:
Standard Normal Probabilities
z

.07

.08

.09

1.3

.0853

.0838

.0823

1.2

.1020

.1003

.0985

1.1

.1210

.1190

.1170

BPS - 5th Ed.

Chapter 3

55

Health and Nutrition Examination


Study of 1976-1980
How tall must a man be to place in the
lower 10% for men aged 18 to 24? The
proportion/probability in the table closest
to 0.1000 is 0.1003. The corresponding z
number is -1.28.

.10
? 70
-1.28
BPS - 5th Ed.

0
Chapter 3

(height values)
(standardized values)
56

Observed Value for a Standardized


Score
Need to unstandardize the z-score to
find the observed value (x) :

x
z

x z

observed value =
mean plus [(standardized score) (std dev)]

BPS - 5th Ed.

Chapter 3

57

Observed Value for a Standardized


Score
observed value =
mean plus [(standardized score) (std
dev)]
x = 70 + [(1.28 ) (2.8)]
x = 70 + (3.58) = 66.42

A man would have to be approximately


66.42 inches tall or less to place in the
lower 10% of all men in the population.

BPS - 5th Ed.

Chapter 3

58

You might also like