Professional Documents
Culture Documents
74 87 60 77 67 68 69 73 87 81 ...
The full data set has the following relative frequency table:
A common error that researchers make is to assume a sample arises from a normal
distribution when in fact it does not. Because the sample in the last example is
1
These lecture notes are intended to be used with the open source textbook “Introductory
Statistics” by Barbara Illowsky and Susan Dean (OpenStax College, 2013).
1
Chapter 6 Notes The Normal Distribution D. Skipper, p 2
fairly large, its histogram has a very clear bell-shape. For smaller data sets, the
bell-shape may not be as clearly apparent. There are methods to help us decide
if it is “safe” to assume that data arise from a normally distributed population,
such as a normal quantile plot, but ultimately this is a gray area and relies on the
discretion of the researcher.
Suppose X ∼ N (µ, σ). If we need to find the z score associated with the data
value x, we use the formula
x−µ
z= .
σ
If we need to find the data value that has a particular z score, we use the formula
x = µ + zσ.
Example 2. Comparing data values using z scores. SAT scores and ACT
scores are both normally distributed. SAT scores have a mean of 1026 and a
standard deviation of 209. ACT scores have a mean of 20.8 and a standard deviation
of 4.8. A student takes both tests and scores 1130 on the SAT and 25 on the ACT.
(1) Suppose X = score on the SAT. Then X ∼ ( , ).
(2) Suppose Y = score on the ACT. Then Y ∼ ( , ).
(3) Compare the test scores using z scores.
algebra.com
If X is a random variable and has a normal distribution with mean and standard
deviation , then the Empirical Rule says the following:
• About 68% of the x values lie between −1σ and +1σ of the mean µ (within
one standard deviation of the mean).
• About 95% of the x values lie between −2σ and +2σ of the mean µ (within
two standard deviations of the mean).
• About 99.7% of the x values lie between −3σ and +3σ of the mean µ (within
three standard deviations of the mean).
The empirical rule is also known as the 68-95-99.7 rule.
Section 6.1.
Chapter 6 Notes The Normal Distribution D. Skipper, p 3
Example 3. The 68-95-99.7 Rule. From 1984 to 1985, the mean height of 15
to 18-year-old males from Chile was 172.36 cm, and the standard deviation was
6.34 cm. Let Y = the height of 15 to 18-year-old males from 1984 to 1985. Then
Y ∼ N (172.36, 6.34).
(1) About 68% of the y values lie between what two values?
(2) About 95% of the y values lie between what two values?
(3) About 99.7% of the y values lie between what two values?
Example 6.6.
STRATEGY: For each question, take a minute to figure out what is given and
what is unknown. Then sketch a normal diagram, shade the relevant area, and
mark what is given and what is unknown.
• Finding a probabilty: cutoff data value(s) x given; area unknown.
• Finding a percentile/quartile: area given; cutoff data value x unknown.
• Given a percentage: area given; cutoff data value(s) unknown.