Section 03 02 Ess Stats2e

ESSENTIAL STATISTICS 2E
William Navidi and Barry Monk
McGraw-Hill Education. All rights reserved. Authorized only for instructor use in the classroom. No reproduction or further distribution permitted without the prior written consent of McGraw-Hill Education.
Measures of Spread
Section 3.2
McGraw-Hill Education.
Objectives
1. Compute the range of a data set
2. Compute the variance of a population and a sample
3. Compute the standard deviation of a population and a
sample
4. Approximate the standard deviation with grouped data
5. Use the Empirical Rule to summarize data that are unimodal
and approximately symmetric
6. Use Chebyshevs Inequality to describe a data set
7. Compute the coefficient of variation
Objective 1
Compute the range of a data set
The Range
The range of a data set is the difference between the largest value
and the smallest value.
The average monthly temperatures, in degrees Fahrenheit, for San

Francisco are listed.
The range of temperatures is: 63 51 = 12.
Although the range is easy to compute, it is not often used in

practice. The reason is that the range involves only two values from
the data set: the largest and smallest.
Objective 2
Compute the variance of a population and a
sample
Variance
When a data set has a small amount of spread, like the San
Francisco temperatures, most of the values will be close to
the mean. When a data set has a larger amount of spread,
more of the data values will be far from the mean.
The variance is a measure of how far the values in a data

set are from the mean, on the average.
The variance is computed slightly differently for

populations and samples. The population variance is
presented first.
Population Variance
Let 1 , 2 , 3 , , denote the values in a population
of size . Let denote the population mean.
Population Variance
2 2
=

Example: Population Variance
Compute the population variance for the San Francisco temperatures.
Step 1: Compute the population mean .

690
= = = 57.5
12
Step 2: For each population value compute . These values

are shown in the second row.
Example: Population Variance (Continued)
Step 3: Square the deviations to obtain the quantity 2 . These
values are shown in the third row.
Step 4: Sum the squared deviations to obtain the quantity 2 .
2 = 42.25 + 12.25 + 6.25 + 2.25 + 0.25 + 6.25 + 6.25

+12.25 + 30.25 + 20.25 + 0.25 + 30.25 = 169
Step 5: Divide the sum obtained in Step 4 by the population size to

obtain the population variance 2 .
2 169
2 = = = 14.083.
12
Sample Variance
When the data values come from a sample rather than a
population, the variance is called the sample variance.
The procedure for computing the sample variance is a bit
different from the one used to compute a population
variance. In the formula, the mean is replaced by the
sample mean and the denominator is 1 instead of
. The sample variance is denoted by 2 .
Sample Variance
2
2 =
1
Why Divide by 1?
When computing the sample variance, we use the sample mean to
compute the deviations. For the population variance, we use the
population mean for the deviations.
It turns out that the deviations using the sample mean tend to be a
bit smaller than the deviations using the population mean. If we
were to divide by when computing a sample variance, the value
would tend to be a bit smaller than the population variance.
It can be shown mathematically that the appropriate correction is to

divide the sum of the squared deviations by 1 rather than .
Example: Sample Variance
A company that manufactures batteries is testing a new type of
battery designed for laptop computers. They measure the lifetimes, in
hours, of six batteries, and the results are 3, 4, 6, 5, 4, 2. Find the
sample variance of the lifetimes.
Solution:
3+4+6+5+4+2
The sample mean is = = 4.
6
The sample variance is given by

2
2 =
1
34 2 + 44 2 + 64 2 + 54 2 + 44 2 + 24 2 10
= = =2
61 5
Objective 3
Compute the standard deviation of a
population and a sample
Standard Deviation
Because the variance is computed using squared deviations, the units
of the variance are the squared units of the data. For example, in the
Battery Lifetime example, the units of the data are hours, and the
units of variance are squared hours. In most situations, it is better to
use a measure of spread that has the same units as the data.
We do this simply by taking the square root of the variance. This
quantity is called the standard deviation. The standard deviation of a
sample is denoted , and the standard deviation of a population is
denoted by .
Sample Standard Deviation Population Standard Deviation

= 2 = 2
Example: Standard Deviation
Example:
The population variance of temperatures in San Francisco is 2 =
14.083. Find the population standard deviation.
Solution:
The population standard deviation is = 2 = 14.083 = 3.753.
Example:
The variance of the lifetimes for a sample of six batteries 2 = 2.
Find the sample standard deviation.
Solution:
The sample standard deviation is = 2 = 2 = 1.414.
Standard Deviation on the TI-84 PLUS
The following steps will compute the standard deviation for both
sample data and population data on the TI-84 PLUS Calculator.
Enter the data into L1 in the data editor.
Run the 1-Var Stats command (the same command used for
means and medians), selecting L1 as the location of the data.
Standard Deviation and Resistance
Recall that a statistic is resistant if its value is not affected
much by extreme values (large or small) in the data set.
The standard deviation is not resistant.
That is, the standard deviation is affected by extreme

values.
Objective 4
Approximate the standard deviation using
grouped data
Approximating the Standard Deviation
Sometimes we dont have access to the raw data in a data set, but we
are given a frequency distribution. In these cases we can approximate
the standard deviation using the following steps.
Step 1: Compute the midpoint of each class and approximate the mean
of the frequency distribution.
Step 2: For each class, subtract the mean from the class midpoint to
obtain (Midpoint Mean).
Step 3: For each class square the difference obtained in Step 2 to

obtain (Midpoint Mean)2, and multiply by the frequency to
obtain (Midpoint Mean)2 x (Frequency).
Approximating the Standard Deviation (Continued)
Step 4: Add the products (Midpoint Mean)2 x (Frequency) over all
classes.
Step 5: To compute the population variance, divide the sum obtained

in Step 4 by . To compute the sample variance, divide the sum
obtained in Step 4 by 1.
Step 6: Take the square root of the variance obtained in Step 5. The
result is the standard deviation.
Example: Standard Deviation Grouped Data
The following table presents the number of text messages sent via
cell phone by a sample of 50 high school students. Approximate
the standard deviation of the number of messages sent.
Number of Messages Sent Frequency
0 49 10
50 99 5
100 149 13
150 199 11
200 249 7
250 299 4
Solution: Step 1
Compute the midpoint of each class.
Number of Messages Sent Class Midpoints

0 49 25
50 99 75
100 149 125
150 199 175
200 249 225
250 299 275
Solution: Step 2
For each class, subtract mean from the class midpoint to obtain
(Midpoint Mean). Recall that the mean was calculated earlier to
be 137.
Number of Messages Sent Class Midpoints (Midpoint
Mean)
0 49 25 112
50 99 75 62
100 149 125 12
150 199 175 38
200 249 225 88
250 299 275 138
Solution: Step 3
For each class, square the differences obtained in Step 2 to obtain
(Midpoint Mean)2, and multiply by the frequency to obtain
(Midpoint Mean)2 x (Frequency).
Number of Messages Sent Frequency (Midpoint (Midpoint
Mean) Mean)2
Frequency
0 49 10 112 125440
50 99 5 62 19220
100 149 13 12 1872
150 199 11 38 15884
200 249 7 88 54208
250 299 4 138 76176
Solution: Step 4
Add the products (Midpoint Mean)2 x (Frequency) over all
classes.
Frequency (Midpoint
Mean)2
Frequency
10 125440
MidpointMean 2 Frequency
5 19220 = 125,440 + 19,220 + 1,872
13 1872 +15,884 + 54,208 + 76,176
11 15884 = 292,800
7 54208
4 76176
Solution: Step 5
Step 5: Since we are computing the sample variance, we divide the
sum obtained in Step 4 by 1.
2
2
MidpointMean Frequency 292,800
= = = 5975.51020
1 50 1
Step 6: Take the square root of the variance to obtain the standard
deviation.
= 2 = 5975.51020 = 77.30142
Grouped Data on the TI-84 PLUS
The following procedure is used to compute the mean and standard
deviation for grouped data in a frequency distribution.
Enter the midpoint for each class into L1 and the corresponding
frequencies in L2. Next, select the 1-Var stats followed by L1,
comma, L2.
Note: If your calculator supports Stat Wizards, enter L1 in the List field and L2 in
the FreqList field.
Example: Grouped Data on the TI-84 PLUS
Class Midpoint Frequency The output for the last example on the TI-
25 10 84 PLUS Calculator is presented below.
75 5 The value of s represents the approximate
125 13 sample standard deviation. In this example
175 11 s = 77.30142. Therefore the approximate
225 7 standard deviation is 77.30142.
275 4
Objective 5
Use the Empirical Rule to summarize data that
are unimodal and approximately symmetric
Bell-Shaped Histogram
Many histograms have a single mode near the center of the
data, and are approximately symmetric. Such histograms are
often referred to as bell-shaped.
The Empirical Rule
When a data set has a bell-shaped histogram, it is often possible to
use the standard deviation to provide an approximate description of
the data using a rule known as The Empirical Rule.
Approximately 68% of the data will be within one standard
deviation of the mean.
Approximately 95% of the data will be within two standard
deviations of the mean.
All, or almost all, of the data will be within three standard
Example: The Empirical Rule
The following table presents the U.S. Census Bureau projection for
the percentage of the population aged 65 and over for each state
and the District of Columbia. Use the Empirical Rule to describe the
data.
We first note that the histogram

is approximately bell-shaped and
we may use the TI-84 PLUS
calculator, or other technology,
to compute the population
mean and standard deviation.
Example: The Empirical Rule (Continued)
= 13.249 1.6827 = 11.57
+ = 13.249 + 1.6827 = 14.93
Approximately 68% of the data values are between 11.57 and 14.93.
2 = 13.249 2(1.6827) = 9.88
+ 2 = 13.249 + 2(1.6827) = 16.61
Approximately 95% of the data values are between 9.88 and 16.61.
3 = 13.249 3(1.6827) = 8.20
+ 3 = 13.249 + 3(1.6827) = 18.30
Almost all of the data values are between 8.20 and 18.30.
Objective 6
Use Chebyshevs Inequality to describe a data
set
Any Data Set
When a distribution is bell-shaped, we use The Empirical Rule to
approximate the proportion of data within one or two standard
deviations. Another rule called Chebyshevs Inequality holds for
any data set.
Chebyshevs Inequality
In any data set, the proportion of the data that is within K standard
deviations of the mean is at least 1 1/K2.
Specifically, by setting K = 2 or K = 3, we obtain the following results.
At least 3/4, or 75%, of the data are within two standard

At least 8/9, or 89%, of the data are within three standard

Example: Chebyshevs Inequality
As part of a public health study, systolic blood pressure was measured
for a large group of people. The mean was 120 and the standard
deviation was 10. What information does Chebyshevs Inequality
provide about these data?
Solution:
We compute the following:
We conclude:
At least 3/4 (75%) had systolic blood pressures between 100 and
140.
At least 8/9 (89%) had systolic blood pressures between 90 and
150.
Objective 7
Compute the coefficient of variation
Coefficient of Variation
The coefficient of variation (CV for short) tells how large the
standard deviation is relative to the mean. It can be used to
compare the spreads of data sets whose values have different units.
The coefficient of variation is found by dividing the standard

deviation by the mean.

CV =

Example: Coefficient of Variation
National Weather service records show that over a 30 year period, the
annual precipitation in Atlanta, Georgia had a mean of 49.8 inches with a
standard deviation of 7.6 inches, and the annual temperature had a mean of
62.2 degrees Fahrenheit with a standard deviation of 1.3 degrees. Compute
the coefficient of variation for precipitation and for temperature. Which has
greater spread relative to its mean?
Solution:
We compute the following:
standard deviation for precipitation 7.6
CV for precipitation = = = 0.15
mean precipitation 49.8
standard deviation for temperature 1.3
CV for temperature = = = 0.02
mean temperature 62.2
The CV for precipitation is larger than the CV for temperature. Therefore,
precipitation has a greater spread relative to its mean.
You Should Know . . .
How to compute the range of a data set
The notation for population variance, population standard
deviation, sample variance, and sample standard deviation
How to compute the variance and the standard deviation for
populations and samples
How to use the TI-84 PLUS calculator to compute the variance and
standard deviation for populations and samples
How to approximate the standard deviation for grouped data
How to use The Empirical Rule to describe a bell-shaped data set
How to use Chebyshevs Inequality to describe any data set
How to compute and interpret the coefficient of variation

Section 03 02 Ess Stats2e

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Section 03 02 Ess Stats2e

Uploaded by

Copyright:

Available Formats

ESSENTIAL STATISTICS 2E

William Navidi and Barry Monk

The average monthly temperatures, in degrees Fahrenheit, for San

The range of temperatures is: 63 51 = 12.

Although the range is easy to compute, it is not often used in

The variance is a measure of how far the values in a data

The variance is computed slightly differently for

Step 1: Compute the population mean .

Step 2: For each population value compute . These values

Step 4: Sum the squared deviations to obtain the quantity 2 .

2 = 42.25 + 12.25 + 6.25 + 2.25 + 0.25 + 6.25 + 6.25

Step 5: Divide the sum obtained in Step 4 by the population size to

It can be shown mathematically that the appropriate correction is to

The sample variance is given by

Sample Standard Deviation Population Standard Deviation

Enter the data into L1 in the data editor.

The standard deviation is not resistant.

That is, the standard deviation is affected by extreme

Step 3: For each class square the difference obtained in Step 2 to

Step 5: To compute the population variance, divide the sum obtained

Number of Messages Sent Class Midpoints

We first note that the histogram

Specifically, by setting K = 2 or K = 3, we obtain the following results.

At least 3/4, or 75%, of the data are within two standard

At least 8/9, or 89%, of the data are within three standard

The coefficient of variation is found by dividing the standard

You might also like