You are on page 1of 10

Self-Correcting Exercises 2A: Measures of Center

1.

Find the mean, median and mode of the following observations.


2, 5, 4, 6, 4, 3, 6, 7, 4, 3, 5

2.

a. Find the mean, median and mode of the following observations and compare their values.
5, 7, 3, 5, 6, 8, 5, 6, 4, 6, 25
b.
c.

Eliminate the last observation x = 25 and then find the mean, median and mode. How do these values
compare with those found using the full data set?
How do possible outliers (such as 25) affect the values of these three measures of center?

3.

Suppose that as a prospective buyer you wish to compare the prices of homes in several parts of town.
a. If the homes in one area of town consisted of tract homes with about the same square feet of living space,
what measure would you use in finding the typical cost of a home in this area?
b. If the homes in another area of town consisted of homes which were built by individual contractors with
varying square feet of living area including several large custom homes, what measure might you wish to
use to find the typical cost of a home in this area?

4.

The following data are the ages (in months) at which n = 50 children were first enrolled in a preschool.
38
47
32
55
42
40
36
35
45
45
a.
b.
c.
d.

5.

40
35
34
39
50
48
41
40
42
38

30
34
41
33
37
36
43
30
41
46

35
43
30
32
39
31
48
46
36
36

39
41
46
32
33
36
40
37
50
31

Find the mean and median for these data.


Construct a frequency histogram for these data using 30 as the lower limit of the first class, and a class
width of 5 months. What is the modal class? What is its endpoint?
Compare the values of the mean, median and the midpoint of the modal class.
From the histogram, comment on the shape of the distribution. Do the values of the mean and the median
support your answer concerning the shape of the distribution?

The average weekly unemployment benefit amount for the 50 US states are given to the nearest dollar.
$159
190
163
210
160
256
258
215
220
212

$222
238
247
225
182
202
212
293
244
290

$188
222
217
290
180
247
231
210
236
214

$190
181
189
227
213
216
204
281
198
233

$284
209
252
157
186
188
233
264
253
207

Source: The World Almanac and Book of Facts, 2002, p. 139

a.
b.
c.

Find the mean and median for these data.


Construct a histogram and find the midpoint of the modal class.
Compare these three measures of center. Using only the measures of center, would you expect this
distribution to be fairly symmetric or skewed? Is your conclusion supported by the shape of the histogram
in part b?
Solutions to SCE 2A

1.

Arrange the set of data in order of ascending magnitude.


2, 3, 3, 4, 4, 4, 5, 5, 6, 6, 7
median = 4; x = xi n = 49 11 = 4.45; mode = 4.

2.

a.

Arrange the set of data in order of ascending magnitude.


3, 4, 5, 5, 5, 6, 6, 6, 7, 8, 25
median = 6; x = xi n = 80 11 = 7.27; modes = 5 and 6.

3.

b.

If the value x = 25 is removed, median = ( 5 + 6 ) 2 = 5.5; x = xi n = 55 10 = 5.5; modes = 5 and 6. The

c.

mean is smaller.
The mean is affected by the outlier, while the median and mode are not.

a.
b.

4.

a.

If all homes will cost about the same amounts, and there are no unusually high or low costing homes in the
tract, use the sample mean.
Since the custom homes may be much more expensive than the other homes, the average or mean cost may
not be a good measure of center. You should use the median cost.
The data are arranged in order of ascending magnitude below.
30
34
37
40
45

b.
c.
d.

30
34
37
41
46

30
35
38
41
46

31
35
38
41
46

31
35
39
41
47

32
36
39
42
48

32
36
39
42
48

32
36
40
43
50

33
36
40
43
50

33
36
40
45
55

The median for n = 50 observations is the average of the 25th and 26th ordered observations or
39 + 39
median =
= 39
2
xi 1954
=
= 39.08 .
and x =
n
50
The frequency histogram is shown below. The modal class is 35 to < 40 with midpoint ( 35 + 40 ) 2 = 37.5 .

Notice that the mean is shifted slightly to the right of both the median and the midpoint of the modal class.
The distribution in part b is slightly skewed to the right, which explains why the mean is shifted to the right
of the median and the midpoint of the modal class.

Histogram of Ages
16
14

Frequency

12
10
8
6
4
2
0

5.

a.

c.

35

40

45
Ages

50

55

60

The data are arranged in order of ascending magnitude.


157
159
160
163
180
181
182
186
188
188

b.

30

189
190
190
198
202
204
207
209
210
210

212
212
213
214
215
216
217
220
222
222

225
227
231
233
233
236
238
244
247
247

252
253
256
258
264
281
284
290
290
293

The median for n = 50 observations is the average of the 25th and 26th ordered observations or
215 + 216
median =
= 215.5
2
and
xi 10998
x=
=
= 219.96
n
50
Using the relative frequency histogram from Exercise 2, SCE 1C, the modal class is 200 to < 215 which has
midpoint 207.5.
Since the mean is shifted to the right of the median and the modal class, the distribution would exhibit
skewing to the right. See the histogram in Exercise 2, SCE 1C.

Self-Correcting Exercises 2B: Measures of Variability

1.

Given the n = 5 observations 2, 4, 3, 4, 5, calculate


a. the sample mean x .
b. the sample variance and standard deviation using the definition formula.
c. the sample variance and standard deviation using the short-cut formula. How do these values compare with
those found in part b?
d. If you have a statistical calculator, use it to find x and s. Compare these values with those found in parts b
and c.

2.

Given the n = 9 observations 7, 9, 10, 6, 8, 7, 8, 9, 8, calculate


a. the range.
b. the mean, variance and the standard deviation.
c. find the ratio of the range divided by the standard deviation. The range is approximately how many
standard deviations?

3.

Refer to the data in SCE 2A, Exercise 4.


a. Find the variance and standard deviation for these data.
b. Find the range of these data.
c. Compare the range and the standard deviation by finding the ratio of the range to the standard deviation.
Approximately how many standard deviations is the range?
Solutions to SCE 2B

1.

xi = 18; xi2 = 70; n = 5

xi 18
=
= 3.6
n
5

a.

x=

b.

s2 =
=

( xi x )

s2 =

n 1

( 1.6 )

( 2 3.6 )2 + + ( 5 3.6 )2
4

+ (.4 ) + ( .6 ) + (.4 ) + 1.42


2

5.2
=
= 1.3
4

c.

xi2

and s = 1.3 = 1.14

( xi )2

n 1

182
5 = 5.2 = 1.3 and
4
4

70

s = 1.3 = 1.14 . Answers are the same.

2.

a.
b.

R = 10 6 = 4
xi 72
=
= 8;
x=
n
9
s2 =

c.

xi2

( xi )2

n 1

722
9 = 12 = 1.5;
8
8

588

s = 1.5 = 1.225 .
R s = 4 1.225 = 3.27 . The range is between 3 and 4 standard deviations.

3.

xi2

( xi )2
n

a.

s2 =

b.
c.

s = 35.8302 = 5.99 .
Using the sorted data in the solution for Exercise 4, SCE 2A, R = 55 30 = 25 .
R s = 25 5.99 = 4.2 . The range is slightly more than 4 standard deviations.

n 1

19542
50 = 1755.68 = 35.8302;
49
49

78,118

Self-Correcting Exercises 2C: Tchebysheffs Theorem, the Empirical Rule and the Range Approximation

1.

A set of observations consists of the values:


10, 5, 1, 10, 7, 3, 5, 2, 3, 8.
a. Use the range approximation to estimate the value of s, the sample standard deviation. (Hint: use the
appropriate divisor as given in the table at the end of Section 2.5.)
b. Find the standard deviation, s. How does it compare to the estimate you found in part a?
c. Construct a stem and leaf plot for these n = 10 observations. Is the data mound-shaped?
d. Could you use Tchebysheffs Theorem to describe these data? The Empirical Rule? Why or why not?

2.

Suppose you are told that the mean and standard deviation of a sample of n = 500 observations were
x = 50 and s = 10 . You know nothing else about the shape of the distribution for these data.
a. What can be said about the proportion of observations between 40 and 60?
b. What can be said about the proportion of observations between 30 and 70?
c. What can be said about the proportion of observations smaller than 30?

3.

Suppose now you are told that the data in Exercise 2 are mound-shaped.
a. What can be said about the proportion of observations between 40 and 60?
b. What can be said about the proportion of observations between 30 and 70?
c. What can be said about the proportion of observations smaller than 30?

4.

Refer to Exercise 4, SCE-2A.


a. Find the mean and standard deviation of these data.
b. Construct a histogram and describe the shape of these data.
c. Find the actual proportion of observations within the intervals x s , x 2 s , and x 3s . How do these
observed proportions compare with those given by Tchebysheffs Theorem and the Empirical Rule?
Solutions to SCE 2C

1.

a.

b.

s R 3 = (10 1) 3 = 3

s2 =

xi2

( xi )2

n 1

542
10 = 94.4 = 10.4889;
9
9

386

s = 10.4889 = 3.239 , which is close to the estimate in part a.


c-d. The data is not mound-shaped as shown by the stem and leaf plot. You could use Tchebysheffs Theorem,
but not the Empirical Rule to describe the data.
1
2
4
4
(2)
4
4
3
2
2

2.

a.

b.

1
2
3
4
5
6
7
8
9
10

0
0
00
00
0
0
00

Since nothing is known about the shape of the distribution, you must use Tchebysheffs Theorem to
describe the data. The interval 40 to 60 represents x s 50 10 . Since k = 1 , you can say only that at
least none of the measurements are in this interval.
The interval 30 to 70 represents x 2 s 50 20 . Since k = 2 , you can say that at least 3/4 of the
measurements are in this interval.

c.

If at least 3/4 of the measurements are between 30 and 70, at most 1/4 of the measurements are outside this
interval. Since you know nothing about the shape of the distribution, all of these measurements might be
less than 30.

3.

a.
b.
c.

Using the Empirical Rule, approximately 68% of the measurements will be between 40 and 60.
Approximately 95% of the measurements will be in the interval x 2 s 50 20 or 30 to 70.
From b., there are 5% of the measurements outside the interval from 30 to 70. Since a mound-shaped
distribution is symmetric about the mean, 1 2 (5%) = 2.5% will be less than 30.

4.

a.
b.
c.

From Exercise 3, SCE 2B, x = 35.83 and s = 5.99 .


From Exercise 4, SCE 2A, the histogram is slightly skewed to the right.
x s 35.83 5.99 or 29.84 to 41.82 contains 35 50 = .70 or 70% of the measurements.
x 2s 35.83 11.98 or 23.85 to 47.81 contains 45 50 = .90 or 90% of the measurements.
x 3s 35.83 17.97 or 17.86 to 53.80 contains 49 50 = .98 or 98% of the measurements. Since the
distribution is not quite mound-shaped, the proportions are not exactly as described by the Empirical Rule.

Self-Correcting Exercises 2D: Measures of Relative Standing

1.

Find the median, the lower and upper quartiles and the interquartile range for the following data: 4, 0, 5, 3, 6, 2,
5, 9, 5, 3.

2.

Refer to the data in Exercise 1.


a. Calculate x and s.
b. Calculate the z-score for the smallest and largest observations in the set. Is either of these observations
unusually small or large?

3.

Construct a boxplot for the data in Exercise 1. Are there any suspected outliers? Any extreme outliers?

4.

Refer to the data in SCE-2A, Exercise 5.


a. Find the mean, standard deviation, median and lower and upper quartiles.
b. Construct a boxplot for these data. Comment on the distribution of these data. Is the distribution relatively
symmetric? Are there any outliers?
c. Find the proportion of observations falling into the intervals x s , x 2 s , and x 3s . Compare these
proportions with those given by Tchebysheffs Theorem and the Empirical Rule. Do the results here
support your answer in part b?
Solutions to SCE 2D

1.

Arrange the data in order of ascending magnitude:


The positions of the median, lower and upper quartiles are
.5 ( n + 1) = .5 (11) = 5.5
.25 ( n + 1) = .25 (11) = 2.75
.75 ( n + 1) = .75 (11) = 8.25

Then m = ( 4 + 5 ) 2 = 4.5;
Q1 = 2 + .75 ( 3 2 ) = 2.75;
Q3 = 5 + .25 ( 6 5 ) = 5.25
and IQR = Q3 Q1 = 2.5
2.

a.

x=

xi 42
=
= 4.2;
10
n

s2 =

xi2

( xi )2

n 1

230
9

422
10 = 5.9556;

s = 5.9556 = 2.44 .

b.

x x 0 4.2
=
= 1.72 .
s
2.44
x x 9 4.2
=
= 1.97 .
For x = 9, z -score =
s
2.44
Neither value is unusually large or small.

For x = 0, z -score =

0, 2, 3, 3, 4, 5, 5, 5, 6, 9

3.

Q1 1.5 ( IQR ) = 2.75 1.5 ( 2.5 ) = 1

Lower and Upper Fences

Q3 + 1.5 ( IQR ) = 5.25 + 1.5 ( 2.5 ) = 9


The largest and smallest values that are not outliers are 0 and 6.

4.

a.

From Exercise 5a, SCE 2A, x = 219.96 and m = 215.5 . Using the sorted data in the solution to that
exercise, the positions of the upper and lower quartiles are:
.25 ( n + 1) = .25 ( 51) = 12.75
.75 ( n + 1) = .75 ( 51) = 38.25

Then
Q1 = 190 + .75 (190 190 ) = 190;
Q3 = 244 + .25 ( 247 244 ) = 244.75
and IQR = Q3 Q1 = 54.75
Also,

( xi )2

2, 478,896

109982
50 = 34.93.

n
=
n 1
49
Lower and Upper Fences:
Q1 1.5IQR = 190 1.5 ( 54.75 ) = 107.875
s=

b.

xi2

Q3 + 1.5 IQR = 244.75 + 82.125 = 326.875


There are no outliers; the whiskers are attached to largest and smallest values, x = 157 and x = 293. The
distribution is slightly skewed to the right.

150

c.

175

200

225
Dollars

250

275

300

x s 219.96 34.93 or 185.03 to 254.89 contains 35 50 = .70 or 70% of the measurements.


x 2 s 219.96 69.86 or 150.10 to 289.82 contains 47 50 = .94 or 94% of the measurements.
x 3s 219.96 104.79 or 115.17 to 324.75 contains 50 50 = 1.00 or 100% of the measurements.
Although the distribution is not quite mound-shaped, the proportions match quite well with those described
by the Empirical Rule.

You might also like