You are on page 1of 13

Review for Examination I

This problem set contains review questions from chapters 1 and 2.


1.

Identify each of the following data as either qualitative or quantitative. In


the case of qualitative data, classify the data as being measured on a
nominal scale or an ordinal scale. In the case of quantitative data, classify
the data as being discrete or continuous.
a.

The birth weight (in ounces) of a randomly selected infant is


recorded.

b.

The eye color of a randomly selected person is recorded.

c.

The midterm grade (A, B, C, D, F) of a randomly selected person in


a statistics class is recorded.

d.

The marital status of a randomly selected person is recorded.

e.

The time required to run a mile is recorded for a randomly selected


Duquesne University football player.

f.

The number of credit cards owned by a randomly selected


Duquesne University student.

g.

The temperature of a cooling system at a randomly selected point


in time is recorded.

h.

The rating (excellent, good, fair, abysmal) of a randomly selected


professor at Duquesne University.

2.

Refer to exercise 1. State whether you would use a bar/column chart, pie
chart, stemplot and/or histogram to display the data.

3.

Researchers wish to estimate the true mean cholesterol level in


psychiatric patients that exhibit violent behavior. From a study of 500
such patients, the mean cholesterol level was found to be 208 mg/dL.

4.

a.

Describe the population of this study.

b.

Define the variable.

c.

Define the parameter.

d.

What is the numeric value of the statistic?

e.

Explain the relation between the parameter and the statistic.

Classify each of the following as being qualitative or quantitative. In the


case of a quantitative variable, tell whether it is nominal or ordinal. In the
case of a quantitative variable, tell whether it is discrete or continuous.
a.

The species of a randomly selected fish from Lake Arthur.

b.

The weight (in grams) of a randomly selected fish from Lake Arthur.

c.

The number of fish caught in one hour.

5.

Compute the mean, median, range, and standard deviation for the
following temperature data. Temperature is given in oF. Be sure to show
the details of your calculations. Finally, interpret (in practical language)
the standard deviation.
25.1o, 33.8o, 18.5o, 47.2o

6.

The heights (in inches) of a randomly selected group of 30 men are listed
below.
Compute the lower inner fence and the upper inner fence.
Identify any outliers.
56
62
63
64
64

66
67
67
68
68

69
69
69
70
70

71
71
71
71
72

72
73
74
74
75

76
77
78
78
80

7.

Refer to the data set in the previous exercise. Describe the distribution in
terms of shape, center, spread, and potential outliers.
Give the five
number summary. Sketch a modified boxplot for this data.

8.

MULTIPLE CHOICE: Kayleighs examination score is in the 90 th percentile of


the distribution of exam scores. What does this mean? Select the best
response from those listed below.
(A)

Kayleigh score on the examination was 90%.

(B)

Kayleigh completed 90% of the examination.

(C)

Kayleigh scored better than 90% of the other students who took the

(D)

Kayleigh scored worse than 90% of the other students who took the

exam.
exam.
9.

In a certain forest, it is estimated that the distribution of the height of pine


trees is mound shaped with a mean of 34 feet and a standard deviation of
3 feet. If 1000 trees are to be randomly selected, approximate the number
of trees that are expected to be over 40 feet tall.

10.

A public health official wants to know the true average number of cavities
in twelve-year-old boys. From a sample of 200 twelve-year-old boys, the
average number of cavities was found to be 2.3 cavities.
a.

Define the population.

b.

Define the variable.

c.

Define the parameter.

d.

What is the (numeric) value of the statistic?

e.
11.

Briefly explain the relation between the parameter and the statistic.

Compute the mean, median, and standard deviation for the following
lengths (in cm) of a certain species of plant. Interpret the standard
deviation.
23.1 cm, 34.8 cm, 19.5 cm, 48.2 cm, 46.1 cm

12.

A data set consists of the speeds (in miles per hour) of 100 randomly
selected automobiles traveling along an open stretch of I-79. Which of the
following numbers is a reasonable value for the standard deviation?
Explain your answer.
0.5 mph

5.0 mph

50.0 mph

500 mph

13.

The height of American males has a distribution that is (approximately)


mound shaped with a mean of 70 inches and a standard deviation of 3
inches. A door is constructed with a height of 76 inches. Approximately
what percent of the American male population will bonk their head when
passing under the door?

14.

A hospital patient is asked to supply information to an attendant. Classify


each of the following as being qualitative or quantitative. In the case of a
quantitative variable, tell whether it is nominal or ordinal. In the case of a
quantitative variable, tell whether it is discrete or continuous.

15.

16.

a.

The weight of the patient is recorded.

b.

The blood type (A, B, AB, O) of the patient is recorded.

c.

The patient is asked to indicate her level of discomfort on a scale


from 1 (minimal discomfort) to 10 (extreme discomfort).

The principal of Bower Hill Elementary School wishes to estimate the true
average hours of weekly television viewing for all third graders in the
Bower Hill School. From a random sample of 45 third grade students, the
average number was 10.4 hours.
a.

Describe the population in this study.

b.

Describe the sample in this study.

c.

Define the parameter in this study.

d.

Describe the relation between the parameter and the value of 10.4
hours.

In a metropolitan area, the concentrations of cadmium in leaf lettuce were


measured at six representative gardens where sewage sludge was used
for fertilizer. The measurements (in mg/kg) are 21, 38, 12, 16, 14 and 7.

a.

Compute the mean and the median of the sample data.

b.

Compute the standard deviation of the sample data. Interpret the


standard derivation.

17.

The length of human pregnancies from conception to birth varies


according to a mound shaped distribution with a mean of 266 days and a
standard deviation of 16 days. Approximate the percentage of women
whose pregnancy lasts for more than 250 days.

18.

The following data lists the time (in minutes) for the 20 men between the
ages of 40 and 44 who finished a local sprint-triathlon.
76, 78, 79, 80, 80, 82, 82, 82, 83, 84
84, 85, 86, 87, 88, 90, 92, 95, 107, 112
Compute the lower inner fence and the
any outliers.

upper inner fence.

Identify

19.
Refer to the data set in the previous exercise. Give the five number
summary. Describe the distribution in terms of shape, center, spread, and
potential outliers. Sketch a modified boxplot for this data.
20.

The weight of each men in a sample of 100 men was men was calculated.
The mean weight was 185 pounds with a standard deviation of 10 pounds.
However, it was later discovered that the scale was not correct and
everyone was underweighed by 3 pounds. In other words, 3 pounds must
be added to everyones weight. How does this change the mean and
standard deviation?

21.

A National Health Survey wishes to determine the true average serum


cholesterol level in men aged 18 to 24.

22.

a.

Identify the population in this study.

b.

Identify the variable in this study.

c.

Define the parameter in this study.

The following data gives the serum cholesterol level for a random sample
mg

of six men aged 18 to 24. The units are in 100ml .


270
201
104
145
a.

Compute the sample mean.

b.

Compute the sample median.

c.

Compute the sample standard deviation.


deviation.

184

176

Interpret the standard

23.

Consider the serum cholesterol level of a sample (n = 25) of overweight


men. Compute the lower inner fence and the upper inner fence. Identify
any outliers.
152
164
178
182
188

190
192
196
200
210

216
220
230
232
244

248
250
260
262
274

284
290
290
296
298

24.

Refer to the data set in the previous exercise. Give the five number
summary. Describe the distribution in terms of shape, center, spread, and
potential outliers. Sketch a modified boxplot for this data.

25.

In the general population, the serum cholesterol level in men aged 18 to


24 has a mound shaped distribution with a mean of 180 and a standard
deviation of 40. Use the Empirical Rule to approximate the percentage of
men with a serum cholesterol level below 260.

26.

The average height of five-year-old boys is approximately 43 inches with a


standard deviation of 1.5 inches. The average height of ten-year-old boys
is 138 cm with a standard deviation of 8 cm. Five-year-old Barry is 45
inches tall. His ten-year-old brother Gary is 148 cm tall. Which brother is
taller for his age?

27.
Class scores on a really, really hard examination have a mean of 35 and a
standard
deviation of 8. The professor decides to scale the exam by doubling
everyones score.
What is the mean and standard deviation of these new
(scaled) scores.
28.

Class scores on a not-as-hard-as-the-last-exam have a mean of 70 and a


standard deviation of 12. The professor decides to scale the exam by
adding 10 points to everyones score. What is the mean and standard
deviation of these new (scaled) scores?

29.

The President of a large university wants to know the true proportion of


students that use the current athletic facilities. In a sample of 250
students, 96 students use the universitys athletic facilities.
a.

Define the population in this study.

b.

Define the response variable.

c.

Define the parameter.

d.
What is the (numeric) value of the statistic? Use the appropriate
notation.
e.

What is the relation between the parameter and the statistic?

30.

Identify whether each of the following is a statistic or whether it could be a


parameter.

In other words, do you think that the figure came from a

census or a sample?
a.

The average height of all Pittsburgh Steelers is 6.25 feet.

b.

The average height of all students at Duquesne University is 5.54


feet.

c.

The proportion of Duquesne University students that commute to


school is 0.45 (45%).

d.

The proportion of registered voters in Allegheny County that


participated in the last election is 0.832 (83.2%).

e.

The proportion of males in Greene County with blood type A is 0.42


(42%).

f.

For runners completing the 2012 New York Marathon, the average
completion time was 4 hours, 32 minutes.

31.

The trees in the south orchard have an average yield of 42 pounds of


apples with a standard deviation of 6 pounds.

The trees in the north

orchard have an average yield of 54 pounds with a standard deviation of 8


pounds.

Which tree had the worst yield farthest from the mean: a tree

in the south orchard with a yield of 30 pounds or a tree in the north


orchard with a yield of 36 pounds?
32.

After taking 50 temperature readings, a scientist calculated the mean


temperature to be 30C with a standard deviation of 5C.
discovered that he needs these statistics in units of F.

He later

Knowing the

9
F (C) 32
5
conversion formula
, help the scientist convert the mean and
standard deviation into units of F .
33.

What is a resistant statistic?


Of the mean and median, which is a
resistant statistic? Of the inner quartile range and standard deviation,
which is a resistant statistic?

34.

In any data set, what percent of the data lies within the inner quartile
range (that is, between Q1 and Q3)?

35.

Ponder this exercise. Select four numbers from among the numbers 2, 3,
5, 6. Of course, you are allowed repeats numbers.

a.
minimum.

Choose the four numbers so that the standard deviation is a

b.
maximum.

Choose the four numbers so that the standard deviation is a

Answers
1a.

Birth weight is an example of a quantitative (continuous) variable.

1b.

Eye color is an example of a categorical (nominal) data variable.

1c.

A grade is an example of a categorical (ordinal) variable.

1d.

Marital status is an example of a categorical (nominal) variable

1e.

Time is an example of a quantitative (continuous) variable.

1f.

The number of credit cards is an example of a quantitative (discrete)


variable.

1g.

Temperature is an example of a quantitative (continuous) variable

1h.

A rating is an example of a categorical (ordinal) variable.

2a.

Use a bar/column chart or a pie chart to display this data.

2b.

Use a bar/column chart or a pie chart to display this data.

2c.

Use a bar/column chart or a pie chart to display this data.

2d.

Use a bar/column chart or a pie chart to display this data.

2e.

Use a stemplot or a histogram to display this data.

2f.

Use a stemplot or a histogram to display this data.

2g.

Use a stemplot or a histogram to display this data.

2h.

Use a bar/column chart or a pie chart to display this data.

3a.

3d.
3e.

The population is all psychiatric patients that exhibit violent behavior.


(NOTE: The group of 500 patients is the sample, not the population.)
The variable is the cholesterol level of a randomly selected patient from
population.
The parameter is the true mean cholesterol level in psychiatric patients
that exhibit violent behavior.
The numeric value of the statistic is 208 mg/dL.
The statistic estimates the (unknown) value of the parameter.

4a.
4b.
4c.

This is an example of a qualitative (nominal) variable.


This is an example of a quantitative (continuous) variable.
This is an example of a quantitative (discrete) variable.

5.

The mean is x = 31.15 oF .


The median is md = 29.45 oF
The range is 28.7 oF .
The standard deviation is s = 12.4 oF .
Interpretation:
Typically, the temperature varied by about 12.4o from the mean
temperature of 31.15 oF .

3b.
the
3c.

6.

The lower inner fence is 56.5 inches. The upper inner fence is 84.5
inches. The man with a height of 56 inches is an outlier since his height is
below the lower inner fence.

7.

Five number summary: Min = 56, Q 1 = 67, median = 70.5, Q 3 = 74,


Max = 80.
SHAPE: The shape of the distribution of heights is skewed left.
CENTER: The mean height is 70.2 inches and the median is 70.5 inches.
SPREAD:
The range of the distribution is 24 inches; the standard
deviation is 5.31 inches; the interquartile range is 7 inches.
OUTLIERS: There is one outlier: the height of 56 inches is an outlier.

8.

The correct answer is C.

9.

The approximate number of trees expected to be over 40 feet tall is 25.


(Tricky Problem.....2.5% is the percent. 2.5% of 1000 is 25)

10a.
10b.

The population is all twelve-year-old boys.


The variable is the number of cavities in a randomly selected twelve-yearold boy.
The parameter is the true average number of cavities in twelve-year-old

10c.
boys.
10d. The value of the statistic is 2.3 cavities.
10e. The statistic serves as an estimate of the unknown value of the
parameter.
11.

Mean: x = 34.34 cm
Median: md = 34.8 cm
Standard Deviation: s = 13.01 cm
Typically, the plant data varies 13.01 cm from the mean height of 34.34

cm.
12.

A reasonable value for the standard deviation is 5.0 mph. It is reasonable


that most speeds along I-79 will fall within 2 standard deviations (10
mph) of the mean speed.

13.
Approximately 2.5% of American males have a height in excess of 76
inches.
14a. Weight is an example of a quantitative (continuous) variable.
14b. Blood type is an example of a qualitative (nominal) variable.
14c. Level of discomfort is an example of a qualitative (ordinal) variable.
15a. The population consists of all third graders in the Bower Hill Elementary
School.
15b. The sample consists of the 45 randomly selected third grade students.
15c. The parameter is the true average hours of weekly television viewing for
all third graders in the Bower Hill School.
15d. The value of 10.4 is a statistic and serves as an estimate of the
parameter.
16a.
16b.

The mean is x = 18 mg/kg. The median is 15 mg/kg


The standard deviation is s = 10.83 mg/kg
Typically, the data varies 10.83 mg/kg from the mean of 18 mg/kg.

17.
Approximately 84% of women have pregnancies that last for more than
250 days.
18.

The lower inner fence is 69 min. The upper inner fence is 101 min. The
times of 107 min and 112 min are outliers since they are above the upper
inner fence.

19.

Five-number summary: min = 76, Q 1 = 81, md = 84, Q3 = 89, max =


112 .
SHAPE: The distribution of times is skewed to the right.
CENTER: The median time is 84 minutes; the mean time is 86.6 minutes.
SPREAD:
The scores span a range of 36 minutes with a standard
deviation of 9.17 minutes. The interquartile range is 8 minutes.
OUTLIERS: There are two outliers: 107 minutes and 112 minutes.

20.

The revised mean is 188 pounds. The standard deviation remains 10


pounds. Adding 3 pounds to the data increases the mean by 3 pounds
but does not change the standard deviation.

21a.
21b.
21c.
to 24.

The population is all men aged 18 to 24.


The variable is the serum cholesterol level for a randomly selected man
aged 18 to 24.
The parameter is the true average serum cholesterol level in men aged 18
mg
100ml

22a.

The mean is x = 180

22b.

The median is md = 180 100ml

22c.

The standard deviation is s = 55.81

23.

55.81 100ml from the sample mean of 180 100ml .


The lower inner fence is 114.5. The upper inner fence is 345.5. There are
no outliers in this data set.

mg
mg
100ml .

mg

Typically, the data is s =


mg

24.
Five-number summary: min = 152, Q 1 = 191, md = 230, Q3 = 268,
max = 298 .
SHAPE: The distribution of cholesterol level is approximately normal.
mg

CENTER: The mean time is 229.84 100ml ; the median level is 230
SPREAD:
mg

The scores span a range of 146

of 44.2 100ml . The interquartile range is 77


OUTLIERS: There are no outliers.

mg
100ml
mg
100ml

mg
100ml

with a standard deviation

25.

Approximately 97.5% of the men have a serum cholesterol level below


260.

26.

The z-score for Gary is 1.33. The z-score for Barry is 1.25. Gary is taller
for his age than Barry.

27.

Multiplying each data value by a constant changes both the mean and
standard deviation. The mean is now 70 and the standard deviation is
now 16.

28.

Adding a constant to a data set changes the mean of the data, but does
not change the spread. The mean is now 80 but the standard deviation
remains 12.

29a. The population consists of all students at the University.


29b. The response variable is a qualitative variable indicating whether or not a
student
uses the current athletic facilities (yes/no).
29c. The parameter is the true proportion of students that use the current
athletic
facilities.

29d. The statistic is p = 96/250 = 0.384


29e. The statistic serves as an estimate of the parameter.
30a.

This number is most likely based on a census of the population and is


therefore a parameter.
30b. This number is most likely based on a sample from the population and is
therefore a statistic.
30c. This number is most likely based on a census of the population and is
therefore a parameter.
30d. This number is most likely based on a census of the population and is
therefore a parameter.
30e. This number is most likely based on a sample from the population and is
therefore a statistic.
30f. This number is most likely based on a census of the population and is
therefore a parameter.
31.

The tree in the south orchard has a z-score of z = -2. The tree in the
north orchard has a z-score of z = -2.25. The yield of the tree in the north
orchard was further below the mean than the yield of the tree in the south
orchard. The tree in the north orchard had a worse yield than that of the
tree in the south orchard.

32.

The mean temperature is 86F with a standard deviation of 9F.

33.

A resistant statistic is a statistic that is not greatly affected by outliers. Of


the mean and median, the median is a resistant statistic. The mean is
not!
Of the inner quartile range and standard deviation, the inner
quartile range is a resistant statistic. It is not greatly affected by outliers.
Since the calculation of the standard deviation involves the mean and
since the mean is not a resistant statistic, the standard deviation is also
NOT a resistant statistic.

34.

In any data set, 50% of the data lies within the inner quartile range. (25%
of the data is less than Q1 and 25% of the data is greater than Q3)

35.

Im not telling!

The modified boxplots were created using the applet


http://www.alcula.com/calculators/statistics/box-plot/

You might also like