Study Guide For ECO 3411

Study Guide for Business Statistics and Applications--Soskin and Braun
Study Guide for Business

Statistics and Applications
ECO 3411 Quantitative Business Tools II
Copyright 2012
University of Central Florida
College of Business Administration
Authors:
Drs. Mark Soskin and Bradley Braun
Associate Professors of Economics
Not for resale or any other commercial use.

ALL INTELLECTUAL MATERIALS IN THIS TEXT ARE THE SOLE PROPERTY OF THE AUTHORS.
Business Statistics and Applications--Soskin and Braun

A.
REVIEW QUESTIONS FOR CHAPTER 1 .................................................................... 2
Answers for Chapter 1 .............................................................................................................................................................. 2
B.
Answers for Chapter 2 .............................................................................................................................................................. 4
C.
Answers for Chapter 3 ............................................................................................................................................................ 16
D.
REVIEW QUESTIONS FOR CHAPTER 4 ...................................................................17
E.
F.
G.
H.
Answers for Chapter 8: ........................................................................................................................................................... 64
I.
Answers for Chapter 9: ........................................................................................................................................................... 68
J.
REVIEW QUESTIONS FOR CHAPTER 10 .................................................................69
Answers for Chapter 10 .......................................................................................................................................................... 80
K.
REVIEW QUESTIONS FOR CHAPTER 11 .................................................................81
Answers for Chapter 11 ........................................................................................................................................................ 104
L.
REVIEW QUESTIONS FOR CHAPTER 12 ...............................................................105
Answers for Chapter 12 ........................................................................................................................................................ 108
Page 2
A. Review Questions for Chapter 1

1.1 _______ is defined as the total collection of items under consideration.
a. Population
b. Variable
c. Data
d. Statistics
1.2 Statistics is an important tool because:
a. Statistics helps us to tell the datas story.
b. People rarely deal with raw data.
c. It can be used to unlock new sources of economic value.
d. All the above are true.
1.3 The key steps for statistical decision making include identifying the population
and variables, locating the data, analyzing the data, and generating a report.
Usually the most difficult step is
a. Identifying the relevant population and variables.
b. Locating the data
c. Analyzing the data.
d. Generating a report.
1.4 Which is an ethical violation by a statistician or manager?
a. Withholding information or distorting results to support a preferred
conclusion
b. Selling personal information to the highest bidder
c. Having strong security to prevent hackers from stealing data
d. Making profits from other peoples personal information
e. all the above
Answers for Chapter 1

1.1: a, 1.2: d, 1.3: b, 1.4: a.
Page 3
B. Review Questions for Chapter 2

2.12 Which of the following method of displaying the data would be best for a
data set with 450 observations?
a. a sorted listing
b. a listing of descriptive summary statistics
c. a histogram
d. all of the above are equally informative
e. it depends on the analytical problem at hand
2.13 A histogram presents quantitative data as
a. a table of frequencies
b. a pie chart of relative frequencies
c. bar graph of class frequencies
d. a sorted listing
e. a listing of descriptive statistics
2.14 The modal class is
a. the class with the largest observations
b. the class with the most observations
c. the class with the widest interval width
d. the most frequently occurring observation
e. the class containing the mode
2.44 Bivariate data are data gathered on
a. one variable for each observation
b. two variables for each observation
c. at least one variable for each observation
d. at least two variables for each observation
e. none of the above
Page 4
2.45 A scatterplot is a
a. two dimensional plot of data
b. frequency distribution
c. histogram
d. trend line
e. random array of points
2.46 An index
a. summarizes a group of related variables
b. is the average of several variables
c. is often used to represent overall movements in stock prices
d. is often used to represent overall movements in consumer prices
e. all of the above
2.48 Which of the following is a guideline for constructing time series graphs?
a. place time on the vertical axis
b. always draw a trend line rather than connect the plotted points
c. beware of graphs that omit the most recently available data
d. compare each time series observation with its cross section counterpart
e. graph two (or more) time series variables on the same graph to compare
them

2.12: c, 2.13: c, 2.14: b, 2.44: b, 2.45: a, 2.46: e, 2.48: c.
Page 5
C. Review Questions for Chapter 3

3.1 The center of the distribution for salaries (measured in thousands of dollars /
year) 10, 20, 20, 20, 30, 100, 500 is
a. 100
b. 20
c. the mean.
d. the median.
e. both b and d.
3.2 For 120 private colleges, the mean number of accounting majors is 84, the
median is 59, the mode is 64. In the histogram of accounting majors at the
120 colleges, the modal class
a. has a class midpoint of 59 accounting majors
b. has a class midpoint of 64 accounting majors
c. has a class midpoint of 84 accounting majors
d. will not occur because this is not quantitative data
e. cannot be found from information given in the problem
3.3 A housing tract builder needs to decide on the number of bedrooms to put
into its most commonly-built homes. Which measure of the average should
the builder determine?
a. the mean number of bedrooms for new home sales
b. the median number of bedrooms for new home sales
c. the mode for bedrooms of new home sales
d. the mean family size of new home buyers
3.4 In calculating the standard deviation which operation is not involved?
a. summation over the number of observations
b. squaring differences from the mean
c. averaging over the number of observations
d. taking the square root
e. all the above operations are involved
Page 6
3.5 The median is

a. the arithmetic average
b. the most frequently occurring value in the data set
c. the 50th percentile
d. the middle value if there is an even number of observations
e. all of the above
3.6 For which of the following are the mean, median, and mode identical?
a. 1 3 3
b. 1 2 2 4
c. 1 1 4 4 4 8 8
d. 1 10 10 19
e. 1 5 5 5 10
3.7 The median for the salaries (measured in thousands of dollars / year) 10, 20,
20, 20, 30, 100, 500 is
a. 10
b. 20
c. 30
d. 65
e. 100
3.8 The mode for the salaries (measured in thousands of dollars / year) 10, 20,
20, 20, 30, 100, 500 is
a. 10
b. 20
c. 30
d. 65
e. 100
Page 7
3.9 If mean and median salaries are very similar at a firm, then you should
conclude that
a. most employees make about the same salary
b. if some salaries are far below the mean, other salaries are considerably
above the mean
c. no salaries are far from the mean
d. some salaries may be far above the mean, but none can be far below it
e. the similarity between mean and median tells us nothing about the
distribution of the data
3.10 Given the following means, for which would you also want to know the
dispersion?
a. choosing among three mutual funds which averaged 15, 12, and 8 percent
return
b. hiring a new engineer from among graduates with 3.3, 3.1, and 3.0 grade
point averages
c. attempting to make vacation reservations at one of three resort areas with
average annual occupancy rates of 60, 70, and 80 percent
d. all of the above
3.11 If the mean weight of 50 parts in a shipment is 2.2 pounds and the median
weight is 0.8 pounds, the total weight of the shipment is
a. 40 pounds
b. 50 pounds
c. 110 pounds
d. 220 pounds
e. cannot be determined from the information provided
3.12 If the mean weight of 50 parts in a shipment is 2.2 pounds and the median
weight is 0.8 pounds, then
a. half the parts weigh more than 0.8 pounds
b. 26 parts weigh less than 0.8 pounds
c. 25 parts weigh between 0.8 and 2.2 pounds
d. all of the above
Page 8
3.13 For the data set 5 4 3 2 1 0

a. the mean and median are the same
b. the mean has a value that does not occur in the data
c. the median has a value that does not occur in the data
d. all of the above
3.65 Which of the following is not a measure of dispersion?
a. standard deviation
b. range
c. maximum
d. mean absolute difference
e. all of the above are measures of dispersion
3.66 The problem with using the range to determine dispersion is that the range
does not take account of
a. the smallest value
b. the largest value
c. the difference between the largest and smallest value
d. values other than the largest and smallest
3.67 In calculating the standard deviation which operation is not involved?
a. summation over the number of observations
b. squaring differences from the mean
c. averaging over the number of observations
d. obtaining the square root
e. all the above operations are involved
3.80 Random samples

a. must be collected from lottery machines
b. are the results of simulations
c. have the same chance of occurring as any other sample of the same size
d. require a computer to collect
Page 9
3.81 Which of the following is an example of random sampling?

a. people attending a shopping mall on a weekday afternoon
b. respondents to an evening phone survey
c. a mailed questionnaire sent to people with drivers licenses
d. phone response to a 900 number announced during the network evening
news
e. drawing 10 names from a well-mixed hat containing names of everyone in
the population
3.82 A description of a variable in the population is a(n)
a. estimator
b. estimate
c. parameter
d. sample
3.83 Statistical inference is the process of
a. describing a population from population data
b. describing a sample using sample data
c. make guesses about sample estimates by using population parameters
d. make guesses about population parameters by using sample estimates
e. all of the above
3.84
is the parameter but
__
a. X and
__
b. X and M
__
c. m and X
__
d. and X
is an estimator of the mean
Page 10
3.85 If the mean for a very large number of samples is equal to the parameter
being estimated, the estimator is
a. unbiased
b. a sample
c. not an outlier
d. random
3.86 An outlier is defined as
a. an observation considerably larger than any other in the sample
b. an observation considerably smaller than any other in the sample
c. an observation considerably larger or smaller than any other in the sample
d. observations that are measured incorrectly
e. observations that should be deleted from the sample
3.87 Which of the following conclusions does not involve potentially large
sampling bias?
a. poverty cannot be difficult to overcome. All the people I know that came
from the ghetto are doing well financially now.
b. I can't see why our product isn't selling. Our most trusted customers tell
me we have the best product on the market.
c. I knew I wanted to become a computer programmer. I loved computers
playing around with the one at home for years.
d. I took art courses in school, so I know that I am not artistically inclined.
e. All of the above involve sampling bias.
3.88 To ensure that s has the same units and scale as the variable in the data,
a. we divide by n - 1
b. we square the differences from the mean
c. we sum all the squared differences
d. we take the square root of the final mean sum of squares
Page 11
3.89 To adjust for the lost degree of freedom in the sample estimator s
a. we divide by n - 1
b. we square the differences from the mean
c. we sum all the squared differences
d. we take the square root of the final mean sum of squares
3.90 For a utility company responding to the 200 service outages last month,
median repair time was 45 minutes, the mean was one and one-half hours,
the minimum was 15 minutes, the maximum was 1 day, and the standard
deviation was 3 hours. The range of service outage times last month was
a. 23.85 hours
b. 23.75 hours
c. 23.55 hours
d. 23.25 hours
e. 22.50 hours
3.91. The list price of new mid- and full-sized cars models was surveyed and the
summary statistics reported below:
Variable
N Mean Median TrMean StDev SEMean
PRICE
45 22795 20389 22620 6486 967
Variable Min Max
Q1
Q3
PRICE
13206 35553 17091 27357
Answer the following four questions regarding the data described above:
1.
The median price, $20,389, must be an actual car price in the data set
because:
a. there is an odd number of car models in the data set
b. the average of the minimum and maximum is the median
c. the median and mode always are actual observed values, but the mean
may not be
d. the median is less than the mean
e. the median does not have to be an actual car price in this data set
Page 12
2.
$22,347 is
a. The interquartile range
b. The range
c. The standard deviation
d. The mode
e. None of the above
3.
If the distribution of prices is approximately bell-shaped, then we would

expect about 95% of the prices would lie between:
a. $16,309 and $29,281
b. $12,972 and $33,361
c. $9823 and $35,767
d. $17,091 and $27,357
4.
If a car dealer were to stock one of each of the 45 cars models, the total
value of its inventory would be approximately:
a. $600,000
b. $750,000
c. $900,000
d. $1 million
e. Cannot be calculated from the information given
3.92. In the 1993 NBA college draft, the 54 players drafted signed for annual
salaries (in thousands of dollars) given in the following sorted data listing,
summary statistics, and histogram:
SALARY
125 125 125 125 125 145 160 170 175 200 220
245 275 300 305 325 365 380 400 430 475 500
550 575 600 625 660 700 745 775 800 825 865
900 950 1000 1100 1250 1300 1430 1500 1600 1700 1780
1875 1945 2000 2100 2200 2260 2335 2350 2400 2500
Variable
SALARY
Variable
SALARY

54 924 680
881
752 102
Min Max
Q1
Q3
125 2500 294 1525
Page 13
Answer the following six questions based on this information:

1.
The median annual salary is

a. the average of the 27th and 28th highest salary
b. half-way between $660,000 and $700,000
c. not an actual observation in the data set
d. all of the above are true
e. none of the above are true
2.
The mode for this data set is

a. $125,000
b. $250,000
c. $680,000
d. $924,000
Frequency
20
10
3.
The modal class interval is

a. $100,000 to $400,000
b. $125,000 to $2,500,000
c. $294,000 to $1,525,000
d. $924,000 to $2,628,000
4.
If a players union representative stated that one-third of the first-year

players drafted earned less than $400,000 per year, he would be reporting
information about the
a. mean
b. median
c. mode
d. modal class
e. minimum
100
400
700
1000
1300
1600 1900
2200 2500
SA LA RY
Page 14
5.
The fiftieth percentile for annual salary of NBA rookies was

a. more than a million dollars
b. just over $900,000 dollars
c. about half-a-million dollars
d. under $700,000
e. cannot be determined from the information given
6.
By only examining the histogram for the salary data, we can immediately
conclude that
a. The mean will be substantially larger than the median
b. There are outliers in the data
c. Salaries do not have a bell-shaped distribution
d. The standard deviation will be relatively large
e. All of the above
3.93. A regional supervisor for Coors has the price per 6-pack of its beer surveyed
from the 25 supermarket chains in Los Angeles. Describe in a couple
sentences the average and variability of beer prices from the following
computer output of descriptive statistics:
coors
N MEAN MEDIAN TRMEAN STDEV SEMEAN

25 3.6620 3.6500
3.6548 0.2446 0.0487
MIN
MAX
Q1
Q3
coors 3.2900 4.2900 3.4900 3.8200
Page 15
3.94 New television shows are notoriously risky ventures. Nielsen ratings are the
main variable used to assess the success of a show and the rates that can be
charged advertisers. A network programmer would like you to summarize
the Nielsen ratings for the 1993-94 season crop of new shows. You obtain
the following computer printout:
Data Display
NIELSEN
4.8 5.0
7.8 7.9
9.5 10.2
11.3 11.4
12.7 12.7
15.3 15.7
5.3 5.5 5.8 6.1 7.0 7.2 7.4

8.0 8.3 8.3 8.5 9.0 9.1 9.1
10.3 10.4 10.4 10.5 10.6 10.7
11.4 11.9 12.0 12.0 12.0 12.1
13.4 13.6 14.0 14.1 14.3 14.5
15.9 17.4 17.6 17.8 19.3 20.5
7.6
9.2
10.8
12.2
14.8
7.6
9.4
10.9 11.3
12.2 12.2
15.1 15.2
Descriptive Statistics
NIELSEN
63 11.176 10.900 11.077
3.586 0.452
NIELSEN
MIN MAX
Q1
Q3
4.800 20.500 8.300 13.600
a) Based on the DESCRIBE information, locate and mark the mean and median
Nielsen rating on the sorted data listing above. [No explanation please]
b) Is the median an actual value in the data? Why or why not? In two sentences,
explain carefully how the median was calculated from this data set.
c) By examining the Nielsen data printed above, explain why the mean is not very
different from the median.
d) What is the RANGE for the Nielsen data (single number answer)?
e) Calculate (from the DESCRIBE output above) an interval two standard deviations
on either side of the mean? Are about 95 percent of the Nielsen ratings
within this interval? Which ratings (if any) are above and below these limits?
f) Write a one-sentence report summarizing to the network programmer your
findings about averages and dispersion of new program Nielsen ratings.
Page 16

3.1: e, 3.2: e, 3.3: c, 3.4: e, 3.5: c, 3.6: d, 3.7: b, 3.8: b, 3.9: b, 3.10: d, 3.11: c, 3.12: e, 3.13: d,
3.65: c; 3.66: d, 3.67: e, 3.80: c, 3.81: e, 3.82: c, 3.83: d, 3.84: d, 3.85: a, 3.86: c, 3.87: e, 3.88: d,
3.89: a, 3.90: b, 3.91.1: a, 3.91.2: b, 3.91.3: c, 3.91.4: d, 3.92.1: d, 3.92.2: a, 3.92.3: a, 3.92.4: d,
3.92.5: d, 3.92.6: c, 3.93: Average beer price in L.A. supermarkets is about $3.65 a six-pack, but
prices ranges from as high as $4.29 to as low as $3.29. 3.94: (a) Circle 10.9 for the median and a
point between 10.9 and 11.3 on the data listing as the mean; (b) Yes, because n is odd. The (n +
1) / 2st largest (or smallest) observation from the sorted listing is the median. Since (63 + 1) / 2
is 32, the median must be the 32nd largest rating, or 10.9; (c) No extremely large or small
ratings occur, nor is there an overabundance of larger ratings not offset by about the same
number of lower ratings; (d) Range = Max - Min = 20.5 - 4.8 = 15.7
(e) 95% confidence interval = ( - 2, + 2) = (11.176 - 2(3.586) and 11.176 - 2(3.586)) = (4.00,
8.35), of the n = 63 ratings observed, 61 of the 63 are within this interval, about 97% and close
to 95%. None of the show ratings were below the interval, but two were above (19.3 and
20.5); (f) Nielsen ratings for 1993-94 new shows average approximately 11, but ratings ranged
from as low slightly less than 5 to one show than exceeded 20.
Page 17
D. Review Questions for Chapter 4

4.1 By knowing only the trend line, you know all except which one of the
following things about time series data?
a. fit around the trend
b. steepness of the trend
c. location of the trend
d. the average rate of increase of the time series data for each time period
e. all of the above can be determined from the trend line
4.2 If the trend line for inventories is given by the equation
predicted inventory = 50 + 150 Quarter
that was fitted from 20 consecutive quarters (1 through 20). The prediction for
quarter 21 would be
a. forecasting
b. extrapolation
c. 3200
d. all of the above
you can predict inventory levels for quarter number 5 to be
a. 700
b. 800
c. 900
d. 1000
e. cannot be predicted from the information given
then inventory levels on average
a. rise 600 per year
b. rise 200 per quarter
c. rise 150 per month
d. fall 200 per quarter
Page 18

were fit from 20 consecutive quarters (1 through 20), prediction for quarter 21
would be
a. forecasting
b. extrapolation
c. 3200
d. all of the above
4.6 The problem with trend line regressions is that
a. eventually all trends must end
b. there may not be any trend upward or downward
c. they can only apply to time series data
d. they do not explain why a trend occurs
e. all of the above
4.7 Based on the trend line for sales of a fast growing company given by the
equation
predicted sales = 2500 + 500 Year
the fitted value of sales for Year 7 is
a. 2500
b. 3500
c. 5000
d. 6000
4.8 Based on the trend line for sales given by the equation
predicted sales = 3000 + 600 Year
sales on average
a. rises 50 per month
b. rises 150 per quarter
c. rises 600 per year
d. all of the above
Page 19
4.9 A simple regression equation can always be used to predict a dependent

variable if
a. the explanatory variable is positive
b. the explanatory variable lies within the range of data values
c. the explanatory variable has a realistic value
d. the predicted value of the dependent variable is realistic
4.10 The intercept in a simple regression equation may always be interpreted as
a. the change in the dependent variable for zero change in the explanatory
variable
b. the value of the dependent variable when the explanatory variable is zero
c. the value we add to b1 X to predict in-range values of the dependent
variable
d. the mean of the dependent variable
e. all of the above
4.29 If SSE = 200 and SST = 500, then R is equal to
a. 60 percent
b. 40 percent
c. 30 percent
d. 20 percent
4.30 The least-squares line has which of the following properties?
a. minimizes the sum of errors
b. passes through the most points possible
c. passes through the origin
d. minimizes the SSE
e. all of the above
4.31 To determine the standard error of the estimate for simple regression, we
divide by n - 2 in the denominator prior to taking the square root because
a. we lose two pieces of information to first determine the error
b. we lose two degrees of freedom
c. we must first estimate the slope and intercept of the fitted equation
d. all of the above
Page 20
4.32 If SSE = 35 and SST = 140, then R is equal to

a. 75 percent
b. 60 percent
c. 35 percent
d. 25 percent
4.33 If R is zero for a regression, then
a. the intercept for the regression equation must be zero
b. the slope for the regression equation must be zero
c. all the fitted values of the dependent variable must be zero
d. all of the above
4.34 Calculate R and SEE from the following information:
(a) SSE = 120, SST = 1000, n = 25
(b) SSE = 8, SST = 24, n = 33
(c) SSE = 60, SST = 64, n = 65
(d) SSE = 0.4, SST = 1, n = 41
(e) SSE = 44,100, SST = 250,000, n = 22
4.35 Monthly data on short term interest rates for commercial paper from
November 1989 through August 1993 was examined by a financial analyst,
using the following variables:
USA Int
U.S. short term interest rates (in percent) in month t
Month
n = 46 months: month = 1, 2, ..., 46
The regression equation is
USA Int. = 8.95 - 0.146 Month
[other output not relevant is omitted here]
s = 0.4899
R-sq = 94.2%
Analysis of Variance
SOURCE
DF
SS
Regression 1 171.69
Error
44
10.56
Total
45 182.25
R-sq(adj) = 94.1%
MS
F
p
171.69 715.35 0.000
0.24
Page 21
(1) The monthly trend rate in short term interest may be described by: Interest
rates (decreased / increased) at an average rate of
percentage
points per month.
(2) Assuming past trends continued, forecast interest rates for December 1993
(Month = 50):
(3) What percent of variation in interest rates is explained by the regression?
%. This percentage may be verified by subtracting from 1 the following ratio:
.
(4) The standard error of the estimate, 0.4899, is the square root of which
number?
(5) To capture the actual interest rate outcome about 95% of the time, predictions
based on the fitted equation may report a margin of error of roughly plus or
minus
(please round).
(6) The correlation is negative because of the negative sign of the
. The correlation coefficient between interest rate and Month is - 0.
.
[show work]
Answers?
Page 22
4.43 The mean cross-product deviation for bivariate data are the
a. correlation
b. coefficient of determination
c. standard error of the estimate
d. covariance
4.44 Which of the following is a unitless measure ranging from -1 to +1?
a. correlation
b. coefficient of determination
c. standard error of the estimate
d. covariance
e. all of the above
4.45 Which of the following is not a difference between regression and
correlation?
a. correlation requires us to first plot the data
b. regression requires that we designate a dependent variable
c. regression allows us to make predictions
d. correlation is merely a measure of association
e. correlation is more appropriate for exploring a new area of inquiry
4.46 If rX,Z = 0 then we say that X and Z are
a. uncorrelated
b. perfectly correlated
c. negatively correlated
d. both a and b
e. both a and c
4.47 If the correlation among two variables is 0.2, then R for the simple
regression is
a. 0.40
b. 0.20
c. 0.04
d. 0.02
e. cannot be calculated with the information given
Page 23
4.48 If a simple regression has an R of 49 percent, then the correlation between

the two variables is
a. 0.2401
b. 0.49
c. 0.07
d. 0.70
e. cannot be calculated from the information given
Answer the next four questions based on the following statistical output:
The primary business of a newspaper is to sell readers to advertisers. Data on two
variables are collected for Florida newspaper on the following variables:
Advert advertisers space (in thousands of inches) purchased during each month
Month a time trend variable for each of n = 50 months: month = 1, 2, ..., 50
The regression equation is :
advert = 102 0.501 month
S = 7.276
R-sq = 50.7%
SOURCE
DF
SS
Regression 1 2612.4
Error
48 2541.4
Total
49 5153.8
R-sq(adj) = 49.7%
MS
2612.4
52.9
F
p
49.34 0.000
4.49. According to the regression output, advertising space in this newspaper

a. Increased at about 102,500 inches per month
b. Increased at about 500 inches per month
c. Decreased at about 102,500 inches per month
d. Decreased at about 500 inches per month
e. Cannot be determined from the information provided
Page 24
4.50. The least-squares equation reduces the total variation in predicting advert
a. From 5154 to 2541
b. From 2612 to 52.9
c. From 2541 to 52.9
d. From 50.7 to 49.7
e. From 49 to 48
4.51. We may predict advertising space for month 20 to be approximately
a. 92,000 inches with a margin of error of 15,000 inches
b. 82,000 inches with a margin of error of 15,000 inches
c. 92,000 inches with a margin of error of 7,300 inches
d. 82,000 inches with a margin of error of 7,300 inches
e. 82,000 inches with a margin of error of 7,300 inches
4.52. The correlation coefficient between advertising space and month was
a. +0.71
b. - 0.71
c. +0.26
d. - 0.26
e. Insufficient information to determine answer
4.59 Given that rxy = -.80, determine the following:
(a) R for the regression of Y on X and the sign of the slope coefficient b1
(b) R for the regression of X on Y and the sign of the slope coefficient b1
Page 25
4.60 The following multiple regression determines insurance charged to short haul
trucking firms: premium = b0 + b1 fleetsz + b2 popden
where the variables in the regression equation are defined as:
premium = insurance premiums charged each firm (in dollars per truck)
fleetsz
= number of truck owned by each firm
popden
= county population density (people / square mile) where
company is located
From a random sample of n = 32 trucking firms surveyed in 1993,
the regression equation is
premium = 779 - 5.49 fleetsz + 0.885 popden
S = 332.5
R-sq = 36.6%
SOURCE
DF
SS
Regression 2 1850413
Error
29 3206046
Total
31 5056459
R-sq(adj) = 32.2%
MS
925206
110553
F
p
8.37 0.001
1. This data set consists of time series / cross section (circle one) data.
2. This is multivariate regression because there is more than one variable.
3. The
statistic tells us we have explained over one-third the variation in
premiums.
4. The degrees of freedom for the error sum of squares =
, found by
_.
5. Use the fitted equation to predict premiums for a firm with 10 trucks in a county
with 50 people / square mile.
[round answer to the nearest dollar]
6. For the following, show computations assuming no other explanatory variable
changes:
(a) Companies increasing fleet size 20 trucks are charged $110 lower premiums on
average.
b1 fleetsz =
(b) Companies moving to counties of 200 fewer people / square mile are charged
$177 less on average.
b2 popden =
Answers?
Page 26
4.72 Multiple regression means that

a. we run more than one regression equation
b. we run the same regression equation more than once
c. we have more than one dependent variable
d. we have more than one explanatory variable
4.73 When one regression equation contains an additional explanatory variable
not present in a second, the first equation will nearly always have a
a. higher adjusted R
b. lower adjusted R
c. higher R
d. lower R
e. both R and adjusted R may be higher or lower depending on the data
For each of the following four questions, assume that the regression equation for
the apartment rent example is
Rent = - 200 + 1.0 Size - 10 Walk
4.74 The estimate of monthly rent for an 700 square foot apartment that is a 20
minute walk from campus is
a. $200
b. $300
c. $400
d. $500
e. cannot be calculated from the information given
4.75 How much should you expect rents to change on average if you moved to the
same-sized apartment, but 7.5 minutes further walking distance from
campus?
a. $15 less
b. $15 more
c. $75 less
d. $75 more
e. exactly the same amount
Page 27
4.76 The monthly rent for a 1200 square foot apartment that is a 2 minute walk
from campus may be estimated to be
a. $1180
b. $980
c. $1220
d. $1420
e. $760
4.77 How much should you expect rents to change on average if you moved to an
equal-sized apartments 10 minutes further walking distance from campus?
a. $100 less
b. $100 more
c. $20 more
d. $200 more
e. exactly the same amount
4.78 We should not expect R from two regressions to be directly comparable if
a. the samples for the two regressions was gathered from very different time
periods
b. the sample for the one of regressions was from a more broadly-defined
population
c. the dependent variables used in each regression was defined differently
d. all of the above
4.79 Suppose a particular regression equation is utterly worthless in accounting
for variation in the dependent variable over the entire population. Then we
should expect that the least-squares equation from a random sample of that
population will yield an R = 0
a. only very rarely
b. less than 50 percent of the time
c. most of the time
d. nearly always
e. always
Page 28
4.80 Which is not true about the difference between R and the adjusted R?
a. the difference is greater for smaller sample sizes
b. the difference is greater if R is large
c. the difference is greater if there are more explanatory variables in the
equation
d. adjusted R cannot be greater than R
e. adjusted R cannot be greater than 100 percent
4.81 Multiple regression is so often used in business and economics today
because
a. most variables we seek to explain are affected by a complex set of factors.
b. an acceptable fit often cannot be obtained with only a single explanatory
variable.
c. controlled experiments are often too difficult to conduct.
d. the abundance of data and speed of modern computers make multiple
regression a practical option.
e. all of the above.
4.82 In comparing regression fits on cross section and time series data
a. R is usually lower for cross section data because it is easier to explain why
different items are different
b. R is usually higher for cross section data because it is easier to explain
why different items are different
c. R is usually lower for cross section data because it is more difficult to
explain why different items are different
d. R is usually higher for cross section data because it is more difficult to
explain why different items are different
e. there is no systematic difference between fits for either type of data
Answers? 4.72-4.82
Page 29
Answer the next five questions based on the following case description and
statistical output:
We next examine a regression equation where advertising space is a function of
newspaper sales and the state of the economy:
advert = advertising space (in thousands of inches) purchased during each
circ = the monthly level of circulation (measured in millions)
jobless = the monthly unemployment rate (in percent)
The regression equation is:
advert = 125 + 1.95 circ - 6.13 jobless
4.83 Forecast advertising space next month when circulation is 4 million and
unemployment rate is 6 percent.
a. 22,000 inches
b. 29,000 inches
c. 96,000 inches
d. 132,000 inches
4.84 If the circulation numbers were instead reported in thousands (rather than
millions) of newspapers, what would the coefficient of circ now have to be
for the fitted equation to tell exactly the same story?
a. 1950
b. 1.95
c. 0.00195
d. 0.00000195
e. Cannot be determined from the information provided.
4.85 Assuming unemployment is the same, circulation decline of one million will
result in a
a. 1950 inch increase in advertising space
b. 127,000 inch increase in advertising space
c. 1950 inch decrease in advertising space
d. 127,000 inch decrease in advertising space
Page 30
4.86 For a one percentage point increase in the unemployment, other things
equal, advertising space will
a. increase by about six thousand inches
b. increase by about 119 thousand inches
c. decrease by about six thousand inches
d. decrease by about 119 thousand inches
4.87 Which of the following describes the data set and regression equation:
a. data set is time series data and the regression is multivariate regression
b. data set is cross section data and the regression is multivariate regression
c. data set is time series data and the regression is simple regression
d. data set is cross section data and the regression is simple regression
e. not enough information provided to determine the data set and regression
type.
4.88 Determine for each of the following cases:
(a) b1 = 15, X = 10
(b) b1 = 15, X = 100
(c) b1 = 1.5, X = 100
(d) b1 = -5, X = 10
(e) b1 = 15, X = -10
Page 31
4.88 Answer the next four questions based on the following case and statistical
output on the regression: Predicted P GAS = b0 + b1 month
where the variables are:
P GAS monthly average price at pump for regular gasoline (in cents / gallon)
month numbered from 1 to 53 from April 1986 to August 1990.
The Minitab regression equation is: P GAS = 83.8 + 0.414 month
S = 4.532
R-sq = 66.9%
R-sq(adj) = 66.3%
a.
Rounded to the nearest cent, the annual trend rate (i.e., every 12 months) in
gas price increases is about
a. 1 cent
b. 5 cents c. 84 cents d. 89 cents e. None of the above
b.
The forecast of gas prices in March 1991 (i.e., month = 60), assuming past
trends continue is approximately
a. 25 cents
b. 35 cents
c. 84 cents
d. 109 cents
c.
The variation in gas prices explained by this trend equation is approximately

what fraction?
a. One third
b. One half
c. Two thirds
d. Three quarters
e. Cannot be determined from the output above
d.
The correlation between gas price and month is

a. 0.44
b. 0.66
c. 0.67
d. 0.82
e. Insufficient information provided to determine the answer
Page 32
4.89 Answer the next four questions based on the following case and statistical
output: Data on the following fours variables were collected from a random
sample of n=49 persons taking the business law section of the CPA (Certified
Public Accounting) exam:
LAWSCORE = each person's score on the business law section of the CPA exam
HOURS = number of hours that person studied per week to prepare for the exam
GPA = undergraduate grade point average of the person taking the exam
WORKEXP = number of years work experience of the person taking the exam
The regression equation is in the following form:
Predicted LAWSCORE = b0 + b1 HOURS + b2 GPA + b3 WORKEXP
When the regression was run, the following output results:
LAWSCORE = 53.8 + 0.400 HOURS + 3.45 GPA + 0.272 WORKEXP
s = 9.050
R-sq = 25.4%
SOURCE
DF
SS
Regression 3 1256.46
Error
45 3685.78
Total
48 4942.25
R-sq(adj) = 20.5%
MS
418.82
81.91
F
p
5.11 0.004
a) The degrees of freedom for the error sum of squares are calculated as follows:
a. 48 - 1 - 1 = 46 degrees of freedom
b. 50 - 2 = 48 degrees of freedom
c. 45 - 4 = 41 degrees of freedom
d. 49 - 3 - 1 = 45 degrees of freedom
b) If a second regression equation reports an R2 = 28.6% and an adjusted R2
=19.2%, then this second equation has
a. A better fit than the regression equation above.
b. A worse fit than the regression equation above.
c. The same fit as the regression equation above.
d. Not enough information provided to answer this question.
Page 33
c) Assuming the other explanatory variables don't change, ten more hours study
weekly
a. increases exam scores an average of 0.4 points
b. increases exam scores an average of 4 points
c. increases exam scores an average of 40 points
d. increases exam scores an average of 58 points
d) The exam score for a person studying 30 hours per week who earned a 3.0
grade point average in college and has ten years work experience is
predicted to be approximately
a. 72
b. 79
c. 85
d. 91
e. None of these
Answers?
4.90 At the beginning of 1991, a national realtors association wants you to analyze
trends in the home construction industry over the preceding four years and
forecast new construction activity this year.
Predicted H starts = b0 + b1 Month
The variables in the regression equation are for month t
H starts
number of housing starts (in thousands) during month t
Month
month = 1 to 48 from January 1987 to December 1990)
The regression is fit yielding the following Minitab output:
The regression equation is: H starts = 1727 - 12.2 month
s = 97.54
R-sq = 75.7%
R-sq(adj) = 75.2%
SOURCE
Regression
Error
Total
DF
SS
1 1366204
46 437648
47 1803852
MS
F
p
1366204 143.60 0.000
9514
Now answer the following 10 questions:
Page 34
a) What was the monthly trend rate in housing starts over the period examined?
Housing starts (increased / decreased) at an average rate of
thousand per
month decreased, 12.2
b) What percent of variation in car sales was explained by this trend equation?
75.7%
c) Calculate the correlation between Month and Sales.
2
0.757
R =
r=
= - .87 because b1 <0
d) Verify with a calculator that the standard error of the estimate has the proper
relationship to the mean square error MSE.
SEE =
=
=
97.54
e) Verify with a calculator that R has the expected relationship to the error and
total sum of squares, SSE and SST.
R2 = 1 - SSE / SST = 1 - 437,648 / 1,803,852 = .757 or 75.7%
f) Remember that month = 48 was December 1990. Use a calculator to forecast
housing starts for January 1991 and again for February 1991 from the fitted
regression equation.
predicted H starts = 1727 - 12.2(49) = 1129.2 thousand for January 1991
predicted H starts = 1727 - 12.2(50) = 1117.0 thousand for February 1991
g) The actual values for housing starts in January 1991 was 844 thousand and 1008
thousand for February. Compare your answers in question (6) above with
these actual outcomes, and discuss how close each of your forecasts was.
h) Below is the time series plot of housing starts from 1987 through the end of
1991 (i.e., for an additional 12 months beyond the data for which the
regression equation was fitted)
MS E
0.9514
Both forecasts were too high, however the second forecast (for February 1991)
was much closer being only 109 thousand off rather than 285 thousand away
from the actual outcome.
Page 35
- **
1750+ *
**
*
H strts - ***** *
* 2 ***
*
**2 *
*
1400+
** *
* ** *
** * * *
* *
****
2*
***
1050+
* **2
* ***
*
*
--+---------+---------+---------+---------+---------+----month
0
12
24
36
48
60
i) Based on your examination of this plot, discuss how you can tell that the
regression equation we estimated would yield very bad forecasts for mid- and late1991 (months 54 to 60).
The downward trend over the first four years (48 months) was used to fit the
regression equation. However, the trend may have reversed and turned into
an upward trend thereafter. Thus, forecasts from the fitted line will be
increasingly wrong and too low.
j) If a different regression equation were to be used instead of a time trend
equation, suggest one or more explanatory variables that would explain the
variation in housing starts in the U.S. Acceptable answers: interest rates,
construction costs, unemployment, and population.
Answers?
Page 36
4.91 The director of a hospital pharmacy wants to determine how staffing level
affects the rate at which prescriptions are processed. She decides to use the
following regression equation: prescript = b0 + bb staff + b2 in-pat
where the variables are defined as follows:
staff
= average number of staff on duty on day t
presrip
= prescriptions processed per hour during day t
in-pat
= number of in-patients at the hospital on day t
Forty-seven days are sampled and a regression yields the following equation:
prescrip = - 67.1 + 21.0 staff + 0.395 in-pat.
S = 14.22
R-sq = 69.6%
SOURCE
DF
SS
Regression 2
20392
Error
44
8892
Total
46
29284
R-sq(adj) = 68.3%
MS
10196
202
F
p
50.45 0.000
a) Verify that the error degrees of freedom should be 44, given this regression and
sample size.
DF = n - k - 1 = 47 - 2 - 1 = 44
Using the marginal effects"delta" () formula answer b and c: Based on the fitted
equation, determine the expected change in the prescription processing rate,
assuming the other explanatory variable in the regression equation does not
change: [In each case, show delta formula computations and report answers in
correct direction of change, numerical magnitude, and correct units]
b) The staff level at the pharmacy is increased by two persons.
prescrip = b1 staff = (+21)(+2) =+42
an increase of 42 prescriptions /
hour
c) The number of patients at the hospital decreases by 100.
prescrip = b2 in-pat = (+.395)(-100) =-39.5
a decrease of 40
prescriptions / hour
Make a prediction for questions d and e:
d) Predict the prescription processing rate when there are 200 patients in the
hospital and 10 persons maintained on staff at the pharmacy.
predicted prescrip = -67.1 + 21.0(10) +.395(200) = -67.1 + 210 + 79
= 222 prescriptions / hour
Page 37
e) Why shouldn't you worry about the negatively signed intercept term, -67.1, in
the fitted equation? [Hint: use information from the descriptive statistics
below]
Even if both variables are at their minimum values, predicted prescrip would still
have positive value.
staff
47 9.512 9.563
9.509
1.037 0.151
prescrip 47 170.55 174.93 170.68
25.23 3.68
in-pat.
47 95.98
97.00
95.72
15.40
2.25
MIN MAX
Q1
Q3
staff
7.375 12.125 8.625 10.187
prescrip 110.00 227.86 153.57 188.14
in-pat.
72.00 129.00 81.00 108.00
f) What is the danger of using the fitted equation to predict "prescript" if only four
staffers are on duty at the pharmacy?
The range of staff data was 7.375 to 12.125, so four staff would result in
extrapolation if used to fit the regression and predictions cannot be trusted.
Answers?
4.1: a; 4.2: d; 4.3: b; 4.4: a; 4.5: d; 4.6: e; 4.7: d; 4.8: d; 4.9: b; 4.10: c; 4.29: a; 4.30: d; 4.31: d;
4.32: a; 4.33: b; 4.34 (rounded): a) 88%, b) 67%, c) 6%, d) 60%, e) 82%; 4.35; 4.43: d; 4.44: a;
4.45: a; 4.46: a; 4.47: c; 4.48: e; 4.49: d; 4.50: a; 4.51: a; 4.52: b; 4.59: a and b are the same
answer (64% and negative sign); 4.60; 4.72-4.82; 4.88: a) +150, b) +1500, c) +15, d) -50, e) 150; 4.89; 4.90; 4.91;
Page 38
E. Review Questions for Chapter 5

5.1 Which of the following variables would best measure the quality of
education offered among a population of colleges?
a. SAT exam scores of entering students at each college
b. salaries of professors at each college
c. average grades of all graduates at each college
d. satisfaction surveys for graduates of each college
e. the tuition costs at each college
5.2 When making decisions based on time series data, one rule of advice is
a. base your decisions only on the most recent data available
b. don't be too quick to proclaim success or failure of a new policy
c. we must learn the lessons of history so that we never risk repeating past
mistakes
d. the behavior of people in each age group today can be used to predict the
behavior
of those same age groups in the future
e. time series data are superior to cross section data for decision making
5.3 What is wrong with deciding whether a business has grown over the past
twenty years by comparing the sales revenues?
a. the variable is contaminated by price changes from inflation
b. this involves circular reasoning
c. sales revenue usually cannot be measured
d. sales revenue is not a variable
e. there is nothing wrong
5.4 Which of the following is not an example of faulty reasoning involving time?
a. A firm suffering a temporary cash flow crisis fires a large portion of its
workforce.
b. Things were better in the good old days, so we should do things the way they
used to be done.
c. Luxury car sales increased in the 1980s because of ads using 1960s rock music,
so the same music is approved for ad campaigns in the 1990.
d. all of the above
Page 39

5.5 A business school accreditation organization circulates class evaluation forms
to students attending the next to the final week of the term. All except one
of the following would likely cause bias in estimating educational
effectiveness of the courses:
(a) students place too much faith on instructors who say the courses are
effective.
(b) students underestimate the usefulness of courses until they obtain full
time jobs in business.
(c) the evaluation forms contain questions primarily focused on whether
courses meet minimum standards.
(d) students who dropped or withdrew from the course before the final week
are not sampled.
(e) all of the above are likely sources of bias.
5.6
A state in fiscal crisis is considering various tax reform proposals. To get a

feeling for what voters would support, the governor decides to conduct a survey
before deciding which reforms to propose to the state legislature. Which of the
following survey designs would you recommend?
(a) set up a series of 900 numbers for people to call to express their
preference.
(b) mail a survey form to all registered voters, and determine the preferences
from those who respond.
(c) conduct extensive "man-on-the-street" interviews at all major shopping
malls.
(d) survey a random sample of political science professors and other expert
as to which reform plan they believe the voters will support.

5.1: d, 5.2: b, 5.3: a, 5.4: d, 5.5: e, 5.6: e.
Page 40
F. Review Questions for Chapter 6

6.1 A personnel manager uses insight gathered from experience and training to
quantify the probability that a new salesperson will be caught stealing. The
approach used to arrive at this probability is best described as the
a. frequency approach
b. subjective approach
c. classical approach
d. probability approach
6.2
If the set S is defined by S = {Toyota Corolla, Nissan Sentra, Honda Civic,

Mazda MPV Minivan, Ford trucks}, which of the following dealerships could
have its new vehicle inventory described by S?
a. a Japanese car dealer
b. a foreign car dealer
c. a car dealer
d. a car, truck, and van dealer
e. each of the above
6.3
An insurance agent calculates that one in twenty unsolicited letters to

prospective clients results in a new policy sold. What is the probability that
any given letter will yield a sale?
a. 0.95
b. 0.50
c. 0.20
d. 0.05
e. insufficient information answer
6.4
Subjective probability
a. is derived from human judgments.
b. expresses our personal degree of belief.
c. is small when there is great doubt that an event will occur.
d. is most appropriate in new, complex, and difficult to quantify situations.
e. all of the above.
Page 41
6.16 If the probability of an event A is P(A) = .25, then the probability of its
complement C, P(C) must be
a. 0
b. 0.25
c. 0.5
d. 0.75
e. 1.0
6.17 If this year, P(Recession) = 0.25, P(Mideast War) = 0.10, and
P(Recession and Mideast War) = 0.05, then P(Recession or Mideast War) is
a. 0.025
b. 0.20
c. 0.30
d. 0.35
e. 0.40
6.18 Upon entering an intersection where two roads cross, cars have a 0.2
probability of turning left. Then the event Right Turn
a. has a probability of 0.8
b. is the complementary event of Left Turn
c. is mutually exclusive of Left Turn
d. is independent of Left Turn
e. all of the above
6.19 Suppose that of all newly-trained employees at a fast-food franchise, fifty
percent are still working there one year later but another thirty percent do not last
more than the first two months. Therefore, there is a twenty percent chance that
a new employee will
a. be around after two months
b. not last a year
c. be fired immediately
d. work there more than two months but no more than a year
e. insufficient information to answer
Page 42
6.20 A new product of the type your company is considering have achieved
market success in 200 out of 1000 cases. Which of the following correctly
describes this record?
a. the probability of success is 0.20
b. the probability of not achieving success is 80 percent
c. the odds are four-to-one against success
d. the chance of success is one in five
e. all of the above
6.21 If there are even odds that you will get a job offer from Acme, Inc., then the
probability P(Acme job offer) equals
a. 100 percent
b. 75 percent
c. 50 percent
d. 33 percent
e. not enough information provided
6.22 Which of the following does not indicate statistical independence among
events?
a. Students in the honors program have the same chance of passing this
course as any other type of student.
b. The chance of a small firm failing is identical to that for any size firm.
c. The odds of the boss' son getting a promotion are no better or worse than
for any other employee of the firm
d. People who saw their new commercial were no more likely to shop at KMart than those who didn't watch it.
e. All of the above indicate independent events.
6.23 For a product to be delivered on time, all of the following must occur: the
order is processed within one work day, the product is shipped the following
day, and the shipment is routed through the proper regional distribution
center. If each of these event are independent and their probabilities are
0.8, 0.8, and 0.5, the probability of on-time delivery is
a. 0.50
b. 0.48
c. 0.40
d. 0.32
e. 0.24
Page 43
6.24 If S is the set containing events describing the time between servicing of a
copying machine under warranty, which of the following would be a possible
event in S?
a. five months
b. customer complained
c. asked for money back
d. cartridge need replacing
e. all of the above
6.25 If events S = {A, B, C} is an exhaustive set of events, then
a. P(A) + P(B) + P(C) = 1
b. P(S) = 1
c. A, B, and C must be mutually exclusive events
d. all of the above
6.26 If S = {A, B, C} and P(A) + P(B) + P(C) = 1.0, then
a. A, B, and C are mutually exclusive and exhaustive events.
b. A, B, and C are mutually exclusive but not exhaustive events.
c. A, B, and C are exhaustive but not mutually exclusive events.
d. A, B, and C are complements.
e. A, B, and C are independent events.
6.27 If the sample space consists of a listing of each product sold at a
supermarket, then frozen foods and fresh produce are
a. outcomes
b. events
c. sample spaces
d. experiments
6.28 If the sample space consists of a listing of each product sold at a
supermarket, then frozen foods and fresh produce are
a. independent
b. mutually exclusive
c. exhaustive
d. complementary
e. all of the above
Page 44
6.29 Getting promoted to a senior vice-president position has a probability of 0.50

if you are head of the marketing but only 0.20 if you are head of research.
These are examples of
a. marginal probabilities
b. joint probabilities
c. conditional probabilities
d. independent probabilities
6.30 Which of the following sets consists of mutually exclusive events?
a. Marital Status = {single, married, divorced, widowed}
b. Education = {high school degree, college degree}
c. Region = {South, West, East, North}
d. Occupation = {office worker, clerk, receptionist}
6.31 If events A and B are statistically independent, and P(A) = 0.5 and P(B) = 0.2,
then the joint probability P(A and B) must be
a. 0.1
b. 0.2
c. 0.5
d. 0.9
e. cannot calculate without knowing the conditional probabilities
U.S. manufacturers are surveyed about whether they have fewer than 50 workers
and if they do any exporting of their product. Answer the following questions based
on the Venn Diagram:
L e s s
t h a n
5 0
W o r k e r s
E x p o r t
U .S . M a n u f a c t u r i n g
C o m p a n ie s
Page 45
6.32 Which of the following is described in the Venn diagram?

a. Exporting has a greater probability than Less Than 50 Workers
b. Less Than 50 Workers has a greater probability than At Least 50 Workers.
c. At Least 50 Workers has a greater probability than Not Exporting.
d. All of the above are portrayed in the diagram.
e. Cannot be answered by examining the diagram.
6.33 Which of the following are described in the Venn diagram?
a. Exporting and Less Than 50 Workers are exhaustive events.
b. Exporting and Less Than 50 Workers are mutually exclusive events.
c. Exporting and Less Than 50 Workers are complementary events.
d. All of the above
6.34 Which of the following conditional probabilities is most closely approximated
in the Venn diagram above.
a. P(Exports | Less Than 50 Workers) is 0.50
b. P(Less Than 50 Workers | Exports) is 0.50
c. P(Exports | Less Than 50 Workers) is 0.85
d. P(Less Than 50 Workers | Exports) is 0.85
e. Cannot be answered with information presented in the diagram.
6.35 A new comedy series has a 20 percent chance of being renewed for a second
season. On average, 60 percent of second-season comedies are renewed.
What is the probability a new comedy will still be on the air for its third
season?
a. 0.48
b. 0.32
c. 0.12
d. 0.08
6.36 If events A and B are statistically independent, then
a. P(A) = P(B)
b. P(A) + P(B) = 1
c. P(A|B) = P(A)
d. P(A|B) = P(B)
e. all of the above
Page 46
6.37 Which of the following pairs of events are most likely to be independent?
a. a rusting fender and a car less than two years old
b. a person who has attended college and an athlete over 7 feet tall
c. a company president earning more than $500,000 a year and a Fortune
500 company
d. an A on the first exam and final course grade of D
e. having blue eyes and majoring in accounting
6.38 If the probability of an event A is P(A) = .25, then the probability of its
complement C, P(C) must be
a. 0
b. 0.25
c. 0.5
d. 0.75
e. 1.0
6.39 People usually express reluctance to offer a needy relative one of their two
kidneys for transplant surgery. They are often convinced to become donors when
they learn that the same diseases that cause one kidney to fail will also damage the
other kidney. This argument relies on explaining that the failure of each kidney are
not
a. exhaustive events
b. mutually exclusive events
c. independent events
d. complementary events
e. all of the above
6.40 If events A and B are statistically independent, and P(A) = 0.5 and P(B) = 0.4,
then the joint probability P(A and B) must be
a. 0.1
b. 0.2
c. 0.5
d. 0.9
e. cannot calculate without knowing the conditional probabilities
Page 47
6.41 For a product to be delivered on time, all of the following must occur: the
order is processed within one work day, the product is shipped the following
day, and the shipment is routed through the proper regional distribution
center. If each of these event are independent and their probabilities are
0.8, 0.6, and 0.5, the probability of on-time delivery is
a. 0.50
b. 0.48
c. 0.40
d. 0.30
e. 0.24
6.64 If X is a random variable with sample space {2, 6} and P(X=2) = 0.5, P(X=6) =
0.5, then 4 equals
a. the mean
b. the variance
c. the standard deviation
d. both a and b
e. both a and c
6.65 In Tampa, the probability distribution for the random variable X measuring
the price charged for weekday video tape rental is found to be:
Price Charged
$ 1.50
2.00
2.50
3.00
P(X)
0.10
0.60
0.10
0.20
For a randomly selected video store, the probability P($1.00<x<$2.25) is

a. 0.10
b. 0.20
c. 0.60
d. 0.70
e. 0.80
Page 48
6.69 A manager may not select the investment alternative with the highest
expected profit if another investment
a. has a lower standard deviation and the manager is averse to risk
b. has a higher standard deviation and the manager is averse to risk
c. has a lower standard deviation and the manager is risk loving
d. both a and c are possible explanations of the manager's behavior
e. both b and c are possible explanations of the manager's behavior
The questions that follow are related to the following decision making problem: A
manufacturer has to decide whether to replace (R), fix (F), or ignore (I) its aging
factory equipment this year. If R is chosen, there is a 0.8 chance of no production
stoppages (NO) and a 0.2 chance of minor stoppages (MIN). If F is selected, on the
other hand, P(NO) falls to 0.5 and P(MIN) increases to 0.5. If I is chosen by the
manufacturer, P(MIN) = 0.6 and there is now a 0.4 chance for major stoppages (MAJ).
Because of the higher costs associated with replacing equipment, profits from choice
R will only be 15 if NO occurs and 5 if MIN results. For choice F, a NO outcome yields
profits of 18 and MIN results in profits of 8. In the case of choice I, MIN produces
profits of 20 but MAJ cause profits of 0.
6.70 The number of decision forks faced by this manufacturer is
a. 0
b. 1
c. 2
d. 3
e. 4
6.71 The number of chance forks faced by this manufacturer is
a. 0
b. 1
c. 2
d. 3
e. 4
Page 49
6.72 The greatest expected profits for this manufacturer are derived by choosing
a. R
b. F
c. I
d. either a or b
e. either b or c
6.73 The expected profits from R is
while expected profits from I are
a. 7 and 8
b. 7 and 12
c. 13 and 8
d. 13 and 12
6.74 In problems using Bayes' theorem, we
a. always seek to determine a conditional probability as our answer
b. assume that outcomes each statistically independent
c. look for keywords such as "who", "what", and "how" to determine
whether we are dealing with marginal probabilities
d. all of the above
6.75 In problems using Bayes' theorem, we
a. always seek to determine a conditional probability as our answer
b. assume that events are not statistically independent
c. look for the presence of keywords such as "given that", "if", and "when" to
identify conditional probabilities
d. all of the above
Page 50
Use the information from the following situation to answer questions below:
A study of past shuttle launch attempts reveals that the probability of a launch (L)
taking place was 40 percent if a heavy clouds cover (HC) was forecast, 60 percent if
a light cloud cover (LC) was forecast, and 80 percent if no clouds (NC) were
forecast. A survey of weather forecasts for the Cape informs us that clear days are
forecast three-quarters of the time and light clouds are forecast 20 percent of the
time. Assume that there are only three kinds of forecasts: HC, LC, and NC.
6.76 The probability P(HC) is
a. 0 percent
b. 5 percent
c. 30 percent
d. 37.5 percent
e. 55 percent
6.77 The 40, 60, and 80 percent probabilities given in the problem are
a. marginal probabilities
b. conditional probabilities
c. joint probabilities
d. Bayesian probabilities
e. all of the above
6.78 Using Bayes' theorem, we may use the shuttle launch and weather
information to solve for
a. P(HCL)
b. P(LCL)
c. P(NCL)
d. all of the above
6.79 The denominator 0.74 in Bayes' formula from the launch and weather
probabilities is calculated from the following sum:
a. 0.01 + 0.24 + 0.49
b. 0.02 + 0.12 + 0.60
c. 0.04 + 0.08 + 0.62
d. 0.12 + 0.24 + 0.38
e. 0.20 + 0.30 + 0.24
Page 51
6.80 The probability that heavy clouds were forecast if a launch is known to have
occurred that day is approximately
a. 22 percent
b. 12 percent
c. 7 percent
d. 3 percent
e. 1 percent

6.1: b; 6.2:d; 6.3:d; 6.4: e; 6.16: d; 6.17: c; 6.18: c; 6.19: d; 6.20:e; 6.21:d; 6.22:e; 6.23: d; 6.24:
a; 6.25:b ; 6.26:a; 6.27:b; 6.28:b; 6.29: c; 6.30: e; 6.31:a; 6.32: b; 6.33: e; 6.34:b; 6.35: c; 6.36:c;
6.37:e; 6.38:d; 6.39:c; 6.40:b; 6.41:e; 6.64: d; 6.65:d; 6.69:a; 6.70: b; 6.71:d; 6.72:d; 6.73: d;
6.74:a; 6.75:d; 6.76: b; 6.77: b; 6.78:d; 6.79:b; 6.80: d
Page 52
G. Review Questions for Chapter 7

7.1 Which of the following is not a method to determine the appropriate
distribution for a decision making problem?
a. make reasonable assumptions about the distribution in the population
b. apply well-understood distributions as approximations for large-sample
situations
c. apply well-understood distributions by using ordinal measures
d. discern the population distribution from the sample's distribution in large
sample situations
e. all of the above are valid methods
7.2 In many situations, we may be able to assign a distribution for our analysis by
a. examining distributions of earlier studies
b. asking experts about the process by which the data are generated
c. using our subjective feeling
d. all of the above
7.3
Most statistical analysis uses only a few distributions for all but one of the
following reasons:
a. we may use large sample properties
b. business variables are distributed in only a few different manners
c. we can represent many distributions by a handful of distribution families
d. business statistics relies primarily on discrete distributions
e. all of the above are explanations
7.4
In distributions that are skewed to the right,

a. the median will lie to the left of the mean
b. the mean will lie to the left of the median
c. the median and mean will be identical
d. the relationship between the median and mean will depend on the specific
distribution
e. skewed distributions do not have a median
Page 53
7.5
The probability that exactly two computers in the lab will break down this
month is approximately
a. 0.1
b. 0.2
c. 0.3
d. 0.4
e. 0.5
7.6
The probability that no more than two computers will break down this
month is approximately
a. 0.3
b. 0.4
c. 0.5
d. 0.6
e. 0.7
7.7
If the cost of repairing a computer is $300, the approximate probability that

service to the lab will cost the maintenance firm at least $1800 is
a. 0.001
b. 0.006
c. 0.01
d. 0.06
e. 0.1
7.8
Which of the following is not an example of a Bernoulli trial:

a. the merger goes through or it does not go through
b. the manager hired is a male or a female
c. the sale takes place before or after lunch
d. the stock market goes up today or it does not go up
e. all of these are examples of a Bernoulli trial
7.9
A company either adopts TQM methods or it does not adopt them. If the
probability of adoption is 0.4, then, in the notation of Bernoulli trials,
a. P(S) = 0.4
b. P(F) = 0.6
c. p = 0.6
d. q = 0.4
e. all of the above
Page 54
7.10 If the probability of any particular car buyer choosing the dealer's bank
financing is 0.7, then the probability that the next four buyers will each
choose the dealer's bank financing is approximately
a. 0.53
b. 0.49
c. 0.34
d. 0.24
e. 0.17
7.11 For the preceding example, out of n = 5 buyers the expected number of
buyers who accept the dealer's bank financing is
a. 0.7
b. 1.05
c. 2.5
d. 3.2
e. 3.5
7.12 The standard deviation for the expected number of buyers in the preceding
example is
a. 3.50
b. 1.87
c. 1.22
d. 1.05
e. 1.02
7.13 A probability density function applies only to
a. continuous random variables
b. discrete random variables
c. distributions having both tails infinitely long
d. distributions having at least one infinitely-long tail
e. any random variables
7.14 By examining the algebraic function for the normal pdf, it is easy to see that
the density
a. is determined from only two parameters, and
b. is the same for x = 1.5 as it is for x = 1.5
c. always has a positive value
d. is a maximum at x =
e. all of the above
Page 55
7.15 Which of the following is not a characteristic of the normal pdf?

a. it is unimodal
b. it is symmetrical
c. the mean and mode are identical
d. the mean and median are identical
e. all of the above are characteristics of the normal pdf
7.16 Which of the following would be most likely to be approximately normal?
a. the distribution of annual income of U.S. households
b. distribution of city and town populations in Texas
c. age distribution of workers at Kodak
d. number of McDonalds franchises in each of the fifty states
e. sales of manufacturing corporations in the United States
7.17 Which of the following is true about the standard normal distribution:
a. the mean is 0
b. the standard deviation is 1
c. the variance is 1
d. all of the above
e. a and b only
7.18 Ninety-five percent of the area under the normal pdf is
a. within 1.96 standard deviations of the mean
b. within 1.96 of the mean
c. within 1.96 of zero
d. beyond 1.96 of zero
7.19 If class size at a college is normally distributed, approximately 68 percent of
the class sizes at the college are
a. within one standard deviation of zero
b. within one student of the mean
c. within one percent of the mean
d. within one standard deviations of the mean
e. within one student of the standard deviation
Page 56
7.20 If class size at a college is normally distributed, then approximately 95

percent of the class sizes at the college are
a. within two standard deviation of zero
b. within two student of the mean
c. within two percent of the mean
d. within two standard deviations of the mean
e. within two student of the standard deviation
7.21 A hardware store has weekly sales that are normally distributed with mean
of $18,000 and standard deviation of $4,500. What can you conclude about
the median and modal class midpoint, and approximate percentage of weeks
with sales between $9,000 and $27,000?
7.22 A quality control chart records the lead impurity levels for 60 motor oil
samples collected today. From years of operation, the production manager
knows that impurity levels for properly functioning machinery are normally
distributed with mean of 80 parts per million (ppm) lead impurities and
standard deviation of 12 ppm.
a) What can you conclude about the median and mode for this distribution?
b) What is an interval of lead impurity levels wide enough to include approximately
57 (or 95 percent) of the 60 oil samples?
c) What is an interval of impurity levels wide enough to include virtually all 60
(about 99.5 percent) of the samples?
d) How many standard deviations from the mean would an impurity level of 140
ppm be?
e) How many standard deviations from the mean would an impurity level of 62
ppm be?
f) To standardize the impurity data to a standard normal variable Z, you would
subtract the number
from each value and then divide by the number
_______. By doing so, the new data Z would have a mean equal to
and a
standard deviation equal to
.
Page 57

7.1: e; 7.2: d; 7.3: d; 7.4: a; 7.5:c; 7.6: e; 7.7: b; 7.8: e; 7.9: e; 7.10:d; 7.11: e; 7.12: e; 7.13: a;
7.14: e; 7.15: e; 7.16: c; .17: d; 7.18: a; 7.19: d; 7.20: d; 7.21: Average sale, whether measured
by the mean, median, or mode, should be about $18,000 per week because the distribution is
normal. Approximately 95 percent of the weeks should experience sales between $9,000 and
$27,000 because 95% of the population for a normally distributed variable should have values
within two standard deviations (2 times $4,500) of the mean, $18,000; 7.22: a) median and
mode will be the same as the mean, 80 ppm, b) plus or minus 2 is 80 24, or an interval of
(56 ppm, 104 ppm), c) plus or minus 3 is 80 36, or an interval of (44 ppm, 116 ppm), d) (140
- 80)/12 = 5 standard deviations above the mean, e) (62 - 80)/12 = -18/12 = -1.5 standard
deviations above the mean, f) 80, 12,zero, 1.
Page 58
H. Review Questions for Chapter 8

8.1
Statistical inference favors the use of larger samples because large sample
size do each of the following except:
a. reduce the thickness of the t distribution tails.
n
b. increase the
quotient in the standard error calculations.
c. tend to make central limit theorem approximations appropriate.
d. permit the use of the normal distribution for our sampling distribution.
e. all of the above
8.2
If X is a normally distributed random variable with mean and standard

deviation , then X will be a random variable with
a. a normal distribution
b. a mean of
c. a standard deviation of /
d. all of the above
8.3
If X is a random variable, the sampling distribution of

a. is normally distributed if X is normal with a known value of
b. has a t distribution if X is normal and unknown
c. is approximately normal for large samples and known value of
d. has a t distribution approximately for large samples and unknown
e. all of the above
8.4
Which of the following are not common characteristics of both the normal
distribution and t distributions:
a. Their shape depends on the number of degrees of freedom.
b. They are symmetrical.
c. Each have two infinitely-long tails.
d. They each have a single mode.
e. All of the above are characteristics of both distributions.
Page 59
8.5
If mean auto sales for a random sample of n = 9 salespersons last year was
$300,000 and the sample standard deviation was $90,000, then we should
use as the standard error of the estimate a value of
a. $60,000
b. $30,000
c. $15,000
d. $7500
e. answer depends on the choice of sampling distribution.
8.6
If = 8 for a random variable X, then

a. 0.5
b. 2
c. 4
d. 8
e. 32
8.7
If s = 24 for a random sample, then s

a. 0.67
b. 3
c. 4
d. 8
e. 144
8.8
In comparing the t and standard normal Z distributions, which of the

following is not true?
a. the t distribution is not symmetrical for small samples
b. the t distribution converges to the normal distribution as the sample size
increases
c. the t distribution has thicker tails than the normal
d. only the t distribution varies with the number of degrees of freedom
e. all of the above are true
for a sample size of 16 is
for a sample size of 36 is
8.9 The central limit theorem is applied to inference situations where

a. the sample size is large
b. the sample size is small
c. the population is normally distributed
d. the sample is not a random sample
e. the population standard deviation is unknown
Page 60
8.31 The difference between a parameter and a sample point estimate of that
parameter is called the
a. sampling error
b. bias
c. variance
d. distribution
e. degrees of freedom
8.32 Which of the following does not suggest an interval estimate for the
population mean:
a. As a rough guess, I'd say we rework an average of 15 assemblies each
month.
b. Our new boss reprimands a dozen or so employees a week.
c. Flights average 10 minutes late give or take five minutes.
d. GM's market share averaged about 40 % in each of the past three decades.
e. All of the above suggest interval estimates.
8.33 Which of the following does not involve statistical inference?
a. descriptive statistics
b. point estimation
c. interval estimation
d. hypothesis testing
e. forecasting
8.34 Which information is generally found in a confidence interval?
a. a point estimate
b. an interval width
c. a confidence level
d. all of the above
e. a and c only
8.35 The probability distribution of an estimator or statistic is a
a. random sample
b. confidence interval
c. null hypothesis
d. summary statistic
e. sampling distribution
Page 61
8.36 Analysts for a tire manufacturer estimate from a random sample that mean
tread life for its new radial belt tire is between 20,000 and 50,000 miles.
Which of the following strategies might be used to obtain a narrower
confidence interval?
a. Use a point estimate.
b. Collect a larger sample.
c. Use a smaller value for .
d. Use a larger confidence level.
e. All of the above.
8.41 A statement capable of being subjected to empirical evidence is
a. a hypothesis
b. a tautology
c. a theory
d. a parameter
e. an assumption
8.42 One difference between business statistics and statistics in the sciences is
a. business statistics cannot use the scientific method
b. sciences do not need to formulate hypotheses
c. sciences do not test hypotheses
d. human behavior is not as predictable as planets and viruses
e. all of the above are differences
8.43 Each of the following is a major source of measurement error in business and
economic data except
a. many societies fail to record business and economic data
b. governments suppress data to protect confidentiality
c. people are reticent to disclose financial information
d. businesses are reticent of making information available to rivals
e. all of the above are sources of measurement error
8.44 In traditional hypothesis testing, we
a. try to reject the null hypothesis
b. set up the null hypothesis as a "straw man"
c. make the null and alternative hypotheses exhaustive
d. make the null and alternative hypotheses mutually exclusive
e. all of the above
Page 62
8.45 What is wrong with the following hypothesis test:

H0: X = 12
HA: X > 12
a. should be stated in terms of 0, not 12
b. should involve population parameters
c. does not tell us which is significant
d. does not involve a confidence interval
e. all of the above
8.46 A type II error is
a. rejecting H0 when we shouldn't have
b. rejecting HA when we shouldn't have
c. not rejecting H0 when we should have
d. not rejecting HA when we should have
e. the answer depends on the particulars of the problem
8.47 Alpha, , is
a. the level of significance
b. the probability of a type I error
c. the chance we take that our significance judgments are wrong
d. the chance we take that we are incorrect when rejecting H0
e. all of the above
8.48 In a two-sided test, each rejection region corresponds to a tail containing
probability area equal to
a. 1 -
b. 1 -
c. / 2
d. / 2
e.
8.49 A formula that may be calculated from sample information and hypothesized
values in H0 is called
a. a rejection region
b. a decision rule
c. a test statistic
d. a significance level
Page 63
e. an hypothesis test
8.50 Which of the following is not a limitation of statistical significance?
a. in large samples, significance can be found from minor patterns
b. in small samples, even substantial effect size may not result in significance
c. although we can select , we usually don't know
d. rejecting H0 does not necessarily support any particular HA
e. all of the above are limitations
8.51 Which of the following is not an unethical practice in hypothesis testing?
a. choosing an H0 that is easy to reject
b. always conducting one-sided tests
c. conducting hypothesis tests prior to estimating confidence intervals
d. choosing significance levels just large enough to obtain significance
e. all of the above are unethical
8.52 A one-sided test that yields significance at the = .05 level is equivalent to
significance for the corresponding two-sided test at the
a. = .10 level
b. = .05 level
c. = .025 level
d. = .01 level
8.53 One difference between hypothesis testing and interval estimation is
a. hypothesis tests may have no meaningful estimation counterpart
b. hypothesis testing involves inferential statistics
c. hypothesis test examine the distribution centered around
d. hypothesis testing loses much of its usefulness for small samples
e. hypothesis testing is more important in forecasting problems
X
Page 64
Answers for Chapter 8:

8.1: d; 8.2: d; 8.3: e; 8.4: a; 8.5: b; 8.6: b; 8.7: c; 8.8: a; 8.9: a; 8.31: a; 8.32: e; 8.33: a;
8.34: d; 8.35: e; 8.36: b; 8.41: a; 8.42: d; 8.43: a; 8.44: e; 8.45: b; 8.46: c; 8.47: e; 8.48: d;
8.49: c; 8.50: e; 8.51: c; 8.52: a; 8.53: a.
I.
Page 65
Review Questions for Chapter 9
9.1
The finite population correction factor (fpc) should be used if

a. the sample size is a substantial fraction of the population size
b. the sample size is larger than 30.
c. the population mean is small.
d. populations are not normally distributed.
e. any of the above indicate that the fpc should be used.
9.2
A confidence interval based on the Student t sampling distribution will

a. tend to be wider than an interval based on the normal distribution.
b. always be wider than an interval based on the normal distribution.
c. tend to be narrower than an interval based on the normal distribution.
d. always be narrower than an interval based on the normal distribution.
e. tend to have the same width as an interval based on the normal
distribution.
9.3
Suppose that analysis of sample data results in an estimated mean for

insurance adjustors of 12.5 claims processed per day with a margin of error
of 1.6 and a 95 percent level of confidence. Then the confidence interval
reported is
a. (11.7, 13.3)
b. (10.9, 14.1)
c. (12.5, 14.1)
d. (12.5, 15.7)
e. unable to determine from the information provided
9.4
Based on the preceding problem, 95 percent represents

a. the likelihood that interval contains the population mean
b. the likelihood that interval does not contain the population mean
c. the likelihood that interval contains the sample mean
d. the likelihood that interval does not contain the sample mean
9.5
The log transformation may be used when working with variables having a
a. bimodal distribution
b. symmetrical distributions
c. uniform distributions
d. distribution with two infinitely-long tails
e. highly skewed distribution
9.6
Page 66
Which of the following variables would be a likely candidate for log

transformation?
a. the wealth of students' parents at your college
b. the height of students at your college
c. the grade point average of students at your college
d. the length of textbooks used at your college
e. all of the above
Answer the next three questions based on the following case and statistical output:
A manufacturer collects a sample of 24 monthly sales (in units of millions of dollar)
to estimate mean monthly sales for the population:
Confidence Intervals
Variable N Mean StDev SE Mean
sales
24 10.796 4.273 0.872
90.0 % C.I.
( 9.301, 12.291)
9.7
For this case, the standard error of the mean is a little more than one-fifth
the size of the sample standard deviation because:
a. The square-root of the sample size is slightly less than five
b. The standard deviation is a bit less than five
c. Half the mean is slightly greater than five
d. Twice the width of the confidence interval is slightly greater than five
e. None of the above facts are relevant here
9.8
Which of the following can we conclude?

a. the population mean is $10.8 million
b. we are confident that sales are between $9.3 and $12.3 million in 90% of
all months in the population
c. population mean is $10.8 million and we are 90% confident that the
sample mean lies between $9.3 and $12.3 million
d. The sample mean has a 80% chance of lying within the interval from $9.3
million and $10.8 million
e. The sample mean is $10.8 million and we are 90% confident that the
population mean is between $9.3 and $12.3 million

9.9
Page 67
The confidence interval reported is substantially narrower than fourstandard-errors-of-the-mean wide because
a.90% confidence intervals are narrower than 95% intervals
b. confidence intervals using the t-distribution are narrower than those using
the z-distribution
c. the standard deviation is fairly small in this sample
d. we must first divide by the square root of the sample size
e. all of the above
T-Test of the Mean

Test of mu = 110.00 vs mu not = 110.00
Variable
N Mean StDev SE Mean
T P-Value
DEMAND 60 117.70 14.42 1.86
4.14 0.0001
9.10 A test is conducted at the = 0.01 level to determine if average monthly
demand now is significantly different from 110 million gallons. We may
conclude that
a. p > and therefore reject H0
b. p < and therefore reject H0
c. p < and therefore cannot reject H0
d. p > and therefore cannot reject H0
e. Insufficient information provided to reach a conclusion
9.11 For this large sample, we can be 95 percent confident that monthly water
demand now averages between approximately
a. 114.0 and 121.4 million gallons
b. 103.3 and 132.1 million gallons
c. 114.6 and 121.8 million gallons
d. 115.8 and 119.6 million gallons
A premium is the price charged to maintain insurance coverage. A random sample
of n = 40 life insurance policyholders is surveyed. An insurance company tests
whether premiums charged its nonsmoker policyholders are significantly below $25
for $1000 coverage. Answer the next four questions based on the following case
and statistical output:
Page 68
T-Test of the Mean

Test of mu = 25.00 vs mu < 25.00
Variable
N
NonSmoke 40
Mean StDev SE Mean

T P-Value
18.94 16.39 2.59
-2.34 0.012
9.12 The null and alternative hypotheses for this test on mean nonsmoker
premiums are:
a.
b.
c.
d.
e.
H0: = $25 H0: = $25 H0: = $25 H0: > $25 H0: < $25
HA: > $25 HA: $25 HA: < $25 HA: < $25 HA: > $25
9.13 According to the printout, the mean premium for this sample of nonsmokers
a. is 16.94 standard errors below $25
b. is 16.39 standard errors below $25
c. is 2.59 standard errors below $25
d. is 2.34 standard errors below $25
e. is 0.012 standard errors below $25
9.14 Mean premiums tests significantly less than $25 (per $1000 coverage) at any
the following significance levels except:
a. = .20 significance level
b. = .10 significance level
c. = .05 significance level
d. = .01 significance level
e. tests significant at any of the levels selected above
9.15 If a two-tailed test had been conducted instead, the computer output would
have been exactly the same except
a. the t-ratio would have been twice as large: t = 4.68
b. the t-ratio would have been half as large: t = 1.17
c. the t-ratio would have been positive: t = +2.34
d. the p-value would have been twice as large: p = 0.024
e. the p-value would have been half as large: p = 0.006
Answers for Chapter 9:
9.1: a; 9.2: a; 9.3: b; 9.4: a; 9.5: e; 9.6: a; 9.7: a; 9.8: e; 9.9: a; 9.10: b; 9.11: a; 9.12: c; 9.13:
d; 9.14: d; 9.15: d
J.
Page 69
10.1 Data on the outcome of an experiment is found in the

a. experimental unit
b. response variable
c. factor or factors
d. treatments
e. experimental design
10.2 Analysis of variance often uses the following method to control for factors
that may "confound" the test results?
a. replication
b. balanced design
c. randomized design
d. all of the above
10.3 Which of the following is not a problem associated with trying to control
confounding factors in an experimental design:
a. can be time consuming and costly
b. may overlook some factors that need to be controlled for
c. need to fix factors at realistic values for the results to be applicable
d. may prevent achieving a balanced design
e. all of the above are common problems encountered in controlling for
other variables
10.4 Which of the following is not a problem associated with constructing a
completely randomized design:
a. may not be able to compel participation in an experiment
b. it may be unethical to force participants to serve as guinea pigs
c. if only volunteers participate, self selection bias may result
d. participation may require costly compensation
e. all of the above are common problems encountered in constructing
completely randomized designs
Page 70
10.5 A property assessor wants to determine whether property values vary among
houses on difference sized lots. If the assessor measures property value for a
random sample of houses in the Atlanta area in October 1993, the relation
between property value and lot size may be confounded if
a. property values differ between different regions of the country
b. property values vary from one year to the next
c. property values vary among houses different distances from downtown
Atlanta jobs and shopping
d. property values differ among houses, apartments, and commercial
e. none of the above are capable of producing confounding effects for this
experimental design
10.15 What distinguishes one-way analysis of variance from other types of ANOVA
is
a. there is only one response variable
b. there is only one factor
c. there is only one treatment
d. both b and c
e. none of the above are unique to one-way ANOVA
10.20 Which of the following is not an assumption of analysis of variance models?
a. zero overall population mean
b. constant standard deviation among the treatment population
c. normally distributed treatment populations
d. independent, random samples
e. all of the above are assumptions of ANOVA models
10.21 Which of the following is an assumption about the random disturbance term
in analysis of variance models?
a. constant standard deviation of the random disturbance
b. normal distribution of the random disturbance
c. both a and b
d. none of the above
Page 71
10.22 If treatment sample sizes are approximately equal, which of the following
can be said about the sensitivity of the assumptions for one-way analysis of
variance:
a. moderate departures from normality do not invalidate ANOVA test results
b. moderate differences among treatment standard deviations do not
invalidate ANOVA test results
c. use of non-independent or nonrandom samples do not invalidate ANOVA
test results
d. a and b are true
e. only balance designs may be subjected to analysis of variance tests
10.23 If we wish to test whether construction worker wages are significantly
different among four different states, the number of treatments necessary
for analysis of variance must be
a. 1
b. 2
c. 3
d. 4
e. insufficient information provided to answer this question
10.24 In an ANOVA test of whether mean construction worker wages are
significantly different among four different states, not rejecting the null
hypothesis implies that
a. mean construction worker wages are the same as mean wages in other
occupations
b. mean construction worker wages are the same from one year to the next
c. mean construction workers wages are the same for each state
d. wages are the same among all construction workers
e. all of the above
Page 72
10.25 In order to reject the null hypothesis that mean MPG (miles per gallon) is the
same among subcompact, compact, and mid-sized cars, we must conclude
that
a. sample mean MPGs are all different
b. at least one sample mean is different from the sample mean MPG for the
other two car sizes
c. population mean MPGs are all different
d. at least one population mean is different from the population mean MPG
for the other two car sizes
10.33 The F-ratio for one-way analysis of variance is the ratio of
a. the treatment mean to the overall mean
b. the treatment mean square to the mean square error
c. the treatment sum of squares to the error sum of squares
d. the alpha value to the p-value
e. the p-value to the alpha value
Answer the following questions about the analysis of variance table below:
SOURCE
DF SS
MS F
FACTOR
3
18
ERROR
20
TOTAL
48
10.34 The number of treatments for the factor is
a. 1
b. 2
c. 3
d. 4
10.35 The sample size for the experimental design is
a. 22
b. 23
c. 24
d. 25
e. 26
Page 73
10.36 The error sum of squares, SSE, and mean square error, MSE, are
a. 30 and 10
b. 30 and 1.5
c. 66 and 33
d. 66 and 3.3
e. not enough information to determine
10.37 The F ratio is
a. 1
b. 3
c. 4
d. 5
e. 8
10.38 Determine MSTR and MSE from the following information:
(a) SSTR = 30, SSE = 480, n = 30, and k = 6
(b) SSTR = 300, SSE = 48, n = 30, and k = 6
(c) SSTR = 30, SSE = 480, n = 60, and k = 12
(d) SSTR = 30, SSE = 480, n = 12, and k = 3
(e) SSTR = 30, SSE = 48, n = 12, and k = 4
10.39 Determine the F-ratio from the following information:
(a) MSTR = 400, MSE = 120
(b) MSTR = 0.50, MSE = 0.020
(c) SSTR = 30, SSE = 480, n = 12, and k = 3
(d) SSTR = 300, SSE = 48, n = 30, and k = 6
10.42 Tests to compare individual treatment means may be conducted only if
a. the null hypothesis that all treatment means are equal was to be rejected
b. at least one of the sample treatment means is different from the rest
c. the sample size is large (at least 30)
d. there are at least four treatment being compared
e. not all the assumption of the ANOVA model are valid
Page 74
10.43 If we can reject H0:1 = 2 = 3, then

a. 1 is different from 2
b. 1 is different from either 2 or from 3 or from both
c. 1 is different from 2 and 2 is different from 3
d. one of the three 's is different from the other two, which are equal
e. either c or d must be true
10.44 For each of the following, complete the omitted values for DF, SS, MS, or F.
(a) SOURCE
DF SS
MS F
FACTOR
4
20
ERROR
30
60
TOTAL
34
80
(b)
SOURCE
FACTOR
ERROR
TOTAL
DF
3
20
SS
20
60
MS
(c)
SOURCE
FACTOR
ERROR
TOTAL
DF
5
SS
20
MS
29
80
SOURCE
FACTOR
ERROR
TOTAL
DF
SS
MS
8.5
21
23
63
80
SOURCE
FACTOR
ERROR
TOTAL
DF
4
SS
MS
F
4.5
(d)
(e)
15
14
Page 75
10.78 Which of the following is not common to both regression and analysis of
variance?
a. involve linear models
b. use an F-test to test for the significance of the model
c. use quantitative explanatory variables
d. use a dependent variable that is measured quantitatively
e. all of the above are common to regression and analysis of variance
10.79 Regression models are similar to analysis of variance models in all except
which of the following ways:
a. the dependent variable in regression is like the response variable in
ANOVA
b. the explanatory variables in regression are like the factors and treatments
in ANOVA
c. both use F tests for significance of the model
d. both report analysis of variance tables
e. all of the above are similarities between regression and analysis of
variance
10.80 Which of the following advice should you give someone deciding between
the use of regression and analysis of variance models?
a. use analysis of variance if you have a balanced design
b. use analysis of variance if you need to estimate a slope coefficient for an
explanatory variable
c. use regression if you have a controlled experiment with most confounding
factors held fixed
d. use regression if you have only one or two explanatory variables, each of
which is categorical
e. all of the above are good advice to give
Page 76
10.81 A two-way model (without interaction) is run to test whether brand of

computer and capacity of hard disk have a significant effect on the price of a
386 computer. The 24 data observations are stored in three columns with
response variable and two factors defined as follows:
pr 386 response variable: dollar price of 386 computer
brand factor 1: coded 1 = off-brand, 2 = name brand, 3 = premium brand
harddisk factor 2: coded 1 = small capacity hard disk (less than 40
megabytes), 2 = larger capacity hard disk (60 or more megabytes)
ROWS: harddisk COLUMNS: brand
1
2
3
ALL
1 1963.0 1911.0 2685.0 2186.3
2 2861.0 1931.3 3589.0 2793.8
ALL 2412.0 1921.1 3137.0 2490.0
CELL CONTENTS -- pr 386:MEAN
Analysis of Variance for pr 386
Source
DF
SS
MS
F
P
brand
2 5986494 2993247 7.41 0.004
harddisk 1 2213730 2213730 5.48 0.030
Error
20 8075865 403793
Total
23 16276089
Use the p-values to conduct the F tests at the = .05 level for each factor, interpret
your test results verbally to a person shopping for a 386 computer, and examine
the table of means to show that a brand name computer with large capacity hard
disk appears to be an excellent buy.
There is a significant difference in price based on both brand and harddisk space.
However, a person buying a name computer will pay just slightly more (about $20
on average) for larger disk space, whereas the larger disk space in the off-brand or
premium brand will cost approximately $1,000 more.
Page 77
10.82 A one-way analysis of variance test is conducted to see if the amount a

customer spends in a jewelry store varies with the time a salesperson spends with
that customer. The columns contain random samples of customer purchases at
three treatments for salesperson time spent in minutes: 5, 10, and 15-30.
5 min purchase when salesperson spent about 5 minutes with customer
10 min purchase when salesperson spent about 10 minutes with customer
15-30min purchase when salesperson spent 15 to 30 minutes with customer
The table shows the sorted sample data on customer dollar purchase amounts:
ROW 5 min 10 min 15-30 min
1
0
0
0
2
10
0
0
3
10
0
0
4
18
0
0
5
50
25
10
6
50
30
35
7
78
50
70
8
60
100
9
100
200
10
100
200
11
150
300
12
419
400
13
575
400
14
560
15
575
16
599
ANALYSIS OF VARIANCE
SOURCE DF
SS
MS
F
p
FACTOR 2 181681 90841 2.60 0.089
ERROR 33 1151870 34905
TOTAL 35 1333551
INDIVIDUAL 95 PCT CI'S FOR MEAN BASED ON POOLED STDEV
LEVEL
N MEAN STDEV ----------+---------+---------+-----5 min
7 30.9 28.7
(-----------*-----------)
10 min
13 116.1 178.2
(--------*-------)
15-30min 16 215.6 225.9
(-------*-------)
----------+---------+---------+-----POOLED STDEV = 186.8
0
120
240
Page 78
(1) State the alternative hypothesis given that the null hypothesis is
H0: 5 min = 10 min = 15-30min
HA: at least one j different from the other means
(2) Show results of p-value decision rules for this test at =.05 significance level
and state your test findings in one sentence.
p = .089 > = 05 No significant spending differences found regardless of amount of
time salesperson devoted to customer.
(3) If the manager only examines the means of the three treatment samples, he
might conclude that 10 minutes with the customer nearly quadrupled purchase
amounts ($116.1 versus $30.9), and 15 to 30 minutes doubled that figure again
($215.6 versus $116.1). Use the data listing and confidence interval diagram to
respond to this mistaken conclusion.
The overlapping confidence intervals and huge range of data within the last two
categories indicates within-category variation dominates the between-category
variation.
(4) Is the design balanced? Explain.
Design unbalanced because fewer observations from "5 min" than from other two
categories
Page 79
10.83. Many companies have United Way drives among their employees, and set
goals to surpass the previous year. Test whether marital status and the size
of one's salary has a significant effect on the percentage of salary donated to
United Way. A factorial design with replications from random sampling
results in 30 employee observations stored in three columns (i.e. stacked)
with response variable and two factors defined as follows:
donate%
response variable: percentage of salary donated to charity
marstat
factor 1: coded 1 = single, 2 = married
paycode
factor 2: coded 1 = salary under $20,000, 2 = salary $20,000 to
$30,000, and 3 = salary at least $30,000
A table of means is first generated:
ROWS: marstat
COLUMNS: paycode
1
2
3
ALL
1 0.9284 1.4323 2.2490 1.5366
2 0.9508 1.4080 1.7954 1.3848
ALL 0.9396 1.4202 2.0222 1.4607
CELL CONTENTS -- donate%:MEAN
A two-way model (with interactions) is run, with the following results:
Analysis of Variance for donate%
Source
DF
SS
MS
F
P
marstat
1 0.1729 0.1729 0.85 0.367
paycode
2 5.8841 2.9420 14.39 0.000
Interaction 2 0.3442 0.1721 0.84 0.443
Error
24 4.9053 0.2044
Total
29 11.3064
(1) Use the p-values to conduct the F tests at the = .05 level for each factor,
interpret your test results verbally.
Only salary matters.
Page 80
(2) Use the ALL column or ALL row of the table of means to quantify any significant
patterns found in question #1.
The significant factor, marital status, was associated with single employees
donating at nearly twice the rate (4.2%) of married employees' donations (2.2%)
(3) Conduct a test at the .05 level for possible interactions between marital status
and salary. What does the test result allow you to conclude about whether
an additivity assumption would have been valid?
Fails the p-value decision rule so additivity is not valid.

10.1:b, 10.2: d, 10.3: d, 10.4: e, 10.5: c, 10.15: b, 10.20: a, 10.21: c, 10.22: d, 10.23:
d, 10.24: c, 10.25: d, 10.33: b, 10.34: d, 10.35: c, 10.36: b, 10.37: c, 10.38: a)5,
20.87 b) 50, 2.09 c) 2.5, 10.21 d) 10, 60 e) 7.5, 6.86, 10.39: a) 3.33 b) 25 c) .167 d)
23.96, 10.42: a, 10.43: e, 10.44: a) MSR=5, MSE=2, F=2.5, b) DFT=23, SST=80,
MSR=6.67, MSE=3, F=2.22, c) DFE=24, SSE=60, MSR=4, MSE=2.5, F=1.6, d) DFR=2,
SSR=17, MSE=3, F=2.83, e) DFE=10, SSR=27, SST=42, MSR=6.75, MSE=1.5, 10.78: c,
10.79: e, 10.80: a
Page 81
K. Review Questions for Chapter 11

11.1 Which of the following is in proper form for a regression?
a. predicted Sales = b0 + b1 Invest + b2 Employ
b. Sales = 0 + 1 Invest + 2 Employ +
c. predicted Sales = 0 + 1 Invest + 2 Employ
d. Sales = b0 + b1 Invest + b2 Employ +
e. all of the above
11.2 Which of the following is a source of random disturbance due to modeling
error?
a. using a proxy variable instead of the ideal variable needed for your model
b. using a sample to estimate population characteristics
c. including only the most important variables for your model
d. using less than precise instruments to make your data readings
11.3 A proficiency exam score used in a regression model to "proxy for" worker
efficiency is an example which source of random disturbance?
a. sampling error
b. measurement error
c. modeling error
d. all of the above
e. b and c only
11.4 The random disturbance term is required in population regressions because
of
a. sampling error
c. modeling error
d. all of the above
e. b and c only
Page 82
11.5 If you have data for the entire population, which of the following will no
longer be a factor?
a. sampling error
c. modeling error
d. errors in judgment
e. all of the above
11.6 If all regression assumptions are valid, least-squares estimators
a. are unbiased
b. have minimum standard deviation among all unbiased estimators
c. are efficient
d. all of the above
11.7 Which of the following is not true about autocorrelation?
a. it results in inefficient estimation
b. it is a problem only for time series data
c. it means that and j are correlated for i j
d. it results in biased estimation
e. all of the above are true
11.8 Omitting an intercept, or constant, term from a regression equation
a. is recommended whenever we suspect that E() is not zero
b. means the equation cannot be estimated with least-squares analysis
c. usually improves the regression fit
d. should be avoided even if the intercept is meaningless in the equation
e. all of the above
11.9 Which of the following is not a regression assumption?
a. the parameters are constant
b. E() is zero
c. is uncorrelated with each of the explanatory variables
d. each is uncorrelated with every other
e. all of the above are regression assumptions
Page 83
11.10 A changing slope in a simple regression equation means that which

assumption is violated?
a. the parameters are constant
b. E() is zero
c. is uncorrelated with each of the explanatory variables
d. each is uncorrelated with every other
e. the random disturbance is normal
11.11 If all regression assumptions are valid except there is nonconstant ,
estimators are
a. unbiased but not efficient
b. efficient but not unbiased
c. unbiased and efficient
d. neither unbiased nor efficient
11.12 According to the Gauss-Markov theorem
a. the assumption of normally distributed is approximately true for large
samples
b. least-squares estimators are efficient under the regression assumptions
c. the least-squares equation minimizes the sum of squared errors
d. all of the above
11.13 In many regression situations, according to the central limit theorem,
a. the assumption of normally distributed is approximately true for large
samples
b. least-squares estimators are efficient under the regression assumptions
c. the least-squares equation minimizes the sum of squared errors
d. all of the above
Page 84
Case Study: Answer questions 11.14- 11.18

An expressway planner in Orlando Florida is preparing her analysis about which
factors affect the traffic through the Beach Line toll plaza near the airport. Time
series data for 72 months and the following model:
TRAFFIC = 0 + 1 METROPOP + 2 SALETAX + 3 UNEMPLOY + 4 AUTOTOUR + 5 AIRARRIV +
with variables defined as:
TRAFFIC
monthly traffic volume measured at expressway toll plaza
METROPOP monthly population estimates for the metropolitan area (in thousands)
SALESTAX
monthly state sales tax collected (in millions of dollars)
UNEMPLOY monthly unemployment rate (in percentage points)
AUTOTOUR monthly volume of automobile visitors to the state (in thousands)
AIRARRIV
monthly passengers arriving at metropolitan airport (in thousands)
The regression equation is: TRAFFIC = 8559 + 115 METROPOP - 90.3 SALETAX - 1639 UNEMPLOY
+ 88.8 AUTOTOUR + 228 AIRARRIV
Predictor
Constant
METROPOP
SALETAX
UNEMPLOY
AUTOTOUR
AIRARRIV
Coef
8559
115.2
-90.33
-1639
88.76
227.97
Std Err
85392
138.0
70.69
4574
19.29
20.57
t-ratio
0.10
0.83
-1.28
-0.36
4.60
11.08
p
0.920
0.407
0.206
0.721
0.000
0.000
11.14. Complete the alternative hypothesis for two-tailed significance test on the
SALETAX variable.
a) H0: 2 = 0 and HA: 2 (complete the alternative hypothesis)
b) Use the p-value decision rule to conduct this test at the = .05 level.
11.15. A one-tailed test is conducted on whether airport arrivals is directly related
to traffic volume; then
a) H0: 5 = 0 and HA: 5 (complete the alternative hypothesis)
b) Use the p-value decision rule to conduct this test at the = .05 level.
11.16. Why is a one-tailed test for the AIRARIV variable justified in this model?
11.17. Use the p-value decision rule to conduct a one-tailed test for METROPOP at
the = .05 level.
11.18. Based on a 95% confidence level, each additional one thousand auto
tourists adds about 89 cars/month to the traffic, plus or minus
.
Page 85
11.25
A valid decision rule for a two-sided test of a regression coefficient is
a. t-ratio > t /2(n k 1)
b. t-ratio < t /2(n k 1)
c. t-ratio > t (n k 1)
d. t-ratio < t (n k 1)
11.26
A valid decision rule for a two-sided test of a regression coefficient is
a. p >
b. p <
c. p/2 >
d. p/2 <
11.27
Which of the following steps does not belong in the inference process
for explanatory variables?
a. use the t-ratio or p-value decision rules to determine test results
b. test alternative models at several different significance levels
c. translate test results on each coefficient into significance about the
corresponding explanatory variable
d. interpret estimates of regression coefficients as slopes
e. all of these step belong
11.28
Which of the following steps is out of sequence in the inference
process for explanatory variables?
a. state the regression model
b. collect the sample data and estimate the regression equation
c. decide which variables are to be tested
d. determine which variable are eligible for one-sided tests
e. assign a level of significance
11.29
Which of the following does not belong with the rest in interpreting
the findings of a two-sided test of an explanatory variable X1?
a. X1 is directly related to the dependent variable in the model
b. 1 is significantly different from zero
c. we reject the null hypothesis
d. X1 is statistically significant
e. all of the above are equivalent
Page 86
11.30
For the one-tailed test, a valid decision rule for an explanatory variable
to test significant is that the regression coefficient has the anticipated sign
and
a. p > /2
b. p < /2
c. p/2 >
d. p/2<
An insurance company investigates the determinants of life insurance rates. A
survey of n = 58 policyholders collects information on the three variables in the
following the regression model: premium = 0 + 1 Age + 2 mortrate +
premium
annual premium for each $1000 in life insurance coverage (in dollars)
Age
age of policyholder (in year)
mortrate
mortality rate (number of deaths per 1000) based on gender, health,

and other policyholder characteristics.
Regression Analysis
premium = - 0.77 + 0.364 Age + 1.40 mortrate
Predictor
Constant
Age
mortrate
Coef
-0.765
0.3636
1.4038
Std Err
3.463
0.1158
0.2283
t-ratio
p
-0.22 0.826
3.14 0.003
6.15 0.000
[other output not relevant to this case is omitted]
11.31
Which are the null-alternative hypotheses for a two-tailed significance
test on the Age variable?
a.
b.
c.
d.
e.
H0: 1 = 0 H0: 1 = 0 H0: 1 = 0 H0: 1 > 0 H0: 1 < 0
HA: 1 > 0 HA: 1 0 HA: 1 <0
HA: 1 < 0 HA: 1 > 0
Page 87
11.32
According to the regression printout above,
a. Age is more than three standard errors greater than mean age
b. Age is about 0.36 standard errors greater than 0
c. The sample regression coefficient of Age is about 0.36 errors greater than 0
d. The sample regression coefficient of Age is more than three errors greater
than 0
e. The population regression coefficient of Age is about 0.36 errors greater than
0
11.33
If Age is tested at the = .05 significance level, then we may conclude
each of the following except:
a. p <
b. Reject H0
c. The coefficient of Age is significantly different from zero
d. Age has a significant and direct effect on life insurance premiums
e. All of the above are valid
11.34
Each additional year of age adds an average of ______ to insurance
premiums (other things equal)?
a. 36 cents with a margin of error of about 3.5 cents
b. 36 cents with a margin of error of about 23 cents
c. 11 cents with a margin of error of about 3 cents
d. $3.14 with a margin of error of 12 cents
e. Cannot be answered because Age is not statistically significant
11.35
Construct the null-alternative hypotheses for a one-tailed significance
test that mortrate is directly related to premium?
a.
b.
c.
d.
e.
H0: 2 = 0 H0: 2 = 0 H0: 2 = 0 H0: 2 > 0 H0: 2 < 0
HA: 2 > 0 HA: 2 0
HA: 2 < 0 HA: 2 < 0 HA: 2 > 0
11.36
Mortality rates would have a significant, direct relationship with
premiums at any of the following significance levels except:
a. = .20 significance level
b. = .10 significance level
c. = .05 significance level
d. = .01 significance level
e. tests significant at any of the levels selected above
Page 88
11.37 Determine the t-ratio under each of the following conditions:

(a) b = 10, sb = 2
(b) b = 8, sb = 3.5
(c) b = 1, sb = 2
(d) b = .01, sb = .001
(e) b = 3500, sb = 2100
11.38 Determine Std Err (Sb) for a simple regression given the following:
(a) s = 12, sX = 45, n = 36
(b) s = 1.2, sX = 4.5, n = 36
(c) s = 6, sX = 45, n = 9
(d) s = 12, sX = 22.5, n = 144
Explain the relationship in your answers based upon the trade-offs between s, sX,
and n in the formula for sb.
An expressway planner is preparing her analysis about which factors affect the
traffic through a major toll plaza. Time series data are collected for 72 months and
the following model is used:
TRAFFIC = 0 + 1 POP + 2 TAX + 3 UNEMP + 4 AUTO + 5 AIR +
with variables defined as:
TRAFFIC monthly traffic volume at toll plaza (in thousands)
POP
monthly population estimates for the metropolitan area (in thousands)
SALESTAX monthly state sales tax collected (in millions of dollars)
UNEMP
monthly unemployment rate (in percentage points)
AUTO
monthly volume of automobile visitors to the state (in thousands)
AIR
monthly passenger volume arriving at metropolitan airport (in
thousands)
and the regression output is:
TRAFFIC = 8.6 + 0.115 POP - 0.0903 TAX - 1.64 UNEMP + 0.0888 AUTO + 0.228 AIR
s = 26.51
R-sq = 83.1%
SOURCE
DF
SS
R-sq(adj) = 81.8%
MS

Regression #
Error
66
Total
71
227866
46388
274254
%#$@*
703
Page 89
&%#@! 0.000
11.39 Complete the null and alternative hypotheses lines for the F-test of this
model:
H0: 1 =
=0
HA: j0 for at least one j, where j = 1 through
.
11.40. Using the p-value decision rule, p is (less / greater) than , so we (reject /
cannot reject) the null hypothesis, and we therefore conclude that this
expressway traffic model (tests / does not test) significant at the = .01 level.
11.41. This test is also equivalent to the following test for the population R-square,
2 :
H0: 2 = 0 HA: 2
(complete the alternative hypothesis)
Test findings from question 11.40 allows us to conclude that the model fit ( is / is
not ) significant.
11.42. A garbled Fax transmission made some items in the output unreadable.
From your knowledge of the analysis of variance table, the degrees of
freedom for the regression must equal
, the mean square regression is
therefore
, and the F-ratio must then be equal to _______.
11.45
If for a particular explanatory variable, the sample regression
coefficient is 10.5 and its standard deviation is 3.5, then the t-ratio is
a. 7.0
b. 14
c. 0.33
d. 3.0
e. insufficient information provided to determine the t-ratio
11.46 Which of the following can be determined from t-tests on regression
results?
a. whether the model is significant
b. whether specific explanatory variables are significant
c. whether the regression fit is significant
d. all of the above
11.47
In simple regression models, the F-test and the t-test
Page 90
a. test statistically equivalent hypotheses

b. always yield the identical p-values
c. result in an F-ratio that is the square of the t-ratio
d. all of the above
11.48
The alternative hypothesis for the F-test on the multiple regression
model Y = 0 + 1X1 + 2X2 + can be stated as follows:
a. HA: either 1 0 or 2 0 but not both
b. HA: either 1 0 or 2 0 or both
c. HA: both 1 0 and 2 0
d. HA: 1 = 2
e. HA: 1 2
11.49
If SSE = 10, SSR = 10, n = 36, and k = 5, then the F-ratio is
a. 6
b. 2
c. 1
d. 0.33
11.50
If R = 80%, SST = 120, n = 15, and k = 2, then the F-ratio is
a. 24
b. 12
c. 6
d. 4
e. 2
11.51
If MSR = 600 and s = 20, then the F-ratio is
a. 30
b. 15
c. 10
d. 3
e. 1.5
Page 91
Answer questions 11.52-11.54 based on a regression analysis of variance table;

unfortunately, a defect in the printer causes it to provide information on degrees of
freedom (DF) for the regression and error and the SSR.
SOURCE
DF
SS
MS
F
p
Regression
4
136.400
@#!@$#
&%$#@
0.000
Error
22
#$@&#@$ %$#@##
Total
$@ 189.200
11.52
The sample used to generate this ANOVA table must contain
a. n = 26 observations
b. n = 27 observations
c. n = 18 observations
d. n = 19 observations
11.53
MSR =
and SSE =
, where the blanks are equal to
a. 27.28 and 52.8
b. 34.1 and 325.6
c. 34.1 and 52.8
d. 27.28 and 325.6
11.54
The F-ratio for the table is approximately
a. 14.2
b. 11.4
c. 10.8
d. 9.57
11.55
The F distribution has each of the following traits except:
a. is defined only for nonnegative values of F
b. contains an infinitely-long tail
c. is skewed
d. is unimodal
e. all of the above are traits of the F distribution
Page 92
11.56
Which of the following has an F distribution?
a. the ratio of sums of squares
b. mean squares
c. sums of squares
d. the ratio of mean squares
e. all of the above
11.57
A regression model is more likely to test significant if
a. the sample Size n is large
b. there are many explanatory variables in the model
c. R is small
d. the used for the test is small
e. all of the above
11.58
If SSE = 600, SSR = 300, n = 29, and k = 4, then the F-ratio is
a. 6
b. 4
c. 3
d. 1/2
Answer questions 11.59-11.61 based on a regression analysis of variance table;
unfortunately, a defect in the printer causes it to provide information on degrees of
freedom (DF) for the regression and error and the SSR.
SOURCE
Regression
Error
Total
DF
3
&#
19
SS
*$#@!#
84.80
204.05
MS
@#!@$#
%$#@##
F
&%$#@
p
0.000
11.59
The sample used to generate this ANOVA table must contain
a. n = 20 observations
b. n = 19 observations
c. n = 17 observations
d. n = 16 observations
Page 93
11.60
SSR is equal to
a. 288.85
b. 119.25
c. 39.75
d. 20
11.61
The F-ratio for the table is approximately
a. 8.91
b. 7.50
c. 1.406
d. 0.469
11.62 Solve for the F-ratio given the following information:
(a) n = 30, k = 4, SSR = 200, SSE = 1000
(b) MSR = 60, s = 20
(c) n = 36, k = 2, SST = 120, R = 0.60
11.63 Complete the following ANOVA table by calculating the MS column and F
ratio:
SOURCE
DF
SS MS F
Regression
4
136
Error
21
630
Total
25
769
11.64 Complete the omitted information from the following ANOVA table if the
model contains 3 explanatory variables and the sample Size is 32.
SOURCE
DF SS
MS F
Regression
60
Error
Total
90
11.65 Complete the following ANOVA table, determine the R, and test whether
R is significantly greater than 0 at the .01 level.
SOURCE
DF SS MS F
Regression
6
240
Error
120
Total
18
Page 94
11.66 Complete the following ANOVA table, determine the R, s, and test whether
the model is significant at the .01 level.
SOURCE
DF SS
MS F
Regression
2
0.30
Error
50
0.04
Total
11.67 Determine four things that are wrong in the following ANOVA table:
SOURCE
DF SS MS
F
p
Regression
4
180 40
2.0 .001
Error
25
500 20
Total
30
640
Case Study: United Way predicts donations of two firms using information on
wages, employment, and last years giving. They collect a random sample of 55
participating companies and fit the model:
Giving = 0 + 1 Wages + 2 EMP + 3 GiveLast +
where the variables in the model are defined as:
Giving
Total amount raised for United Way at each company (in dollars)
Wages Average employee annual wage (in thousands of dollars)
EMP
Number of employees at the company
GiveLast Total amount raised last year at the same company (in dollars)
Use the descriptive statistics and regression to answer questions 11.68-11.71:
Variable
Wages
55 27.42 28.00
26.98
10.41 1.40
EMP
55 203.0 150.0
180.9
143.0 19.3
GiveLast
55 6202 4460
5425
6238 841
Giving = - 449 + 43.8 Wages + 2.93 EMP + 0.661 GiveLast
Min
Max
12.00 60.00
100.0 800.0
100.0 28208
s = 2210
R-sq = 81.1%
R-sq(adj) = 80.0%
Fit Stdev.Fit
95.0% C.I.
95.0% P.I.
?
298
(4841, 6038) (962, 9917)
Values for FIRM 1: Wages = 27.5, EMP = 200, and GiveLast = 6200
Page 95
Fit
Stdev.Fit
95.0% C.I.
95.0% P.I.
26104 12251
(1503, 50704) (1106, 51101) XX
Values for FIRM2: Wages = 45, EMP = 5000, and GiveLast = 15000
X denotes a row with X values away from the center
XX denotes a row with very extreme X values
11.68 The predicted value for Firm 1's giving is $________
11.69 We are 95% confidence that Firm 1 will donate between approximately $
____
and $ ____
this year.
11.70. The margin of error for the prediction of Firm 1's giving at the 95% level of
confidence is about ______ times the _______because the sample size is
_________ and Wages, EMP, and GiveLast are near their _______ values.
11.71. The XX extrapolation warning attached to the Firm 2s prediction is a result
of trying to predict corporate giving based on a level of _________ outside
the sample data range used to fit the model.
11.72
Which of the following measures the interval for the average value of
the dependent variable given particular values of the explanatory variables?
a. the prediction interval for Y
b. the confidence interval for the conditional mean of Y
c. the forecast interval for Y
d. the univariate confidence interval for Y
11.73
For which of the following situations would I use a prediction interval?
a. estimating the number of defects in a car rolling off the assembly line at 4
P.M.
b. estimating the mean time spent by a sales clerks with customers of a
particular age
c. estimating the average number of years that CEOs retain their Jobs if they
have been with that same corporation 10 year
d. all of the above
11.74
The standard error of the conditional mean s is related to the
standard error of the estimate s in that the former is equal to
a. s
Page 96
b. s/
c. s only when all explanatory variables are at their mean
d. s/
only when all explanatory variables are at their mean
e. always larger than s
n
11.75
Which of the following is true about the standard error of the
conditional mean s?
a. s is larger the further the explanatory variables are from their means
b. s is larger the closer the explanatory variables are to their means
c. s is unaffected by the explanatory variables, only is affected
d. s is larger the closer the explanatory variables are to one another
11.76
In comparing the confidence interval for the conditional mean with the
prediction interval, which of the following is true?
a. prediction intervals are larger and more affected by extreme values for
explanatory variables
b. prediction intervals are larger but less affected by extreme values for
c. prediction intervals are smaller but more affected by extreme values for
d. prediction intervals are smaller and less affected by extreme values for
e. comparisons depend on the specific model and sample being analyzed
11.77
In regression models, forecast intervals are a type of
a. confidence interval for the conditional mean
b. univariate confidence interval for the variable being forecast
c. prediction interval
d. confidence interval for the regression coefficient
Page 97
11.78 A consortium of Florida cities hires a budget analyst to model the factors
affecting police force Size and predict the police force for three newly incorporated
cities. The model is fit for n = 56 middle-Sized cities (populations 25,000 to 100,000)
using the following model: FORCE = 0 + 1 VIOL +2 PRTX + 3 OLD% + 4 OLD% +
where the variables in the model are defined as:
FORCE
police force Size (number of officers on the city police force)
VIOL
violent crime rate (as a percent of total crimes)
PRTX
property tax (city tax in dollars per capita)
POP
city population (in thousands)
OLD%
elderly share of population (percent at least 65)
After generating descriptive univariate statistics on the independent variables, the
regression model is fit and three predictions made:
Variable
VIOL
PRTX
POP
OLD%
N
56
56
56
56
Mean
7.661
164.8
52.38
10.900
Median
6.750
103.0
47.55
10.550
TrMean
6.952
142.1
51.57
10.616
StDev SEMean Min Max

5.468 0.731 0.500 33.00
173.0 23.1
18.0 977.0
20.44 2.73
25.70 95.90
4.995 0.668 2.100 32.60
Predict for VIOL = 7.5,PRTX = 165,POP = 53,OLD% = 11; CITY1

Predict for VIOL = 20,PRTX = 300,POP = 60,OLD% = 25; CITY2
Predict for VIOL = 3,PRTX = 200,POP = 5,OLD% = 4; CITY3
FORCE = - 52.5 + 3.24 VIOL + 0.0641 PRTX + 1.59 POP + 2.10 OLD%
[output omitted to save space]
s = 20.36
R-sq = 85.0%
R-sq(adj) = 83.8%
SOURCE
DF
SS
MS
Regression 4 119858
Error
51
21144 415
Total
55 141002
p
0.000
Fit Stdev.Fit
9 5.0% C.I.
95.0% P.I.
89.71
2.73
(84.24, 95.19)
(48.46, 130.96)
179.34 9.33 (160.60, 198.08) (134.36, 224.31)
66.19 18.90 (28.24, 104.13) (10.41, 121.97) XX
XX denotes a row with very extreme X values
PREDICTION 1
PREDICTION 2
PREDICTION 3
Page 98
Answer the following questions based on the preceding output and model:
A)
A printing error caused the SSR to be missing from the Analysis of Variance
table. The MSR is equal to
a. 290
b. 29,965
c. 98,714
d. 479,432
B)
The same printing error caused the F-ratio also to be omitted from the
Analysis of Variance table. The F-ratio is approximately
a. 12
b. 17
c. 72
d. 98
C)
The null hypotheses associated with the F test is

a. H0: 0 = 1 = 2 = 3 = 4
b. H0: 0 = 1 = 2 = 3 = 4 = 0
c. H0: 1 = 2 = 3 = 4
d. H0: 1 = 2 = 3 = 4 = 0
D) After conducting the F-test at the = .01 significance level using the p-value
decision rule, what should we do?
a. reject H0 and conclude that the model is significant
b. reject H0 and conclude that the model is not significant
c. cannot reject H0 and conclude that the model is significant
d. cannot reject H0 and conclude that the model is not significant
e. insufficient information provided to conduct the test
E)
We are 95% confident that the mean police force for hundreds of cities with
the same characteristics as the first city is between approximately which two
values?
a. 90 to 95
b. 84 to 90
c. 84 to 95
d. 48 to 131
e. 69 to 110
Page 99
F)
Which of the following is not true about the prediction for the first city?
a. the point prediction is based on explanatory variables near their sample
means
b. the prediction interval is approximately four times the standard error of
the estimate
c. the prediction interval is narrower than for virtually any other possible
prediction
d. all of the above are true
G)
The prediction interval is wider for the second city than for the first city
because
a. the second prediction involves extrapolation
b. the second prediction involves explanatory variables are not near the
sample means
c. the second prediction involves a larger Sized police force
d. the second prediction is made after the first
H)
In this case, the prediction for the third city causes an "XX" warning to be
issued on the printout because
a. one of the explanatory variables lies outside the range of the sample data
b. two of the explanatory variables lies outside the range of the sample data
c. three of the explanatory variables lies outside the range of the sample data
d. all four of the explanatory variables lies outside the range of the sample
data
e. the warning is not relevant to this case because no time series forecasting
is involved
Page 100
11.79
To perform rate studies and issue municipal bonds, counties need to model and
estimate electrical power usage needs. County also use resulting information to
determine whether it is more profitable to purchase on the spot market, enter into
long term supply contracts with utilities, or build their own generating facilities.
Monroe country in the Keywest resort area of Florida gathers quarterly time series
data on T = 28 quarters from 1982-1988 on the following variables:
powert =
DD coolt =
customert =
retailt =
residential power usage (in millions of kilowatt hours) during tth quarter
cooling degree days (a measure of temperature)
number of billed residences (in thousands) during the tth quarter
Florida taxable retail sales (in billions of dollar) during the tth quarter
A consultant for the county's utility board obtains the following regression output
on this data:
Predict for ddcool = 400, customer = 15, retail = 6;
Predict for ddcool = 1197, customer = 16.36, retail = 6.77.
power = - 48.8 + 0.0104 DD cool + 4.38 customer - 0.21 retail
Predictor
Coef
StErr
Constant -48.83
11.12
DD cool 0.010438 0.001517
customer
4.379
1.084
retail
-0.210
1.452
s = 3.325
R-sq = 80.4%
t-ratio
p
-4.39 0.000
6.88 0.000
4.04
0.000
-0.14
0.886
R-sq(adj) = 78.0%
SOURCE
DF
SS
MS
F
p
Regression 3 1043.06
347.7 32.8 0.000
Error
24 254.28
10.6
Total
27 1297.34
Fit
Stdev.Fit
95% C.I.
95% P.I.
19.771 1.642 ( 16.382, 23.160) ( 12.116, 27.425) PREDICTION 1
33.884 0.628 ( 32.587, 35.181) ( 26.899, 40.868) PREDICTION 2
MEAN 'DD cool' = 1197.0; MEAN 'customer' = 16.359; Mean 'retail'= 6.7732
Page 101
Answer the following questions based on the preceding output and model:
A)
Modeling and F-Test of the Model: Formally present the model being
estimated.
Hint: Use variable names (DD cool, customer, and retail) and the beta () parameters as
variable coefficients:
B) The hypotheses associated with the F test on the model may be constructed in
terms of the betas ('s) of the model is given as:
C) Conduct the F-test at the = .01 significance level using the p-value decision
rule and then state your conclusion in one sentence.
D) Check that the 19.771 Fit from PREDICTION 1 is correct:
Hint: Use the fitted equation to predict power usage when there are 400 cooling degree
days during the quarter, 15 thousand customers and retail sales are $6 billion.
E) Based on PREDICTION 1, which 95% confidence interval would you report if you
were forecasting power usage for a particular quarter with those explanatory
variable values.
F) Based on PREDICTION 1, which 95% confidence interval would you report if you
you wanted to capture average power usage for many quarters with those
explanatory variable values.
G) Examine PREDICTION 2, which is based on 1197 cooling degrees days, 16,360
customers, and $6.77 billion in retail sales. Explain why standard deviation of
the fit for PREDICTION 2 is so much smaller (only 0.63) than the "Stdev.Fit"
for PREDICTION 1 (1.64).
Page 102
11.80 A bank analyst collects monthly data on the economy for the period just prior
the 1990-92 recession. The sample consists of monthly time series data on the U.S.
economy from 1988 until the middle of 1990, 30 consecutive months. The variables
in the model are defined
unemp
conf
starts
invent
monthly unemployment rate (in percentage points)

monthly consumer confidence index (in 1967 the index was 100)
monthly housing starts (in millions)
monthly manufacturing inventories (in billions of dollars)
and the regression model produces the following regression results:

The regression equation is: unemp = 9.51 - 0.00445 conf - 0.287 starts - 0.00948 invent
Predictor
Coef
Constant
9.512
conf
-0.004453
starts
-0.2865
invent -0.009478
Stdev t-ratio
p
1.104
8.61 0.000
0.008559 -0.52 0.607
0.1971 -1.45 0.158
0.001652 -5.74 0.000
A) Formally present in equation form the model being estimated. Use the variable
names (unemp, conf, starts, and invent) instead of Y, X1, X2, and X3), and don't
forget to use the beta () parameters as variable coefficients.
B) Complete the null-alternative hypotheses for a two-tailed significance test of the
'invent' variable. [use the proper j]: H0: ___ and HA: ___
C) Next, construct the one-tailed hypothesis to test whether there is an inverse
relationship of housing starts with the dependent variable.
[use the proper j]: H0: ___ and HA: ___
D) Explain in one sentence why we are justified in conducting the one-tailed test in
question (C) above.
E) Using the p-value decision rule, conduct each of the two tests from questions B
and C at the = .05 level; in each case, show which two numbers you compared
and determine whether each null hypothesis can or cannot be rejected.
Page 103
F. In one sentence entirely without symbols or the words "hypothesis", "test", or

"coefficient" (but you may use the words "significant" or "not significant"), verbally
communicate your results of the test on the inventory variable to an audience of
business persons.
G. Using the p-value decision rule again, would your test result for the housing starts
variable have been reversed if a two-sided test instead of a one-tailed test had been
conducted? How about if = .10 (instead of = .05) had been chosen for the onetailed test?
H) Using the "delta formula" for multiple regression (i.e. Thing 2), a rise of $100 billion
in manufacturing inventories will yield an average change in the unemployment rate of
how many percentage points, other things equal.
I) Calculate the 95% confidence interval for $100 billion increase in manufacturing
inventories.
Page 104

11.1: b; 11.2: c; 11.3: b; 11.4: e; 11.5: a; 11.6: d; 11.7: e; 11.8: d; 11.9: e; 11.10: a; 11.11: d; 11.12:
b; 11.13: a; 11.14: H A: 2 0; p is greater than so we cannot reject the null hypothesis, and we
therefore conclude that sales tax revenue is not significantly related to traffic volume; 11.15: HA:
5>0; p/2 is less than so we reject the null hypothesis, and we therefore conclude that airport
arrivals are significantly related to traffic volume; 11.16: More people flying into Orlando means
more expressway traffic because there are more cars rented and more friends and relatives being
picked up at the airport; 11.17: We find that p/2 is greater than , so we cannot reject the null
hypothesis, and we therefore conclude that the metropolitan area population does not have a
significant effect on expressway traffic; 11.18: 1000* (2 * Std Err); 11.25: a; 11.26: b; 11.27: b;
11.28: b; 11.29: a; 11.30: d; 11.31: b; 11.32: d; 11.33: e; 11.34: b; 11.35: a; 11.31: b; 11.32: d;
11.33: e; 11.34: a; 11.35: a; 11.36: e; 11.37: a) 5, b) 2.29, c) .5, d) 10, e) 1.67. 11.38: a) .0451, b)
.0451, c) .0471, d) .0446. sb is directly proportional to SEE and inversely proportional to sx and n;
11.39: 2 = 3 = 4 = 5, k=5; 11.40: Less, reject, tests; 11.41: 2> 0, is; 11.42: 4. 5, 45573.2, 64.83;
11.45: d; 11.46: b; 11.47: d; 11.48: b; 11.49: a; 11.50: a; 11.51: e; 11.52: b; 11.53: c; 11.54: a;
11.55: e; 11.56: d; 11.57: a; 11.58: c; 11.59: a; 11.60: b; 11.61: b; 11.62: a) 1.25, b) .15, c) 57.75;
11.63: MSR=34, MSE=30, F=1.13; 11.64: DFR=3, DFE=28, DFT=31, SSE=30, MSR=20, MSE=1.07,
F=18.69; 11.65: DFE=12, SST=360, MSR=40, MSE=10, F=4, R-sq=.67, p-value=.0171 so R-sq is not
significantly greater than 0; 11.66: DFT=52, SSR=.6, SSE=2, SST=2.6, F=7.5, R-sq=.2308, s=.5477, pvalue=.0011 so the model is significant; 11.67: DFT should equal DFR=DFE, SST should equal
SSR+SSE, MSR should equal SSR / DFR, the p-value should equal .1542; 11.68: 5440; 11.69: 100010000; 11.70: 2, SEE, greater than 30, mean; 11.71: EMP; 11.72: b; 11.73: a; 11.74: d; 11.75: a;
11.76: a; 11.77: c;
11.78: A) b, B) c, C) d, D) a, E) c, F) b, G) b, H) a;
11.79: A) power = 0 + 1 DD cool 2 customer + 3 retail + ), B) H 0: 1 = 2 = 3 = 0; HA: at least
one j different from zero, C) p = .000 < = .01, reject H 0 therefore the model tests significant at
the .01 level, D) -48.8+ .0104(400)+ 4.38(15) - 0.21(6) = 19.8, E) 95% P.I.: (12.1, 27.4), F) 95% C.I.:
(16.4, 23.2), G) The standard deviation of the fit is smallest when the explanatory variables are
near or at their mean; the first prediction is based on cooling degree days well below the mean
for that variable;
11.80: A) Unemp =beta0+beta1(conf)+beta2(starts)+beta3(invent)+ error term, B) H0: beta3 = 0
and HA: beta3 0, C) H 0: beta2 = 0 and H A: beta2 < 0, D) The sample regression coefficient for
starts is negative, E) p=0<.05 reject null and p=.079>.05 do not reject null, F) "Our statistical
analysis concludes that manufacturing inventory levels are significant in predicting
unemployment, G) No and Yes, H) Change in unemp = 100(-.009478) = -.9478 so unemployment
decreases by .9478%, I) -.9478 + 100* (2*.001652) and (-.9478 100* (2*.001652)
so C.I. is (, -.9478 + 100*(2*.001652)) = (-1.2782, -.6174)
L.
Page 105
12.1 Which of the following is a characteristic of nonparametric statistics?

a. they are used for testing rather than for estimation
b. they rely on ordinal measures
c. they make no assumptions about the population distribution
d. they result in more powerful test conclusions than parametric tests
e. all of the above are characteristics of nonparametric statistics
12.2 Which of the following is an advantage of nonparametric statistics over their
parametric cousins?
a. easier to calculate
b. relies on fewer assumptions
c. less sensitive to sample outliers
d. all of the above
12.3 The problem with using a parametric method when it is inappropriate is that
a. you may not get the proper results
b. you may not be able to trust the your results
c. decision makers may act on the basis of improper findings
d. others may justifiably criticize your findings
e. all of the above
12.4 Using a parametric method when only a nonparametric method is justified
yields statistical inference conclusions that will always be
a. wrong
b. substantially different from what would have occurred had a
nonparametric method been used
c. modestly different from those of the corresponding nonparametric
method
d. statistically indefensible
12.5 Which of the following is not an example of a nonparametric method:
a. t test
b. sign test
c. Wilcoxon test
d. all of the above are examples of nonparametric methods
Page 106
12.6 One difference between the sign test and the t-test is that the sign test
a. does not assume a normal distribution of the sampling statistic
b. tests hypotheses related to the mean
c. involves only the sum of ranks
d. all of the above
12.7 One difference between the sign test and the Wilcoxon test is that the
Wilcoxon test
a. does not assume a normal distribution of the sampling statistic
b. tests hypotheses related to the mean
c. involves only the sum of ranks
d. all of the above
12.8 Given a null hypothesis of M = 10, for which of the following samples would
the sign test yield different results:
a. 5, 5, 5, 5, 5, 15
b. 1, 1, 1, 1, 9, 15
c. 5, 5, 5, 5, 5, 1000
d. 5, 5, 5, 5, 15, 15
e. all of the above would yield the same sign test results
12.9 Given a null hypothesis of M = 10, for which of the following samples would
the Wilcoxon test yield different results
a. 1, 2, 3, 4, 15
b. 6, 7, 8, 9, 15
c. 1, 2, 3, 4, 1000
d. 9, 9, 9, 9, 12
e. all of the above would yield the identical Wilcoxon test results
12.10 For nonparametric tests on the two samples 5, 6, 7, 8, 9, 20 and 4, 5, 6, 7, 8,
11 using a null hypothesis of M = 10, we would be
a. less likely to reject H0 with the first sample if we used the Wilcoxon test
b. less likely to reject H0 with the first sample if we used the sign test
c. less likely to reject H0 with the first sample if we used either test
d. equally likely to reject H0 for the first and second sample if we used the
sign test
e. both a and d
Page 107
12.11 One possible justification for using the t-test is

a. the population is known to be symmetrical
b. the sample is small
c. the histogram for the sample strongly suggests a normal population
d. the data are ordinal rather than quantitative
e. all of the above
12.12 If the population is assumed to be symmetrical
a. we must use nonparametric methods
b. nonparametric tests apply only to the mean
c. nonparametric tests apply only to the median
d. nonparametric tests apply to both the mean and median
e. we should not use nonparametric tests
12.13 A sample of quarterly time series data on the cost of fuel, in cents per gallon,
during the 1980s was collected and analyzed. To test the null hypothesis that fuel
prices averaged 90 cents/gallon, parametric and nonparametric tests were
conducted.
TEST OF MU = 90.000 VS MU 90.000
N MEAN STDEV SE MEAN
T P VALUE
fuelcost 32 82.425 15.591 2.756
-2.75 0.0099
SIGN TEST OF MEDIAN = 90.00 VERSUS 90.00
N BELOW EQUAL ABOVE P-VALUE
fuelcost 32 20
0
12
0.2153
MEDIAN
86.65
TEST OF MEDIAN = 90.00 VERSUS MEDIAN N.E. 90.00

N FOR WILCOXON
ESTIMATED
N TEST STATISTIC P-VALUE MEDIAN
fuelcost 32 32 155.5
0.043
84.72
A) Conduct each test at the .05 significance level. Are test results consistent with
one another?
Page 108
B) Use the histogram to decide which test (or tests) are justified by the
distributional assumptions. Carefully explain your reasoning.
Histogram of fuelcost N = 32
Midpoint Count
50
1 *
55
4 ****
60
1 *
65
1 *
70
1 *
75
0
80
3 ***
85
6 ******
90
6 ******
95
4 ****
100 4 ****
105 1 *

12.1: b; 12.2: d; 12.3: e; 12.4: d; 12.5: a; 12.6: a; 12.7: c; 12.8: d; 12.9: a; 12.10: e; 12.11: c;
12.12: d,
12.13: A) The t-test and Wilcoxon test are consistent with each other; each test finds fuel costs
significantly less than 90 cents because both p values, .0099 and .043 are less than =
.05. However, no significant difference from 90 cents is found using the sign test,
B) The histogram of sample data does not indicate that fuel costs are normally or even
symmetrically distributed. If the sample size of n = 32 observations is considered a large
enough sample to invoke the central limit theorem, then the t-test is justified.
Alternatively, if not large enough, should use the sign test results (but not the Wilcoxon
which assumes symmetric distribution).

Study Guide For ECO 3411

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Study Guide For ECO 3411

Uploaded by

Copyright:

Available Formats

Study Guide for Business Statistics and Applications--Soskin and Braun

Study Guide for Business

Not for resale or any other commercial use.

Business Statistics and Applications--Soskin and Braun

REVIEW QUESTIONS FOR CHAPTER 1 .................................................................... 2

Answers for Chapter 1 .............................................................................................................................................................. 2

REVIEW QUESTIONS FOR CHAPTER 2 .................................................................... 3

Answers for Chapter 2 .............................................................................................................................................................. 4

REVIEW QUESTIONS FOR CHAPTER 3 .................................................................... 5

Answers for Chapter 3 ............................................................................................................................................................ 16

REVIEW QUESTIONS FOR CHAPTER 4 ...................................................................17

Answers for Chapter 4 ............................................................................................................................................................ 37

REVIEW QUESTIONS FOR CHAPTER 5 ...................................................................38

Answers for Chapter 5 ............................................................................................................................................................ 39

REVIEW QUESTIONS FOR CHAPTER 6 ...................................................................40

Answers for Chapter 6 ............................................................................................................................................................ 51

REVIEW QUESTIONS FOR CHAPTER 7 ...................................................................52

Answers for Chapter 7 ............................................................................................................................................................ 57

REVIEW QUESTIONS FOR CHAPTER 8 ...................................................................58

Answers for Chapter 8: ........................................................................................................................................................... 64

REVIEW QUESTIONS FOR CHAPTER 9 ...................................................................65

Answers for Chapter 9: ........................................................................................................................................................... 68

REVIEW QUESTIONS FOR CHAPTER 10 .................................................................69

Answers for Chapter 10 .......................................................................................................................................................... 80

REVIEW QUESTIONS FOR CHAPTER 11 .................................................................81

Answers for Chapter 11 ........................................................................................................................................................ 104

REVIEW QUESTIONS FOR CHAPTER 12 ...............................................................105

Answers for Chapter 12 ........................................................................................................................................................ 108

Business Statistics and Applications--Soskin and Braun

A. Review Questions for Chapter 1

Answers for Chapter 1

Business Statistics and Applications--Soskin and Braun

B. Review Questions for Chapter 2

Business Statistics and Applications--Soskin and Braun

Answers for Chapter 2

Business Statistics and Applications--Soskin and Braun

C. Review Questions for Chapter 3

Business Statistics and Applications--Soskin and Braun

3.5 The median is

Business Statistics and Applications--Soskin and Braun

Business Statistics and Applications--Soskin and Braun

3.13 For the data set 5 4 3 2 1 0

3.80 Random samples

Business Statistics and Applications--Soskin and Braun

3.81 Which of the following is an example of random sampling?

is an estimator of the mean

Business Statistics and Applications--Soskin and Braun

Business Statistics and Applications--Soskin and Braun

Business Statistics and Applications--Soskin and Braun

If the distribution of prices is approximately bell-shaped, then we would

N Mean Median TrMean StDev SEMean

Business Statistics and Applications--Soskin and Braun

Answer the following six questions based on this information:

The median annual salary is

The mode for this data set is

The modal class interval is

If a players union representative stated that one-third of the first-year

Business Statistics and Applications--Soskin and Braun

The fiftieth percentile for annual salary of NBA rookies was

N MEAN MEDIAN TRMEAN STDEV SEMEAN

Business Statistics and Applications--Soskin and Braun

5.3 5.5 5.8 6.1 7.0 7.2 7.4

Business Statistics and Applications--Soskin and Braun