You are on page 1of 10

1

PRACTICE QUESTIONS FOR FINAL EXAM REVISION

The purpose of these practice questions is


(i) to guide and focus your study for the final exam
(ii) to give some familiarity with the kinds of questions that can
be asked on a D&D exam.

There will be five questions on the exam. I expect that you will
have a better chance of finishing the final exam paper than you did
the midterm which was pretty demanding for the 90 minutes
allotted. For the final exam you have twice as long.

I will not provide any assistance with the * questions. These are for
students who are aiming at a higher grade.

Go directly to Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9
2

Q1: Operators at a call centre have their calls tape-recorded for possible
analysis. The purpose of this is to analyse interactions with customers with a
view to maintaining and monitoring the level of service.

One important purpose of the monitoring system is that it forms the basis of
an induction process for new operators. New operators are initially trained
for a week, and then begin working in the call centre. During the second
week of an individual operators employment, 20 call recodings are randomly
sampled, analysed by an expert and given a rating out of 10. These 20
scores are averaged and the operator obtains a single score for their
performance. The operator is then counseled by being taken through these
20 calls and having any problems identified.

During the fourth week of employment a second sample of 20 calls is taken


and the same analysis undertaken followed by counseling where problem
areas are identified for the operator. During week 6 a third process of the
same type is taken. The data for the three call samples and for all 63
operators is in the worksheet Q1.xls.
(a) Is the first counseling effective in improving performance?
(b) Give a 95% confidence interval for the probability that the first counseling
process will improve the performance of an operator by more than 1
point.
(c) Is the 2nd counseling effective in improving performance?
(d) *Is there any clear evidence of diminishing returns in the counseling
processes?
3

Q2: Prior to a marketing campaign, you have performed a survey on two


important business variables, brand recognition and attitudes. Recognition is
measured by asking customers to name all the brands they know and
recording whether the client brand is among them. For those who mention
the client brand, they are asked to give a rating (out of 10) to the two major
brands and the client brand. Only the rating of the client brand has been
recorded.

After the campaign you conduct a new survey asking exactly the same
questions. The worksheet Q2.xls contains two columns saying whether the
respondent spontaneously mentions (i.e. recognizes) the brand and, if so,
their rating on the 10 point scale, 10 being best. The next two columns
contain the same data for the second survey.
(a) Did the marketing campaign increase brand recognition?
(b) Did the marketing campaign improve customer rating of the brand?
(c) Why is it necessary to do a statistical analysis? Why not just take the
surveys at face value?

Q3.

In a certain D&D course I had 155 students. There were two assessment
tasks a midterm exam worth 40% and a final exam worth 60%. The results
are given below:

Midterm Final
Mean 69 75
Stdev 6.7 9.9

Since the same students are involved in both tasks we would expect some
kind of correlation between the two sets of results. In fact the correlation
turns out to be +0.3. The standard deviation of the difference in the two
scores across the 155 students is 10.15.

(a) What would you say about a person who scored 77% on the mid-
term? How would this change if the same score was attained in
the final?
(b) Is there really any difference in the underlying mean level of
performance for the two tasks? How sure are you?
4

(c) * If the marks are combined in the stated 40-60 ratio, then what
will be the mean and standard deviation of the final marks?
(d) * If you were not given the figure 10.15 for part (c), how could
you obtain it from the given information?

Q4. Describe, in broad terms, the appropriate statistical method to use in


each of these situations.*
(a) A supermarket surveys 100 customers on their satisfaction. After a new
training program for staff, it conducts another survey of 200 customers.
There is interest in measuring the effectiveness of the training program.
(b) One hundred real estate properties are valued at January 1998 and then
revalued at January 1999. There is interest in measuring the growth in
real estate value.
(c) During a long period of buoyant and rising business confidence, 100
business leaders are surveyed on their confidence in the future of the
economy on September 1, October 1 and November 1. There is interest
in testing whether the growth in business confidence is starting to
decline.
(d) You are trying to monitor brand recognition of your product and in
particular whether it has declined from the previous level of 68%
following your last advertising campaign. You survey 1000 people and
record whether or not they were able to mention your brand without
prompting.

*
This question is not an exam format question. The purpose is to review concepts and to
provide practice in recognizing which theory applies in which situation.
5

Q5. (previous exam question)

Repairing VCRs on guarantee is an expensive business both in terms of


direct repair costs as well as customer satisfaction. It is even more
damaging to customer perceptions if repairs are not done properly and the
VCR has to be returned for further repair during the warranty period.
Overseas experience suggests that, on a three year warranty, 16.5% is an
achievable benchmark for the number of VCR returned on warranty and that
12.2% is an achievable benchmark for the number of units repaired under
warranty that have to be returned for further repair.

Below is some data for one manufacturer who produces 11567 units over
the period of data collection as well as summary statistics for three
contracted repairers.

Sent for repair Requiring further repair


Repairer 1 563 75
Repairer 2 1126 157
Repairer 3 188 28
Total 1877 260

(a) For this manufacturer, make a statistical comparison of the number


of VCRs returned for repair on warranty with the international
benchmark
(b) Are the repairers meeting the benchmark? How much evidence is
there? You may consider the repairers separately or collectively.
(c) Are there any apparent differences between the repairers that you
would be statistically confident about?
6

Q6

You have data on delivery times


of delivery trucks for 42 trips
during a month. You want to use
these to set benchmark times for
different kinds of delivery runs and
also to assess the different drivers
currently on the payroll. There are
8 drivers given IDs 1-8. There are
three types of delivery vehicle.
There are up to 4 delivery drop-
offs on a trip. The time of each
delivery trip has been converted
into an average speed by dividing
the total distance of the trip by the
recorded time. These are in
column F. It would not be sensible
to compare drivers by simply
calculating average speeds for
both because different drivers are
given different types of runs. You
use a multiple regression analysis
to try to separate out the effects of
(1) truck type, (2) driver ID, (3)
number of deliveries. Some output
is on the pages following.
7

Regression Output for part (d).


Multiple regression for Speed_kph on Significant Vars
Summary measures
Multiple R 0.831
R-Square 0.690
Adj R-Square 0.626
StErr of Est 5.735
Regression coefficients
Coefficient Std Err t-value p-value
Constant 34.830 3.012 11.57 0.000
Deliveries -2.387 1.250 -1.91 0.065
Truck_type_2 5.970 2.354 2.54 0.016
Truck_type_3 5.384 2.565 2.10 0.043
Driver_ID_3 21.680 2.896 7.49 0.000
Driver_ID_4 9.658 3.339 2.89 0.007
Driver_ID_5 7.547 2.615 2.89 0.007
Driver_ID_8 9.830 2.514 3.91 0.000

(a) Fit a model which says that delivery speed is a linear function of the
number of deliveries. What does this model say about the effect of an
extra delivery stop on the average speed of the run?
(b) Fit a model which allows delivery speed to be different for each
different truck type and for each different driver and also that each
delivery stop reduces delivery speed by a fixed amount. What does
this model say about the effect of extra deliveries stops on the average
speed of the run?
(c) Using the model in (b), give a benchmark average speed for a run with
4 deliveries in a truck of type 2, with the best driver driving. How much
variability could you reasonably allow around this benchmark?
(d) * Looking at the model in Regression Output above, what does this
model assume that the model in part (b) does not (in simple English)?
It appears that drivers 3, 4, 5 and 8 are better than the others. How
sure are you that driver 5 is in this group of better drivers?
(e) If you used this model to make predictions of average delivery speeds
then how accurate would you expect these predictions to be?
(f) Use backwards elimination procedure to arrive at a best model using
P-to-leave=0.2. In your field of input variables include all dummies for
truck-type and driver, deliveries as numeric and also distance.
8

Q7. (Slightly harder than an exam question*)

There are three stocks you are considering investing in, whose mean and
standard deviation of return are listed below.

Mean Stdev S1 S2 S3
S1 5.5 2.1 S1 1 0.55 0.2
S2 7.9 3.6 S2 0.55 1 -0.3
S3 11.6 12.4 S3 0.2 -0.3 1

Also listed (in blue) are correlations between the three stocks. Calculate the
mean and standard deviation of return for the following three investments:

(a) If you were looking at putting your money into just two of these
three investments, which two might you expect to deliver the greatest
benefits of diversification?
(b) Calculate the mean and standard deviation of return for a portfolio
with 95% in S1 and 5% in S2.
(c) Calculate the mean and standard deviation of return for a portfolio
with 80% in S1 and 20% in S3
(d) Calculate the mean and standard deviation of return for a portfolio
with 85% in S2 and 15% in S3
(e) On the basis of these calculations, which investments could you
definitely recommend against?
(f) If the risk free rate is 4.5%, then which of the investments looks
best? Does this confirm your considerations in part (a)?

*
You are only required to know about portfolios of two stocks. This example give you an opportunity to look at
various combinations of two of the three stocks. But there will only be two stocks on the exam question.
9

Q8. The data in Q9.xls refers to the weekly sales (in units of $1000) across
a chain of supermarkets. The price each week is measured by a the price of
a large basket of goods that represent the average behaviour of your
customers. It is obvious from looking at the time series charts below that
drops in prices (i.e. weeks where there are more specials) are associated
with rises in demand. Price specials are advertised in flyers in areas local to
the supermarkets. The question is exactly how general levels of pricing
drives demand.

Time series chart of Price and Demand


$255
Price 9800
$250 Demand

$245 9300

Demand
Price

$240
8800

$235

8300
$230

$225 7800
1

11

16

21

26

31

36

41

46

51

56

61

66

71

76

81

86

91

96
(a) Fit a simple linear regression of sales on price. Evaluate the model. In
particular, by looking at the residuals, identify any observations that do not fit
the trend of the data.
(b) In worksheet data (2), I have created three new data columns. Price_1
contains the price last week, Price_2 the price two weeks ago and Price_3
the price 3 weeks ago. Just click on the cell to complete the column. Fit a
regression model that allows the sales this week to be driven by the price
over the past four weeks (i.e. this week back to 3 weeks ago).
(c) What does the model say in simple terms?
(d) How is it that the model estimates that underlying sales are decreasing
when a plot of sales against time (the red plot above) indicates that sales
are increasing?
(e) The price in the last week of the data was $234.14. Supposing the price
is fixed at $230 for the next three weeks, what sales can we expect in 3
weeks?
10

Q9. Below I have reproduced the correlation matrix and the partial
correlation matrix for the Meatloaf example. The variable of interest is profit
and how this is impacts by time (measured in weeks) as well as advertising
this week, and the previous three weeks.

Correlations

Profit ADV ADV_Lag1 ADV_Lag2 ADV_Lag3 Week


Profit 1 0.394 0.440 0.413 0.038 0.092
ADV 0.394 1 0.551 0.270 -0.006 -0.151
ADV_Lag1 0.440 0.551 1 0.560 0.111 -0.145
ADV_Lag2 0.413 0.270 0.560 1 0.499 -0.139
ADV_Lag3 0.038 -0.006 0.111 0.499 1 -0.133
Week 0.092 -0.151 -0.145 -0.139 -0.133 1

Partial Correlations

Profit ADV ADV_Lag1 ADV_Lag2 ADV_Lag3 Week


Profit 48.8% 0.38 0.35 0.21 -0.10 0.24
ADV 0.38 38.5% 0.31 -0.23 0.05 -0.18
ADV_Lag1 0.35 0.31 52.6% 0.40 -0.12 -0.10
ADV_Lag2 0.21 -0.23 0.40 48.3% 0.51 -0.08
ADV_Lag3 -0.10 0.05 -0.12 0.51 28.7% -0.07
Week 0.24 -0.18 -0.10 -0.08 -0.07 9.2%

(a) How well would you expect to be able to explain profits using the five
explanatory variables?
(b) Which of the explanatory variables appear to be least useful and most
useful in explaining profit?
(c) Why do you think it is that the partial correlation between profits and
week is larger than the ordinary correlation?
(d) What is the main difference in your conclusions about the effects of
advertising using the partial as opposed to ordinary correlations?
(e) * Calculate the T-statistics for each explanatory variable that you
would obtained from a regression of profits on the five explanatory
variables.

You might also like