Professional Documents
Culture Documents
Instructions to Candidates:
Attempt ALL questions.
Each question is of equal mark value.
Start your solution to each question on a new page.
To ensure full marks show all the steps in working out your
solution. Marks may be deducted for failure to show appropriate
calculations or formulae.
Unless otherwise stated, use a significance level of 5%.
Selected statistical tables are attached to the back of the
examination paper.
Page 1 of 12
Page 2 of 12
(vi) A researcher asks a group of university students to select their favourite method
of assessment open book examination, closed book examination, assignment,
group work or other. In summarising this data, which would be the most
appropriate graph to use?
a. Bar chart
b. Boxplot
c. Histogram
d. Scatterplot
(vii) In investigating the relationship between years of formal education and income, a
researcher finds that the covariance is +101.34. Which of the following
statements is true, based only on this figure?
a. There is a very strong positive relationship between the two variables
b. The correlation will be very close to 1.
c. Both years of formal education and income have very large variances.
d. A line of best fit for the data would have a positive slope.
(viii) A study is being performed based on in-home access to broadband internet
within the ACT. The ACT has been divided into 32 regions, and a random
selection of homes in each region is chosen for participation in the study. This
sampling plan is best described as
a. Simple random sampling
b. Systematic sampling
c. Stratified random sampling
d. Cluster sampling
Page 3 of 12
Data
6000
5000
4000
3000
2000
1000
0
5 Day
4 Day
(ix) When the data are entered into Minitab, how many rows will be required?
a. 12
b. 3
c. 11
d. 10
(x) Based only on the boxplot, we can say that
a. The IQR is larger for the 5 day week than for the 4 day week
b. The Range is larger for the 4 day week than for the 5 day week
c. The mean of the 5 day week is higher.
d. There are more observations for the 5 day week than the 4 day week.
(xi) How many variables are present in the data set?
a. 2
b. 3
c. 4
d. 12
Page 4 of 12
N
11
11
11
Mean
4955
3973
982
SE Mean
*oil1*
*oil3*
*oil5*
StDev
3161
2171
1140
Sum
*oil2*
*oil4*
*oil6*
Sum of
Squares
369966249
220755581
23593724
(xiii) The value which should be present at *oil1* is given by (to the nearest whole
number)
a. 287
b. 953
c. 908356
d. 3161
(xiv) The value which should be present at *oil4* is given by (to the nearest whole
number)
a. 43703
b. 23881
c. 14858
d. 20068689
Based on the above descriptive statistics, a 90% confidence interval for the average
mileage driven using a 5 day week is calculated to be
3161
3161
, 4955 + c
4955 c
.
n
n
Page 5 of 12
(xviii) In testing if there is a significant difference between mileage under the two work
day schemes, the test statistic should be compared to which tables?
a. T tables with 10 degrees of freedom
b. Standard normal tables
c. T tables with 20 degrees of freedom
d. F tables with 11, 11 degrees of freedom.
(xix) The p-value for the test discussed in parts (xvii) and (xviii) (using a two-sided
alternative hypothesis) is 0.017. This means that
a. There is a 1.7% chance that the null hypothesis is true.
b. 1.7% of the time, the null hypothesis is true.
c. We would reject the null hypothesis.
d. We would reject the alternative hypothesis.
(xx) If the test were carried out against a one-sided alternative, which of the following
statements about the new p-value would be true?
a. The p-value will be equal to 0.017.
b. The p-value will be equal to 0.0085.
c. The p-value will be equal to 0.034.
d. None of the above.
Page 6 of 12
(a) The table below shows the results of a survey of voters including who they
voted for in the most recent federal election (in the House of Representatives)
and their positions on the death penalty for convicted murderers.
For
Against
Liberal/National
0.26
0.04
ALP
0.12
0.24
Other
0.24
0.10
i. Find the marginal probability distribution of voting in the most recent
federal election.
ii. What is the probability that a randomly chosen Australian voter supports
the death penalty for convicted murderers?
iii. What is the probability that an Australian voted for the Liberal/National
candidate in the House of Representatives at the last election if it is
known that they are against the death penalty for convicted murderers?
iv. Are voting choice and position on the death penalty independent events?
Explain your answer.
(b)A commuter must pass through give traffic lights on her way to work, and will
have to stop at each one that is red. She estimates the probability model for the
number of red lights she hits as shown below.
# red lights
0
1
2
3
4
5
Probability 0.06 0.25 0.34 0.15 0.16 0.04
i. Find the expected number of red lights at which the commuter will
have to stop on her way to work.
ii. Find the standard deviation of the number of red lights.
iii. Find the expected number of red lights the commuter will face on
her way to work over a 5 day working week. What is the standard
deviation of the number of red lights faced over a 5 day working
week?
iv. The local council installs a new set of lights on the commuters
route. The commuter wants to take a sample in order to estimate
the new mean number of red lights she can expect to be stopped at.
To estimate this mean to within half a red light (0.5), how many
journeys should she sample, assuming the new standard deviation
is equal to 2.5 red lights? That is, calculate the number of
observations she should make, clearly stating any assumptions you
make.
Page 7 of 12
(a) Your pocket copy of Kyrgystan on a Budget claims that you can expect to spend
about 4237 soms (the local unit of currency) each day you spend in this country,
with a standard deviation of 360 soms. Assume that expenditure follows a normal
distribution.
i. Your budget allows you to spend 90,000 soms during your stay (not
including transport into and out of the country). What is the maximum
number of whole days you can spend in Kyrgystan on average, without
breaking your budget?
ii. What is the standard deviation of your total expenses for a stay of that
duration?
iii. How much money should you budget for each day in order to cover all
but the most expensive 10% of days?
iv. After a stay of 10 days, you find that you have spent 41,414 soms.
What percentage of travellers with the same length of stay will have
spent less than you (assuming the figures in Kyrgystan on a Budget
are accurate).
(b)Having completed your stay in Kyrgystan, you return home to Canberra, and
decide to put your budgeting skills to the test. Your part-time job pays well, at $24
an hour, but the number of hours per week is a random variable best represented by
a uniform distribution with possible values from 4 to 18. Assume each week is
independent.
i. Draw a graph representing the distribution of the number of hours of
work per week. Clearly label all axes and points of interest.
ii. Find the expected value of the hours of work per week.
iii. Find the variance of the hours of work per week.
iv. What is the probability you get no more than 6 hours work next week?
v. Your budget requires your job to bring in a minimum of $112 per week
to meet minimum expenses, or you will be forced to ask your parents
for money. What is the probability that you ask your parents for money
next week?
vi. What is the probability that you get more than 600 hours work over the
coming year (52 weeks)?
Page 8 of 12
(a) The Yummy biscuit company claims that every 500g package of their
chocolate chip cookies contains an average of at least 1000 chocolate chips.
Being a dedicated student of statistics, you determine to test their claim, and
taking a random sample of 16 packages, you find an average of 1238.2
chocolate chips with a standard deviation of 94.3.
i. Perform a hypothesis test at the 5% level to determine if the
companys claim is supported by your data.
ii. Comment on any assumptions you have made in performing
the inference in part (i).
(b)The Scrumptious biscuit company claim that their 500g packages of
chocolate chip cookies are tastier, and contain a different number of chocolate
chips on average than those produced by the Yummy company. You take a
random sample of 16 Scrumptious packages and find a sample average of
1382.2 with a standard deviation of 123.1.
i. Test at the 10% level whether the data support the assumption
of equal population variances.
ii. Perform an appropriate test of equality of means, using a
significance level of 10% and clearly stating any assumptions
you make.
iii. Given the results of your test in (ii), answer the following
question without performing any further calculations: Would
the value 0 be found within a 90% confidence interval for the
true mean difference in number of chocolate chips between
Yummy and Scrumptious packages? Explain your answer.
Page 9 of 12
Variable
Weeks
Weight lost
N
100
100
Mean
10.500
7.290
StDev
5.795
4.242
Minimum
1.000
-1.136
Maximum
20.000
15.414
Weight lost
12.5
10.0
7.5
5.0
2.5
0.0
0
10
Weeks
15
20
A regression is performed in Minitab on the data, and an excerpt of the output is given
below. However, some of the output has been obscured by sweat stains from the
exercise program.
Regression Analysis: Weight lost versus Weeks
Predictor
Constant
Weeks
Coef
-0.0742
0.70133
SE Coef
0.2539
0.02119
T
-0.29
33.09
Page 10 of 12
P
0.771
0.000
R-Sq = 91.8%
R-Sq(adj) = 91.7%
Analysis of Variance
Source
Regression
Residual Error
Total
DF
1
98
99
SS
1635.5
146.4
1781.8
MS
1635.5
1.5
F
1095.03
SE Fit
0.218
0.143
0.218
0.184
Residual
3.258
2.952
-2.611
2.433
P
0.000
Unusual Observations
Obs
42
47
79
97
Weeks
2.0
7.0
19.0
17.0
Weight
lost
4.586
7.787
10.640
14.282
Fit
1.328
4.835
13.251
11.848
St Resid
2.71R
2.43R
-2.17R
2.01R
(a) Describe the scatterplot. Would you expect the covariance between weeks and
weight loss to be positive or negative? Give a reason.
(b)Give the equation of the fitted model (SWEAT STAIN 1).
(c) Find the standard error of the estimate (SWEAT STAIN 2). Give an interpretation
of what this value means.
(d)The friends wish to test the value of the intercept particularly, they wish to know
if the average weight loss at 0 weeks is 0kg. Use the output (without performing
any calculations) to comment on this.
(e) It is often claimed that weight loss of over 0.5kg per week is unsustainable. Test if
the average weight loss per week by this group of friends is likely to be
unsustainable based on this criterion.
(f) Comment on the unusual observations flagged in the Minitab output. Are they a
cause for concern about the model?
Page 11 of 12
(g)The friends wish to find a 95% confidence interval for the average weight loss of
all people using the same combination of diet and exercise in week 15. Use the
output above and calculations you have made in earlier parts of this question, to
find this interval.
Hint: You may find the following formulae useful:
(1 ) % Confidence Interval for y given that x = xg :
1 ( xg x )
+
n ( n 1) sx2
y t / 2, n 2 s
y t / 2,n 2 s
_____________________________________________________________________
END OF EXAMINATION
Page 12 of 12