You are on page 1of 18

For questions 1 28, there is only one correct answer from those given.

Mark
your answer to each question with a pencil on the sheet provided. Ambiguous
responses will be considered incorrect.
For questions 29 36, please insert your answers in the spaces provided. You
should provide all relevant working. Should you require more space, please
use the reverse side of the sheet, clearly indicating to which question your
response corresponds.
1. Indicate which section you are in:
(a) 201 (9am class)
(b) 202 (2pm class)
(c) Deferred student from a previous term.
2. All human blood can be typed as one of A, O, B, or AB. Suppose that,
in a very large population, 50% of people are type O, 20% are type
B and 5% are type AB. I will choose one person at random from the
population. What is the probability that the persons blood is type A?
(a) 0.05
(b) 0.20
(c) 0.25
(d) 0.35
(e) 0.40

Use the following for questions 3, 4 and 5. A consumer recorded the


price of a loaf of Di Baggio Italian bread at her local supermarket
each week in 2007. The data are presented in a histogram below:

3. The distribution displayed above would be described as bimodal. True


or false?
(a) True
(b) False
4. Which of the following is the most likely value of the mean of the prices
recorded?
(a)

$1.80 (b)

$1.90 (c)

$2.00 (d)

$2.10 (e)

$2.20

5. Which of the following is the most likely value of the standard deviation
of the prices recorded?
(a)

$0.04 (b)

$0.15 (c)

$0.50 (d)

$1.50 (e)

$2.00

Use the four scatterplots below for questions 6 9. The four scatterplots
below are, as labelled, respectively plots of Y1 vs X1, Y2 vs X2, Y3 vs
X3 and Y4 vs X4.

For each scatterplot, choose the value of the associated correlation from
the five listed below. Note that you will not need to use all the five
listed values, and it is possible that some values may be used more than
once.
(a)

0.958

(b)

0.504 (c)

6. Y1 vs X1
7. Y2 vs X2
8. Y3 vs X3
9. Y4 vs X4

0.032

(d)

0.217

(e)

0.830

10. Data were collected within a group of males in an athletic association in


BC. Based on this dataset, a regression model was computed to predict
weight Y (in kg) from height X (in cm). The model fitted was
Y = 0.28X + 27.00.
If intending to predict height from weight using the same dataset, which
of the following statements most precisely describes what you can say
about the appropriate regression line?
(a) The slope of the regression line would be 0.28.
(b) The slope of the regression line would be 27.00
(c) The slope of the regression line would be positive.
(d) The slope of the regression line would be negative.
(e) The slope of the regression line would be 3.57.
11. Consider sampling with replacement from a large population of people.
Within the population, the variance of IQ is denoted 2 . The sample
size is n, and the mean of the sample IQs is found. The variance of
this sample mean is
(a) 2 provided n is large.
(b) 2 for any value of n.
(c) 2 /n provided n is large.
(d) 2 /n for any value of n.

(e) 2 / n for any value of n.

12. In a very large population, the distribution of annual income is skewed,


with a very long right tail. I take a simple random sample of n people from this population and record the n incomes and calculate their
average. The histogram of the n incomes in the sample
(a) will resemble a Normal distribution provided n is large.
(b) will resemble a Normal distribution for all values of n.
(c) will not resemble a Normal distribution whatever the value of n.
(d) will resemble a Uniform distribution provided n is large.
(e) will resemble a Uniform distribution for all values of n.
Use the following information for questions 13, 14 and 15:
A researcher for Agri-Giant Corporation wants to study how many
pounds of tomatoes are typically picked by a manual labourer in an 8
hour day. The Corporation will consider using picking machines if the
labourers cannot pick at least 240 pounds, on average.
To test the null hypothesis that the expected number of pounds of
tomatoes picked by a farm labourer in 8 hours is 240 pounds, the
researcher randomly sampled 100 farm labourers, found out how many
pounds of tomatoes each picked and calculated the sample mean and
standard deviation, x and s. The researcher found a p-value of 0.04.
13. The researcher would reject the null hypothesis at significance level
0.05. True or false?
(a) True
(b) False

14. The parameter of interest to the researcher is


(a) whether or not a manual labourer can pick at least 240 pounds of
tomatoes in an 8 hour shift
(b) 240 pounds
(c) how long the labourer works
(d) how many pounds of tomatoes each labourer picks in an 8 hour
shift
(e) the average number of pounds of tomatoes typically picked by
labourers in an 8 hour shift
15. Suppose that the Agri-Giant Corporation had chosen to base its inference on a sample of 50 labourers rather than 100. Suppose further that
by chance the values of the sample mean and standard deviation were
the same as found in the case above, where the sample size was 100.
Then which of the following statements best describes the impact on
the p-value?
(a) The p-value would be smaller than 0.04.
(b) The p-value would be larger than 0.04.
(c) The p-value would be equal to 0.04.
(d) The p-value would be meaningless for such a small sample.
(e) It is impossible to deduce anything about the p-value from the
information given.

Use the following information for questions 16, 17 and 18:


Researchers are interested in determining if, during an exam period,
SFU undergraduates tend to sleep more than UBC undergraduates.
Ten SFU undergraduates were chosen at random and, independently,
ten UBC undergraduates were chosen at random. A data file was constructed consisting of a line for each student containing:
ID: student ID,
SCH: School attended (SFU/UBC),
APR16: number of hours slept during the period on April 16 from
12:01 am to 11:59 pm,
APR17: number of hours slept during the period on April 17 from
12:01 am to 11:59 pm.
16. True or false? A good way to study the primary research question is
to make a scatterplot of UBC students numbers of hours slept Apr 16
on the x axis and SFU students numbers of hours slept Apr 16 on the
y axis.
(a) True
(b) False
17. True or false? The researchers should use a onesided alternative hypothesis.
(a) True
(b) False

18. To test the null hypothesis that SFU undergraduates and UBC undergraduates tend to sleep the same, on average, during exam period, we
would need which one of the following?
(a) Tables for the Chi-squared distribution on 9 degrees of freedom.
(b) A table for the t distribution with 8 degrees of freedom.
(c) A table for the t distribution with 19 degrees of freedom.
(d) A table for an F distribution with degrees of freedom of the denominator equal to 19.
(e) A table for an F distribution with degrees of freedom of the denominator equal to 18.
Use the following information for questions 19, 20 and 21:
The owner of a small clothing store is concerned that her average sales
each day are only $149, not enough to cover rent and salary. She decides
to try out some new window displays, to see if these will increase her
average sales. She buys the new window displays on trial. To decide
if she should keep the new displays, she collects sales data for 20 days
to test the null hypothesis that the daily expected sales are unchanged
(equal to $149) versus the alternative hypothesis that expected daily
sales are greater than $149.
19. Suppose that the displays really do work. If the store owner extends her
trial period from 20 days to 30 days, which statement most precisely
describes what can be said about the power of her test?
(a) The power would increase.
(b) The power would stay the same.
(c) The power would decrease.
(d) The power would remain zero.
(e) The power could be chosen to be 5%.

20. Suppose that, based on the data collected in the trial, the owner calculates a p-value of 0.04. This means
(a) there is a 4% chance that sales increased during the trial period.
(b) there is a 4% chance that sales decreased during the trial period.
(c) during the trial period, sales increased by 4%.
(d) during the trial period, sales decreased by 4%.
(e) during the trial period, her sales figures were pretty high, if indeed
the new displays typically would have no effect.
21. Suppose that, based on the data collected in the trial, the owner of the
store decides to keep the new displays. Then
(a) she is in danger of making a Type I error.
(b) she is in danger of making a Type II error.
(c) she is in danger of making a Type III error.
(d) she will get a bigger .
(e) she will get a smaller .

22. Dependent on birth date, each person is assigned to one astrological


sign. There are twelve such signs, commonly known as the Signs of
the Zodiac. A study investigated a relationship between astrological
sign and heart rate (measured in bpm (beats per minute)). The study
asked each subject for their astrological sign and also had their heart
rates recorded while at rest. To test the null hypothesis that the expected heart rate is the same for people of each astrological sign, an
ANOVA table was computed. The value of the test statistic was 0.61,
to be compared with the F (11, 813) distribution.
We would reject the null hypothesis here. True or false?
(a) True
(b) False
23. Based on the information in the previous question, you could deduce
which of the following?
(a) The mean heart rate was 813 bpm.
(b) The variance of all the heart rates recorded is 0.61 bpm2 .
(c) There were eleven people in the study born under each of the
twelve astrological signs.
(d) There is 61% of the variation in the heart rates that is accounted
for by astrological sign.
(e) There were 825 subjects in the study.

10

24. Three different labs tested two types of cream, A and B, recording the
percentage of solubility in some liquid. Each lab repeated each experiment, and the data are given below:

Lab

Cream
A
1 6.8, 6.6
2 7.5, 7.4
3 7.8, 9.1

type
B
5.3, 6.1
7.2, 6.5
8.8, 9.1

Differences in the measurements may be due to differences in solubility


in the cream types, differences between the labs or both of these possible
sources of variation. To investigate this, you could use
(a) A Binomial distribution.
(b) A matched pairs t test.
(c) A Chi-squared test for association.
(d) A linear regression model.
(e) No method that has been encountered in STAT 200.

11

For questions 2528, consider studying if gender and the highest academic qualification obtained (none, high school diploma, bachelors degree, post-graduate degree) are independent.
25. True or false? To study independence of gender and the highest qualification obtained, it would be useful to compare the four conditional
distributions:
the conditional distribution of gender given no qualification was
obtained,
the conditional distribution of gender given the highest qualification is high school diploma,
the conditional distribution of gender given the highest qualification is a bachelors degree,
the conditional distribution of gender given the highest qualification is a post-graduate degree.
(a) True

(b) False

26. True or false? To study independence of gender and the highest academic qualification obtained, it would be useful to construct a scatterplot.
(a) True
(b) False
27. True or false? To study independence of gender and the highest academic qualification obtained, it would be useful to calculate a correlation coefficient.
(a) True
(b) False
28. True or false? To study independence of gender and the highest academic qualification obtained, it would be useful to calculate a chi-square
statistic.
(a) True
(b) False

12

29. (6 marks) In his twenty seasons playing in the National Hockey League
(NHL), Wayne Gretzky played the following number of games per season:
79, 80, 80, 80, 74, 80, 80, 79, 64, 78,
73, 78, 74, 45, 81, 48, 80, 82, 82, 70
(a) Create a stemandleaf plot for these data.

(b) Which of the following best describes the distribution? (Circle


one)

Symmetric
Left skewed
Right skewed
Uniform

(c) Identify any apparent outliers in the data, and provide a plausible
explanation for the value(s).

13

30. (10 marks) Recall that a Roulette wheel has 38 slots, labelled 0, 1, 2,
..., 36, and 00. I will play Roulette by betting on the slot labelled 00.
For one play of Roulette, I pay $1. If 00 comes up on the wheel, I
get my dollar back, plus $35, for a net gain of $35. If 00 does not
come up on the wheel, I lose my dollar, for a net gain of $1. Let X
be my net gain in one play of Roulette.
(a) Find the probability distribution of X.

(b) Find E (X) , the expected value of X.

(c) Find Var(X) , the variance of X.

(d) Suppose that I play Roulette 100 times, each time betting on 00.
Let W be my total winnings. What is E (W )? What is Var(W )?

14

31. (5 marks) Every day Lucky Louie plays a die roll game. He rolls a die
five times and counts the number of ones. If he rolls exactly two ones,
then he treats himself and buys a Barstucks Macchiato. That is the
only way he treats himself. Let X be the number of Macchiatos Lucky
Louie buys in the month of June, a month with thirty days. Then
X has a Binomial distribution defined by two parameters, denoted as
usual n and p.
(a) What is the value of n here?

(b) What is the value of p?

32. (7 marks) Suppose that Math Proficiency scores of 12th graders are
Normally distributed with mean 80 and standard deviation 12.
(a) Approximately what is the Math Proficiency score of a student at
the first quartile (that is, the 25th percentile)?

(b) Approximately what is the interquartile range of the Math Proficiency scores?

15

33. (5 marks) Some people believe in dowsing, the ability to be able to


detect unseen water with the aid of a forked stick. Dowsers claim that
they experience a movement in the stick when it is passed over water,
even when that water cannot be seen or otherwise detected.
In a study to investigate the possibility of dowsing, researchers obtained
eight subjects who claimed to have the ability to dowse. The researchers
took twelve identical containers, and placed half a litre of water in six
of them. The other six were empty, and the containers were such that it
was impossible to determine their contents by visual inspection alone.
The twelve containers were placed in a random order in a room. The
dowsers entered the room onebyone, and attempted to determine
which of the twelve containers held water using only their supposed
dowsing powers. The dowsers were not told how many of the twelve
containers actually contained water, and nor were they told whether
their choices had been correct. The researchers recorded the number
of times each dowser was correct.
(a) Why did the researchers only allow the subjects to attempt the
task one at a time?

(b) Briefly explain why this experiment was not doubleblind.

(c) One of the eight dowsers successfully determined the presence or


absence of water in all twelve containers. This proves false the
hypothesis that no-one has the genuine ability to dowse for water.

16

True or false? (Circle one)


True

False

34. (9 marks) One definition of obesity is in terms of body mass index.


In a study of obesity in Vancouver fourth graders, random samples
of fourth graders were taken from each school and body mass indices
recorded. Here is a summary of the body mass index data from two
schools, Laura Secord and Charles Dickens.
School

Number of
children measured

Average
body mass

SD of
the body masses

Laura Secord

24.3

3.1

Charles Dickens

21.0

2.9

(a) Carry out a hypothesis test to determine if the average body mass
of fourth graders at Laura Secord is equal to the average at Charles
Dickens. Test at the 0.05 significance level. Clearly state your test
statistic, the tables you use and show all calculations. State your
conclusion in the context of this problem.

(b) Find a 95% confidence interval for the average body mass index
of fourth graders at Laura Secord School.

17

35. (9 marks) What affects how a person chooses at random? Each of


92 randomly sampled university students was given a slip of paper that
said
Randomly choose one of the letters S or Q.
Of these 92 students, 61 chose S. The remaining 31 students chose Q.
Another 98 randomly sampled university students were given a slip of
paper that said
Randomly choose one of the letters Q or S.
Of these 98 students, 45 chose S. The remaining 53 students chose Q.
Is there an association between how the students responded and the
ordering of the letters in the question? Carry out the appropriate test
at level 0.05. Clearly show the calculation of your test statistic and
your rejection rule (in particular, clarify which of the tables provided
you have used, if any). State your conclusion, in the context of this
problem.

18

You might also like