ST102 2008mock

2008 Mock examination
ST102 Mock Paper

Elementary Statistical Theory
2007/8 syllabus only - not for resit candidates
Instructions to candidates
Time allowed: 3 hours
Full marks may be obtained for complete answers to FIVE equations.
Answer not more than THREE questions from Section A and not more than THREE
questions from Section B.
You are supplied with:
Graph Paper
Murdoch & Barnes Statistical Tables (4th Edition)
You may also use:
A hand held calculator which however must not

be pre-programmed or able to display graphics,
text or algebraic equations. The make and type of
machine must be stated clearly on the front cover
of the answer book.
c
LSE
2008/ST102 Mock Paper
Page 1 of 9
SECTION A
1. a) Let {Ai} be a set of events in a sample space S.
The three axioms of probability may be stated as follows:
p(A) 0 for all events A
p(S) = 1
p( U Ai) = p(Ai) if events {Ai} are mutually exclusive.

i=1
i=1
i)
Prove that p(A ) = 1 p(A) for any event A, where AC is the event
complementary to event A.
ii)
Prove that if A B then p(A) p(B) for events A and B.
iii)
State what it means for events A and B to be independent, and

show that if A and B are independent, then AC and BC are
independent.
[For parts i)and ii)no results about probabilities may be assumed

unless derived from the axioms. For part iii), any additional results about
probabilities that you use should be stated clearly but need not be
proved. For each of i), ii) and iii), results from set theory should be
stated clearly but need not be proved.]
[10 marks]
b) Assume that a calculator has a random number key and that when the key is
pressed, a whole number between 0 and 999 inclusive is generated at random,
all numbers being generated independently of one another.
i)
What is the probability that the number generated is less than 300?
ii)
If two numbers are generated, what is the probability that both are
less than 300?
iii)
If two numbers are generated, what is the probability that the first
exceeds the second?
iv)
If two numbers are generated, what is the probability that the first
exceeds the second and their sum is exactly 300?
v)
If five numbers are generated, what is the probability that at least

one number occurs more than once?
[10 marks]
2.
a) X and Y are discrete random variables that can assume values 0,1 and 2 only.
p(X = x, Y = y) = A(x + y) for some constant A and x, y {0,1,2}
i)
Draw up a table to describe the joint distribution of X and Y and

find the value of the constant A.
ii)
Describe the marginal distributions of X and Y.
iii)
Give the conditional distribution of X|Y = 1 and find E(X|Y = 1).
iv)
Are X and Y independent? Give reasons for your answer.

[11 marks]
b) The independent random variables X1, X2, and X3 are each normally
distributed with mean 0 and variance 4. Find
i)
p(X1 > X2 + X3)
ii)
p(X12 > 9.25(X22 + X32))
iii)
p(X1 > 5(X22 + X32))
[9 marks]
3. a) Show that the moment generating function (mgf) of a Poisson distribution with
parameter is given by
m(t) = exp [(exp(t) 1)]
(writing exp() e)
Hence or otherwise, show that the mean and variance of the distribution are both .
[10 marks]
b) Suppose X and Y are independent random variables both of which possess an
mgf. Prove that the mgf of their sum is the product of their individual mgfs, i.e.
mX+Y(t) = mX(t) mY(t)
Hence or otherwise, show that if X and Y are Poisson random variables with
parameters 1 and 2 respectively, then X + Y is a Poisson random variable with
parameter 1 + 2.
[6 marks]
c) On a dangerous stretch of road, it has been found that in any three month period,
the average number of deaths and serious injuries to car drivers, pedestrians and
other road users (including cyclists) are 2, 3 and 4 respectively. Assume that the
numbers of deaths and serious injuries for each type of road user can be well
represented by a Poisson distribution. What is the probability of at most one death
or serious injury for any type of road user next month?
[4 marks]
4. A random variable X has probability density function (pdf)

{,
f(x) = {
{0,
a)
b)
c)
d)
e)
f)
5. a)
0x1
1<x2
otherwise.
Explain why f(x) can serve as a pdf.

Find the mean and median of the distribution.
Find Var(X).
Write down the cumulative distribution function of X.
Find p(X = 1) and p(X > 1.5|X > 0.5).
What is the moment generating function of X?
[3 marks]
[4 marks]
[2 marks]
[3 marks]
[4 marks]
[4 marks]
Show that if X and Y are independent random variables, then

var(X + Y) = var(X) + var(Y).
[5 marks]
b) A manufacturer produces vitamin pills whose masses are normally distributed

with mean 0.5g and standard deviation 0.2g. The pills are sorted into jars each
containing 100 pills. Each jar has mass precisely 50g and the pills in each jar are a
random sample from the population.
i)
Find the probability that if a jar containing 100 pills is chosen at

random, its mass lies between 97g and 105g.
ii)
Fifty jars are chosen at random. What is the probability that the mean
mass of the fifty jars exceeds 100.2g?
iii)
Fifty jars are placed in a box. What is the probability that the total
mass of jars in the box, where each jar contains 100 pills, exceeds
5010g?
iv)
Inadvertently, some jars contain 101 pills. What is the probability

that a jar selected at random from those containing 100 pills has a
greater mass than a jar selected at random from those containing 101
pills?
[15 marks]
Section B
6. (a) Let {X1 , , Xn } be a sample from a population with probability density function f (x, 1 , 2 ), where 1 , 2 are two unknown parameters. Write down the
equations to be solved in order to obtain the method of moments estimators for
1 and 2 . Define the maximum likelihood estimators for 1 and 2 . Comment
on why the maximum likelihood estimators is in general preferred than the
method of moments estimators.
[6 marks]
(b) Let {X1 , , Xn } be a sample from the uniform distribution on the interval
[a , a + ] ( > 0).
b
i. Find the method of moments estimators b
a and .
[6 marks]
ii. Is b
a an unbiased estimator for a?
[2 marks]
iii. Compute the mean square error MSE(b
a)? Is b
a a consistent estimator?
[3 marks]
b
iv. For the sample {1, 2, 1, 3, 4}, compute b
a and .
[3 marks]
=
7. (a) A random sample of size n = 35 from N(, 2 ) yields the sample mean X
2
14.6 and the sample variance S = 20.15.
i. Compute the maximum likelihood estimate for = /.
[3 marks]
ii. Find a 95% confidence interval for .
[4 marks]
iii. Now assuming 2 = 20.15, recalculate a 95% confidence interval for .
Compare it with the one obtained above, and comment on the difference.
How many more observations will be required in order to ensure that the
length of the confidence interval is not beyond 2?
[8 marks]
(Hint. For Part i., you may use directly the known results on the MLEs for
and 2 .)
(b) Show that for any constants c and d, it holds
n
X
(ai c)(bi d) =
i=1
where a
=
1
n
n
X
(ai a)(bi b) + n(
a c)(b d),
i=1
Pn
i=1
ai and b =
c
LSE
1
n
Pn
i=1 bi .
[5 marks]
Page 5 of 9
8. (a) Let {X1 , , Xn } and {Y1 , , Ym } be two independent random samples from,
S 2 denote the sample mean
respectively, N(x , 2 ) and N(y , 2 ). Let X,
x
and the sample variance of {X1 , , Xn }, and Y , Sy2 denote the sample mean
and the sample variance of {Y1 , , Ym }.
Y .
i. Write down the distribution of X
[2 marks]
2
2
2
ii. Write down the distribution of {(n 1)Sx + (m 1)Sy }/ .
[2 marks]
iii. Show that
s
Y (x y )
n+m2
X
q
tn+m2
1/n + 1/m (n 1)S 2 + (m 1)S 2
x
y
[4 marks]
= 2.56, Y = 2.34 S 2 = 2.6 and S 2 = 1.89.

iv. Suppose n = 15, m = 11, X
x
y
Test the hypothesis H0 : x = y against H1 : x > y .
[4 marks]
(b) Use the Wilcoxon ranked-sign test to test at the 5% significance level if the the
population mean of Sample 1 is greater than that of Sample 2.
Sample 1
Sample 2
15 7 22 20 32
8 27 17 25 20
18 26 17 23 30
16 21 18 10 8
[8 marks]
9. (a) The numbers of calls received over 100 one-minute intervals at a telephone
exchange are summarized as follows
No. of calls 0 1 2 3 4 5 6 7 8 9
frequency
2 8 12 13 24 22 9 3 5 2
i. An expert claims that the number of calls received per minute at this
exchange follows the Poisson distribution with mean 3.5. Based on the
above data, will you be able to reject the claim at 5% significance level?
[6 marks]
ii. Test at 5% significance level if the data follow a Poisson distribution with
an unknown mean?
[6 marks]
(b) Below is the Minitab output of a two-way ANOVA in which some information
is missing.
Two-way ANOVA: C1 versus C2, C3
Source
DF
SS
MS
F
P
C2
?
487.23
?
?
?
C3
?
?
184.09
?
?
Error
26
708.00
?
Total
34
2115.70
i. Find the missing values C3 SS and C3 DF.
ii. Is the C3-effect significant at the 1% significance level?
iii. Is the C2-effect significant at the 1% significance level?
c
LSE
[2 marks]
[3 marks]
[3 marks]
Page 6 of 9
10. (a) It has been claimed that watching TV reduces the amount of physical exercise,
causing weight gains among children. The table below lists the over-weights
(in pounds) of randomly selected 8 children together with the times (in hours)
which they watch TV per week.
TV time (x)
42 34 25 35 37
Overweight (y) 18 6 0 -1 13
38 31 33
14 7 8
i. Plot the data y against x.

[2 marks]
ii. Fit the data with a simple regression model y = 0 + 1 x + . Superimpose
the fitted regression line in the plot obtained in i. above.
[5 marks]
iii. Test the hypothesis H0 : 1 = 0 against H1 : 1 > 0. What can be
concluded on the childrens overweight in relation to the TV-watching
time?
[4 marks]
iv. Plot the residual against x, and comment on the fitted regression model
based on this residual plot.
[2 marks]
(b) Let {(xi , yi ), 1 i n} be observations from linear regression model
yi = 0 + 1 xi + i .
Let b0 and b1 be the least squares estimators, and ybi = b0 + b1 xi . Show that
n
n
n
X
X
X
(yi y)2 =
(yi ybi )2 +
(b
yi y)2 .
i=1
c
LSE
i=1
i=1
[7 marks]
Page 7 of 9
Appendix: Formula Sheet

1. Discrete Distributions
Distribution
Probability function
Binomial
n!
k (1
k!(nk)!
Geometric
(1 )k1 ,
Negative binomial
(k1)!
r (1
(r1)!(kr)!
Poisson
k
e ,
k!
Hypergeometric
S
k

)nk ,
N S
nk
)kr ,
.
N
n
Mean
Variance
k = 0, 1, , n
n(1 )
k = 1, 2,
1/
(1 )/ 2
k = r, r + 1,
r/
r(1 )/ 2
k = 0, 1, 2,
k = 0, 1, , n
nS
N
nS(N S)(N n)
N 2 (N 1)
2. Simple Linear Regression

Model: yi = 0 + 1 xi + i .
Pn
P
LSEs: b0 = y b1 x, b1 = ni=1 (xi x)(yi y)
)2 , and
j=1 (xj x
Pn
2
2
j=1 xj
b
P
Var(0 ) =
,
n ni=1 (xi x)2
2
,
)2
i=1 (xi x
Var(b1 ) = Pn
Estimator for the variance of i :

b2 =
1
n2
Regression ANOVA:
Total SS =
n
X
(yi
y) ,
i=1
Regression SS = b12
Pn
i=1 (yi
n
X
2 x
b
b
P
Cov(0 , 1 ) = n
.
)2
i=1 (xi x
b0 b1 xi )2 .
(xi
x) ,
Residual SS =
i=1
i=1
Squared regression correlation coefficients:

R2 =
Regression SS
,
Total SS
2
Radj
=1
n
X
(yib0 b1 xi )2 .
(Residual SS)/(n 2)
.
(Total SS)/(n 1)
For a given x, the expectation of y is (x) = 0 + 1 x. A (1 ) confidence

interval for (x) is
Pn

(xi x)2 1/2
Pi=1
,
b0 + b1 x t/2, n2
b
n nj=1 (xj x)2
and a predictive interval which covers y with probability (1 ) is

Pn

(xi x)2 1/2
b0 + b1 x t/2, n2
b 1 + Pi=1
.
n nj=1(xj x)2
c
LSE
Page 8 of 9
3. One-way ANOVA
Pk
Pnj
Pr
Pc
2
X)
P
j X)
2
Between-treatments variation: B = kj=1 nj (X
P P nj
j )2
(Xij X
Within-treatments variation: W = kj=1 i=1
Total variation:
j=1
i=1 (Xij
4. Two-way ANOVA
Total variation:
i=1
j=1 (Xij
2
X)
Pr
X)
2
Pc
j X)
2
Between-treatments (columns) variation: Bcol = r j=1 (X
Pr Pc
2
Residual (Error) variation:
j=1 (Xij Xi Xj + X)
i=1
Between-blocks (rows) variation: Brow = c
c
LSE
i=1 (Xi
Page 9 of 9

ST102 2008mock

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ST102 2008mock

Uploaded by

Copyright:

Available Formats

2008 Mock examination

ST102 Mock Paper

2007/8 syllabus only - not for resit candidates

You may also use:

A hand held calculator which however must not

p(A) 0 for all events A

p( U Ai) = p(Ai) if events {Ai} are mutually exclusive.

Prove that if A B then p(A) p(B) for events A and B.

State what it means for events A and B to be independent, and

[For parts i)and ii)no results about probabilities may be assumed

If five numbers are generated, what is the probability that at least

Draw up a table to describe the joint distribution of X and Y and

Describe the marginal distributions of X and Y.

Give the conditional distribution of X|Y = 1 and find E(X|Y = 1).

Are X and Y independent? Give reasons for your answer.

p(X1 > X2 + X3)

p(X12 > 9.25(X22 + X32))

p(X1 > 5(X22 + X32))

4. A random variable X has probability density function (pdf)

Explain why f(x) can serve as a pdf.

Show that if X and Y are independent random variables, then

b) A manufacturer produces vitamin pills whose masses are normally distributed

Find the probability that if a jar containing 100 pills is chosen at

Inadvertently, some jars contain 101 pills. What is the probability

= 2.56, Y = 2.34 S 2 = 2.6 and S 2 = 1.89.

i. Plot the data y against x.

Appendix: Formula Sheet

2. Simple Linear Regression

Estimator for the variance of i :

Squared regression correlation coefficients:

For a given x, the expectation of y is (x) = 0 + 1 x. A (1 ) confidence

and a predictive interval which covers y with probability (1 ) is

Between-blocks (rows) variation: Brow = c

You might also like