You are on page 1of 9

2008 Mock examination

ST102 Mock Paper


Elementary Statistical Theory

2007/8 syllabus only - not for resit candidates

Instructions to candidates
Time allowed: 3 hours
Full marks may be obtained for complete answers to FIVE equations.
Answer not more than THREE questions from Section A and not more than THREE
questions from Section B.
You are supplied with:

Graph Paper
Murdoch & Barnes Statistical Tables (4th Edition)

You may also use:

A hand held calculator which however must not


be pre-programmed or able to display graphics,
text or algebraic equations. The make and type of
machine must be stated clearly on the front cover
of the answer book.

c
LSE
2008/ST102 Mock Paper

Page 1 of 9

SECTION A
1. a) Let {Ai} be a set of events in a sample space S.
The three axioms of probability may be stated as follows:

p(A) 0 for all events A

p(S) = 1

p( U Ai) = p(Ai) if events {Ai} are mutually exclusive.


i=1

i=1

i)

Prove that p(A ) = 1 p(A) for any event A, where AC is the event
complementary to event A.

ii)

Prove that if A B then p(A) p(B) for events A and B.

iii)

State what it means for events A and B to be independent, and


show that if A and B are independent, then AC and BC are
independent.

[For parts i)and ii)no results about probabilities may be assumed


unless derived from the axioms. For part iii), any additional results about
probabilities that you use should be stated clearly but need not be
proved. For each of i), ii) and iii), results from set theory should be
stated clearly but need not be proved.]
[10 marks]
b) Assume that a calculator has a random number key and that when the key is
pressed, a whole number between 0 and 999 inclusive is generated at random,
all numbers being generated independently of one another.
i)

What is the probability that the number generated is less than 300?

ii)

If two numbers are generated, what is the probability that both are
less than 300?

iii)

If two numbers are generated, what is the probability that the first
exceeds the second?

iv)

If two numbers are generated, what is the probability that the first
exceeds the second and their sum is exactly 300?

v)

If five numbers are generated, what is the probability that at least


one number occurs more than once?
[10 marks]

2.

a) X and Y are discrete random variables that can assume values 0,1 and 2 only.
p(X = x, Y = y) = A(x + y) for some constant A and x, y {0,1,2}
i)

Draw up a table to describe the joint distribution of X and Y and


find the value of the constant A.

ii)

Describe the marginal distributions of X and Y.

iii)

Give the conditional distribution of X|Y = 1 and find E(X|Y = 1).

iv)

Are X and Y independent? Give reasons for your answer.


[11 marks]

b) The independent random variables X1, X2, and X3 are each normally
distributed with mean 0 and variance 4. Find
i)

p(X1 > X2 + X3)

ii)

p(X12 > 9.25(X22 + X32))

iii)

p(X1 > 5(X22 + X32))

[9 marks]

3. a) Show that the moment generating function (mgf) of a Poisson distribution with
parameter is given by
m(t) = exp [(exp(t) 1)]

(writing exp() e)

Hence or otherwise, show that the mean and variance of the distribution are both .
[10 marks]
b) Suppose X and Y are independent random variables both of which possess an
mgf. Prove that the mgf of their sum is the product of their individual mgfs, i.e.
mX+Y(t) = mX(t) mY(t)
Hence or otherwise, show that if X and Y are Poisson random variables with
parameters 1 and 2 respectively, then X + Y is a Poisson random variable with
parameter 1 + 2.
[6 marks]
c) On a dangerous stretch of road, it has been found that in any three month period,
the average number of deaths and serious injuries to car drivers, pedestrians and
other road users (including cyclists) are 2, 3 and 4 respectively. Assume that the
numbers of deaths and serious injuries for each type of road user can be well
represented by a Poisson distribution. What is the probability of at most one death
or serious injury for any type of road user next month?
[4 marks]

4. A random variable X has probability density function (pdf)


{,
f(x) = {
{0,
a)
b)
c)
d)
e)
f)

5. a)

0x1
1<x2
otherwise.

Explain why f(x) can serve as a pdf.


Find the mean and median of the distribution.
Find Var(X).
Write down the cumulative distribution function of X.
Find p(X = 1) and p(X > 1.5|X > 0.5).
What is the moment generating function of X?

[3 marks]
[4 marks]
[2 marks]
[3 marks]
[4 marks]
[4 marks]

Show that if X and Y are independent random variables, then


var(X + Y) = var(X) + var(Y).
[5 marks]

b) A manufacturer produces vitamin pills whose masses are normally distributed


with mean 0.5g and standard deviation 0.2g. The pills are sorted into jars each
containing 100 pills. Each jar has mass precisely 50g and the pills in each jar are a
random sample from the population.
i)

Find the probability that if a jar containing 100 pills is chosen at


random, its mass lies between 97g and 105g.

ii)

Fifty jars are chosen at random. What is the probability that the mean
mass of the fifty jars exceeds 100.2g?

iii)

Fifty jars are placed in a box. What is the probability that the total
mass of jars in the box, where each jar contains 100 pills, exceeds
5010g?

iv)

Inadvertently, some jars contain 101 pills. What is the probability


that a jar selected at random from those containing 100 pills has a
greater mass than a jar selected at random from those containing 101
pills?
[15 marks]

Section B
6. (a) Let {X1 , , Xn } be a sample from a population with probability density function f (x, 1 , 2 ), where 1 , 2 are two unknown parameters. Write down the
equations to be solved in order to obtain the method of moments estimators for
1 and 2 . Define the maximum likelihood estimators for 1 and 2 . Comment
on why the maximum likelihood estimators is in general preferred than the
method of moments estimators.
[6 marks]
(b) Let {X1 , , Xn } be a sample from the uniform distribution on the interval
[a , a + ] ( > 0).
b
i. Find the method of moments estimators b
a and .
[6 marks]
ii. Is b
a an unbiased estimator for a?
[2 marks]
iii. Compute the mean square error MSE(b
a)? Is b
a a consistent estimator?

[3 marks]

b
iv. For the sample {1, 2, 1, 3, 4}, compute b
a and .

[3 marks]

=
7. (a) A random sample of size n = 35 from N(, 2 ) yields the sample mean X
2
14.6 and the sample variance S = 20.15.
i. Compute the maximum likelihood estimate for = /.
[3 marks]
ii. Find a 95% confidence interval for .
[4 marks]
iii. Now assuming 2 = 20.15, recalculate a 95% confidence interval for .
Compare it with the one obtained above, and comment on the difference.
How many more observations will be required in order to ensure that the
length of the confidence interval is not beyond 2?
[8 marks]
(Hint. For Part i., you may use directly the known results on the MLEs for
and 2 .)
(b) Show that for any constants c and d, it holds
n
X

(ai c)(bi d) =

i=1

where a
=

1
n

n
X

(ai a)(bi b) + n(
a c)(b d),

i=1

Pn

i=1

ai and b =

c
LSE
2008/ST102 Mock Paper

1
n

Pn

i=1 bi .

[5 marks]

Page 5 of 9

8. (a) Let {X1 , , Xn } and {Y1 , , Ym } be two independent random samples from,
S 2 denote the sample mean
respectively, N(x , 2 ) and N(y , 2 ). Let X,
x
and the sample variance of {X1 , , Xn }, and Y , Sy2 denote the sample mean
and the sample variance of {Y1 , , Ym }.
Y .
i. Write down the distribution of X
[2 marks]
2
2
2
ii. Write down the distribution of {(n 1)Sx + (m 1)Sy }/ .
[2 marks]
iii. Show that
s
Y (x y )
n+m2
X
q
tn+m2
1/n + 1/m (n 1)S 2 + (m 1)S 2
x
y
[4 marks]

= 2.56, Y = 2.34 S 2 = 2.6 and S 2 = 1.89.


iv. Suppose n = 15, m = 11, X
x
y
Test the hypothesis H0 : x = y against H1 : x > y .
[4 marks]
(b) Use the Wilcoxon ranked-sign test to test at the 5% significance level if the the
population mean of Sample 1 is greater than that of Sample 2.
Sample 1
Sample 2

15 7 22 20 32
8 27 17 25 20

18 26 17 23 30
16 21 18 10 8
[8 marks]

9. (a) The numbers of calls received over 100 one-minute intervals at a telephone
exchange are summarized as follows
No. of calls 0 1 2 3 4 5 6 7 8 9
frequency
2 8 12 13 24 22 9 3 5 2
i. An expert claims that the number of calls received per minute at this
exchange follows the Poisson distribution with mean 3.5. Based on the
above data, will you be able to reject the claim at 5% significance level?
[6 marks]

ii. Test at 5% significance level if the data follow a Poisson distribution with
an unknown mean?
[6 marks]
(b) Below is the Minitab output of a two-way ANOVA in which some information
is missing.
Two-way ANOVA: C1 versus C2, C3
Source
DF
SS
MS
F
P
C2
?
487.23
?
?
?
C3
?
?
184.09
?
?
Error
26
708.00
?
Total
34
2115.70
i. Find the missing values C3 SS and C3 DF.
ii. Is the C3-effect significant at the 1% significance level?
iii. Is the C2-effect significant at the 1% significance level?

c
LSE
2008/ST102 Mock Paper

[2 marks]
[3 marks]
[3 marks]

Page 6 of 9

10. (a) It has been claimed that watching TV reduces the amount of physical exercise,
causing weight gains among children. The table below lists the over-weights
(in pounds) of randomly selected 8 children together with the times (in hours)
which they watch TV per week.
TV time (x)
42 34 25 35 37
Overweight (y) 18 6 0 -1 13

38 31 33
14 7 8

i. Plot the data y against x.


[2 marks]
ii. Fit the data with a simple regression model y = 0 + 1 x + . Superimpose
the fitted regression line in the plot obtained in i. above.
[5 marks]
iii. Test the hypothesis H0 : 1 = 0 against H1 : 1 > 0. What can be
concluded on the childrens overweight in relation to the TV-watching
time?
[4 marks]

iv. Plot the residual against x, and comment on the fitted regression model
based on this residual plot.
[2 marks]
(b) Let {(xi , yi ), 1 i n} be observations from linear regression model
yi = 0 + 1 xi + i .
Let b0 and b1 be the least squares estimators, and ybi = b0 + b1 xi . Show that
n
n
n
X
X
X
(yi y)2 =
(yi ybi )2 +
(b
yi y)2 .
i=1

c
LSE
2008/ST102 Mock Paper

i=1

i=1

[7 marks]

Page 7 of 9

Appendix: Formula Sheet


1. Discrete Distributions
Distribution

Probability function

Binomial

n!
k (1
k!(nk)!

Geometric

(1 )k1 ,

Negative binomial

(k1)!
r (1
(r1)!(kr)!

Poisson

k
e ,
k!

Hypergeometric

S
k



)nk ,

N S
nk

)kr ,

.

N
n

Mean

Variance

k = 0, 1, , n

n(1 )

k = 1, 2,

1/

(1 )/ 2

k = r, r + 1,

r/

r(1 )/ 2

k = 0, 1, 2,

k = 0, 1, , n

nS
N

nS(N S)(N n)
N 2 (N 1)

2. Simple Linear Regression


Model: yi = 0 + 1 xi + i .
 Pn
P
LSEs: b0 = y b1 x, b1 = ni=1 (xi x)(yi y)
)2 , and
j=1 (xj x

Pn
2
2

j=1 xj
b
P
Var(0 ) =
,
n ni=1 (xi x)2

2
,
)2
i=1 (xi x

Var(b1 ) = Pn

Estimator for the variance of i :


b2 =

1
n2

Regression ANOVA:
Total SS =

n
X

(yi
y) ,

i=1

Regression SS = b12

Pn

i=1 (yi

n
X

2 x
b
b
P
Cov(0 , 1 ) = n
.
)2
i=1 (xi x

b0 b1 xi )2 .

(xi
x) ,

Residual SS =

i=1

i=1

Squared regression correlation coefficients:


R2 =

Regression SS
,
Total SS

2
Radj
=1

n
X

(yib0 b1 xi )2 .

(Residual SS)/(n 2)
.
(Total SS)/(n 1)

For a given x, the expectation of y is (x) = 0 + 1 x. A (1 ) confidence


interval for (x) is
Pn

(xi x)2 1/2
Pi=1
,
b0 + b1 x t/2, n2
b
n nj=1 (xj x)2

and a predictive interval which covers y with probability (1 ) is


Pn

(xi x)2 1/2
b0 + b1 x t/2, n2
b 1 + Pi=1
.
n nj=1(xj x)2
c
LSE
2008/ST102 Mock Paper

Page 8 of 9

3. One-way ANOVA
Pk

Pnj

Pr

Pc

2
X)
P
j X)
2
Between-treatments variation: B = kj=1 nj (X
P P nj
j )2
(Xij X
Within-treatments variation: W = kj=1 i=1
Total variation:

j=1

i=1 (Xij

4. Two-way ANOVA
Total variation:

i=1

j=1 (Xij

2
X)
Pr

X)
2
Pc
j X)
2
Between-treatments (columns) variation: Bcol = r j=1 (X
Pr Pc

2
Residual (Error) variation:
j=1 (Xij Xi Xj + X)
i=1

Between-blocks (rows) variation: Brow = c

c
LSE
2008/ST102 Mock Paper

i=1 (Xi

Page 9 of 9

You might also like