You are on page 1of 8

MAST20005 Statistics, Assignment 3

Brendan Hill - Student 699917 (Tutorial Thursday 10am)

November 19, 2016

Question 1
(a)
The likelihood ratio test is based on the following:

L(θ0 )
λ= ≤k
L(θM L )
Q Qn P Pn
Since the MLE of θ in this case is given by X̄ this is determined as follows (where means i=1 and means i=1
throughout):

L(θ0 )
λ= ≤k
L(x̄)
Q xi
[θ (1 − θ0 )1−xi ]
= Q 0xi ≤k
[x̄ (1 − x̄)1−xi ]
P P
(θ0 )( xi ) (1 − θ0 )(n− xi )
= P P
(x̄)( xi ) (1 − x̄)(n− xi )
(θ0 )y (1 − θ0 )(n−y) X
= (where y = xi )
(y/n)y (1 − y/n)(n−y)
 θ y  1 − θ n−y
0 0
=
y/n 1 − y/n
 nθ y  n − nθ n−y
0 0
=
y n−y

P
Therefore the likelihood ratio test is based on the statistic Y = Xi .

(b)
When H0 is true, Y is the sum of n i.i.d Bernoulli trials with p = θ0 . Therefore when H0 is true, the Y distributes
according to Bi(n, θ0 ).

(c)
Using the normal approximation, Y ≈ N (µ = nθ0 , σ 2 = nθ0 (1 − θ0 ))
Hence for n = 100, θ0 = 21 , the following test rejects H0 at the 0.05 significance level:

|Y − nθ0 |
z=p ≥ z0.025
nθ0 (1 − θ0 )
|Y − 50|
⇒z= ≥ 1.96
5
In terms of a critical region for Y, this is equivalent to:

c = 40
Where the critical region is Y ≤ 40 or Y ≥ 60.

1
Question 2
(a)
Under the null hypothesis, where the medianPof X is 0, we expect approximately half the observations in general to
n
be > 0. Hence the sign test statistic S(0) = i=1 I(Xi > 0) simply has a binomial distribution with n = 25, p = 0.5,
ie. S(0) ≈ Bi(25, 0.5).
So the significance level of a test which rejects H0 when S(0) ≥ 16 is given by P (S(0) ≥ 16|H0 ) = 1 − P (S(0) ≤
15|H0 ) = 0.11476

(b)
Let H2 be the hypothesis that X ≈ N (0.1, 1). Hence P (X > 0|H2 ) = 0.53983. So S(0)|H2 distributes according to a
binomial with parameters n = 25, p = 0.53983.
Then the power of the test S(0) ≥ 16 is 1 − P (S(0) ≤ 15|H2 ) = 0.21150.

(c)
Note that this question relies on the assumption that X distributes normally (at least, symmetrically) so that the
median is the mean. This assumption has been confirmed with the lecturer even though it is not stated explicitly in
the assignment.

2
Note that by the central limit theorem, for σX = 1, we have that X̄ ≈ N (µX , 1/n).
p
Let the null hypothesis H0 be that µX = 0. Hence under H0 , (X̄ − 0)/ 1/25 = 5X̄ ≈ N (0, 1).

p
Let Z = X̄/ 1/n = 5X̄, so Z ≈ N (0, 1). Then the following test has significance level 0.11476:

Z ≥ z0.11476 = 1.201596

(d)
If X ≈ N (0.1, 1) (H1 ) then the power of the test in (c) under H1 is given by:

P (Z ≥ 1.201596) = P (5X̄ ≥ 1.201596)


= P (X̄ ≥ (1/5)1.201596)
= P ((X̄ − 0.1) ≥ (1/5)1.201596 − 0.1)
= P (5(X̄ − 0.1) ≥ 1.201596 − 0.5)
= P (5(X̄ − 0.1) ≥ 0.701596)
= 1 − Φ(0.701596) since 5(X̄ − 0.1) ≈ N (0, 1) under H1
= 0.2414656

2
Question 3
(a)
Since X distributes by Exp(λ = 1), F (x) = 1 − e−x . Hence if y = F (x) then:

y = 1 − e−x
e−x = 1 − y
−x = log(1 − y)
x = − log(1 − y) y ∈ (0, 1)

In other words, F −1 (y) = − log(1 − y)

(b)
H0 is the hypothesis that X ≈ Exp(1).
Five equally probable class intervals for X are given by evaluating the intervals of (F −1 (y) given by y ∈ {0, 0.2, 0.4, 0.6, 0.8, 1.0,
hence:

(0, 0.223144], (0.223144, 0.510826], (0.510826, 0.916291], (0.916291, 1.60944], (1.60944, ∞)

(c)
Using these 5 categories the following test statistic can be used, to test at the α = 0.05 significance level:

Q = (6 − 8)2 /8 + (16 − 8)2 /8 + (7 − 8)2 /8 + (6 − 8)2 /8 + (5 − 8)2 /8 = 10.25


With a critical region:

Q ≥ c = χ20.05 (4) = 9.487729


Since Q > c, we reject H0 at the 0.05 significance level.

3
Question 4
(a)
Let H0 be the hypothesis that the median m = 130, and H1 be the alternative hypothesis that m > 130.

Let Y equal the number of observations such that (xi − 130) > 0. Hence under H0 , since 130 is the median,
Y ≈ Binom(8, 0.5).

The critical region under H0 at significance 0.05 is given by Y > 6.

The dataset provided yields y = 3, hence we do not reject H0 at this significance level.

(b)
Let X represent the sample of unexposed babies and Y the sample of exposed babies. Note that nx = 7 and ny = 8.

Let the null hypothesis H0 be mx = my , and the alternative hypothesis H1 be that mx < my .
The normal approximation of the distribution of the W statistic is:

ny (nx + ny + 1)
µW = = 64
2
nx ny (nx + ny + 1)
V ar(W ) = = 74.667
12
Hence the critical region for W is:

W ≥ 64 + z0.01 74.667 = 84.10199
The value of the W statistic for the observations provided is calculated as follows:
Dataset X Y

Observations 8 11 12 14 20 43 111 35 56 83 92 128 150 176 208

Rank 1 2 3 4 5 7 11 6 8 9 10 12 13 14 15

P
Now w = yrank = 87 > 84.10199. Hence we reject H0 at the 0.01 significance level.

4
Question 5
(a)
Let µX and µY represent the means of X and Y respectively. Then, let the null hypothesis H0 be that µX = µY = µ,
where X ≈ N (µ, σ 2 ) and Y ≈ N (µ, σ 2 ). The alternative hypothesis H1 is that µX 6= µY .
The following test statistic can be used to test H0 at the 0.05 significance level:

|X̄ − Ȳ |
T = p ≥ t0.025 (n + m − 2)
SP 1/n + 1/m
q 2 +(m−1)S 2
(n−1)SX
where n and m are the sample sizes of X and Y respectively, and the pooled variance SP = n+m−2
Y

For the sample given we have n = 14, m = 5, x̄ = 12.56, s2x = 24.65, ȳ = 17.32, s2y = 11.01, hence the test statistic is:
r
13 · 24.65 + 4 · 11.01
SP = = 4.6304
17
|12.56 − 17.32|
t= p = 1.97315
4.6304 · 1/14 + 1/5
With critical region:

t0.025 (17) = 2.109816


Since 1.97315 < 2.109816, we do not reject H0 at the 0.05 significance level.

(b)
p = 0.064964

(c)
Endpoints of the 95% confidence interval for µx − µy are given by:
p
(x̄ − ȳ) ± t0.025 (17)SP 1/14 + 1/15
Hence we are 95% confidence that the true difference lies in the following interval:

[−9.84968, 0.329682]
Note that this interval includes 0, reflecting the fact that we failed to reject the null hypothesis in (a).

(d)
2
The following F statistic can be used to test if the hypothesis σX = σY2 at the 0.05 significance level:

s2x
F = ∈
/ [f0.025 (n − 1, m − 1), f0.975 (n − 1, m − 1)]
s2y
For the samples provided, the value of the test statistic is:

f = 2.238874
And the critical region is:

(0, 0.2502567) ∪ (8.7149963, ∞)


2
However since 0.2502567 < 2.238874 < 8.7149963, there is not sufficient evidence to reject the hypothesis that σX = σY2
at the 0.05 significance level.

5
Question 6
The null hypothesis H0 is that the angle of pull does not affect the separation force required. The alternative hypoth-
esis H1 states that angle of pull does affect the separation force required.

The relevant test statistic is:

SS(A)/3
F = ≥ f0.05 (3, 12) = 3.490295
SS(E)/12
Using the ”anova” function in R to analyze variance by the A factor yields the following results:

SS(A) = 58.157

SS(E) = 91.005
Hence

f = 2.5562
Since 2.5562 < 3.490295, we do not reject H0 at the 0.05 significance level. So there is not enough evidence to suggest
that the angle of pull affects the separation force required.

Note that the p-value for this result is 0.104162

Note that I have assumed normality and equal variances of the underlying data.

6
Question 7
(a)
Note that E(Xij ) = E(µ + αi + ij ) = µ + αi , so:

m
i n
1 XX
E(X̄.. ) = E( Xij )
n i=1 j=1
m i n
1 XX
= E(Xij )
n i=1 j=1
m i n
1 XX
= (µ + αi )
n i=1 j=1
m
1X
= ni [µ + αi ]
n i=1
m
1 1X
= (nµ) + ni αi
n n i=1
m
1X
=µ+ ni αi
n i=1

Pm
X̄.. will be an unbiased estimatorPof µ when n1 i=1Pni αi = 0. Specifically, when all values of ni are equal (say n0 ),
m m m
this becomes n1 i=1 n0 αi = n1 n0 i=1 αi = 0, since i=1 αi = 0.
P

So X̄.. is an unbiased estimator of µ when the sample sizes ni for all categories are the same.

(b)

m
X m
X
ni (X̄i. − X̄.. )2 = ni (X̄i.2 − 2X̄i. X̄.. + X̄..2 )
i=1 i=1
m
X m
X m
X
= ni X̄i.2 − 2X̄.. ni X̄i. + X̄..2 ni
i=1 i=1 i=1
Xm Xm h1 X ni i
= ni X̄i.2 − 2X̄.. ni Xij + nX̄..2
i=1 i=1
ni j=i
Xm ni
m X
X
= ni X̄i.2 − 2X̄.. Xij + nX̄..2
i=1 i=1 j=i
m
X
= ni X̄i.2 − 2X̄.. [nX̄.. ] + nX̄..2
i=1
m
X
= ni X̄i.2 − 2nX̄..2 + nX̄..2
i=1
m
X
= ni X̄i.2 − nX̄..2
i=1

7
(c)
 P
1 m Pni
Required to show: That n−m i=1 j=1 (Xij − X̄i. )2 is an unbiased estimator of σ 2

First, note that:

E[Xij ] = (µ + αi )
V ar[Xij ] = σ 2
2
E[Xij ] = V ar[Xij ] + E[Xij ]2 = σ 2 + (µ + αi )2
1
Pni
Also since X̄i. = ni i=1 Xij we have:

E[X̄i. ] = (µ + αi )
σ2
V ar[X̄i. ] =
n
σ2
E[X̄i.2 ] = V ar[X̄i. ] + E[X̄i. ]2 = n + (µ + αi )2
Now, the expectation of the expression under consideration is calculated as follows:
m Xni
" m n #
 1 X  1  XX i
2 2
E[ (Xij − X̄i. ) ] = E (Xij − X̄i. )
n − m i=1 j=1 n−m i=1 j=1
" m n #
 1  XX i
2 2
= E (Xij − 2Xij X̄i. + X̄i. )
n−m i=1 j=1
" m ni ni
#
 1  X h X i hX i 
2 2
= E Xij − 2X̄i. Xij + ni X̄i.
n−m i=1 j=1 j=1
" m ni
# ni
 1  X h X i  X
2
= E Xij − 2X̄i. [ni X̄i. ] + ni X̄i.2 since ni X̄i. = Xij
n−m i=1 j=1 i=1
" m ni
#
 1  X h X i 
2
= E Xij − 2ni Xi.2 + ni X̄i.2
n−m i=1 j=1
" m ni
#
 1  X h X i
2 2
= E Xij − ni Xi. )
n−m i=1 j=1
" m n #
 1  XX i h i
2 2
= E Xij − X̄i.
n−m i=1 j=1
m i n
 1 XX h
2
i
= E Xij − X̄i.2
n − m i=1 j=1
m i n
 1 XX 2
= E[Xij ] − E[X̄i.2 ]
n − m i=1 j=1
m i n
 1 XX 2
= (σ 2 + (µ + ai )2 ) − ( σni + (µ + ai )2 )
n − m i=1 j=1
m i n
 1 XX σ2
= (σ 2 − ni )
n − m i=1 j=1
m ni
 1  2 XX 1
= σ (1 − ni )
n−m i=1 j=1
m
1  2X

= σ (ni − 1)
n−m i=1
 1 
= σ 2 (n − m)
n−m
= σ2

Hence, it is an unbiased estimator of σ 2 .