Solutions To Problems: X 0 ZX X y

1
Solutions to Problems
Lecture 1
1. P {max(X, Y, Z) t} = P {X t and Y t and Z t} = P {X t}3 by
independence. Thus the distribution function of the maximum is (t6 )3 = t18 , and the
density is 18t17 , 0 t 1.
2. See Figure S1.1. We have

P {Z z} =
yzx
zx
fXY (x, y) dx dy =
FZ (z) =
x=0
ex (1 ezx ) dx = 1
fZ (z) =
1
,
(z + 1)2
ex ey dy dx
y=0
1
,
1+z
z0
z0
FZ (z) = fZ (z) = 0 for z < 0.

3. P {Y = y} = P {g(X) = y} = P {X g 1 (y)}, which is the number of xi s that map to
y, divided by n. In particular, if g is one-to-one, then pY (g(xi )) = 1/n for i = 1, . . . , n.
4. Since the area under the density function must be 1, we have ab3 /3 = 1. Then (see
Figure S1.2) fY (y) = fX (y 1/3 )/|dy/dx| with y = x3 , dy/dx = 3x2 . In dy/dx we
substitute x = y 1/3 to get
fY (y) =
fX (y 1/3 )
3 y 2/3
1
= 3 2/3 = 3
2/3
b 3y
b
3y
for 0 < y 1/3 < b, i.e., 0 < y < b3 .

5. Let Y = tan X where X is uniformly distributed between /2 and /2. Then (see
Figure S1.3)
fY (y) =
fX (tan1 y)
1/
=
|dy/dx|x=tan1 y
sec2 x
with x = tan1 y, i.e., y = tan x. But sec2 x = 1 + tan2 x = 1 + y 2 , so fY (y) =

1/[(1 + y 2 )], the Cauchy density.
Lecture 2
1. We have y1 = 2x1 , y2 = x2 x1 , so x1 = y1 /2, x2 = (y1 /2) + y2 , and

2 0
(y1 , y2 )
= 2.
=
1 1
(x1 , x2 )
y = zx
x
Figure S1.1
Y = X
y
y 1/3
Figure S1.2
Thus fY1 Y2 (y1 , y2 ) = (1/2)fX1 X2 (x1 , x2 ) = ex1 x2 = exp[(y1 /2) (y1 /2) y2 ] =
ey1 ey2 . As indicated in the comments, the range of the ys is 0 < y1 < 1, 0 < y2 < 1.
Therefore the joint density of Y1 and Y2 is the product of a function of y1 alone and
a function of y2 alone, which forces independence.
2. We have y1 = x1 /x2 , y2 = x2 , so x1 = y1 y2 , x2 = y2 and

(x1 , x2 ) y2 y1
=
= y2 .
0 1
(y1 , y2 )
Thus fY1 Y2 (y1 , y2 ) = fX1 X2 (x1 , x2 )|(x1 , x2 )/(y1 , y2 )| = (8y1 y2 )(y2 )(y2 ) = 2y1 (4y23 ).
Since 0 < x1 < x2 < 1 is equivalent to 0 < y1 < 1, 0 < y2 < 1, it follows just as in
Problem 1 that X1 and X2 are independent.
3. The Jacobian (x1 , x2 , x3 )/(y1 , y2 , y3 ) is given by

y2 y3
y1 y3
y1 y2

y2 y3 y3 y1 y3 y2 y1 y2

0
y3
1 y2
= (y2 y32 y1 y2 y32 )(1 y2 ) + y1 y22 y32 + y3 (y2 y1 y2 )y2 y3 + (1 y2 )y1 y2 y32
which cancels down to y2 y32 . Thus
fY1 Y2 Y3 (y1 , y2 , y3 ) = exp[(x1 + x2 + x3 )]y2 y32 = y2 y32 ey3 .
This can be expressed as (1)(2y2 )(y32 ey3 /2), and since x1 , x2 , x3 > 0 is equivalent to
0 < y1 < 1, 0 < y2 < 1, y3 > 0, it follows as before that Y1 , Y2 , Y3 are independent.
Lecture 3
1. MX2 (t) = MY (t)/MX1 (t) = (1 2t)r/2 /(1 2t)r1 /2 = (1 2t)(rr1 )/2 , which is
2 (r r1 ).
'
ar
ct
an
/2
y
2
'
Figure S1.3
2. The moment-generating function of c1 X1 + c2 X2 is
E[et(c1 X1 +c2 X2 ) ] = E[etc1 X1 ]E[etc2 X2 ] = (1 1 c1 t)1 (1 2 c2 t)2 .
If 1 c1 = 2 c2 , then X1 + X2 is gamma with = 1 + 2 and = i ci .
n
n
n
3. M (t) = E[exp( i=1 ci Xi )] = i=1 E[exp(tci Xi )] = i=1 Mi (ci t).
4. Apply Problem 3 with ci = 1 for all i. Thus
MY (t) =
n

Mi (t) =
i=1
n

i=1
exp[i (e 1)] = exp

t

n

(e 1)
t
i=1
which is Poisson (1 + + n ).
5. Since the coin is unbiased, X2 has the same distribution as the number of heads in the
second experiment. Thus X1 + X2 has the same distribution as the number of heads
in n1 + n2 tosses, namely binomial with n = n1 + n2 and p = 1/2.
Lecture 4
1. Let be the normal (0,1) distribution function, and recall that (x) = 1 (x).
Then
n
X
n
<c
P { c < X < + c} = P {c
<
}
/ n
= (c n/) (c n/) = 2(c n/) 1 .954.
Thus (c n/) 1.954/2 = .977. From tables, c n/ 2, so n 4 2 /c2 .

2. If Z = X Y , we want P {Z > 0}. But Z is normal with mean = 1 2 and
variance 2 = (12 /n1 ) + (22 /n2 ). Thus
P {Z > 0} = P {
>
} = 1 (/) = (/).
4
3. Since nS 2 / 2 is 2 (n 1), we have
P {a < S 2 < b} = P {
na
nb
< 2 (n 1) < 2 }.
2
If F is the 2 (n 1) distribution function, the desired probability is F (nb/ 2 )

F (na/ 2 ), which can be found using chi-square tables.
4. The moment-generating function is

2 2
2
nS t
E[etS ] = E exp
= E[exp(t 2 X/n)]
2 n
where the random variable X is 2 (n 1), and therefore has moment-generating function M (t) = (1 2t)(n1)/2 . Replacing t by t 2 /n we get

MS 2 (t) =
2t 2
n
(n1)/2
so S 2 is gamma with = (n 1)/2 and = 2 2 /n.
Lecture 5
1. By denition of the beta density,
E(X) =
(a + b)
(a)(b)
xa (1 x)b1 dx
and the integral is (a + 1, b) = (a + 1)(b)/(a + b + 1). Thus E(X) = a/(a + b).

Now

(a + b) a+1
2
E(X ) =
x (1 x)b1 dx
(a)(b) 0
and the integral is (a + 2, b) = (a + 2)(b)/(a + b + 2). Thus
E(X 2 ) =
(a + 1)a
.
(a + b + 1)(a + b)
and
Var X = E(X 2 ) [E(X)]2
=
1
ab
[(a + 1)a(a + b) a2 (a + b + 1)] =
.
(a + b)2 (a + b + 1)
(a + b)2 (a + b + 1)
2. P {c T c} = FT (c) FT (c) = FT (c) (1 FT (c)) = 2FT (c) 1 = .95, so

FT (c) = 1.95/2 = .975. From the T table, c = 2.131.
3. W = (X1 /m)/(X2 /n) where X1 = 2 (m) and X2 = 2 (n). Consequently, 1/W =
(X2 /n)/(X1 /m), which is F (n, m).
5
4. Suppose we want P {W c} = .05. Equivalently, P {1/W 1/c} = .05, hence
P {1/W 1/c} = .95. By Problem 3, 1/W is F (n, m), so 1/c can be found from the
F table, and we can then compute c. The analysis is similar for .1, .025 and .01.

5. If N is normal (0,1), then T (n) = N/( 2 (n)/n). Thus T 2 (n) = N 2 /(2 (n)/n). But
N 2 is 2 (1), and the result follows.
6. If Y = 2X then fY (y) = fX (x)|dx/dy| = (1/2)ex = (1/2)ey/2 , y 0, the chi-square
density with two degrees of freedom. If X1 and X2 are independent exponential random
variables, then X1 /X2 is the quotient of two 2 (2) random variables, which is F (2, 2).
Lecture 6
1. Apply the formula for the joint density of Yj and Yk with j = 1, k = 3, n = 3, F (x) =
x, f (x) = 1, 0 < x < 1. The result is fY1 Y3 (x, y) = 6(y x), 0 < x < y < 1. Now let
Z = Y3 Y1 , W = Y3 . The Jacobian of the transformation has absolute value 1, so
fZW (z, w) = fY1 Y3 (y1 , y3 ) = 6(y3 y1 ) = 6z, 0 < z < w < 1. Thus
6z dw = 6z(1 z),
fZ (z) =
0 < z < 1.
w=z
2. The probability that more than one random variable falls in [x, x + dx] need not be
negligible. For example, there can be a positive probability that two observations
coincide with x.
3. The density of Yk is
fYk (x) =
n!
xk1 (1 x)nk ,
(k 1)!(n k)!
0<x<1
which is beta with = k and = n k + 1. (Note that (k) = (k 1)!, (n k + 1) =

(n k)!, (k + n k + 1) = (n + 1) = n!.)
4. We have Yk > p if and only if at most k1 observations are in [0, p]. But the probability
that a particular observation lies in [0, p] is p/1 = p. Thus we have n Bernoulli trials
with probability of success p on a given trial. Explicitly,
P {Yk > p} =
k1

i=0
n i
p (1 p)ni .
i
Lecture 7
1. Let Wn = (Sn E(Sn ))/n; then E(Wn ) = 0 for all n, and
n
Var Sn
1 2
nM
M
Var Wn =
= 2
2 =
0.
n2
n i=1 i
n
n
P
It follows that Wn 0.
6
d
2. All Xi and X have the same distribution (p(1) = p(0) = 1/2), so Xn

0. But if
0 < - < 1 then P {|Xn X| -} = P {Xn = X}, which is 0 for n odd and 1 for n even.
Therefore P {|Xn X| -} oscillates and has no limit as n .
3. By the weak law of large numbers, X n converges in probability to , hence converges
in distribution to . Thus we can take X to have a distribution function F that is
degenerate at , in other words,

0,
F (x) =
1,
x<
x .
4. Let Fn be the distribution function of Xn . For all x, Fn (x) = 0 for suciently large
n. Since the identically zero function cannot be a distribution function, there is no
limiting distribution.
Lecture 8
1. Note that MXn = 1/(1t)n where 1/(1t) is the moment-generating function of an
exponential random variable (which has mean ). By the weak law of large numbers,
P
Xn /n , hence Xn /n
.

n
2. 2 (n) = i=1 Xi2 , where the Xi are iid, each normal (0,1). Thus the central limit
theorem applies.
b
3. We have n Bernoulli trials, with probability of success p = a f (x) dx on a given trial.
Thus Yn is binomial (n, p). If n and p satisfy the sucient condition given in the text,
the normal approximation with E(Yn ) = np and Var Yn = np(1 p) should work well
in practice.
4. We have E(Xi ) = 0 and

Var Xi =
E(Xi2 )
1/2
2
=
1/2
1/2
x2 dx = 1/12.
x dx = 2
0
By the central limit theorem, Yn is approximately normal with E(Yn ) = 0 and Var Yn =
n/12.
5. Let Wn = n(1 F (Yn )). Then
P {Wn w} = P {F (Yn ) 1 (w/n)} = P {max F (Xi ) 1 (w/n)}
hence

w n
P {Wn w} = 1
,
n
0 w n,
which approaches ew as n . Therefore the limiting distribution of Wn is exponential.
Lecture 9
1. (a) We have
f (x1 , . . . , xn ) = x1 ++xn
en
.
x1 ! xn !
With x = x1 + + xn , take logarithms and dierentiate to get
x
(x ln n) = n = 0,
= X.
(b) f (x1 , . . . , xn ) = n (x1 xn )1 , > 0, and

n
n

n
ln xi ) = +
ln xi = 0,
(n ln + ( 1)
i=1
i=1
n
= n
i=1
ln xi
Note that 0 < xi < 1, so ln xi < 0 for all i and > 0.

n
n
(c) f (x1 , . . . , xn ) = (1/n ) exp[( i=1 xi )/]. With x = i=1 xi we have
x
n
x
(n ln ) = + 2 = 0,
= X.
n
n
(d) f (x1 , . . . , xn ) = (1/2)n exp[ i=1 |xi |]. We must minimize i=1 |xi |,
and we must be careful when dierentiating because of the absolute values. If the
order statistics of the xi are yi , i = 1, . . . , n, and yk < < yk+1 , then the sum to be
minimized is
( y1 ) + + ( yk ) + (yk+1 ) + + (yn ).
The derivative of the sum is the number
n of yi s less than minus the number of yi s
greater than . Thus as increases, i=1 |xi | decreases until the number of yi s
less than equals the number of yi s greater than . We conclude that is the median
of the Xi .
n
(e) f (x1 , . . . , xn ) = exp[ i=1 xi ]en if all xi , and 0 elsewhere. Thus
f (x1 , . . . , xn ) = exp[
xi ]en I[ min(x1 , . . . , xn )].
i=1
The indicator I prevents us from dierentiating blindly. As increases, so does en ,

but if > mini xi , the indicator drops to 0. Thus = min(X1 , . . . , Xn ).
2. f (x1 , . . . , xn ) = 1 if (1/2) xi +(1/2) for all i, and 0 elsewhere. If Y1 , . . . , Yn
are the order statistics of the Xi , then f (x1 , . . . , xn ) = I[yn (1/2) y1 +(1/2)],
where y1 = min xi and yn = max xi . Thus any function h(X1 , . . . , Xn ) such that
Yn
1
1
h(X1 , . . . , Xn ) Y1 +
2
2
8
for all X1 , . . . , Xn is an MLE of . Some solutions are h = Y1 + (1/2), h = Yn (1/2),
h = (Y1 + Yn )/2, h = (2Y1 + 4Yn 1)/6 and h = (4Y1 + 2Yn + 1)/6. In all cases, the
inequalities reduce to Yn Y1 1, which is true.
3. (a) Xi is Poisson () so E(Xi ) = . The method of moments sets X = , so the
estimate of is = X, which is consistent by the weak law of large numbers.
1
(b) E(Xi ) = 0 x d = /( + 1) = X, = X + X, so
=
X
/( + 1)
P
=
1 [/( + 1)]
1X
hence is consistent.
(c) E(Xi ) = = X, so = X, consistent by the weak law of large numbers.
(d) By symmetry, E(Xi ) = so = X as in (a) and (c).

(e) E(Xi ) = xe(x) dx = (with y = x ) 0 (y + )ey dy = 1 + = X. Thus
= X 1 which converges in probability to (1 + ) 1 = , proving consistency.

r
r
4. P {X r} = 0 (1/)ex/ dx = ex/ 0 = 1 er/ . The MLE of is = X [see
Problem 1(c)], so the MLE of 1 er/ is 1 er/X .
5. The MLE of is X/n, the relative frequency of success. Since
b

n k
P {a X b} =
(1 )nk ,
k
k=a
the MLE of P {a X b} is found by replacing by X/n in the above summation.
Lecture 10
1. Set 2(b) 1 equal to the desired condence level. This, along with the table of the
normal
(0,1) distribution function, determines b. The length of the condence interval
is 2b/ n.
2. Set 2FT (b) 1 equal to the desired condence level. This, along with the table of the
T (n
1) distribution function, determines b. The length of the condence interval is
2bS/ n 1.
3. In order to compute the expected length of the condence interval, we must compute
E(S), and the key observation is

nS 2
2
S=
=
(n 1).
2
n
n
If f (x) is the chi-square density with r = n 1 degrees of freedom [see (3.8)], then the
expected length is

2b
x1/2 f (x) dx
n1 n 0
and an appropriate change of variable reduces the integral to a gamma function which
can be evaluated explicitly.
9
4. We have E(Xi ) = and Var(Xi ) = 2 . For large n,
X
X
/ n
/ n
is approximately normal (0,1) by the central limit theorem. With c = 1/ n we have

P {b <
X
< b} = (b) (b) = 2(b) 1
c
and if we set this equal to the desired level of condence, then b is determined. The
condence interval is given by (1 bc) < X < (1 + bc), or
X
X
<<
1 + bc
1 bc
where c 0 as n .
5. A condence interval of length L corresponds to |(Yn /n) p| < L/2, an event with
probability

L n/2
2
1.
p(1 p)
Setting this probability equal to the desired condence level gives an inequality of the
form
L n/2

> c.
p(1 p)
As in the text, we can replace p(1p) by its maximum value 1/4. We nd the minimum
value of n by squaring both sides.
In the rst example in (10.1), we have L = .02, L/2 = .01 and c = 1.96. This problem
essentially reproduces the analysis in the text in a more abstract form. Specifying how
close to p we want our estimate to be (at the desired level of condence) is equivalent
to specifying the length of the condence interval.
Lecture 11
1. Proceed as in (11.1):

Z = X Y (1 2 )
divided by
12
2
+ 2
n
m
is normal (0,1), and W = (nS12 /12 )+(mS22 /22 ) is 2 (n+m2). Thus

is T (n + m 2), but the unknown variances cannot be eliminated.
n + m 2Z/ W
10
2. If 12 = c22 , then
1
12
2
1
+ 2 = c22
+
n
m
n cm
and
nS12
mS 2
nS12 + cmS22
+ 22 =
.
2
1
2
c22
Thus 22 can again be eliminated, and condence intervals can be constructed, assuming
c known.
Lecture 12
1. The given test is an LRT and is completely determined by c, independent of > 0 .
2. The likelihood ratio is L(x) = f1 (x)/f0 (x) = (1/4)/(1/6) = 3/2 for x = 1, 2, and
L(x) = (1/8)/(1/6) = 3/4 for x = 3, 4, 5, 6. If 0 < 3/4, we reject for all x, and
= 1, = 0. If 3/4 < < 3/2, we reject for x = 1, 2 and accept for x = 3, 4, 5, 6, with
= 1/3 and = 1/2. If 3/2 < , we accept for all x, with = 0, = 1.
For = .1, set = 3/2, accept when x = 3, 4, 5, 6, reject with probability a when
x = 1, 2. Then = (1/3)a = .1, a = .3 and = (1/2) + (1/2)(1 a) = .85.
3. Since (220-200)/10=2, it follows that when c reaches 2, the null hypothesis is accepted.
The associated type 1 error probability is = 1 (2) = 1 .977 = .023. Thus the
given result is signicant even at the signicance level .023. If we were to take additional
observations, enough to drive the probability of a type 1 error down to .023, we would
still reject H0 . Thus the p-value is a concise way of conveying a lot of information
about the test.
Lecture 13
1. We sum (Xi npi )2 /npi , i = 1, 2, 3, where the Xi are the observed frequencies and the
npi = 50, 30, 20 are the expected frequencies. The chi-square statistic is
(40 50)2
(33 30)2
(27 20)2
+
+
= 2 + .3 + 2.45 = 4.75
50
30
20
Since P {2 (2) > 5.99} = .05 and 4.75 < 5.99, we accept H0 .
2. The expected frequencies are given by
1
2
A
49
51
B
147
153
C
98
102
For example, to nd the entry in the 2C position, we can multiply the row 2 sum by
the column 3 sum and divide by the total number of observations (namely 600) to get
11
(306)(200)/600=102. Alternatively, we can compute P (C) = (114 + 86)/600 = 1/3.
We multiply this by the row 2 sum 306 to get 306/3=102. The chi square statistic is
(33 49)2
(147 147)2
(114 98)2
(67 51)2
(153 153)2
(86 102)2
+
+
+
+
+
49
147
98
51
153
102
which is 5.224+0+2.612+5.020+0+2.510 = 15.366. There are (h1)(k1) = 12 = 2
degrees of freedom, and P {2 (2) > 5.99} = .05. Since 15.366 > 5.94, we reject H0 .
3. The observed frequencies minus the expected frequencies are
a
(a + b)(a + c)
ad bc
=
,
a+b+c+d
a+b+c+d
(a + b)(b + d)
bc ad
=
,
a+b+c+d
a+b+c+d
(a + c)(c + d)
bc ad
=
,
a+b+c+d
a+b+c+d
(c + d)(b + d)
ad bc
=
.
a+b+c+d
a+b+c+d
The chi-square statistic is

(ad bc)2
1
a + b + c + d (a + b)(c + d)(a + c)(b + d)

[(c + d)(b + d) + (a + c)(c + d) + (a + b)(b + d) + (a + b)(a + c)]
and the expression in small brackets simplies to (a + b + c + d)2 , and the result follows.
Lecture 14
1. The joint probability function is
f (x1 , . . . , xn ) =
n

e xi
i=1
xi !
en u(x)
.
x1 ! xn !
Take g(, u(x)) = en u(x) and h(x) = 1/(x1 ! xn !).

2. f (x1 , . . . , xn ) = [A()]n B(x1 ) B(xn ) if 0 < xi < for all i, and 0 elsewhere. This
can be written as
[A()]n
n

i=1
where
I is an indicator.
n
i=1 B(xi ).

B(xi )I max xi <
1in
We take g(, u(x)) = An ()I[max xi < ] and h(x) =
3. f (x1 , . . . , xn ) = n (1 )u(x) , and the factorization theorem applies with h(x) = 1.

n
4. f (x1 , . . . , xn ) = n exp[( i=1 xi )/], and the factorization theorem applies with
h(x) = 1.
12
5. f (x) = ((a + b)/[(a)(b)])xa1 (1 x)b1 on (0,1). In this case, a = and b = 2.
Thus f (x) = ( + 1)x1 (1 x), so
n n
f (x1 , . . . , xn ) = ( + 1)
n

xi
n
1
i=1
(1 xi )
i=1
and the factorization theorem applies with

g(, u(x)) = ( + 1)n n u(x)1
and h(x) =
n
i=1 (1 xi ).
1 x/
6. f (x) = (1/[() ])x

is
, x > 0, with = and arbitrary. The joint density
n

1
1
f (x1 , . . . , xn ) =
u(x)
exp[
xi /]
[()]n n
i=1
and the factorization theorem applies with h(x) = exp[

to the remaining factors.
xi /] and g(, u(x)) equal
7. We have
P {X1 = x1 , . . . , Xn = xn } = P {Y = y}P {X1 = x1 , . . . , Xn = xn |Y = y}
We can drop the subscript since Y is sucient, and we can replace Xi by Xi by
denition of Bs experiment. The result is
P {X1 = x1 , . . . , Xn = xn } = P {X1 = x1 , . . . , Xn = xn }
as desired.
Lecture 17
1. Take u(X) = X.
2. The joint density is

f (x1 , . . . , xn ) = exp

(xi ) I[min xi > ]
i=1
so Y1 is sucient. Now if y > , then

P {Y1 > y} = (P {X1 > y})n =
n
exp[(x )] dx
= exp[n(y )],
so
FY1 (y) = 1 en(y) ,
fY1 (y) = nen(y) ,
y > .
13
The expectation of g(Y1 ) under is

E [g(Y1 )] =
g(y)n exp[n(y )] dy.
If this is 0 for all , divide by en to get

g(y)n exp(ny) dy = 0.
Dierentiating with respect to , we have g()n exp(n) = 0, so g() = 0 for all ,

proving completeness. The expectation of Y1 under is

yn exp[n(y )] dy =
(y )n exp[n(y )] dy +
n exp[n(y )] dy
zn exp(nz) dz + =
0
1
+ .
n
Thus E [Y1 (1/n)] = , so Y1 (1/n) is a UMVUE of .

3.
Since f (x) = exp[( 1) ln x], the density belongs to the exponential

nclass. Thus
n
ln
X
is
a
complete
sucient
statistic,
hence
so
is
exp
(1/n)
=
i
i=1
i=1 ln Xi
u(X1 , . . . , Xn ). The key observation is that if Y is sucient and g is one-to-one,
then g(Y ) is also sucient, since g(Y ) conveys exactly the same information as Y
does; similarly for completeness.
To compute the
likelihood estimate, note that the joint density is f (x1 , . . . , xn ) =
maximum
n
n exp[( 1) i=1 ln xi ]. Take logarithms, dierentiate with respect to , and set the
n
result equal to 0. We get = n/ i=1 ln Xi , which is a function of u(X1 , . . . , Xn ).
4. Each Xi is gamma with = 2, = 1/, so (see Lecture 3) Y is gamma (2n, 1/). Thus

E (1/Y ) =
(1/y)
0
1
y 2n1 ey dy
(2n)(1/)2n
which becomes, under the change of variable z = y,

2n
(2n)

0
z 2n2 z dz
2n (2n 1)
= 2n1
=
.
e
2n2
(2n)
2n 1
Therefore E [(2n 1)/Y ] = , and (2n 1)/Y is the UMVUE of .

5. We have E(Y2 ) = [E(X1 ) + E(X2 )]/2 = , hence E[E(Y2 |Y1 )] = E(Y2 ) = . By
completeness, E(Y2 |Y1 ) must be Y1 /n.
6. Since Xi / is normal (0,1), Y / is 2 (n), which has mean n and variance 2n. Thus
E[(Y /)2 ] = n2 +2n, so E(Y 2 ) = 2 (n2 +2n).Therefore the UMVUE of 2 ) is Y 2 /(n2 +
2n).
14
7. (a) E[E(I|Y )] = E(I) = P {X1 1}, and the result follows by completeness.
(b) We compute
P {X1 = r|X1 + + Xn = s} =
P {X1 = r, X2 + + Xn = s r}
.
P {X1 + + Xn } = s
The numerator is
e r (n1) [(n 1)]sr
e
r!
(s r)!
and the denominator is
en (n)s
s!
so the conditional probability is

sr
r
s (n 1)sr
1
s
n1
=
s
r
r
n
n
n
which is the probability of r successes in s Bernoulli trials, with probability of success
1/n on a given trial. Intuitively, if the sum is s, then each contribution to the sum is
equally likely to come from X1 , . . . , Xn .
(c) By (b), P {X1 = 0|Y } + P {X1 = 1|Y } is given by

1
1
n
Y
+Y
Y 1
Y

1
Y /n
n1
1
1+
=
1
n
n
n
(n 1)/n

=
n1
n
Y
1+

Y
.
n1
This formula also works for Y = 0 because it evaluates to 1.

8. The joint density is

n

(xi 1 )
1
f (x1 , . . . , xn ) = n exp
I min Xi > 1 .
i
2
2
i=1
Since
n

(xi 1 )
i=1
n
1
xi n1 ,
2 i=1
the result follows from the factorization theorem.
15
Lecture 18
1. By (18.4), the numerator of (x) is

1
n x
(1 )nx d
r1 (1 )s1
x
0
and the denominator is
r1
(1 )
s1
n x
(1 )nx d.
x
Thus (x) is
(r + x + 1, n x + s)
(r + x + 1) (r + s + n)
r+x
=
=
.
(r + x, n x + s)
(r + x) (r + s + n + 1)
r+s+n
2. The risk function is

2
r+X
1
E
=
E [(X n + r r s)2 ]
r+s+n
(r + s + n)2
with E (X n) = 0, E [(X n)2 = Var X = n(1 ). Thus
R () =
1
[n(1 ) + (r r s)2 ].
(r + s + n)2
The quantity in brackets is

n n2 + r2 + r2 2 + s2 2 2r2 2rs + 2rs2
which simplies to
((r + s)2 n)2 + (n 2r(r + s)) + r2
and the result follows.
3. If r = s = n/2, then (r + s)2 n = 0 and n 2r(r + s) = 0, so

R () =
r2
.
(r + s + n)2

4. The average loss using is B() = h()R () d. If (x) has a smaller maximum
risk than (x), then since R is constant, we have R () < R () for all . Therefore
B() < B(), contradicting the fact that is a Bayes estimate.
Lecture 20
1.
Var(XY ) = E[(XY )2 ] (EXEY )2 = E(X 2 )E(Y 2 ) (EX)2 (EY )2
2
2 2
2
= (X
+ 2X )(Y2 + 2Y ) 2X 2Y = X
Y + 2X Y2 + 2Y X
.
16
2.
Var(aX + bY ) = Var(aX) + Var(bY ) + 2ab Cov(X, Y )
2
= a2 X
+ b2 Y2 + 2abX Y .
3.
2
Cov(X, X + Y ) = Cov(X, X) + Cov(X, Y ) = Var X + 0 = X
.
4. By Problem 3,
X,X+Y =
2
X
X
= 2
.
X X+Y
X + Y2
5.
Cov(XY, X) = E(X 2 )E(Y ) E(X)2 E(Y )
2
2
= (X
+ 2X )Y 2X Y = X
Y .
6. We can assume without loss of generality that E(X 2 ) > 0 and E(Y 2 ) > 0. We will
have equality i the discriminant b2 4ac = 0, which holds i h() = 0 for some .
Equivalently, X + Y = 0 for some . We conclude that equality holds if and only if
X and Y are linearly dependent.
Lecture 21
n
2
1. Let Yi = Xi E(Xi ); then E[
i=1 ti Yi ] 0 for all t. But this expectation is

E[
ti Yi
tj Yj ] =
ti ij tj = t Kt
i
i,j
where ij = Cov(Xi , Xj ). By denition of covariance, K is symmetric, and K is

always nonnegative denite because t Kt 0 for all t. Thus all eigenvalues i of K
are nonnegative. But K = LDL , so det K = det D = 1 n . If K is nonsingular
then all i > 0 and K is positive denite.
2. We have X = CZ + where C is nonsingular and the Zi are independent normal
random variables with zero mean. Then Y = AX = ACZ + A, which is Gaussian.
3. The moment-generating function of (X1 , . . . , Xm ) is the moment-generating function
of (X1 , . . . , Xn ) with tm+1 = = tn = 0. We recognize the latter moment-generating
function as Gaussian; see (21.1).
n
4. Let Y = i=1 ci Xi ; then
tY
E(e ) = E exp
n

i=1
ci tXi

= MX (c1 t, . . . , cn t)
17
n
n

1 2

= exp t
ci i exp t
ci aij cj
2
i=1
i,j=1
which is the moment-generating function of a normally distributed random variable.

Another method: Let W = c1 X1 + + cn Xn = c X = c (AY + ), where the Yi are
independent normal random variables with zero mean. Thus W = b Y + c where
b = c A. But b Y is a linear combination of independent normal random variables,
hence is normal.
Lecture 22
1. If y is the best estimate of Y given X = x, then
y Y =
Y
(x X )
X
and [see (20.1)] the minimum mean square error is Y2 (1 2 ), which in this case is 28.
We are given that Y /X = 3, so Y = 3 2 = 6 and 2 = 36/Y2 . Therefore
Y2 (1
36
) = Y2 36 = 28,
Y2
Y = 8,
2 =
36
,
64
= .75.
Finally, y = Y + 3x 3X = Y + 3x + 3 = 3x + 7, so Y = 4.
2. The bivariate normal density is of the form
f (x, y) = a()b(x, y) exp[p1 ()x2 + p2 ()y 2 + p3 ()xy + p4 ()x + p5 ()y]
so we are in the exponential class. Thus
2 2
Xi ,
Yi ,
Xi Yi ,
Xi ,
Yi
2
is a complete sucient statistic for = (X
, Y2 , , X , Y ). Note also that any statistic
in one-to-one correspondence with this one is also complete and sucient.
Lecture 23
1. The probability of any event is found by integrating the density on the set dened by
the event. Thus

P {a f (X) b} =
f (x) dx, A = {x : a f (x) b}.
A
2. Bernoulli: f (x) = x (1 )1x , x = 0, 1
x 1x
ln f (x) =
[x ln + (1 x) ln(1 )] =
18
2
x
1x
ln f (x) = 2
2
(1 )2
I() = E
X
1X 1
1
1
+
= +
=
2
(1 )2
1
(1 )
since E (X) = . Now

Var Y
1
(1 )
=
.
nI()
n
But
Var X =
1
n(1 )
(1 )
Var[binomial(n, )] =
=
n2
n2
n
so X is a UMVUE of .
Normal:
f (x) =
1
exp[(x )2 /2 2 ]
2
(x )2 x
=
ln f (x) =
2 2
2
1
2
ln f (x) = 2 ,
2
I() =
1
,
2
Var Y
2
n
But Var X = 2 /n, so X is a UMVUE of .

Poisson: f (x) = e x /x!, x = 0, 1, 2 . . .
ln f (x) =
( + x ln ) = 1 +
2
x
ln f (x) = 2 ,
2
Var Y
so X is a UMVUE of .
I() = E
X
1
= 2 =
2
= Var X
n
19
Lecture 25
1.
K(p) =
c

n
k=0
pk (1 p)nk
with c = 2 and p = 1/2 under H0 . Therefore

12
12
12
79
=
+
+
(1/2)n =
= .019.
0
1
2
4096
2. The deviations, with ranked absolute values in parentheses, are
16.9(14), -1.7(5), -7.9(9), -1.2(4), 12.4(12), 9.8(10), -.3(2), 2.7(6), -3.4(7), 14.5(13),
24.4(16), 5.2(8), -12.2(11), 17.8(15), .1(1), .5(3)
The Wilcoxon statistic is W =1-2+3-4-5+6-7+8-9+10-11+12+13+14+15+16=60
Under H0 , E(W ) = 0 and Var W = n(n + 1)(2n + 1)/6 = 1496,
W = 38.678
Now W/38.678 is approximately normal (0,1) and P {W c} = P {W/38.678

c/38.678} = .05. From a normal table, c/38.678 = 1.645, c = 63.626. Since
60 < 63.626, we accept H0 .
3. The moment-generating function of Vj is
MVj (t) = (1/2)(ejt +ejt ) and the momentn
generating function of W is MW (t) = j=1 MVj (t). When n = 1, W = 1 with
equal probability. When n = 2,
MW (t) =
1 t
1
1
(e + et ) (e2t + e2t ) = (e3t + et + et + e3t )
2
2
4
so W takes on the values 3, 1, 1, 3 with equal probability. When n = 3,

MW (t) =
1 3t
1
+ et + et + e3t ) (e3t + e3t )
(e
4
2
1 6t
+ e4t + e2t + 1 + 1 + e2t + e4t + e6t ).
(e
8
Therefore P {W = k} = 1/8 for k = 6, 4, 2, 2, 4, 6, P {W = 0} = 1/4, and

P {W = k} = 0 for other values of k.

Solutions To Problems: X 0 ZX X y

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Solutions To Problems: X 0 ZX X y

Uploaded by

Copyright:

Available Formats

1

FZ (z) = fZ (z) = 0 for z < 0.

for 0 < y 1/3 < b, i.e., 0 < y < b3 .

with x = tan1 y, i.e., y = tan x. But sec2 x = 1 + tan2 x = 1 + y 2 , so fY (y) =

exp[i (e 1)] = exp

= (c n/) (c n/) = 2(c n/) 1 .954.

Thus (c n/) 1.954/2 = .977. From tables, c n/ 2, so n 4 2 /c2 .

If F is the 2 (n 1) distribution function, the desired probability is F (nb/ 2 )

so S 2 is gamma with = (n 1)/2 and = 2 2 /n.

and the integral is (a + 1, b) = (a + 1)(b)/(a + b + 1). Thus E(X) = a/(a + b).

2. P {c T c} = FT (c) FT (c) = FT (c) (1 FT (c)) = 2FT (c) 1 = .95, so

which is beta with = k and = n k + 1. (Note that (k) = (k 1)!, (n k + 1) =

2. All Xi and X have the same distribution (p(1) = p(0) = 1/2), so Xn

which approaches ew as n . Therefore the limiting distribution of Wn is exponential.

With x = x1 + + xn , take logarithms and dierentiate to get

(b) f (x1 , . . . , xn ) = n (x1 xn )1 , > 0, and

Note that 0 < xi < 1, so ln xi < 0 for all i and > 0.

xi ]en I[ min(x1 , . . . , xn )].

The indicator I prevents us from dierentiating blindly. As increases, so does en ,

the MLE of P {a X b} is found by replacing by X/n in the above summation.

is approximately normal (0,1) by the central limit theorem. With c = 1/ n we have

is normal (0,1), and W = (nS12 /12 )+(mS22 /22 ) is 2 (n+m2). Thus

The chi-square statistic is

a + b + c + d (a + b)(c + d)(a + c)(b + d)

Take g(, u(x)) = en u(x) and h(x) = 1/(x1 ! xn !).

We take g(, u(x)) = An ()I[max xi < ] and h(x) =

3. f (x1 , . . . , xn ) = n (1 )u(x) , and the factorization theorem applies with h(x) = 1.

and the factorization theorem applies with

6. f (x) = (1/[() ])x

, x > 0, with = and arbitrary. The joint density

and the factorization theorem applies with h(x) = exp[

xi /] and g(, u(x)) equal

so Y1 is sucient. Now if y > , then

fY1 (y) = nen(y) ,

g(y)n exp[n(y )] dy.

If this is 0 for all , divide by en to get

Dierentiating with respect to , we have g()n exp(n) = 0, so g() = 0 for all ,

Thus E [Y1 (1/n)] = , so Y1 (1/n) is a UMVUE of .

which becomes, under the change of variable z = y,

Therefore E [(2n 1)/Y ] = , and (2n 1)/Y is the UMVUE of .

This formula also works for Y = 0 because it evaluates to 1.

the result follows from the factorization theorem.

The quantity in brackets is

3. If r = s = n/2, then (r + s)2 n = 0 and n 2r(r + s) = 0, so

where ij = Cov(Xi , Xj ). By denition of covariance, K is symmetric, and K is

which is the moment-generating function of a normally distributed random variable.

2. Bernoulli: f (x) = x (1 )1x , x = 0, 1

since E (X) = . Now

But Var X = 2 /n, so X is a UMVUE of .

with c = 2 and p = 1/2 under H0 . Therefore

Now W/38.678 is approximately normal (0,1) and P {W c} = P {W/38.678

so W takes on the values 3, 1, 1, 3 with equal probability. When n = 3,

Therefore P {W = k} = 1/8 for k = 6, 4, 2, 2, 4, 6, P {W = 0} = 1/4, and

You might also like