Professional Documents
Culture Documents
Chen, L.-A.
1
Proof. We consider case n = 3 only.
iid
We have X1 , X2 , X3 ∼ f (x), a < x < b, the joint p.d.f of X1 , X2 , X3 is
2
Jacobians:
1 0 0 0 1 0 1 0 0
J1 = 0 1 0 = 1, J2 = 1 0 0 = −1, J3 = 0 0 1 = −1,
0 0 1 0 0 1 0 1 0
0 0 1 0 1 0 0 0 1
J4 = 1 0 0 = 1, J5 = 0 0 1 = 1, J6 = 0 1 0 = −1
0 1 0 1 0 0 1 0 0
The joint p.d.f of Y1 , Y2 , Y3 is
fY1 ,Y2 ,Y3 (y1 , y2 , y3 ) = f (y1 )f (y2 )f (y3 )|1| + f (y2 )f (y1 )f (y3 )| − 1|
+ f (y1 )f (y3 )f (y2 )| − 1| + f (y3 )f (y1 )f (y2 )|1|
+ f (y2 )f (y3 )f (y1 )|1| + f (y3 )f (y2 )f (y1 )| − 1|
= 6f (y1 )f (y2 )f (y3 )
= 3!f (y1 )f (y2 )f (y3 )
fY1 ,...,Yn (y1 , . . . , yn ) = n!f (y1 )f (y2 ) · · · f (yn ), a < y1 < y2 < · · · < yn < b
3
(d) For 1 ≤ j < k ≤ n, the joint p.d.f of Yj and Yk is
Marginal p.d.f of Yn is
Z yn Z y3 Z y2
fYn (yn ) = n! ··· f (y1 )f (y2 ) · · · f (yn )dy1 · · · dyn−1
a a a
Z yn Z y3
= n! ··· f (y2 ) · · · f (yn )F (y2 )dy2 · · · dyn−1
Za yn Za y4
1
= n! ··· f (y3 ) · · · f (yn ) (F (y3 ))2 dy3 · · · dyn−1
a a 2!
= ······
Z yn
1
= n! f (yn−1 )f (yn ) (F (yn−1 ))n−2 dyn−1
a (n − 2)!
1
= n!f (yn ) (F (yn ))n−1
(n − 1)!
= n(F (yn ))n−1 f (yn )
4
Marginal p.d.f of Yj is
Z b Z b Z b Z yj Z y2
fYj (yj ) = n! ··· ··· f (y1 ) · · · f (yn )dy1 · · · dyj−1 dyn · · · dyj+1
yj yn−2 yn−1 a a
1 1
= n!f (yj ) (F (yj ))j−1 (1 − F (yj ))n−j
(j − 1)! (n − j)!
n!
= (F (yj ))j−1 (1 − F (yj ))n−j f (yj ), a < yj < b
(j − 1)!(n − j)!
t-distribution
Z ∼ N (0, 1)
indep.
V ∼ χ2 (r)
We say that T = √ZV has a t-distribution with d.f. r and denote T ∼ t(r).
r
Want the p.d.f of T .
Proof. The joint p.d.f of Z and V is
1 z2 1 r
−1 − v2
fZ,V (z, v) = √ e− 2 r r v
2 e , z ∈ R, v > 0
2π Γ( 2 )2 2
5
Jacobian is √u − 12 √
∂z ∂z √ u√ t u
J = ∂t ∂u = r 2 r = √
∂v ∂v
∂t ∂u
0 1 r
The joint p.d.f of T and U is
√ √
ut u
fT,U (t, u) = fZ,V ( √ , u)| √ |
r r
√
1 − 1 u t2 1 r
−1 − u u
= √ e 2r r r u
2 e 2√
2π Γ( 2 )2 2 r
1 2
− 12 u(1+ tr ) r+1
=√ r r √ e u 2 −1 , −∞ < t < ∞, u > 0
2πΓ( 2 )2 r
2
The p.d.f of T is
Z ∞
1 r+1
−1 1 t2
fT (t) = √ r√ u 2 e− 2 u(1+ r ) du
2πΓ( 2r )2 2 r 0
2
T h du 1
( Let H = U (1 + ), u = 2 , = 2 )
r 1 + tr dh 1 + tr
Z ∞ r+1
1 h 2 −1 h 1
=√ r√ 2 r+1 −1
e− 2 2 dh
r
2πΓ( 2 )2 2 r 0 (1 + r ) 2 t 1 + tr
r+1 Z ∞
Γ( r+1
2
)2 2 1 1 r+1
−1 − h
=√ r √ 2 r+1 r+1 h 2 e 2 dh
2πΓ( 2r )2 2 r (1 + tr ) 2 0 Γ( r+1 2
)2 2
Γ( r+1
2
)
=√ t2 r+1
, −∞ < t < ∞
πrΓ( 2r )(1 + r
) 2
1 x
α−1 − β
X ∼ Gamma(α, β) if f (x) = x e ,x > 0
Γ(α)β α
r
If β = 2, we let α = , then X ∼ χ2 (r).
2
1 r x
If X ∼ χ (r), then f (x) = r r x 2 −1 e− 2 , x > 0
2
Γ( 2 )2 2
6
(b) Weak Law of Large Numbers ( WLLN )
If X1 , . . . , Xn are iid random variables with mean µ and variance σ 2 < ∞,
then
P
X −→ µ
P P
Thm. If Yn −→ a, then g(Yn ) −→ g(a) for any continuous function g.
Thm. Slutsky’s theorem
d P
If Xn −→ X and Yn −→ a, then
d
Xn ± Yn −→ X ± a
d
Xn Yn −→ aX
Xn d X
−→ , if a 6= 0.
Yn a
Concept of C.I. :
(a)Suppose that P (X ∈ A) = 0.9. If we observe X many times with
x1 , . . . , xn of large n, then there are about number 0.9 × n obs. x0i s
such that xi ∈ A.
(b) If (t1 (X1 , . . . , Xn ), t2 (X1 , . . . , Xn )) is a 90% C.I. for θ,then we have about
number 0.9×n obs.(t1 , t2 ) such that θ ∈ (t1 , t2 ) when we observe (t1 , t2 )
many times (saying n times).
Note:
(a) The normal approximation by CLT can be applied on any distribution
f (x, θ), normal or not, when n is large (n ≥ 30)
Approximate C.I.:
(1) Let X1 , . . . , Xn be a random sample from f (x) with mean µ and variance
σ 2 , where f can be normal or not. Then
X −µ X −µ σ d
√ = √ · −→ N (0, 1) · I = N (0, 1) by Slutsky’s theorem .
s/ n σ/ n s
7
We have
X −µ
1 − α = P (−z α2 ≤ Z ≤ z α2 ) ' P (−z α2 ≤ √ ≤ z α2 )
s/ n
s s
= P (X − z α2 √ ≤ µ ≤ X + z α2 √ )
n n
n
d P
(2) Let Y ∼ b(n, p). If X1 , . . . , Xn are iid Bernoulli(p), then Y = Xi .
i=1
Y
Let p̂ = n
. We have
Y d 1P n
P
p̂ = = n
Xi = X −→ p by WLLN.
n i=1
p̂ − p d X − p d
q = √ −→ N (0, 1) by CLT.
p(1−p) p(1−p)
√
n n
Then
s
p̂ − p p̂ − p p(1 − p) d
q =q · −→ N (0, 1)·I = N (0, 1) by Slutsky’s theorem .
p̂(1−p̂) p(1−p) p̂(1 − p̂)
n n
We have
p̂ − p
1 − α = P (−z α2 ≤ Z ≤ z α2 ) ' P (−z α2 ≤ q ≤ z α2 )
p̂(1−p̂)
n
r r
p̂(1 − p̂) p̂(1 − p̂)
= P (p̂ − z α2 ≤ p ≤ p̂ + z α2 )
n n
q q
p̂(1−p̂)
Here, (p̂ − z 2
α
n
, p̂ + z 2 p̂(1−p̂)
α
n
) is an approximate 100(1 − α)%
C.I. for p.
8
C.I. for σ 2 or σ:
Let X1 , . . . , Xn be a random sample from N (µ, σ 2 ) where µ and σ 2 are un-
known. We have
n
(n − 1)s2 2 2 1 X
2
∼ χ (n − 1), where s = (Xi − X)2
σ n − 1 i=1
α
Let χ2α and χ21− α satisfy 2
= P (χ2 (n − 1) ≤ χ2α ) = P (χ2 (n − 1) ≥ χ21− α ).
2 2 2 2
Then
1 − α = P (χ2α ≤ χ2 (n − 1) ≤ χ21− α )
2 2
2
(n − 1)s
= P (χ2α ≤ ≤ χ21− α )
2 σ2 2
2
(n − 1)s 2 (n − 1)s2
= P( ≤ σ ≤ )
χ21− α χ2α
2 2
(n−1)s2 (n−1)s2
Here, ( , χ2α ) is an approximate 100(1 − α)% C.I. for σ 2 .
χ21− α
2
r 2 r
(n−1)s2 2
1 − α = P ( χ2 α ≤ σ ≤ (n−1)s χ2α
) since all is positive.
1− 2 2
C.I for µ :
iid
(a) X1 , . . . , Xn ∼ N (µ, σ02 ) where σ02 is known.
X −µ
Pivotal quantity: Z = σ0 ∼ N (0, 1)
√
n
iid
(b) X1 , . . . , Xn ∼ N (µ, σ 2 )
X −µ
Pivotal quantity: T = ∼ t(n − 1)
√s
n
iid
(c) X1 , . . . , Xn ∼ f (x) with mean µ, n is large (n ≥ 30).
X −µ
Pivotal quantity: s ' N (0, 1)
√
n
C.I. for σ 2 :
iid
X1 , . . . , Xn ∼ N (µ, σ 2 )
2
Pivotal quantity: (n−1)s σ 2 ∼ χ2 (n − 1).
Confidence Interval for Difference of Means :
Case I :
(
iid
X1 , . . . , Xn ∼ N (µx , σx2 )
iid indep. and σx2 , σy2 are known.
2
Y1 , . . . , Ym ∼ N (µy , σy )
9
( 2
X ∼ N (µx , σnx )
⇒ σ2 independent
Y ∼ N (µy , my )
σx2 σy2
⇒ X − Y ∼ N (µx − µy , + )
n m
(X − Y ) − (µx − µy )
⇒Z= q ∼ N (0, 1)
σx2 σy2
n
+ m
So,
(X − Y ) − (µx − µy )
1 − α = P (−z α2 ≤ Z ≤ z α2 ) = P (−z α2 ≤ q ≤ z α2 )
σx2 σy2
n
+m
r r
σx2 σy2 σx2 σy2
= P (X − Y − z α2 + ≤ µx − µy ≤ X − Y + z α2 + )
n m n m
A 100(1 − α)%q C.I. for µx − µy is q
2 σ2 2 σy2
(X − Y − z α2 σnx + my , X − Y + z α2 σnx + m
)
iid
X1 , . . . , Xn ∼ f (x, θ)
Q = h(X1 , . . . , Xn , θ) is a pivotal quantity if it has a distribution free of
parameter θ
⇒ ∃a, b s.t. 1 − α = Pθ (a ≤ Q ≤ b), ∀θ
We want a pivotal quantity with
1 − α = P (a ≤ Q ≤ b) = P (t1 (x1 , . . . , xn ) ≤ θ ≤ t2 (x1 , . . . , xn ))
Case II :
Variances are unknown but they are equal.
(
iid
X1 , . . . , Xn ∼ N (µx , σ 2 )
iid indep. and σ 2 is unknown.
2
Y1 , . . . , Ym ∼ N (µy , σ )
σ2
X ∼ N (µ x , n
)
2 n
(n − 1)sx
2 1
(Xi − X)2
P
2 s =
2
∼ χ (n − 1)
x n−1
⇒ σ indep. where i=1
m
σ2
Y ∼ N (µ y , m
) s 2
= 1
P
(Yi − Y )2
y
2
m−1
(m − 1)s
i=1
y
∼ χ2 (m − 1)
σ2
(
X − Y ∼ N (µx − µy , σ 2 ( n1 + m1 ))
⇒ (n−1)s2x +(m−1)s2y indep.
σ2
∼ χ2 (n + m − 2)
10
Then
(X−Y )−(µx −µy )
√1
σ n
1
+m (X − Y ) − (µx − µy )
T =q =q q ∼ t(n + m − 2)
(n−1)s2x +(m−1)s2y (n−1)s2x +(m−1)s2y 1 1
σ 2 (n+m−2) n+m−2 n
+m
We have
(X − Y ) − (µx − µy )
1 − α = P (−t α2 ≤ T ≤ z α2 ) = P (−t α2 ≤ q q ≤ t α2 )
(n−1)s2x +(m−1)s2y 1 1
n+m−2 n
+m
s r
(n − 1)s2x + (m − 1)s2y 1 1
= P (X − Y − t α2 + ≤ µx − µy
n+m−2 n m
s r
(n − 1)s2x + (m − 1)s2y 1 1
≤ X − Y + t α2 + )
n+m−2 n m
A 100(1 − α)% q C.I. 2for µx −2 q
µy is q q
(n−1)sx +(m−1)sy 1 1 (n−1)s2x +(m−1)s2y 1 1
(X − Y − t α2 n+m−2 n
+ m
,X −Y +t α
2 n+m−2 n
+ m
)
Example:
n = 10, m = 7, X = 4.2, Y = 3.4, (n−1)s2x = 490, (m−1)s2y = 224, t0.05 (15) =
1.753
A 90% C.I. for µx −
qµy is q q q
490+224 1 1 490+224 1
(4.2 − 3.4 − 1.753 15 10
+ 7 , 4.2 − 3.4 + 1.753 15 10
+ 17 ) =
(−5.16, 6.76)
11
Def. The null hypothesis, denoted by H0 , is a hypothesis that we will reject
it only if the data reveal strongly that it is not true.
The alternative hypothesis, denoted by H1 , is the hypothesis alternative to
the null hypothesis.
Def. A test is a rule deciding whether to reject or not to reject the null
hypothesis. Usually, a test specifies a subset C of the sample space of the
random sample X1 , . . . , Xn that we reject H0 if the observation (x1 , . . . , xn )
falls in C and do not reject H0 if (x1 , . . . , xn ) does not fall in C. This subset
C is called the critical region or the rejection region.
For any test, there are two possible errors that may occur:
Type I error : H0 is true but we reject H0 .
Type II error : H1 is true but we do not reject H0 .
Def. The power function πc (θ) of critical region C is the probability of re-
jecting H0 when θ is true.
Let X1 , . . . , Xn be a random sample and we consider a test with critical region
C. The power function is
X1
πc (θ) = P (reject H0 : θ is true) = P ( ... ∈ C : θ is true)
Xn
Example :
X= score of a test ∼ N (θ, 100). The past experience indicates θ = 75. Want
to test hypothesis H0 : θ = 75 vs. H1 : θ > 75
sol : Let X1 , . . . , X25 be a random sample from N (θ, 100) and we consider
critical region
X1 25
.. 1 X
C1 = {X : X > 75} = { . : Xi > 75}.
25 i=1
X25
12
The power function is
X −θ 75 − θ 75 − θ 75 − θ
πC1 (θ) = P (X > 75 : θ) = P ( > ) = P (Z > ) = 1−P (Z ≤ )
2 2 2 2
where X ∼ N (θ, 10025
) = N (θ, 4).
πC1 (75) = 0.5, πC1 (77) = 0.841, πC1 (79) = 0.977
If we choose critical region C2 = {X : X > 78}. Power function is
πC2 (θ) = P (X > 78 : θ) = 1 − P (Z ≤ 78−θ 2
)
πC2 (75) = 0.067
Def. The size of a test with critical region C is the maximum of the probability
of type I error, i.e. size=max πc (θ)
θ∈H0
The rule for choosing critical region is fixing a significance level α and find
the test among class of tests with size ≤ α that minimize the probability of
type II error. (Usually we let α = 0.01 or 0.05)
Likelihood function:
n
Y
L(θ, x1 , . . . , xn ) = f (xi , θ) ⇒ function of θ
i=1
13
A ratio of two likelihood as
L(θ0 , x1 , . . . , xn )
L(θ1 , x1 , . . . , xn )
is called the likelihood ratio. We will derive the MP test through this likeli-
hood ratio.
= L(θ1 ) − L(θ1 )
C∩Ac A∩C c
14
x1
L(θ0 ) 1
For ... ∈ C, ≤ k ⇒ L(θ1 ) ≥ L(θ0 )
L(θ1 ) k
xn
x1
L(θ0 ) 1
For ... ∈ C c , ≥ k ⇒ L(θ1 ) ≤ L(θ0 )
L(θ1 ) k
xn
Z Z Z Z
1 1
L(θ1 ) − L(θ1 ) ≥ L(θ0 ) − L(θ0 )
C A k C∩Ac k c
Z Z A∩C Z Z
1
= [ L(θ0 ) + L(θ0 ) − ( L(θ0 ) + L(θ0 ))]
k C∩Ac C∩A A∩C c A∩C
Z Z
1
= [ L(θ0 ) − L(θ0 )]
k c A
≥0
15
and we know the distribution of u(X1 , . . . , Xn ), then the test with critical
region
x1
C = { ... : u(x1 , . . . , xn ) ≤ c}
xn
s.t. α = P (u(X1 , . . . , Xn ) ≤ c : θ = θ0 ) is a MP test with significance level
α.
Example:
iid
X1 , . . . , Xn ∼ N (θ, 1)
Consider simple hypothesis H0 : θ = 0 vs. H1 : θ = 1
Want the MP test with significance level α.
sol: Likelihood function is
n
n (xi −θ)2
P
Y 1 − (xi −θ)2 − n
− i=1
L(θ, x1 , . . . , xn ) = √ e 2 = (2π) 2 e 2
i=1
2π
n n
− 12 [ x2i −2θ xi +nθ2 ]
P P
−n
= (2π) 2 e i=1 i=1
n n
− 12 [ x2i −2θ0 xi +nθ02 ]
P P
−n
L(θ0 , x1 , . . . , xn ) (2π) e i=1 2 i=1
= n n
L(θ1 , x1 , . . . , xn ) − 12 [
P
x2i −2θ1
P
xi +nθ12 ]
−n
(2π) 2 e i=1 i=1
n
xi + n
P
− 2
=e i=1 ≤k
n
X n
⇔− xi + ≤ ln k
i=1
2
n
X n
⇔ xi ≥ − ln k
i=1
2
1 n
⇔x≥ ( − ln k) = c
n 2
MP critical region is C = {x ≥ c}.
s.t.
α = P (type I error) = P (reject H0 : H0 )
X −0 c−0
= P (X ≥ c : θ = 0) = P ( ≥ : θ = 0)
√1 √1
n n
c
= P (Z ≥ )
√1
n
16
c zα
⇒ = zα ⇒ c = √
√1 n
n
zα
The MP critical region with significance level α is C = {x ≥ √
n
}
iid
Example : X1 , . . . , Xn ∼ Poisson(λ)
H0 : λ = 10 vs. H1 : λ = 1
sol: The likelihood function is
n P
Y λxi e−λ λ xi e−nλ
L(λ, x1 , . . . , xn ) = = Qn
i=1
xi ! i=1 xi !
P
10Q xi e−10n
L(λ = 10, x1 , . . . , xn ) n
i=1 xi !
P
xi −9n
= P
1Q xi e−1n
= 10 e ≤k
L(λ = 1, x1 , . . . , xn ) n
i=1 xi !
P
xi −9n
10 e ≤k
X
⇔( xi ) ln 10 − 9n ≤ ln k
X 1
⇔ xi ≤ (ln k + 9n) = c
ln 10
n
P
The MP critical region with significance level α is C = { xi ≤ c}
i=1
where c satisfies
n c
X X (10n)y e−10n
α = P (Y = xi ≤ c : λ = 10) =
i=1 y=0
y!
17
sol:
n
e−1
Q
xi ! e−n
L(f1 (x), x1 , . . . , xn ) i=1
Q
xi !
= Q
n =
L(f2 (x), x1 , . . . , xn ) P1
( 12 )xi −1 2 xi −n
i=1
n
P
xi
−n
e 2i=1
= nQ ≤k
2 xi !
Xn n
Y
⇔( xi ) ln 2 − ln( xi !) ≤ c
i=1 i=1
where c satisfies
Xn n
Y
α = P (( xi ) ln 2 − ln( xi !) ≤ c : f = f1 (x))
i=1 i=1
18
(b)H0 : θ = θ0 vs. H1 : θ < θ0 (One sided hypothesis)
Usually UMP test doesn’t exist for hypothesis
H0 : θ = θ0 vs. H1 : θ 6= θ0 (Two sided hypothesis)
Example:
iid
X1 , . . . , Xn ∼ N (µ, 1), H0 : µ = 0 vs. H1 : µ > 0. Let µ1 > 0.
Consider simple hypothesis H0 : µ = 0 vs. H1 : µ = µ1 .
sol: By Nayman-Pearson theorem,
n
x2 − 12 x2i
Qn P
− i
√1
L(µ = 0, x1 , . . . , xn ) i=1 2π e e
2 i=1
=Q (x −µ )2
= n n
L(µ = µ1 , x1 , . . . , xn ) n
√1 − i 21 − 12 [
P
x2i −2µ1
P
xi +nµ21 ]
i=1 2π e e i=1 i=1
xi + n µ2
P
−µ1
=e 2 1 ≤k
n
X n 2
⇔ −µ1 xi + µ ≤ ln k
i=1
2 1
n
X n 2
⇔ −µ1 xi ≤ ln k − µ where − µ1 < 0 for µ1 > 0
i=1
2 1
n
X 1 n
⇔ xi ≥ − (ln k − µ21 )
i=1
µ1 2
√
⇔ nx ≥ c under H0
√
α = P ( nx ≥ c : H0 ) = P (Z ≥ c)
⇒ c = zα
⇒ MP critical region for H0 : µ = 0 vs. H1 : µ = µ1 is
√
C = { nx ≥ zα }
√
This holds for every µ = µ1 . Hence C = { nx ≥ zα } is UMP critical region
with significance level α for H0 : µ = 0 vs. H1 : µ > 0.
Example:
iid
X1 , . . . , Xn ∼ N (0, σ 2 ), H0 : σ 2 = 1 vs. H1 : σ 2 < 1. Let θ < 1.
Consider simple hypothesis H0 : σ 2 = 1 vs. H1 : σ 2 = θ.
19
sol:
Qn x2i
√1 − 1
x2i
i=1 2π e
P
2
L(σ = 1, x1 , . . . , xn ) 2
e− 2
=Q x2
= P 2
L(σ 2 = θ, x1 , . . . , xn ) n − i
x
i
i=1 √
1
1 e 2θ
1
n e− 2θ
2πθ 2 θ2
n
− 21 (1− θ1 ) x2i
P
=θ e 2 ≤k
n
1 1 X 2 n 1 1
⇔ − (1 − ) xi ≤ ln(θ− 2 k) where − (1 − ) > 0 for θ < 1
2 θ i=1 2 θ
n
X 1 −n
⇔ x2i ≤ 1 ln(θ
2 k) = c
i=1
− 21 (1 − θ)
n
H
Xi2 ∼0 χ2 (n)
P
i=1
MP critical region for hypothesis H0 : σ 2 = 1 vs. H1 : σ 2 = θ with significance
level α is n
X
C={ x2i ≤ χ21−α (n)}
i=1
This holds for every σ 2 = θ < 1. Hence C is the UMP critical region for
hypothesis H0 : σ 2 = 1 vs. H1 : σ 2 < 1.
Example:
iid
X1 , . . . , Xn ∼ Poisson(λ). Hypothesis H0 : λ = 1 vs. H1 : λ 6= 1.
Show that UMP test with significance level α doesn’t exist.
You need to find λ1 and λ2 such that MP tests for H0 : λ = 1 vs. H1 : λ = λ1
and H0 : λ = 1 vs. H1 : λ = λ2 are different.
sol: Let λ0 6= 1. Consider hypothesis H0 : λ = 1 vs. H1 : λ = λ0 .
The likelihood function is
n P
Y λxi e−λ λ xi e−nλ
L(λ, x1 , . . . , xn ) = = Qn
i=1
xi ! i=1 xi !
20
By Neyman-Pearson theorem,
L(λ = 1, x1 , . . . , xn ) e−n
= P xi ≤k
L(λ = λ0 , x1 , . . . , xn ) λ0 e−nλ0
n
P
− xi
⇔ λ0 i=1 ≤ ken−nλ0
n
X
⇔− xi ln λ0 ≤ ln(ken−nλ0 )
i=1
n
X
⇔ xi (− ln λ0 ) ≤ ln(ken−nλ0 )
i=1
xi ≤ − ln1 λ0 ln(ken−nλ0 ) = k2
P
If λ0 < 1, then − ln λ0 < 0. ⇒
⇒ MP critical region for H0 : λ = 1 vs. H1 : λ = λ0 is
Xn X
C2 = { xi ≤ k2 } s.t. α = P ( Xi ≤ k2 : λ = 1)
i=1
∵ C1 6= C2
∴ UMP test doesn’t exist.
There are several types of hypothesis that we need to treat them in different
ways:
A3 : H0 : θ = θ0 vs.H1 : θ 6= θ0
⇒ the UMP test general doesn’t exist.
21
A4 : Suppose that we have composite hypothesis
H0 : θ ≤ θ0 vs. H1 : θ > θ0 or
H0 : θ ≥ θ0 vs. H1 : θ < θ0 (One sided test)
⇒ the UMP test exists for some special distributions.
A5 : General hypothesis
H0 : θ ∈ Θ0 vs. H1 : θ ∈
/ Θ0
⇒ If there is no UMP test, we will consider likelihood ratio test.
22
Larger T ⇒ θ0 (small) is more reliable.
Smaller T ⇒ θ00 (large) is more reliable.
% in T ⇒
If H1 : θ < θ0 , c = {T ≥ t0 }
L(θ0 )
If H1 : θ > θ0 , c = {T ≤ t0 }
= h(T )
L(θ00 )
Larger T ⇒ θ00 (large) is more reliable.
Smaller T ⇒ θ0 (small) is more reliable.
& in T ⇒
If H1 : θ < θ0 , c = {T ≤ t0 }
If H1 : θ > θ0 , c = {T ≥ t0 }
We will say that the test following this rule is UMP test if monotone likelihood
ratio exists.
Note: When there is monotone likelihood ratio (MLR), the UMP test has
monotone power function. If H0 : θ ≤ θ0 , πc (θ) is %. If H0 : θ ≥ θ0 , πc (θ) is
&.
α = sup P (T ≤ t0 : θ) = P (T ≤ t0 : θ0 )
θ≤θ0
0
L(θ )
If the LR = L(θ 00 ) = h(T ) is nonincreasing in T, then the UMP critical region
α = sup P (T ≥ t0 : θ) = P (T ≥ t0 : θ0 )
θ≥θ0
α = sup P (T ≥ t0 : θ) = P (T ≥ t0 : θ0 )
θ≥θ0
0
L(θ )
If the LR = L(θ 00 ) = h(T ) is nonincreasing in T, then the UMP critical region
α = sup P (T ≤ t0 : θ) = P (T ≤ t0 : θ0 )
θ≥θ0
23
Example:
iid
X1 , . . . , Xn ∼ U (0, θ), θ > 0. H0 : θ ≤ θ0 vs. H1 : θ > θ0
sol: f (x, θ) = 1θ I(0 < x < θ). The likelihood function is
n n
Y 1 1 Y 1
L(θ, x1 , . . . , xn ) = I(0 < xi < θ) = n I(0 < xi < θ) = n I(0 < yn < θ)
i=1
θ θ i=1 θ
where Yn = max{X1 , . . . , Xn }
Let θ0 < θ00 . The LR is
L(θ0 , x1 , . . . , xn )
1
(θ0 )n
I(0 < yn < θ0 ) θ00 n I(0 < yn < θ0 )
= 1 =( )
L(θ00 , x1 , . . . , xn ) (θ00 )n
I(0 < yn < θ00 ) θ0 I(0 < yn < θ00 )
y 1 y n−1
fYn (y) = n(F (yn ))n−1 f (yn ) = n( )y−1 = n n , 0 < y < θ
θ θ θ
Z θ0 n−1
y cn
α = P (Yn ≥ c : θ0 ) = n n dy = 1 − n
c θ0 θ0
1
⇒ c = θ0 (1 − α) n
1
The UMP critical region with significance level α is C = {yn ≥ θ0 (1 − α) n }
Example:
iid
X1 , . . . , Xn ∼ N (µ, 1).H0 : µ ≥ 0 vs. H1 : µ < 0.
Want the UMP test.
sol: Let µ1 < µ2 .
n n n
(xi −µ1 )2
P
− 12 [ x2i −2µ1 xi +nµ21 ]
P P
−n − i=1
L(µ1 , x1 , . . . , xn ) (2π) e 2 2 e i=1 i=1
= n = n n
L(µ2 , x1 , . . . , xn ) P
(xi −µ2 )2 − 12 [
P
x2i −2µ2
P
xi +nµ22 ]
n i=1
(2π) e −2 − 2 e i=1 i=1
1 2 2
P X
= e− 2 [2(µ2 −µ1 ) xi +n(µ1 −µ2 )] , & in xi
24
√
⇒ √c = −zα ⇒ c = − nzα .
n
n
P √
So, the UMP critical region significance level α is C = { xi ≤ − nzα } =
i=1
−z
{x ≤ √α}
n
iid
X1 , . . . , Xn ∼ N (µ, σ02 ) where σ0 is known.
25
but it often is.
H0 : θ = θ0 vs. H1 : θ = θ1
⇒ MP test exists by Neyman-Pearson Theorem.
H0 : θ = θ0 vs. H1 : θ > θ0 or H0 : θ = θ0 vs. H1 : θ < θ0
⇒ UMP test often exists by Neyman-Pearson Theorem.
H0 : θ ≤ θ0 vs. H1 : θ > θ0 or H0 : θ ≥ θ0 vs. H1 : θ < θ0
⇒ One sided hypothesis + MLR ⇒ UMP test exists.
Critical region of likelihood ratio test is
max L(θ, x1 , . . . , xn )
θ∈Θ0
C = {λ = λ(x1 , . . . , xn ) = ≤ λ0 }
max L(θ, x1 , . . . , xn )
θ∈Θ
Example:
X1 , . . . , Xn iid with p.d.f f (x, θ) = θe−θx , x > 0, θ > 0
H0 : θ ≤ θ0 vs. H1 : θ > θ0
sol:
n n
P
Y −θ xi
−θxi n
L(θ, x1 , . . . , xn ) = θe =θ e i=1
i=1
n
X
ln L(θ, x1 , . . . , xn ) = n ln θ − θ xi
i=1
n
∂ ln L(θ) n X
= − xi = 0
∂θ θ i=1
n 1
⇒ m.l.e θ̂ = P
n =
x
xi
i=1
1
So, max L(θ, x1 , . . . , xn ) = L(θ̂, x1 , . . . , xn ) = ( )n e−n
θ∈Θ x
1
Since L(θ, x1 , . . . , xn ) achieves maximum at θ =
1x n −n 1
P (x) e P if θ0 >
So, max L(θ, x1 , . . . , xn ) = max θn e−θ xi = n −θ0 xi
x
1
θ≤θ0 θ≤θ0 θ0 e if θ0 < x
26
max L(θ) (
1 if θ0 > 1
0<θ≤θ0 x
So, λ = λ(x1 , . . . , xn ) = =
P
θ0n e−θ0 xi
max L(θ) = (θ0 x)n e−n(θ0 x−1)
( x1 )n e−n
if θ0 < 1
x
θ>0
if x > θ10
1
= % in x
(θ0 x)n e−n(θ0 x−1) if x ≤ θ10
Xn
α = P (X ≤ c : θ0 ) = P ( Xi ≤ nc : θ0 )
i=1
n
P
2 Xi
i=1 2nc
= P( 1 ≤ 1 : θ0 )
θ0 θ0
2
= P (χ (2n) ≤ 2θ0 nc)
⇒ χ2α = 2θ0 nc
χ2α
⇒ c = 2nθ 0
Example:
iid
X1 , . . . , Xn ∼ N (θ, 1), H0 : θ = 0 vs. H1 : θ 6= 0.
This is no UMP test for this hypothesis. We want LRT.
sol: n
n (xi −θ)2
P
Y 1 − (xi −θ)2 n
−2 − i=1
L(θ, x1 , . . . , xn ) = √ e 2 = (2π) e 2
i=1
2π
∂ ln L(θ, x1 , . . . , xn )
= 0 ⇒ m.l.e. θ̂ = x
∂θ
H 1 X √ H
X ∼0 N (0, ) ⇒ 1 = nX ∼0 N (0, 1)
n √
n
27
√
α = P (| nx| ≥ c : θ = 0) = P (|Z| ≥ c) = 1 − P (|Z| ≤ c)
⇒ P (|Z| ≤ c) = 1 − α ⇒ c = z α2
Example:
iid
X1 , . . . , Xn ∼ N (θ1 , θ2 ), −∞ < θ1 < ∞, θ2 > 0
H0 : θ1 = 0 vs. H1 : θ1 6= 0.
i.e.H0 : θ1 = 0, θ2 > 0 vs. H1 : θ1 6= 0, θ2 > 0.
28
sol:
(xi −θ1 )2
P
n −n −
L(θ1 , θ2 , x1 , . . . , xn ) = (2π)− 2 θ2 2 e 2θ2
P 2
xi
n −n −
max L(θ1 , θ2 , x1 , . . . , xn ) = max L(0, θ2 , x1 , . . . , xn ) = max(2π)− 2 θ2 2 e 2θ2
θ1 =0,θ2 >0 θ2 >0 θ2 >0
P 2
n n xi
ln L(0, θ2 , x1 , . . . , xn ) = − ln 2π − ln θ2 −
2 2 2θ2
P 2
∂ ln L(0, θ2 , x1 , . . . , xn ) n xi
=− + =0
∂θ2 2θ2 2θ22
1X 2
⇒ θˆ2 = xi
n
1X 2 n 1
X n n
max L(θ1 , θ2 , x1 , . . . , xn ) = L(0, xi , x1 , . . . , xn ) = (2π)− 2 ( x2i )− 2 e− 2
θ1 =0,θ2 >0 n n
(xi −θ1 )2
P
n −n −
max L(θ1 , θ2 , x1 , . . . , xn ) = max (2π)− 2 θ2 2 e 2θ2
θ1 ∈R,θ2 >0 θ1 ∈R,θ2 >0
(xi − θ1 )2
P
n n
ln L(θ1 , θ2 , x1 , . . . , xn ) = − ln 2π − ln θ2 −
2 2 2θ2
∂ ln L(θ1 , θ2 , x1 , . . . , xn ) 1 X
= (xi − θ1 ) = 0 ⇒ θˆ1 = x
∂θ1 θ2
∂ ln L(x, θ2 , x1 , . . . , xn ) n 1 X 1X
=− + 2 (xi − x)2 ⇒ θˆ2 = (xi − x)2
∂θ2 2θ2 2θ2 n
n 1 n n
X
max L(θ1 , θ2 , x1 , . . . , xn ) = (2π)− 2 ( (xi − x)2 )− 2 e− 2
θ1 ∈R,θ2 >0 n
max L(θ1 , θ2 , x1 , . . . , xn ) P 2 −n −n
( n1 (xi − x)2 n
P
θ1 =0,θ2 >0 xi ) 2 e 2
⇒λ= = 1P n n = ( P 2 ) 2 ≤ λ0
max L(θ1 , θ2 , x1 , . . . , xn ) ( n (xi − x)2 )− 2 e− 2 xi
θ1 ∈R,θ2 >0
(xi − x)2
P
2
⇔ ln P 2 ≤ ln λ0
xi n
2
(xi − x)2
P P
(xi − x) 1 2
ln λ0
⇔ P 2 =P 2 2 = 2 ≤ e n
xi (xi − x) + nx nx
1 + (xi −x)2
P
x 2
29
|x|
α = P( > c∗ : θ1 = 0) = P (|T | > c∗ ), c∗ = t α2
√s
n
|x|
C={ > t α2 }
√s
n
F-distribution:
We denote by F ∼ f (r1 , r2 )
1 1 1
α = P (F ≤ fα (r1 , r2 )) = P ( ≥ ) = P (fα (r2 , r1 ) ≥ )
F fα (r1 , r2 ) fα (r1 , r2 )
1
= 1 − P (fα (r2 , r1 ) ≤ )
fα (r1 , r2 )
1
⇒ 1 − α = P (fα (r2 , r1 ) ≤ )
fα (r1 , r2 )
1 1
⇒ = f1−α (r2 , r1 ) ⇒ fα (r1 , r2 ) =
fα (r1 , r2 ) f1−α (r2 , r1 )
30
σx2 σ2
Ratio of Variances or σy2 :
σy2 x
(
iid
X1 , . . . , Xn ∼ N (µx , σx2 )
iid indep.
Y1 , . . . , Ym ∼ N (µy , σy2 )
n
1
s2x =
P
(n−1)s2x (xi − x)
( 2
∼ χ (n − 1)
n−1
σx2 i=1
⇒ (m−1)s2y indep. where m
σy2
∼ χ2 (m − 1) s2y =
1
P
(yi − y)
m−1
i=1
(m−1)s2y
σy2 (m−1) σx2 s2y
F = (n−1)s2x
= 2 2 ∼ f (m − 1, n − 1)
σy sx
σx2 (n−1)
2
C.I. for σσx2 :
y
Let a,b satisfy
α α
= P (f (m − 1, n − 1) ≤ a) and 1 − = P (f (m − 1, n − 1) ≤ b)
2 2
1
⇒a= , b = f1− α2 (m − 1, n − 1)
f1− α2 (n − 1, m − 1)
So,
1 − α = P (f α2 (m − 1, n − 1) ≤ F ≤ f1− α2 (m − 1, n − 1))
1 σx2 s2y
= P( ≤ 2 2 ≤ f1− α2 (m − 1, n − 1))
f1− α2 (n − 1, m − 1) σy sx
1 s2x σx2 s2x
= P( ≤ ≤ f 1− α (m − 1, n − 1) )
f1− α2 (n − 1, m − 1) s2y σy2 2
s2y
σx2
Hence, a 100(1 − α)% C.I. for σy2
is:
1 s2x s2x
( , f α (m − 1, n − 1) )
1−
f1− α2 (n − 1, m − 1) s2y 2
s2y
31
σx2 s2y
Since F = σy2 s2x
∼ f (m − 1, n − 1), we have
n m
(xi − µ1 )2 (yi − µ2 )2
P P
n+m n m
ln L(µ1 , µ2 , σ12 , σ22 ) = − ln(2π)− ln(σ12 )− ln(σ22 )− i=1 − i=1
2 2 2 2σ12 2σ22
Under H0 :
n m
2 n+m n+m 1 X X
ln L(µ1 , µ2 , σ = σ12 = σ22 ) =− ln(2π)− ln(σ )− 2 [ (xi −µ1 ) + (yi −µ2 )2 ]
2 2
2 2 2σ i=1 i=1
32
∂ ln L 1 X
= 2 (xi − µ1 ) = 0 ⇒ µˆ1 = x
∂µ1 σ
∂ ln L 1 X
= 2 (yi − µ2 ) = 0 ⇒ µˆ2 = y
∂µ2 σ
n m
∂ ln L(µ1 = x, µ2 = y, σ 2 = σ12 = σ22 ) n+m 1 X X
=− + 4 [ (xi −x) + (yi −y)2 ] = 0
2
∂σ 2 2σ 2 2σ i=1 i=1
n m
1 X X
⇒ σ̂ 2 = [ (xi − x)2 + (yi − y)2 ]
n + m i=1 i=1
n m
− n+m 1 X X n+m n+m
sup L(µ1 , µ2 , σ =2
σ12 = σ22 ) = (2π) 2 ( [ (xi −x) + (yi −y)2 ])− 2 e− 2
2
Θ0 n + m i=1 i=1
n
∂L(µ1 , µ2 , σ12 , σ22 ) ∂L(µˆ1 , µ2 , σ12 , σ22 ) ˆ2 = 1
X
= 0 ⇒ µˆ1 = x, = 0 ⇒ σ 1 (xi − x)2
∂µ1 ∂σ12 n i=1
m
∂L(µ1 , µ2 , σ12 , σ22 ) ∂L(µ1 , µˆ2 , σ12 , σ22 ) 1 X
= 0 ⇒ µˆ2 = y, 2
= 0 ⇒ σˆ22 = (yi − y)2
∂µ2 ∂σ2 m i=1
n m
− n+m 1X n 1 X m n+m
sup L(µ1 , µ2 , σ12 , σ22 ) = (2π) 2 ( (xi − x)2 )− 2 ( (yi − y)2 )− 2 e− 2
Θ n i=1 m i=1
Likelihood ratio:
n+m
n m n+m
(xi − x)2 + (yi − y)2 )−
P P
(n + m) 2 ( 2
i=1 i=1
λ= n m ≤ λ0
n m n m
x)2 )− 2 ( y)2 )−
P P
n m (2 2 (xi − (yi − 2
i=1 i=1
n n
m m
(xi − x)2 ) 2 ( (yi − y)2 ) 2
P P
(
i=1 i=1
⇔ P
n m ≤ λ1
P n+m
2 2
( (xi − x) + (yi − y) ) 2
i=1 i=1
1 1
⇔ m · n ≤ λ1
(yi −y)2 (xi −x)2
P P
i=1 n i=1 m
(1 + n )2 (1 + m )2
(xi −x)2 (yi −y)2
P P
i=1 i=1
33
1
∵ n m → 0 as x → ∞ or x → 0
(1 + x) (1 + x1 ) 2
2
m
(yi − y)2
P
i=1
∴ Pn ≤ c1 or ≥ c2
(xi − x)2
i=1
m
(Yi −Y )2
P
i=1
(m−1)
⇒F = n ≤ f α2 (m − 1, n − 1) or ≥ f1− α2 (m − 1, n − 1)
(Xi −X)2
P
i=1
(n−1)
H0 : µ1 = µ2 vs. H1 : µ1 6= µ2
(1) Suppose that it is known that σ = σ1 = σ2 . We have
(
iid
X1 , . . . , Xn ∼ N (µ1 , σ 2 )
iid indep.
Y1 , . . . , Ym ∼ N (µ2 , σ 2 )
2
X ∼ N (µ1 , σn )
1
n
s2x = n−1 (xi − x)2
P
2
(n−1)sx ∼ χ2 (n − 1)
σ2 indep. where i=1
2 m
Y ∼ N (µ2 , σm ) 1
s2y = m−1
P
(yi − y)2
(m−1)sy
2
σ2
∼ χ2 (m − 1) i=1
(
X − Y ∼ N (µ1 − µ2 , σ 2 ( n1 + m1 ))
(n−1)s2x +(m−1)s2y indep.
σ2
∼ χ2 (n + m − 2)
X−Y −(µ1 −µ2 )
√1
σ n
1
+m X −Y
T =q =q q ∼ t(n + m − 2)
(n−1)s2x +(m−1)s2y (n−1)s2x +(m−1)s2y 1 1
σ 2 (n+m−2) (n+m−2) n
+ m
34
And
X −Y H0
T =q q ∼ t(n + m − 2)
(n−1)s2x +(m−1)s2y 1 1
(n+m−2) n
+ m
Reject H0 if
|X − Y |
|T | = q q > t α2 (n + m − 2)
(n−1)s2x +(m−1)s2y 1 1
σ 2 (n+m−2) n
+ m
n
P
Yi
Y d i=1 P
p̂ = = −→ p by WLLN.
n n
n
P
Yi
p̂ − p d n i=1
−p d
q = q −→ N (0, 1) by CLT.
p(1−p) p(1−p)
n n
s
p̂ − p p(1 − p) p̂ − p d
⇒q = q −→ 1 · N (0, 1) = N (0, 1) by Slutsky’s theorem
p̂(1−p̂) p̂(1 − p̂) p(1−p)
n n
p̂ − p0 d
Under H0 , q −→ N (0, 1)
p̂(1−p̂)
n
|p̂ − p0 |
α = P (|Z| ≥ z α2 ) ' P ( q ≥ z α2 : H0 )
p̂(1−p̂)
n
35
Then, an approximate test with significance level α is
rejecting H0 if qp̂−p 0
p̂(1−p̂)
≥ z α2 or ≤ −z α2
n
Table of approximate tests for p.
Hypothesis Critical region
p̂−p
H0 : p = p0 vs. H1 : p 6= p0 q p̂(1−0p̂) ≥ z α2 or ≤ −z α2
n
H0 : p = p0 vs. H1 : p > p0 qp̂−p0 ≥ zα
p̂(1−p̂)
n
p̂−p
H0 : p = p0 vs. H1 : p < p0 q 0
p̂(1−p̂)
≤ −zα
n
d p1 (1 − p1 ) p2 (1 − p2 )
⇒ pˆ1 − pˆ2 ≈ N (p1 − p2 , + )
n m
or
(pˆ1 − pˆ2 ) − (p1 − p2 )
q ≈ N (0, 1)
p1 (1−p1 ) p2 (1−p2 )
n
+ m
P P
Since pˆ1 −→ p1 , pˆ2 −→ p2 , we have
(pˆ1 − pˆ2 ) − (p1 − p2 )
q ≈ N (0, 1)
pˆ1 (1−pˆ1 ) pˆ2 (1−pˆ2 )
n
+ m
We further have
pˆ1 − pˆ2 H0
q ≈ N (0, 1)
pˆ1 (1−pˆ1 ) pˆ2 (1−pˆ2 )
n
+ m
Since
pˆ1 − pˆ2
α ≈ P (q ≥ z α2 or ≤ −z α2 : H0 )
pˆ1 (1−pˆ1 ) pˆ2 (1−pˆ2 )
n
+ m
36
Bivariate Normal Distribution:
Note:
R∞
(b)For each function f , f (x) ≥ 0 and −∞ f (x)dx = 1, there exists a r.v. X
such that f is the p.d.f r.v. X. R∞ R∞
On the other hand, if f (x, y) satisfies f (x, y) ≥ 0 and −∞ −∞ f (x, y)dxdy =
1,there exists r.v.’s X and Y such that f (x, y) is joint p.d.f of X, Y
Consider the function
2
(x−µ1 ) (x−µ1 )(y−µ2 ) (y−µ2 )2
1 −1 1 2 [ 2 −2ρ + 2 ]
f (x, y) = p e 2 1−ρ σ1 σ1 σ 2 σ2
, −∞ < x < ∞, −∞ < y < ∞
2πσ1 σ2 1 − ρ2
37
R∞ R∞ R∞ R∞
We want to show that −∞ −∞
f (x, y)dxdy = (
−∞ −∞
f (x, y)dy)dx = 1
2πσ1 2πσ2 1 − ρ2
38
E(Y |x)2 )|x].
The conditional p.d.f of Y given X = x is
f (x, y)
fY |x (y) =
fX (x)
σ
(x−µ1 )2 (y−(µ2 +ρ σ2 (x−µ1 )))2
− − 1
2 2
√ 1 e 2σ1
√ √
1
e 2σ2 (1−ρ2 )
2πσ1 2πσ2 1−ρ2
= (x−µ1 )2
− 2
√ 1 e 2σ1
2πσ1
σ
(y−(µ2 +ρ σ2 (x−µ1 )))2
1 − 2
1
2σ2 (1−ρ2 )
=√ p e ,y ∈ R
2πσ2 1 − ρ2
σ2
⇒ Y |x ∼ N (µ2 + ρ (x − µ1 ), σ22 (1 − ρ2 ))
σ1
And
σ1
X|y ∼ N (µ1 + ρ (y − µ2 ), σ12 (1 − ρ2 ))
σ2
So,
σ2
Y |x = µ2 + ρ (x − µ1 ) + , ∼ N (0, σ22 (1 − ρ2 ))
σ1
σ2 σ2
= µ2 − ρ µ1 + ρ x +
σ1 σ1
= β0 + β1 x + , ∼ N (0, σ 2 )
This is the linear regression
model.
We have observations xy11 , . . . , xynn , then the linear regression model is
yi = β0 + β1 xi + i , i = 1, . . . , n
where 1 , . . . , n are iid N (0, σ 2 )
The problem in linear regression is that we have a sequence of random vectors
X1 Xn x1 xn
Y1
, . . . , Yn
with observations y1
, . . . , yn
and picture.
But we believe that the observations obey the following linear regression
model.
yi = β0 + β1 xi + i , i = 1, . . . , n
where 1 , . . . , n are iid random variables with mean 0 and variance σ 2
39
Chi-square Test(Goodness of Fit Test)
In developing of C.I. for θ and hypothesis testing, most methods are derived
assuming that random sample X1 , . . . , Xn is drawn from normal distribution.
An important question is: how do you know that it is really drawn from nor-
mal distribution ?
So, we may try to test the following hypothesis:
iid
H0 : X1 , . . . , Xn ∼ N (µ, σ 2 )
This is a goodness of fit problem that can be solved be chi-square test by
Karl Pearson (Father of Pearson by Neyman-Pearson Theorem)
We first consider the hypothesis
iid
H0 : X1 , . . . , Xn ∼ f0 (x)
Thm. Let
k k
X (Nj − nPj )2 X (pratical # − theoretical #)2
Qk = =
j=1
nPj j=1
theoretical # in Aj
d
We have Qk −→ χ2 (k − 1) if H0 is true.
Proof. We consider k = 2 only.
40
H
N1 (# of X1 , . . . , Xn falling A1 ) ∼0 b(n, P1 )
N1 − nP1 d
⇒p −→ N (0, 1) by CLT
nP1 (1 − P1 )
Example:
Mendelian theory:
Shape and color of a pea ought to be grouped into four groups with proba-
bilities as follows:
Groups probabilities obs
9
Round and yellow P1 = 16 N1 = 315
3
Round and green P2 = 16 N2 = 108
3
Angular and yellow P3 = 16 N3 = 101
1
Angular and green P4 = 16 N4 = 32
With sample n = 556(x1 , . . . , x556 ), the numbers grouped are displaying
above. We want to test hypothesis at significance level α = 0.05
9 3 3 1
H0 : P1 = , P2 = , P3 = , P4 = .
16 16 16 16
4
X (Nj − nPj )2
Q4 =
j=1
nPj
9 2 3 2 3 2 1 2
(315 − 556 × 16
) (108 − 556 × 16
) (101 − 556 × 16
) (32 − 556 × 16 )
= 9 + 3 + 3 + 1
556 × 16 556 × 16 556 × 16 556 × 16
= 0.47
41
where f is known p.d.f but θ1 , . . . , θp are unknown?
Let A1 , . . . , An be a partition of space of X and θˆ1 , . . . , θˆp be mle’s of θ1 , . . . , θp .
Define Z
P̂j = f (x, θˆ1 , . . . , θˆp )dx, j = 1, . . . , k
Aj
k
P
Again, we denote N̂j the # of X1 , . . . , Xn falling in Aj . So, N̂j = n
j=1
Thm. Let
k
X (N̂j − nP̂j )2
Qk =
j=1 nP̂j
d
We have Qk −→ χ2 (k − p − 1) if H0 is true.
Def. The chi-square test for H0 : X ∼ f (x, θ1 , . . . , θp ) is rejecting H0 if
Qk ≥ χ2α (k − p − 1)
Example:
iid
X1 , . . . , Xn ∼ f (x, θ). Consider hypothesis H0 : X ∼ N (µ, σ 2 ).
n
sol: mle’s µ̂ = x, σˆ2 = n1 (xi − x)2
P
i=1
Let A1 = (−∞, a1 ), A2 = (a1 , a2 ), . . . , Ak = (ak−1 , ∞) partition of space of
N (x, s2 ).Define
X −x a1 − x a1 − x
Pˆ1 = PN (x,s2 ) (X ≤ a1 ) = PN (x,s2 ) ( ≤ ) = P (Z ≤ )
s s s
a1 − x X −x a2 − x
Pˆ2 = PN (x,s2 ) (a1 ≤ X ≤ a2 ) = PN (x,s2 ) ( ≤ ≤ )
s s s
a2 − x a1 − x
= P (Z ≤ ) − P (Z ≤ )
s s
..
.
aj − x aj−1 − x
P̂j = PN (x,s2 ) (aj−1 ≤ X ≤ aj ) = P (Z ≤ ) − P (Z ≤ )
s s
..
.
ak−1 − x
Pˆk = PN (x,s2 ) (X ≥ ak−1 ) = P (Z ≥ )
s
k
X (N̂j − nP̂j )2 d
Qk = −→ χ2 (k − 3)
j=1 nP̂j
Reject H0 if Qk ≥ χ2 (k − 3)
42