Professional Documents
Culture Documents
J&B
Chapter 8 section 5 (one proportion)
Chapter 10 section 7 (two proportions)
Notation
Text & Lecture notes
n = sample size
(number of independent Bernoulli trails)
=X/n
(x = 0, 1, , n)
0 1 2
n
p = , , ,...,
n n n
n
For X ~ Bin ( n, ) :
E ( X ) = n
Var ( X ) = n (1 )
X n
n (1 )
both n > 15
and n (1 ) > 15
~ N ( 0,1)
5
X
For P = :
n
X
E ( P) = E
n
E ( X ) n
=
=
=
n
n
X
Var ( P ) = Var
n
Var ( X ) n (1 ) (1 )
=
=
=
2
2
n
n
n
or
(1 )
n
both n > 15
and n (1 ) > 15
~ N ( 0,1)
6
Example (cont)
Here, we have observed the sample result
p = 76/400 = 0.19
We want to test given that p is 0.19, do
we have evidence that is no longer 0.15?
(i.e. has the claim rate changed from 0.15?)
Under the assumption that is 0.15, we
want the probability of getting a sample
proportion at least as far away as 0.19 is
from 0.15 (that is p 0.11 or p 0.19).
This is the same as getting a sample
count
X 76
or
X 44
since 0.15*400 = 60
76 60 = 16
(we observed 16 more than we expected)
and 60 - 16 = 44
(the same distance away in the other direction)
400
x
400 x
Prob =
( 0.15 ) ( 0.85 )
x =0 x
44
400
x
400 x
+
( 0.15 ) ( 0.85 )
x = 76 x
400
approx
(1 )
N ( 0,1)
H1: 0.15
= 0.05
if P N 0 ,
then
0 (1 - 0 )
(1 0 )
N
0 ,1
z obs =
p 0
0 (1 - 0 )
n
10
11
Continuity Correction
See J&B
p.254
z obs
1
p 0
2
n
=
0 (1 - 0 )
n
Allows finding
the area in both
tails including
the observed
sample p
12
1
p 0
2n
=
0(1- 0)
n
1
p 0 +
2n
=
0(1-0)
n
p-value P( Z zobs)
13
1
| P 0 |
2n
Z=
0 (1- 0 )
n
14
1
0.19 0.15
0.03875
800
=
2.17
0.0178536
0.15(1-0.15)
400
(1 - )
n
But is unknown.
for , we have to
solve a quadratic
p z /2
p (1- p )
n
Continuity correction:
When doing confidence intervals for , we
dont worry about continuity correction. It is
pointless trying to improve accuracy when the
17
standard error is only approximated.
np = 76 15
n(1-p) = 324 15
0.19(0.81)
0.19 1.96
400
400
0.01785
p (1 p )
=
n
0.19 0.81
0.019615
400
19
20
alternative: 0 ,
p(1-p)
p + z
For a >
p(1-p)
alternative: p z
, 1
n
22
23
N Sample_p
76 400
95%CI
Z-Val P-Val
N Sample p
76 400
Exact
95% CI
P-Value
0.190 (0.152721,0.231938)
0.036
24
25
p1 = 40/70 0.57
n1 = 70
School2
or
The proportions estimate two different
population proportions 1 and 2
(and the difference between p1 and p2
is due to this systematic difference)
26
observe X1 successes
from n1 observations
P1 = X1 / n1
Sample2:
observe X2 successes
from n2 observations
P2 = X2 / n2
at sig level
P1 ~ N 1 ,
n1
2 (1 2 )
P2 ~ N 2 ,
n2
27
P1 P2 N 1 2 ,
+
n1
n2
If H0 is true, i.e. if 1 = 2 =
P1 P2
(1 ) (1 )
N ,
+
n1
n2
1 1
N 0 , (1 ) +
n1 n2
28
So
n1 p1 + n2 p2
x1 + x2
= p =
=
n1 + n2
n1 + n2
The sample proportions are
weighted by the sample sizes
29
zobs =
p1 p2
1 1
(1 ) +
n1 n2
Continuity correction?
There is no need for continuity correction
in two sample proportions tests, as you
need to add and subtract a correction term
(one for p1 and one for p2) and they will
approximately cancel.
31
p = =
40 + 45
85
1
=
=
= 0.50
70 + 100
170
2
n1(1-p) = 35 15
n2*p = 50 15 and
n2(1-p) = 50 15
CLT applies
32
zobs =
40 45
70 100
1
1
0.5 ( 0.5 ) +
70 100
17
140
0.07792
1.558
1 (1 1 ) 2 (1 2 )
n1
n2
use p1 as an estimate of 1
and p2 as an estimate of 2
So, an approx 100(1-)% C.I. for 1 - 2 is:
( p1 p2 ) z 2
p1 (1 p1 ) p2 (1 p2 )
+
n1
n2
34
(p p )=
se
1
2
p1 (1 p1 ) p2 (1 p2 )
+
n1
n2
35
40 45
1.96*
70 100
40 30
70 70
70
45 55
+ 100 100
100
36
Using Minitab
Under Stat Basic Stats 2 Proportions
N
70
100
Sample p
0.571429
0.450000
N
70
100
Sample p
0.571429
0.450000
39
Topic 9. Appendix A
Insurance claims example:
In past years, 15% of the policy holders
have made an insurance claim per year.
This year, of a random sample of 400
policies, 76 have made a claim.
Is there any evidence that the proportion
has increased by a factor of more than 1.2
times?
Sample estimate of is p = 76/400 = 0.19
Ratio of proportions is
sample proportion 0.19
=
1.267
past proportion
0.15
40
1
76
1
0
0.18
2n
= 400 800
0 (1 0 )
0.18 ( 0.82 )
n
400
p
zobs =
0.00875
0.46
0.0192094
p (1 p )
n
CLT check
(sample numbers):
np = 76 15
n(1-p) = 324 15
76 324
76
=
1.645 400 400
400
400
= 0.19 0.032267
= 0.1577
0.1577 1
,
(1.051, 6.667 )
0.15 0.15
44
versus H1: 0
Sample: P = X/n
where X ~ Bino (n , )
1
p 0
2n
=
0 (1 - 0 )
n
Confidence Interval:
CLT check: np 15 and n(1-p) 15
An approximate 100(1-)% CI for is
p(1-p)
p z 2
z obs =
and
n2p 15
and n2(1-p) 15
p1 p 2
1
1
p (1 p ) +
n1 n2
46
and P2 = X2/n2
and
n2p2 15
and n2(1-p2) 15
( p1 p 2 ) z 2
p1 (1 - p1 ) p 2 (1 - p 2 )
+
n1
n2