You are on page 1of 30

Statistics

Chapter 9

Estimation: Additional Topics


Chapter Goals
After completing this chapter, you should be able to:
 Form confidence intervals for the mean difference from dependent
samples
 Form confidence intervals for the difference between two
independent population means (standard deviations known or unknown)
 Compute confidence interval limits for the difference between two
independent population proportions
 Create confidence intervals for a population variance
 Find chi-square values from the chi-square distribution table
 Determine the required sample size to estimate a mean or proportion
within a specified margin of error
Dependent Samples
Tests Means of 2 Related Populations
 Paired or matched samples
 Repeated measures (before/after)
 Use difference between paired values:

Dependent
samples
di = xi - yi

 Eliminates Variation Among Subjects


 Assumptions:
 Both Populations Are Normally Distributed
Mean Difference
The ith paired difference is di , where
di = xi - yi
n

d
The point estimate for
the population mean i

paired difference is d : d i 1
n
n
The sample
standard  i
(d  d ) 2

deviation is: Sd  i 1
n 1
n is the number of matched pairs in the sample
Confidence Interval for
Mean Difference

The confidence interval for difference between


population means, μd , is

Sd Sd
d  t n1,α/2  μd  d  t n1,α/2
n n
Where
n = the sample size
(number of matched pairs in the paired sample)
Paired Samples Example

 Six people sign up for a weight loss program. You


collect the following data:

Weight: 
d = n
di
Person Before (x) After (y) Difference, di

1 136 125 11 = 7.0


2 205 195 10
3 157 150 7
4
5
138
175
140
165
-2
10 Sd 
 i
(d  d ) 2

n 1
6 166 160 6
42  4.82
Paired Samples Example

 For a 95% confidence level, the appropriate t value is tn-1,/2


= t5,.025 = 2.571
 The 95% confidence interval for the difference between
means, μd , is
Sd S
d  t n1,α/2  μd  d  t n1,α/2 d
n n
4.82 4.82
7  (2.571)  μd  7  (2.571)
6 6
 1.94  μd  12.06

Since this interval contains zero, we cannot be 95% confident, given this
limited data, that the weight loss program helps people lose weight
Difference Between Two Means

Population means, Goal: Form a confidence interval


independent for the difference between two
samples population means, μx – μy
 Different data sources
 Unrelated
 Independent
 Sample selected from one population has no effect on the sample
selected from the other population
 The point estimate is the difference between the two
sample means:
x–y
σx2 and σy2 Unknown,
Assumed Equal

Population means, Assumptions:


independent
 Samples are randomly and
samples
independently drawn

σx2 and σy2 known  Populations are normally


distributed
σx2 and σy2 unknown
 Population variances are
σx2 and σy2
assumed equal * unknown but assumed equal

σx2 and σy2


assumed unequal
σx2 and σy2 Unknown,
Assumed Equal

Population means,  The population variances


are assumed equal, so use
independent
the two sample standard
samples
deviations and pool them to
estimate σ
σx2 and σy2 known
(n x  1)s2x  (n y  1)s2y
sp2 
σx2 and σy2 unknown nx  ny  2
σx2 and σy2  use a t value with
assumed equal * (nx + ny – 2) degrees of
σx2 and σy2 freedom
assumed unequal
Confidence Interval,
σx2 and σy2 Unknown, Equal

σx2 and σy2 unknown

σx2 and σy2


assumed equal * The confidence interval for
μ1 – μ2 is:
σx2 and σy2
assumed unequal

s 2p s 2p
(x  y)  t n x  n y 2,α/2   μX  μY 
nx ny

s 2p s 2p
(x  y)  t n x  n y 2,α/2 
nx ny
Pooled Variance Example

You are testing two computer processors for speed. Form


a confidence interval for the difference in CPU speed. You
collect the following speed data (in Mhz):

CPUx CPUy
Number Tested 17 14
Sample mean 3004 2538
Sample std dev 74 56

Assume both populations are


normal with equal variances,
and use 95% confidence
Calculating the Pooled Variance
The pooled variance is:

S2 
n x  1S x
2
 n y  1S y
2


17  174 2
 14  156 2
 4427.03
(n x  1)  (ny  1) (17 - 1)  (14  1)
p

The t value for a 95% confidence interval is:

tnx ny 2 , α/2  t 29 , 0.025  2.045


Calculating the Confidence Limits

 The 95% confidence interval is

sp2 sp2 sp2 sp2


(x  y)  t nx ny 2,α/2   μX  μY  (x  y)  t nx ny 2,α/2 
nx ny nx ny

4427.03 4427.03 4427.03 4427.03


(3004  2538)  (2.054)   μX  μY  (3004  2538)  (2.054) 
17 14 17 14

416.69  μX  μY  515.31

We are 95% confident that the mean difference in


CPU speed is between 416.69 and 515.31 Mhz.
Two Population Proportions
Goal: Form a confidence interval for
Population the difference between two
proportions population proportions, Px – Py

Assumptions:
Both sample sizes are large (generally at
least 40 observations in each sample)

The point estimate for


the difference is
pˆ x  pˆ y
Statistics for Business and Economics, 6e © 2007 Pearson
Chap 9-15
Education, Inc.
Two Population Proportions
(continued)

 The random variable


Population
proportions
(pˆ x  pˆ y )  (p x  p y )
Z
pˆ x (1 pˆ x ) pˆ y (1 pˆ y )

nx ny
is approximately normally distributed

Statistics for Business and Economics, 6e © 2007 Pearson


Chap 9-16
Education, Inc.
Confidence Interval for
Two Population Proportions

Population The confidence limits for


proportions
Px – Py are:

pˆ x (1 pˆ x ) pˆ y (1 pˆ y )
(pˆ x  pˆ y )  Z / 2 
nx ny

Statistics for Business and Economics, 6e © 2007 Pearson


Chap 9-17
Education, Inc.
Example:
Two Population Proportions
Form a 90% confidence interval for the difference
between the proportion of men and the
proportion of women who have college degrees.

 In a random sample, 26 of 50 men and 28 of 40


women had an earned college degree

Statistics for Business and Economics, 6e © 2007 Pearson


Chap 9-18
Education, Inc.
Example:
Two Population Proportions
(continued)

Men: ˆp x  26  0.52
50

Women: ˆp y  28  0.70
40

pˆ x (1 pˆ x ) pˆ y (1 pˆ y ) 0.52(0.48) 0.70(0.30)


    0.1012
nx ny 50 40

For 90% confidence, Z/2 = 1.645

Statistics for Business and Economics, 6e © 2007 Pearson


Chap 9-19
Education, Inc.
Example:
Two Population Proportions
(continued)

The confidence limits are:


pˆ x (1  pˆ x ) pˆ y (1  pˆ y )
(pˆ x  pˆ y )  Z α/2 
nx ny

 (.52  .70)  1.645 (0.1012)

so the confidence interval is

-0.3465 < Px – Py < -0.0135

Since this interval does not contain zero we are 90% confident that the two
proportions are not equal
Statistics for Business and Economics, 6e © 2007 Pearson
Chap 9-20
Education, Inc.
Margin of Error
 The required sample size can be found to reach a desired
margin of error (ME) with a specified level of confidence (1
- )

 The margin of error is also called sampling error


 the amount of imprecision in the estimate of the population
parameter
 the amount added and subtracted to the point estimate to
form the confidence interval

Statistics for Business and Economics, 6e © 2007 Pearson


Chap 9-21
Education, Inc.
Sample Size Determination
Determining
Sample Size

For the
Mean Margin of Error
(sampling error)
σ σ
x  z α/2 ME  z α/2
n n
Statistics for Business and Economics, 6e © 2007 Pearson
Chap 9-22
Education, Inc.
Sample Size Determination
(continued)

Determining
Sample Size

For the
Mean

σ z σ
2 2
ME  z α/2
n
Now solve
for n to get n α/2
2
ME
Statistics for Business and Economics, 6e © 2007 Pearson
Chap 9-23
Education, Inc.
Sample Size Determination
(continued)

 To determine the required sample size for the mean, you


must know:

 The desired level of confidence (1 - ), which determines the


z/2 value
 The acceptable margin of error (sampling error), ME
 The standard deviation, σ

Statistics for Business and Economics, 6e © 2007 Pearson


Chap 9-24
Education, Inc.
Required Sample Size Example
If  = 45, what sample size is needed to estimate the
mean within ± 5 with 90% confidence?

z σ 2
(1.645) (45) 2 2 2
n 2
 α/2
2
 219.19
ME 5

So the required sample size is n = 220


(Always round up)
Statistics for Business and Economics, 6e © 2007 Pearson
Chap 9-25
Education, Inc.
Sample Size Determination
Determining
Sample Size

For the
Proportion

pˆ (1 pˆ ) pˆ (1 pˆ )
pˆ  z α/2 ME  z α/2
n n
Margin of Error
Statistics for Business and Economics, 6e © 2007 Pearson
Education, Inc.
Chap 9-26
(sampling error)
Sample Size Determination
(continued)

Determining
Sample Size

For the
Proportion
pˆ (1 pˆ )
ME  z α/2
n
pˆ (1 pˆ ) cannot Substitute 2
be larger than 0.25 for pˆ (1 pˆ ) 0.25 z
0.25, when p̂ = and solve for n 2
α/2

n to get
0.5 for Business and Economics, 6e © 2007 Pearson
Statistics
ME
Chap 9-27
Education, Inc.
Sample Size Determination
(continued)
 The sample and population proportions, p̂and P, are
generally not known (since no sample has been taken
yet)
 P(1 – P) = 0.25 generates the largest possible margin of
error (so guarantees that the resulting sample size will
meet the desired level of confidence)
 To determine the required sample size for the
proportion, you must know:
 The desired level of confidence (1 - ), which determines the
critical z/2 value
 The acceptable sampling error (margin of error), ME
 Estimate P(1 – P) = 0.25
Statistics for Business and Economics, 6e © 2007 Pearson
Chap 9-28
Education, Inc.
Required Sample Size Example

How large a sample would be necessary to


estimate the true proportion defective in a
large population within ±3%, with 95%
confidence?

Statistics for Business and Economics, 6e © 2007 Pearson


Chap 9-29
Education, Inc.
Required Sample Size Example
(continued)

Solution:
For 95% confidence, use z0.025 = 1.96
ME = 0.03
Estimate P(1 – P) = 0.25
2 2
0.25 z (0.25)(1.9 6)
n 2
α/2
 2
 1067.11
ME (0.03)
So use n = 1068
Statistics for Business and Economics, 6e © 2007 Pearson
Chap 9-30
Education, Inc.

You might also like