You are on page 1of 44

Introduction to Probability

and Statistics
Twelfth Edition

Robert J. Beaver Barbara M. Beaver William Mendenhall

Presentation designed and written by:


Barbara M. Beaver
Copyright 2006 Brooks/Cole
A division of Thomson Learning, Inc.
Introduction to Probability
and Statistics
Twelfth Edition

Chapter 9
Large-Sample Tests of
Hypotheses
Some graphic screen captures from Seeing Statistics Copyright 2006 Brooks/Cole
Some images 2001-(current year) www.arttoday.com A division of Thomson Learning, Inc.
Introduction
Suppose that a pharmaceutical company
is concerned that the mean potency of an
antibiotic meet the minimum government
potency standards. They need to decide between
two possibilities:
The mean potency does not exceed the
mean allowable potency.
The mean potency exceeds the mean
allowable potency.
This is an example of a test of hypothesis.
Copyright 2006 Brooks/Cole
A division of Thomson Learning, Inc.
Introduction
Similar to a courtroom trial. In trying a person
for a crime, the jury needs to decide between
one of two possibilities:
The person is guilty.
The person is innocent.
To begin with, the person is assumed innocent.
The prosecutor presents evidence, trying to
convince the jury to reject the original
assumption of innocence, and conclude that the
person is guilty. Copyright 2006 Brooks/Cole
A division of Thomson Learning, Inc.
Parts of a Statistical Test
1. The null hypothesis, H0:
Assumed to be true until we can prove
otherwise.
2. The alternative hypothesis, Ha:
Will be accepted as true if we can
disprove H0
Court
Courttrial:
trial: Pharmaceuticals:
Pharmaceuticals:
H
H00::innocent
innocent H00:: does
H doesnot
notexceeds
exceedsallowed
allowedamount
amount
H
Haa::guilty
guilty Haa:: exceeds
H exceedsallowed
allowed amount
amount
Copyright 2006 Brooks/Cole
A division of Thomson Learning, Inc.
Parts of a Statistical Test
3. The test statistic and its p-value:
A single statistic calculated from the sample
which will allow us to reject or not reject H 0,
and
A probability, calculated from the test statistic
that measures whether the test statistic is likely
or unlikely, assuming H0 is true.
4. The rejection region:
A rule that tells us for which values of the
test statistic, or for which p-values, the null
hypothesis should be rejected.
Copyright 2006 Brooks/Cole
A division of Thomson Learning, Inc.
Parts of a Statistical Test
5. Conclusion:
Either Reject H0 or Do not reject H0,
along with a statement about the reliability
of your conclusion.
How do you decide when to reject H0?
Depends on the significance level, the
maximum tolerable risk you want to have
of making a mistake, if you decide to reject
H0.
Usually, the significance level is
or

Copyright 2006 Brooks/Cole
A division of Thomson Learning, Inc.
Example
The mayor of a small city claims that the average
income in his city is $35,000 with a standard
deviation of $5000. We take a sample of 64
families, and find that their average income is
$30,000. Is his claim correct?

1-2.
1-2. We
Wewant
wantto
totest
testthe
thehypothesis:
hypothesis:
HH00::==35,000
35,000(mayor
(mayorisiscorrect)
correct)versus
versus
HHaa::35,000
35,000(mayor
(mayorisiswrong)
wrong)
Start
Startby
byassuming
assumingthat
thatH
H00isistrue and ==35,000.
trueand 35,000.
Copyright 2006 Brooks/Cole
A division of Thomson Learning, Inc.
Example
3.3. The
Thebest
bestestimate
estimateof
ofthe
thepopulation mean isisthe
populationmean thesample
sample
mean,
mean,$30,000:
$30,000:
From
Fromthe
theCentral
CentralLimit
LimitTheorem
Theoremthethesample
samplemean
meanhashasan
an
approximate
approximatenormal
normaldistribution
distributionwith mean ==35,000
withmean 35,000
and
andstandard
standarderror
errorSE
SE==5000/8
5000/8==625.
625.
The
Thesample
samplemean,
mean,$30,000
$30,000lies
lieszz==(30,000
(30,00035,000)/625
35,000)/625
==-8
-8standard
standarddeviations
deviationsbelow
belowthethemean.
mean.
The
Theprobability
probabilityof
ofobserving
observingaasample
samplemean
meanthis
thisfar
farfrom
from
==35,000
35,000(assuming
(assumingHH00isistrue)
true)isisnearly
nearlyzero.
zero.

Copyright 2006 Brooks/Cole


A division of Thomson Learning, Inc.
Example

4.4. From
Fromthe theEmpirical
EmpiricalRule,Rule,values
valuesmore
morethan
thanthree
threestandard
standard
deviations
deviationsaway
awayfrom fromthe
themean
meanareareconsidered
consideredextremely
extremely
unlikely.
unlikely.Such
Suchaavaluevaluewould
wouldbebeextremely
extremelyunlikely
unlikelyto
tooccur
occur
ififindeed
indeedHH00isistrue,
true,and
andwould
wouldgive
givereason
reasonto
toreject
rejectHH00..
5.5. Since
Sincethe
theobserved
observedsample
samplemean,
mean,$30,000
$30,000isisso
sounlikely,
unlikely,we
we
choose
choosetotoreject
rejectHH00:: ==35,000
35,000and
andconclude
concludethat
thatthe
the
mayors
mayorsclaim
claimisisincorrect.
incorrect.
6.6. The
Theprobability that==35,000
probabilitythat 35,000and
andthat
thatwe
wehave
haveobserved
observed
such
suchaasmall
smallsample
samplemean
meanjust
justby
bychance
chanceisisnearly
nearlyzero.
zero.

Copyright 2006 Brooks/Cole


A division of Thomson Learning, Inc.
Large Sample Test of a
Population Mean,
Take a random sample of size n 30 from a
population with mean and standard
deviation
We assume that either
is known or
2. s since n is large
The hypothesis to be tested is
H0: = versus Ha:
Copyright 2006 Brooks/Cole
A division of Thomson Learning, Inc.
Test Statistic
Assume to begin with that H0 is true. The
sample mean x is our best estimate of ,
and we use it in a standardized form as the
test statistic:
xx 00 xx 00
zz
// nn ss// nn
since x has an approximate normal distribution
with mean 0 and standard deviation / n .
Copyright 2006 Brooks/Cole
A division of Thomson Learning, Inc.
Test Statistic
If H0 is true the value of x should be close
to 0, and z will be close to 0. If H0 is false, x
will be much larger or smaller than 0, and z
will be much larger or smaller than 0,
indicating that we should reject H0.

Copyright 2006 Brooks/Cole


A division of Thomson Learning, Inc.
Likely or Unlikely?
Once youve calculated the observed value of the test
statistic, calculate its p-value:
-value
pp-value:
-value: The
The probability
probability of of observing,
observing, just just by
by
chance,
chance, aa test
test statistic
statistic as
as extreme
extreme or or even
even
more
more extreme
extreme thanthan what
what weve
weve actually
actually
observed.
observed. IfIf H
H00 isis rejected
rejected this
this isis the
the actual
actual
probability
probability that
that wewe have
have made
made an an incorrect
incorrect
decision.
decision.
If this probability is very small, less than some
preassigned significance level, , H0 can be rejected.
Copyright 2006 Brooks/Cole
A division of Thomson Learning, Inc.
MY APPLET
Example
The daily yield for a chemical plant
has averaged 880 tons for several years.
The quality control manager wants to know if this
average has changed. She randomly selects 50 days
and records an average yield of 871 tons with a
standard deviation of 21 tons.

H00 :: 880
H 880 Test
Test statistic
statistic::
xx 00 871 880
Haa :: 880
H 880 zz 871 880 3.03
3.03
ss// nn 21 21// 50
50
Copyright 2006 Brooks/Cole
A division of Thomson Learning, Inc.
MY APPLET
Example
What is the probability that this test
statistic or something even more extreme (far
from what is expected if H0 is true) could have
happened just by chance?
value::PP((zz 33..03
pp--value 03)) PP((zz 33..03
03))
22PP((zz 33..03
03)) 22(.(.0012
0012)) ..0024
0024
This is an unlikely
occurrence, which
happens about 2 times in
1000, assuming = 880!
Copyright 2006 Brooks/Cole
A division of Thomson Learning, Inc.
Example
To make our decision clear, we choose
a significance level, say = .01.
IfIfthe
thep-value
p-valueisisless than,
lessthan ,HH00isisrejected
rejectedas
asfalse.
false.You
You
report
reportthat
thatthe
theresults
resultsare
arestatistically
statisticallysignificant
significantatat
level
level
IfIfthe
thep-value
p-valueisisgreater than,
greaterthan ,HH00isisnot
notrejected.
rejected.You
You
report
reportthat
thatthe
theresults
resultsare
arenot
notsignificant level
significantatatlevel
Since our p-value =.0024 is less than, we
reject H0 and conclude that the average yield
has changed.
Copyright 2006 Brooks/Cole
A division of Thomson Learning, Inc.
Using a Rejection Region
If = .01, what would be the critical
value that marks the dividing line between not
rejecting and rejecting H0?
p-value<<,
IfIfp-value ,HH00isisrejected.
rejected.
p-value>>,
IfIfp-value ,HH00isisnot
notrejected.
rejected.
The dividing line occurs when p-value = . This is
called the critical value of the test statistic.
Test
Teststatistic
statistic>>critical
criticalvalue
valueimplies p-value<<,
impliesp-value ,HH00isisrejected.
rejected.
Test
Teststatistic
statistic<<critical
criticalvalue
valueimplies p-value>>,
impliesp-value ,HH00isisnot
not
rejected.
rejected. Copyright 2006 Brooks/Cole
A division of Thomson Learning, Inc.
MY APPLET
Example
What is the critical value of z that
cuts off exactly /2 = .01/2 = .005 in the tail
of the z distribution?
For our example, z
= -3.03 falls in the
rejection region
and H0 is rejected
at the 1%
significance level.

Rejection
RejectionRegion:
Region:Reject
RejectHH00ififzz>>2.58
2.58or
orzz<<-2.58.
-2.58.IfIfthe
the
test
teststatistic
statisticfalls
fallsin
inthe
therejection
rejectionregion,
region,its
itsp-value
p-valuewill
willbe be
less than==.01.
lessthan .01.
Copyright 2006 Brooks/Cole
A division of Thomson Learning, Inc.
One Tailed Tests
Sometimes we are interested in a detecting a
specific directional difference in the value of .
The alternative hypothesis to be tested is one
tailed:
Ha: or Ha: <
Rejection regions and p-values are calculated
using only one tail of the sampling distribution.

Copyright 2006 Brooks/Cole


A division of Thomson Learning, Inc.
MY APPLET
Example
A homeowner randomly samples 64 homes
similar to her own and finds that the average
selling price is $252,000 with a standard
deviation of $15,000. Is this sufficient evidence
to conclude that the average selling price is
greater than $250,000? Use = .01.
Test
Test statistic
statistic::
H00 :: 250
H 250,,000
000 xx 00 252 ,,000 250 ,,000
zz 252 000 250 000 11..07
Haa :: 250
H 250,,000
000 ss// nn 15
07
15,,000
000// 64
64
Copyright 2006 Brooks/Cole
A division of Thomson Learning, Inc.
Critical Value Approach
What is the critical value of z that
cuts off exactly = .01 in the right-tail of the z
distribution? For our example, z =
1.07 does not fall in
the rejection region
MY APPLET and H0 is not rejected.
There is not enough
evidence to indicate
that is greater than
$250,000.
Rejection
RejectionRegion:
Region:Reject
RejectHH00ififzz>>2.33.
2.33.IfIfthe
thetest
teststatistic
statisticfalls
falls
in
inthe
therejection
rejectionregion,
region,its
itsp-value
p-valuewill
willbe
beless than==.01.
lessthan .01.
Copyright 2006 Brooks/Cole
A division of Thomson Learning, Inc.
p-Value Approach
The probability that our sample results or
something even more unlikely would have
occurred just by chance, when = 250,000.
value::PP((zz 11..07
pp--value 07)) 11..8577
8577 ..1423
1423
Since the p-value is
greater than = .01, H0 is
APPLET
MY
not rejected. There is
insufficient evidence to
indicate that is greater
than $250,000.
Copyright 2006 Brooks/Cole
A division of Thomson Learning, Inc.
Statistical Significance
The critical value approach and the p-value
approach produce identical results.
The p-value approach is often preferred because
Computer printouts usually calculate p-
values
You can evaluate the test results at any
significance level you choose.
What should you do if you are the experimenter
and no one gives you a significance level to use?

Copyright 2006 Brooks/Cole


A division of Thomson Learning, Inc.
Statistical Significance
If the p-value is less than .01,
.01 reject H0. The
results are highly significant.
If the p-value is between .01 and .05,
05 reject H0.
The results are statistically significant.
If the p-value is between .05 and .10,
10 do not
reject H0. But, the results are tending towards
significance.
If the p-value is greater than .10,
.10 do not reject
H0. The results are not statistically
significant.
Copyright 2006 Brooks/Cole
A division of Thomson Learning, Inc.
Two Types of Errors
There are two types of errors which can
occur in a statistical test.
Actual Fact Guilty Innocent Actual Fact H0 true H0 false
Jurys Your (Accept H0) (Reject H0)
Decision Decision
Guilty Correct Error H0 true Correct Type II Error
(Accept H0)
Innocent Error Correct
H0 false Type I Error Correct
(Reject H0)

Define:
= P(Type I error) = P(reject H0 when H0 is true)
P(Type II error) = P(accept H0 when H0 is false)
Copyright 2006 Brooks/Cole
A division of Thomson Learning, Inc.
Two Types of Errors
We want to keep the probabilities of
error as small as possible.
The value of is the significance level, and
is controlled by the experimenter.
The value of is difficult, if not impossible
to calculate.
Rather than accepting H0 as true without being
able to provide a measure of goodness, we
choose to not reject H0.
We write: There is insufficient evidence to reject H0.
Copyright 2006 Brooks/Cole
A division of Thomson Learning, Inc.
Other Large Sample Tests
There were three other statistics in Chapter 8
that we used to estimate population
parameters.
These statistics had approximately normal
distributions when the sample size(s) was
large.
These same statistics can be used to test
hypotheses about those parameters, using the
general test statistic:
statistic
statistic--hypothesiz
hypothesiz ed
ed value
value
zz
standard
standarderror
errorof
of statistic
statistic
Copyright 2006 Brooks/Cole
A division of Thomson Learning, Inc.
Testing the Difference
between Two Means
AArandom
randomsample
sampleofof size
sizenn11drawn
drawnfrom
from
population with mean and variance 22
population 1 with mean 11 and variance 11..
1
AArandom
randomsample
sampleofof size
sizenn22 drawn
drawnfrom
from
population with mean and variance
population 2 with mean 22 and variance 22..
2 22

The hypothesis of interest involves the


difference, in the form:
H0: D0 versus Ha:one of three
where D0 is some hypothesized difference,
usually 0. Copyright 2006 Brooks/Cole
A division of Thomson Learning, Inc.
The Sampling
Distribution of x1 x2
1.
1. The
Themean
meanofof xx11xx22 isis1122,,the
thedifference
difference in
in
the
thepopulation
populationmeans.
means.
1212 2222
2.
2. The
Thestandard
standarddeviation of xx11xx22 isisSE
deviationof SE ..
nn11 nn22
33..IfIf the
thesample
samplesizes
sizesare
arelarge,
large,the
thesampling
samplingdistributi
distribution
on
of xx11xx22 isisapproximat
of approximately
elynormal,
normal,and
andSE
SEcan
canbebeestimated
estimated
ss1212 ss2222
as SE
asSE ..
nn11 nn22
Copyright 2006 Brooks/Cole
A division of Thomson Learning, Inc.
Testing the Difference
between Two Means
HH00::1122 DD00 versus
versus
HHaa ::one
oneofof three
threealternativ
alternatives
es
xx11xx22
Test statistic::zz 2
Test statistic
ss112 ss2222

nn11 nn22
with
withrejection
rejectionregions
regionsand/or
and/or pp--values
values
based
basedononthe
thestandard
standardnormal
normalzzdistributi
distribution.
on.

Copyright 2006 Brooks/Cole


A division of Thomson Learning, Inc.
Example
Avg Daily Intakes Men Women
Sample size 50 50
Sample mean 756 762
Sample Std Dev 35 30

Is there a difference in the average daily intakes of dairy


products for men versus women? Use = .05.
H 0 : 1 2 0 (same) H a : 1 2 0 (different)
Test statistic :
x1 x2 0 756 762 0
z .92
s12 s22 352 30 2

n1 n2 50 50
Copyright 2006 Brooks/Cole
A division of Thomson Learning, Inc.
p-Value Approach
The probability of observing values of z
that as far away from z = 0 as we have,
just by chance, if indeed = 0.
value::PP((zz ..92
pp--value 92))PP((zz ..92
92))
22(.(.1788
1788)) ..3576
3576 Since the p-value is
greater than = .05, H0 is
not rejected. There is
insufficient evidence to
indicate that men and
women have different
average daily intakes.
Copyright 2006 Brooks/Cole
A division of Thomson Learning, Inc.
Testing a Binomial
Proportion p
AArandom
randomsample
sampleof
of size
sizennfrom
fromaabinomial
binomialpopulation
population
to
totest
test
HH00:: pp pp00 versus
versus
HHaa ::one
oneofof three
threealternativ
alternatives es
pp pp00
Test statistic::zz
Test statistic
pp00qq00
nn
with
withrejection
rejectionregions
regionsand/orand/or pp--values
valuesbased
basedon
on
the
thestandard
standardnormal
normalzzdistributi
distribution.
on.
Copyright 2006 Brooks/Cole
A division of Thomson Learning, Inc.
Example
Regardless of age, about 20% of American
adults participate in fitness activities at least twice
a week. A random sample of 100 adults over 40
years old found 15 who exercised at least twice a
week. Is this evidence of a decline in participation
after age 40? Use = .05.
Test
Test statistic
statistic::
H00 :: pp ..22
H
pp pp00 ..15
15 ..22
Haa :: pp ..22
H zz 11..25
25
pp00qq00 ..22(.(.88))
nn 100
1002006 Brooks/Cole
Copyright
A division of Thomson Learning, Inc.
Critical Value Approach
What is the critical value of z that
cuts off exactly = .05 in the left-tail of
the z distribution?
For our example, z = -1.25
does not fall in the
rejection region and H0 is
not rejected. There is not
enough evidence to
indicate that p is less than .
2 for people over 40.
Rejection
RejectionRegion:
Region:Reject
RejectHH00ififzz<<-1.645.
-1.645.IfIfthe
thetest
teststatistic
statistic
falls
fallsin
inthe
therejection
rejectionregion,
region,its
itsp-value
p-valuewill
willbebeless than==.05.
lessthan .05.
Copyright 2006 Brooks/Cole
A division of Thomson Learning, Inc.
Testing the Difference
between Two Proportions
To compare two binomial proportions,
AArandom
randomsample
sampleofof size
sizenn11drawn
drawnfrom
from
binomial
binomialpopulation
population11with
withparameter
parameter pp11..
AArandom
randomsample
sampleof
of size
sizenn22 drawn
drawnfrom
from
binomial
binomialpopulation
population22with
withparameter
parameter pp22..

The hypothesis of interest involves the


difference, ppin the form:
H0:ppD0 versus Ha:one of three
where D0 is some hypothesized difference,
Copyright 2006 Brooks/Cole
usually 0. A division of Thomson Learning, Inc.
The Sampling
Distribution of p1 p 2
1.
1. The
Themean
meanofof pp11 pp22 isis pp11 pp22,,the
thedifference
difference in
in
the
thepopulation
populationproportion
proportions.s.
pp11qq11 pp22qq22
2.
2. The
Thestandard
standarddeviation of pp11 pp22 isisSE
deviationof SE ..
nn11 nn22
33..IfIf the
thesample
samplesizes sizesare
arelarge,
large,the
thesampling
samplingdistributi
distribution
on
of pp11 pp22 isisapproximat
of approximately
elynormal.
normal.
4.
4.The
Thestandard
standarderror
errorisisestimated
estimateddifferentl
differently,y,depending
dependingon
on
the
thehypothesiz
hypothesized
eddifference
difference,,DD00..

Copyright 2006 Brooks/Cole


A division of Thomson Learning, Inc.
Testing the Difference
between Two Proportions
HH00:: pp11 pp22 00versus
versus
HHaa ::one
oneof of three
threealternativ
alternatives es
pp11 pp22
Test statistic::zz
Test statistic
11 11
ppqq
nn11 nn22
xx11xx22
with pp
with to
toestimate
estimatethe thecommoncommonval
value
ueof
of pp
nn11nn22
and
andrejection
rejectionregions
regionsor or pp--values values
based
basedon onthe thestandard
standardnormal normalzzdistributi
distribution.
on.
Copyright 2006 Brooks/Cole
A division of Thomson Learning, Inc.
Example
Youth Soccer Male Female
Sample size 80 70
Played soccer 65 39

Compare the proportion of male and female college


students who said that they had played on a soccer team
during their K-12 years using a test of hypothesis.
H 0 : p1 p2 0 (same) H a : p1 p2 0 (different)

Calculate p 1 65 / 80 .81
x1 x2 104
p 2 39 / 70 .56 p .69
n1 n2 150

Copyright 2006 Brooks/Cole


A division of Thomson Learning, Inc.
Example
Youth Soccer Male Female
Sample size 80 70
Played soccer 65 39

Test statistic :
p 1 p 2 0 .81 .56
z 3.30
1 1 1 1
p q .69(.31)
n1 n2 80 70

value::PP((zz 33..30
pp--value 30))PP((zz 33..30
30))22(.(.0005
0005))..001
001
Since the p-value is less than = .01, H0 is rejected. The
results are highly significant. There is evidence to indicate
that the rates of participation are differentCopyright
for boys 2006and girls.
Brooks/Cole
A division of Thomson Learning, Inc.
Key Concepts
I. Parts of a Statistical Test
1. Null hypothesis: a contradiction of the alternative
hypothesis
2. Alternative hypothesis: the hypothesis the researcher
wants to support.
3. Test statistic and its p-value: sample evidence calculated
from sample data.
4. Rejection regioncritical values and significance levels:
values that separate rejection and nonrejection of the null
hypothesis
5. Conclusion: Reject or do not reject the null hypothesis,
stating the practical significance of your conclusion.
Copyright 2006 Brooks/Cole
A division of Thomson Learning, Inc.
Key Concepts
II. Errors and Statistical Significance
1. The significance level is the probability if rejecting H 0
when it is in fact true.
2. The p-value is the probability of observing a test statistic
as extreme as or more than the one observed; also, the
smallest value of for which H 0 can be rejected.
3. When the p-value is less than the significance level , the

null hypothesis is rejected. This happens when the test


statistic exceeds the critical value.
4. In a Type II error, is the probability of accepting H 0
when it is in fact false. The power of the test is (1 ),
the probability of rejecting H 0 when it is false.
Copyright 2006 Brooks/Cole
A division of Thomson Learning, Inc.
Key Concepts
III. Large-Sample
Test Statistics Using
the z Distribution
To test one of the four
population parameters
when the sample sizes
are large, use the
following test statistics:

Copyright 2006 Brooks/Cole


A division of Thomson Learning, Inc.

You might also like