Statistical Hypothesis Testing Steps

Agresti Ch.
9
Statistical Hypothesis a conjecture about a population parameter
tested using sample data, using the understanding of sampling error in sampling distributions
Steps in significance tests of hypotheses:
1) Conditions/Assumptions
specify the number and type of variable(s) and what the tested parameter represents
and then make any necessary assumptions pertaining to data collection,
sample size, and shape of sampling or population distribution
2) Hypotheses
specify a null hypothesis (H0), typically a single parameter value indicating no effect,
and an alternative hypothesis (Ha), a set of alternative parameter values
3) Test-statistic
collect your sample data and measure the distance between the sample statistic and the
hypothesized population parameter value, yielding a z or t as the test statistic
or a more complicated computation could yield F or 2
4) P-value
P-value is the probability of getting a sample statistic (such as the mean) or a more extreme
sample statistic in the direction of the alternative hypothesis when the null hypothesis is true
5) Conclusion
report and interpret the p-value in the context of the study
make a decision about the null hypothesis based on the p-value:
is the probability of the sample result so small that you should reject the null?
discuss the real-world implications of your decision.
Conditions/Assumptions:
Number and type of variable(s)
What the tested parameter represents
Random sampling is almost always required
Sample size considerations are often important for identifying the shape of sampling distribution
Other assumptions may be necessary to identify the population or sampling distribution
Agresti Ch. 9, p. 2
Hypotheses:
Examples based on variable type and nature of the claim:
Mean
Proportion
Claim: Textbooks typically cost

about $450/semester.
Claim: Most Americans think

continued fighting in Afghanistan is justified.
represents the average textbook cost
p is the proportion thinking invasion justified
a single value is claimed, so this is two-sided
a range of possibilities is claimed, so this

situation is one-sided
We specify two hypotheses, null and alternative:

H0: = 450
Ha: 450
H0: p 0.50
Ha: p > 0.50
Null hypothesis, H0, states that the parameter is a particular value [or in a range of values the opposite
of that which is expected or desired].
Alternative hypothesis, Ha, states that the parameter differs from a particular value or is in a range of
expected or desired values.
Hypotheses can be tested multiple ways, but the procedure will not change the appropriate set of
hypotheses.
Three types of pairs of hypotheses, using means here, are possible: (names focus on the alternative)
Two-tailed
H0: = 0
H a: 0
Right-tailed (GT)
H0: = 0 [or 0]
H a: > 0
Left-tailed (LT)
H0: = 0 [or 0]
H a: < 0
where 0 is the numerical value of the population mean specified in the claim
The right-tailed and left-tailed are generically known as one-tailed.
You will typically find the null specified as the = case, even when the alternative is one-sided.
Technically, we must allow for a range of values when the alternative is one-sided.
In those case, we test against the borderline value specified in the null.
1/31/2012
Agresti Ch. 9, p. 3
How to decide what hypothesis goes where:
1) If you are doing a two-tailed test, your null must be the equality, not so much because we prefer null
hypotheses to include the equality relationship (although we often do), but because, when choosing
between = and , only the = hypothesis could potentially be rejected. It is impossible to reject the
hypothesis. This may mean that the claim ends up in the null, even if we would prefer it be in the
alternative.
2) If you are doing a one-tailed test, you should put your claim/hope in the alternative, because the
strongest conclusions occur when you reject the null, and then say that the alternative must be true because
you ruled out everything else. If you put your claim in the null, the best you can say is this could be true,
or I couldnt rule this out -- much weaker statements. This is also justified by realizing that the null is
assumed true unless evidence to the contrary is found you would rather not assume your claim to be true;
you should prove it by ruling out the null instead.
3) If you have a one-tailed test and there is no clear preferred alternative, you may want to choose the
hypotheses so that the most severe consequence of error is a Type I error.
Actual
H0 True
Conclusion
H0 False
Reject H0
Type I Error
(Probability )
No Error
Not reject H0
No Error
(Probability 1-)
Type II Error
Based on
Hypothesis Test
Situation
Examples (from Hawkes and Marsh, p. 501):

For the following situations, identify the appropriate H0 and Ha and state what the consequences would be
for Type I and Type II errors.
a. A company which manufactures one-half inch bolts selects a random sample of bolts to determine if the
diameter of the bolts differs significantly from the required one-half inch.
b. A company which manufactures safety flares randomly selects 100 flares to determine if the flares last
at least three hours on average.
Test Statistic
Depends on the distribution of the sample statistic
In turn related to assumptions, sample size, etc.
Proportions: if sample size large enough, p is normal, and you can create Z as test statistic
Means: if sample large enough, x is normal, and you can create t as test statistic
1/31/2012
Agresti Ch. 9, p. 4
P-value
Determine the likelihood of the test statistic or a more extreme one (or, equivalently, the
underlying sample statistic that generated the test statistic) if the null hypothesis was true
In one-tailed cases, we figure out the probability of getting values beyond the test statistic in the
direction of the alternative. If we have a right-tailed alternative and are using the Z distribution,
we would compute P(ZZcalc) as the P-value. If we have a left-tailed alternative and are using the
Z distribution, we would compute P(ZZcalc) as the P-value. Zcalc refers to the calculated value of
the test statistic. We can easily dismiss the possibility of rejecting the null when the sign is
wrong, i.e., positive for LT alternative, meaning p exceeded p0, or negative for a GT
alternative, meaning p was less than p0.
Agresti and Franklin,

p. 411
For two-tailed cases, we would find the probability in the tail consistent with the result and then
double that value.
Agresti and Franklin,

p. 418
Conclusion
Based on how small the P-value is, we either reject or fail to reject the null hypothesis.
Some people explicitly use an cut-off; others report the p-value and leave it to others to decide if
the value is so rare as to justify rejecting the null.
Remember that we never explicitly accept a null hypothesis.
Once the decision is made, we relate it to the original question.
1/31/2012
Agresti Ch. 9, p. 5
Z test for a Proportion
Assumptions: A single categorical variable, random sampling, np015 and n(1-p0)15, so normality
applies
Types of hypotheses:
Two-tailed
H0: p = p0
Ha: p p 0
Right-tailed (GT)
H0: p = p0 (or p p0)
Ha: p > p 0
Left-tailed (LT)
H0: p = p0 (or p p0)
Ha: p <p 0
where p0 is the numerical value of the population proportion specified in the claim
Test-statistic: Z =
p p0
=
se0
p p0
p0 (1 p0 )
n
P-value: Right-tail probability for GT alternative; left-tail for LT alternative, two-tail for NE alternative
Conclusion: Smaller P-values give stronger evidence against H0.
If decision needed, compare P-value to : If P-value<, reject H0.
Examples
Suppose you are arguing about how many older teens text while they drive, and you claim that it is more
than one in five. A recent Pew Research survey of 800 older teens revealed that 26% of teens admit to
having texted while driving. Is there sufficient evidence at the 0.05 level of significance to conclude that
the proportion of older teens texting while driving differs from one in five?
1/31/2012
Agresti Ch. 9, p. 6
t test for the mean
Conditions/Assumptions: A single quantitative variable, random sampling, x normal,
usually gotten by assuming a normal population or having a sample larger than 30.
Hypotheses:
Two-tailed
H0: = 0
H a: 0
Right-tailed (GT)
H0: = 0 [or 0]
H a: > 0
Left-tailed (LT)
H0: = 0 [or 0]
H a: < 0
where 0 is the numerical value of the population mean specified in the claim
Test-statistic: t =
x 0
se
x 0
s
n
P-value: Right-tail probability for GT alternative; left-tail for LT alternative, two-tail for NE alternative
P-values for t-statistics are more difficult than for Z-statistics, where we could use tables for exact
values. The best one can do using tables is to find the interval within which the p-values lies.
Fortunately, p-values are commonly provided by computer software.
You can also use the tdist function in Excel to get exact p-values.
Conclusion: Smaller P-values give stronger evidence against H0.
If decision needed, compare P-value to : If P-value<, reject H0.
Example (Bluman, p. 415):

The average production of peanuts in the state of Virginia is 3000 pounds per acre. A new plant food has
been developed and is tested on 60 individual plots of land. The mean yield with the new plant food is
3120 pounds of peanuts per acre with a standard deviation of 578 pounds. At = 0.05, can one conclude
that the average production has increased?
The well-known normal temperature for humans is 98.6. A recent study decided to test this value using a
sample of 130 adults. If the mean of the sample was 98.25, and the standard deviation was 0.73, is there
sufficient evidence at the = 0.05 level to conclude that average temperature differs from 98.6? (from
Shoemaker. A. 1996. Journal of Statistics Education v.4, n.2.)
1/31/2012
Agresti Ch. 9, p. 7
Consider the following random sample of size eight from a normal population. Based on the sample, test
the claim that the mean of the population is greater than 100 at =0.10.
100
150
120
90
95
110
100
80
XLSTATS:
Numerical Summaries for x
Number 8
Min
Q1
Median
Q3
Max
Mean 105.625
St Dev 21.61968
Coeff of Var 0.204683
Skew 1.286236
80
93.75
100
112.5
150
Sample Data
Sample Size
Mean
Standard Deviation
SE Mean
Hypothesis Tests
H0 : =
Alternative
>
Confidence Intervals for

Type (2,U,L) 2
Confidence Level 0.95
ME
Lower
Upper
18.07451 87.55049 123.6995
100
<
H1 :
100
T 0.7359
DF 7
= 0.24286
p-value
8
105.625
21.61968
7.643712
SPSS:
T-Test
One-Sample Statistics
N
VAR0000
1
Mean
Std.
Deviation
105.6250 21.6197
Std.
Error Mean
7.6437
One-Sample Test
VAR0000
df
.736
Test Value = 100

Sig.
Mean
(2-tailed) Difference
.486
5.6250
95% Confidence
Interval of the
Difference
Lower Upper
-12.4495 23.6995
1/31/2012
Agresti Ch. 9, p. 8
Confidence Intervals and Hypothesis Testing
If a value is not included in the confidence interval, a two-tailed hypothesis test using the value will lead to
rejection of the null hypothesis.
If a value is in the confidence interval, the null will not be rejected.
This does not apply to one-tailed tests, since they use , leading to lower critical values than what is used
in two-tailed tests and confidence intervals.
Some statisticians advocate abandoning formal hypothesis tests altogether, with and emphasis on
confidence interval instead.
confidence intervals tell us all plausible values of the population parameter
one-tailed tests can be easier, regarded as loosening standards
Misinterpretations of Results of Significance Tests

Do not reject H0 does not mean Accept H0
Statistical significance does not mean practical significance
The P-value cannot be interpreted as the probability that H0 is true.
It is P(test statistic takes observed value or beyond in tails | H0 true)
Not P(H0 true | observed test statistic value)
It is misleading to report results only if they are statistically significant
Some tests may be statistically significant just by chance
1/31/2012

Statistical Hypothesis Testing Steps

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Statistical Hypothesis Testing Steps

Uploaded by

Copyright:

Available Formats

Agresti Ch.

Claim: Textbooks typically cost

Claim: Most Americans think

represents the average textbook cost

p is the proportion thinking invasion justified

a single value is claimed, so this is two-sided

a range of possibilities is claimed, so this

We specify two hypotheses, null and alternative:

Examples (from Hawkes and Marsh, p. 501):

Agresti and Franklin,

Agresti and Franklin,

Example (Bluman, p. 415):

Confidence Intervals for

Test Value = 100

Misinterpretations of Results of Significance Tests

You might also like