Hypothesis Testing: W&W, Chapter 9

Hypothesis Testing
W&W, Chapter 9
Overview
We will discuss two approaches to
hypothesis testing:
1) Using confidence intervals
2) Using critical t or z values, or p values
Hypothesis Test
A statistical hypothesis is simply a claim
about a population that can be put to
the test by drawing a random sample.
Elements of a Hypothesis Test

1.
The null hypothesis, Ho: Specifies

hypothesized values for one or more of
the population parameters
2. The alternative hypothesis, H A: A
statement which says that the population
parameter is something other than the
value specified by the null hypothesis
Elements of a Hypothesis Test

3. The error level, or , which is just 1 the confidence level (in terms of
probability)
4. One-tailed versus two-tailed tests
Example #1
Suppose we want to know if there is a difference in
the salaries for male and female professors. We
might take two samples, one of men and one of
women, to determine their respective mean salary
levels. The calculated M1 and M2 are estimates of the
population means, 1 and 2.
Ho: 1 - 2 = 0, or Ho: 1 = 2
HA: 1 - 2 0, or HA: 1 2
Example #1 (continued)
This is stated as a two-tailed test. If you
believe that women make less than
men, then the alternative hypothesis
might be something like:
HA: 1 - 2 > 0
Example #2
As an employee of the Federal Trade Commission, you are
vigilant in your stand against false or misleading advertising. A
manufacturer of razor blades claims that their new blades give
on average 15 good shaves. You conduct a small test by asking
10 randomly chosen men to each try one of these new razor
blades. The average number of good shaves reported is 13 and
the standard deviation is 3.62. The manufacturer claims that the
true number of shaves (or population value) is 15, or:
Ho: = 15
Example #2 (continued)
If we want to challenge the manufacturers
claim, we might employ a one-tailed test,
where the alternative hypothesis would be:
HA: < 15
Or if we were agnostic, we could use a twotailed test:
HA: 15
Example #3
A more general test that we will see when we get to regression
is where the null hypothesis is equal to zero, and we want to
know if our parameters have a statistically significant effect, or
are different from zero.
For example, suppose a researcher wants to determine if the
amount of electoral rules are related to voter turnout. Suppose
the impact of electoral rules on voter turnout is called . A
typical hypothesis test will be something like the following:
Ho: Electoral rules have no impact on voter turnout, or = 0
HA: Electoral rules affect voter turnout, or 0
Testing Hypotheses: Confidence

Intervals
Let's start with the first example I gave
where we want to see if there is a
difference in the mean salary level (in
thousands of dollars) of male and
female professors. Suppose that for
men (M1 = 16, n1=10, (X1 - M1)2 = 106)
and for women (M2 = 11, n2 = 5, (X2 M2)2 = 40).

Intervals
We calculate the confidence interval as:
(1 - 2) = (M1 - M2) +/- t/2 sp(1/n1 + 1/n2)
We need to calculate the value of s p, which is just:
sp2 = (X1 - M1)2 + (X2 - M2)2
(n1 - 1) + (n2 - 1)
= (106 + 40)/[(10-1) + (5-1)] = 146/13
= 11.2, thus sp = 11.2 = 3.35

Intervals
We plug this back into our confidence
interval to obtain:
(1 - 2)
= (16 - 11) +/- 2.16 (3.35)* (1/10 + 1/5)

= 5 +/- 4
Note: 2.16 is the critical value of t, for

95% confidence, 13 df (for a two tailed
test).

Intervals
With 95% confidence, the difference between
our means is estimated to be between 1 and
9, thus the claim that there is no difference
cannot be accepted, i.e., we can reject the
null hypothesis. Zero is not contained in the
interval.
In general, any hypothesis that lies outside the
confidence interval may be rejected. Thus
the confidence interval may be regarded as
the set of acceptable hypotheses.
Testing Hypotheses: p-values

Let's go back to our one-tailed test by the
FTC employee, who wants to determine
if the razor blade manufacturer's claim
of 15 good shaves is valid.
Ho: = 15
HA: < 15

A p-value is just the probability that the
sample value would be as large as the
value actually observed if Ho is true. In
other words, the p-value summarizes
how much agreement there is between
the data and the null hypothesis. In this
case, the null is that the razors give 15
good shaves.

We start by calculating the t or z
statistic associated with our observed
value. We would use t in this problem
because the sample size is small
(N=10), and the population standard
deviation is unknown (when the
sample size is large, t and z are
equivalent).

t = M - o = 13 - 15 = -1.74
s/N
3.62/10
We can think of t as:
t = estimate - null hypothesis
standard error
If the null hypothesis is zero, then
t = estimate/standard error.
In this case, the t ratio simply measures the size
of the estimate relative to its standard error.

Now we want to find the area beyond that value of t,
which gives us the p-value. In this problem, t = -1.74.
To find the p-value, we need to take into account our
degrees of freedom, df = n-1 in this problem, which is
9.
We go to the t-table and look to see where our calculated
t falls relative to the cutoff values for various probability
values. Our value of t is between the t values of 1.38
(p=.10) and 1.83 (p=.05). Thus we can say that our pvalue is between .10 and .05.
Comparing our p-value or t/z statistic with a

critical p-value or t/z
A classical hypothesis test consists of
setting a critical value, which will give us
the reject and accept regions. For
example, for a one-tailed test, with 95%
confidence ( = .05), we use a value of
z = 1.64 as our critical value, or for a
two-tailed test, we use z = 1.96.
Comparison (continued)
We reject the null hypothesis if our calculated
t or z is beyond the critical t or z, or if the pvalue is .
In the above example, a 95% critical t value
for 9 df is 1.83. Since our calculated t does
not exceed the critical t (or fall in the reject
region), we must accept the null hypothesis,
the manufacturer's claim of 15 good shaves.
Also, our p-value is larger than , which is .05.
The Critical Region

A way to think about a calculated value
in the critical region is that:
1) Ho is true, but we have been
exceedingly unlucky and got a very
improbable sample.
2) Ho is not true after all. Thus it is no
surprise that our observed value was so
high or low.
The Critical Region

When we calculated the difference
between male and female professors'
salaries, if the difference is really large,
then we would expect to find something
in the tails, very far away from the
center of the distribution where the
difference is zero.
Type I and Type II Errors

Choosing an alpha level is tricky because
it sets the level at which we will reject
the null hypothesis. And there is a
chance that the higher this value is, the
greater the chance that we will falsely
reject a true Ho.

State of the World
Ho Accepted
Ho Rejected
If Ho is true
Correct decision
Type I error
Pr = 1-
Pr =
Type II error
Correct decision
Probability =
Probability = 1 -
= power of the test
If Ho is false

To give you an analogy, in a court of law
we assume people are innocent (the
null hypothesis) until proven guilty.
A Type I error would be finding an
innocent man guilty.
A Type II error would be letting a guilty
man go free.
Which is worse?

By decreasing our error or alpha level, we will
increase the chance of a Type II error
(accepting the null when it is really false)
because we make the criteria for rejection
more stringent.
The only way that error can be reduced without
increasing the probability of a Type II error is
by increasing the sample size.

Hypothesis Testing: W&W, Chapter 9

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Hypothesis Testing: W&W, Chapter 9

Uploaded by

Copyright:

Available Formats

Hypothesis Testing

Elements of a Hypothesis Test

The null hypothesis, Ho: Specifies

Elements of a Hypothesis Test

Testing Hypotheses: Confidence

Testing Hypotheses: Confidence

Testing Hypotheses: Confidence

= (16 - 11) +/- 2.16 (3.35)* (1/10 + 1/5)

Note: 2.16 is the critical value of t, for

Testing Hypotheses: Confidence

Testing Hypotheses: p-values

Testing Hypotheses: p-values

Testing Hypotheses: p-values

Testing Hypotheses: p-values

Testing Hypotheses: p-values

Comparing our p-value or t/z statistic with a

The Critical Region

The Critical Region

Type I and Type II Errors

Type I and Type II Errors

Type I and Type II Errors

Type I and Type II Errors

You might also like