Professional Documents
Culture Documents
Truth
Jury
Innocent Guilty
Decision
“Innocent” OK ERROR
Guilty ERROR OK
Truth
Null Alternative
Decision
hypothesis hypothesis
Do not TYPE II
OK
reject null ERROR
TYPE I
Reject null OK
ERROR
Definitions: Types of Errors
• Type I error: The null hypothesis is rejected when
it is true.
• Type II error: The null hypothesis is not rejected
when it is false.
• There is always a chance of making one of these
errors. We’ll want to minimize the chance of doing
so!
P(Type I Error) in trials
• Criminal trials: “Beyond a reasonable doubt”. 12 of
12 jurors must unanimously vote guilty.
Significance level set at 0.001, say.
• Civil trials: “Preponderance of evidence.” 9 out of
12 jurors must vote guilty. Significance level set
at 0.10, say.
Example:
Serious Type I Error
• New Drug A is supposed to reduce diastolic blood
pressure by more than 15 mm Hg.
• H0: μ = 15 versus HA: μ > 15
• Drug A can have serious side effects, so don’t want
patients on it unless μ > 15.
• Implication of Type I error: Expose patients to
serious side effects without other benefit.
• Set = P(Type I error) to be small 0.01
Example:
Not so serious Type I Error
• Grade inflation?
• H0: μ = 2.7 vs. HA: μ > 2.7
• Type I error: claim average GPA is more than 2.7
when it really isn’t.
• Implication: Instructors grade harder. Students get
unhappy.
• Set = P(Type I error) at, say, 0.10.
Type II Error and Power
• Type II Error is made when we fail to reject the null
when the alternative is true.
• Want to minimize P(Type II Error).
• Now, if alternative HA is true:
• P(reject|HA is true) + P(not reject|HA is true) =1
• “Power” + P(Type II error) = 1
• “Power” = 1 - P(Type II error)
Type II Error and Power
• “Power” of a test is the probability of rejecting null
when alternative is true.
• “Power” = 1 - P(Type II error)
• To minimize the P(Type II error), we equivalently
want to maximize power.
• But power depends on the value under the
alternative hypothesis ...
Power
• Power is probability, so number between 0 and 1.
• 0 is bad!
• 1 is good!
• Need to make power as high as possible.
Maximizing Power …
• The farther apart the actual mean is from the mean
specified in the null, the higher the power.
• The higher the significance level , the higher the
P(Type I error), the higher the power.
• The smaller the standard deviation, the higher the
power.
• The larger the sample, the higher the power.
That is, factors affecting power...
• Difference between value under the null and the
actual value
• P(Type I error) =
• Standard deviation
• Sample size
Strategy for designing a
good hypothesis test
• Use pilot study to estimate std. deviation.
• Specify . Typically 0.01 to 0.10.
• Decide what a meaningful difference would be
between the mean in the null and the actual mean.
• Decide power. Typically 0.80 to 0.99.
• Use software to determine sample size.
If sample is too small ...
• … the power can be too low to identify even large
meaningful differences between the null and
alternative values.
• Determine sample size in advance of conducting study.
• Don’t believe the “fail-to-reject-results” of a study based
on a small sample.
If sample is really large ...
• … the power can be extremely high for identifying
even meaningless differences between the null and
alternative values.
• In addition to performing hypothesis tests, use a
confidence interval to estimate the actual population
value.
• If a study reports a “reject result,” ask how much
different?
The moral of the story
as researcher
• Always determine how many measurements you
need to take in order to have high enough power to
achieve your study goals.
• If you don’t know how to determine sample size,
ask a statistical consultant to help you.
The moral of the story
as reviewer
• When interpreting the results of a study, always
take into account the sample size.