Professional Documents
Culture Documents
Introduction to Biostatistics
Inferential Statistics
Hypothesis Testing
Thomas Songer, PhD
with acknowledgment to several slides provided by
M Rahbar and Moataza Mahmoud Abdel Wahab
Research Process
Research question
Hypothesis
Identify research design
Data collection
Presentation of data
Data analysis
Interpretation of data
Polgar, Thomas
Interpreting Results
When evaluating an association between
disease and exposure, we need guidelines
to help determine whether there is a
true difference in the frequency of disease
between the two exposure groups, or perhaps
just random variation from the study sample.
4
2.
3.
Hypothesis Testing
The process of deciding statistically
whether the findings of an
investigation reflect chance or real
effects at a given level of probability.
Null Hypothesis
Alternative hypothesis
Identify level of significance
Test statistic
Identify p-value / confidence interval
Conclusion
7
Hypothesis Testing
H0: There is no association between the
exposure and disease of interest
H1: There is an association between the
exposure and disease of interest
Note: With prudent skepticism, the null hypothesis
is given the benefit of the doubt until the data
convince us otherwise.
8
Hypothesis Testing
Because of statistical uncertainty regarding
inferences about population parameters based
upon sample data, we cannot prove or disprove
either the null or alternate hypotheses as
directly representing the population effect.
Thus, we make a decision based on probability
and accept a probability of making an incorrect
decision.
Chernick
Associations
Two types of pitfalls can occur that affect
the association between exposure and
disease
Type 1 error: observing a difference when in
truth there is none
Type 2 error: failing to observe a difference
where there is one.
10
YOUR
DECISION
Do not reject H0
(not stat. sig.)
Reject H0
(stat. sig.)
REALITY
H0 True
H1 True
(No assoc.)
(Yes assoc.)
Correct
decision
Type II
(beta error)
Type I
(alpha error)
Correct
decision
11
REALITY
H0 True
H1 True
(No assoc.)
(Yes assoc.)
Correct
decision
Failing to find a
difference when
one exists
Finding a
Correct decision
difference when
there is none
12
13
Conventional Guidelines:
Set the fixed alpha level (Type I error) to 0.05
This means, if the null hypothesis is true, the
probability of incorrectly rejecting it is 5% or less.
Study Result
DECISION
H0 True
H1 True
Do not reject H0
(not stat. sig.)
Reject H0
(stat. sig.)
Type I
(alpha error)
14
Empirical Rule
For a Normal distribution approximately,
a) 68% of the measurements fall within one
standard deviation around the mean
b) 95% of the measurements fall within two
standard deviations around the mean
c) 99.7% of the measurements fall within three
standard deviations around the mean
15
Normal Distribution
34.13%
13.59%
34.13%
13.59%
2.28%
2.28%
50%
50 %
16
17
Sever
19
Example:
E+
D+
15
D85
E-
10
90
21
22
23
Point estimate
Lower limit
Sever
Upper limit
25
Sever
26
Interpreting Results
Confidence Interval: Range of values for a point
estimate that has a specified probability of
including the true value of the parameter.
Confidence Level: (1.0 ), usually expressed
as a percentage (e.g. 95%).
Confidence Limits: The upper and lower end
points of the confidence interval.
27
_____________________________________________________
0.0
0.5
1.0
1.5
2.0
(null value)
28
_____________________________________________
0.0
0.5
1.0
1.5
2.0
(null value)
29
Interpreting Results
Interpretation of C.I. For OR and RR:
The C.I. provides an idea of the likely magnitude of
the effect and the random variability of the point
estimate.
On the other hand, the p-value reveals nothing about
the magnitude of the effect or the random variability
of the point estimate.
In general, smaller sample sizes have larger C.I.s due
to uncertainty (lack of precision) in the point estimate.
31
Selection of Tests of
Significance
32
Scale of Data
1. Nominal: Data do not represent an amount or
quantity (e.g., Marital Status, Sex)
2. Ordinal: Data represent an ordered series of
relationship (e.g., level of education)
3. Interval: Data are measured on an interval scale
having equal units but an arbitrary zero point. (e.g.:
Temperature in Fahrenheit)
4. Interval Ratio: Variable such as weight for which we
can compare meaningfully one weight versus another
33
(say, 100 Kg is twice 50 Kg)
Chi-square test
Ordinal
Mann-Whitney U test
Interval (continuous)
- 2 groups
T-test
Interval (continuous)
- 3 or more groups
ANOVA
34
35