You are on page 1of 67

Bias, Confounding and

Fallacies in Epidemiology
M. Tevfik DORAK
http://www.dorak.info/epi

BIAS

Definition
Types
Examples
Remedies

CONFOUNDING
Definition
Examples
Remedies

FALLACIES
Definition

(Effect Modification)

What is Bias?
Bias is one of the three major threats to internal
validity:

Bias
Confounding
Random error / chance

What is Bias?
Any trend in the collection, analysis, interpretation,
publication or review of data that can lead to
conclusions that are systematically different from
the truth (Last, 2001)
A process at any state of inference tending to
produce results that depart systematically from
the true values (Fletcher et al, 1988)
Systematic error in design or conduct of a study
(Szklo et al, 2000)

Bias is systematic error


Errors can be differential (systematic) or nondifferential (random)
Random error: use of invalid outcome
measure that equally misclassifies cases and
controls
Differential error: use of an invalid measures
that misclassifies cases in one direction and
misclassifies controls in another
Term 'bias' should be reserved for differential or
systematic error

Random Error
Per Cent

Size of induration (mm)

WHO

(www)

Systematic Error
Per Cent

Size of induration (mm)

WHO

(www)

Chance vs Bias
Chance is caused by random error
Bias is caused by systematic error
Errors from chance will cancel each other out in the
long run (large sample size)
Errors from bias will not cancel each other out
whatever the sample size
Chance leads to imprecise results
Bias leads to inaccurate results

Types of Bias
Selection bias

Unrepresentative nature of sample

Information (misclassification) bias

Errors in measurement of exposure of disease

Confounding bias

Distortion of exposure - disease relation by some


other factor
Types of bias not mutually exclusive
(effect modification is not bias)
This classification is by Miettinen OS in 1970s
See for example Miettinen & Cook, 1981 (www)

Selection Bias
Selective differences between comparison groups
that impacts on relationship between exposure
and outcome
Usually results from comparative groups not
coming from the same study base and not being
representative of the populations they come from

Selection Bias Examples

(www)

Selection Bias Examples

(www)

Selection Bias Examples

(www)

Selection Bias Examples

(www)

Selection Bias Examples

Selective survival (Neyman's) bias


(www)

Selection Bias Examples


Case-control study:
Controls have less potential for exposure than cases
Outcome = brain tumour; exposure = overhead
high voltage power lines
Cases chosen from province wide cancer registry
Controls chosen from rural areas
Systematic differences between cases and controls

Case-Control Studies:
Potential Bias

Schulz & Grimes, 2002 (www) (PDF)

Selection Bias Examples


Cohort study:
Differential loss to follow-up
Especially problematic in cohort studies
Subjects in follow-up study of multiple sclerosis may
differentially drop out due to disease severity
Differential attrition selection bias

Selection Bias Examples


Self-selection bias:
- You want to determine the prevalence of HIV infection
- You ask for volunteers for testing
- You find no HIV
- Is it correct to conclude that there is no HIV in this
location?

Selection Bias Examples


Healthy worker effect:
Another form of self-selection bias
self-screening process people who are unhealthy
screen themselves out of active worker population
Example:
- Course of recovery from low back injuries in 25-45 year
olds
- Data captured on workers compensation records
- But prior to identifying subjects for study, self-selection
has already taken place

Selection Bias Examples


Diagnostic or workup bias:
Also occurs before subjects are identified for study
Diagnoses (case selection) may be influenced by
physicians knowledge of exposure
Example:
- Case control study outcome is pulmonary disease,
exposure is smoking
- Radiologist aware of patients smoking status when
reading x-ray may look more carefully for
abnormalities on x-ray and differentially select cases
Legitimate for clinical decisions, inconvenient for research

Types of Bias
Selection bias

Unrepresentative nature of sample

** Information (misclassification) bias **


Errors in measurement of exposure of disease

Confounding bias

Distortion of exposure - disease relation by some


other factor
Types of bias not mutually exclusive
(effect modification is not bias)

Information / Measurement /
Misclassification Bias
Method of gathering information is inappropriate and
yields systematic errors in measurement of exposures
or outcomes
If misclassification of exposure (or disease) is
unrelated to disease (or exposure) then the
misclassification is non-differential
If misclassification of exposure (or disease) is related
to disease (or exposure) then the misclassification is
differential
Distorts the true strength of association

Information / Measurement /
Misclassification Bias
Sources of information bias:
Subject variation
Observer variation
Deficiency of tools
Technical errors in measurement

Information / Measurement /
Misclassification Bias
Recall bias:
Those exposed have a greater sensitivity for recalling
exposure (reduced specificity)
- specifically important in case-control studies
- when exposure history is obtained retrospectively
cases may more closely scrutinize their past history
looking for ways to explain their illness
- controls, not feeling a burden of disease, may less
closely examine their past history
Those who develop a cold are more likely to identify
the exposure than those who do not differential
misclassification
- Case: Yes, I was sneezed on
- Control: No, cant remember any sneezing

Information / Measurement /
Misclassification Bias
Reporting bias:
Individuals with severe disease tends to have
complete records therefore more complete
information about exposures and greater association
found
Individuals who are aware of being participants of a
study behave differently (Hawthorne effect)

Controlling for Information Bias


- Blinding
prevents investigators and interviewers from
knowing case/control or exposed/non-exposed
status of a given participant
- Form of survey
mail may impose less white coat tension than a
phone or face-to-face interview
- Questionnaire
use multiple questions that ask same information
acts as a built in double-check
- Accuracy
multiple checks in medical records
gathering diagnosis data from multiple sources

Types of Bias
Selection bias

Unrepresentative nature of sample

Information (misclassification) bias

Errors in measurement of exposure of disease

** Confounding bias **

Distortion of exposure - disease relation by some


other factor
Types of bias not mutually exclusive
(effect modification is not bias)

(www)

Cases of Down Syndrome by Birth Order

EPIET

(www)

Cases of Down Syndrome by Age Groups

EPIET

(www)

Cases of Down Syndrome by Birth Order


and Maternal Age

EPIET

(www)

Confounding
A third factor which is related to both
exposure and outcome, and which accounts
for some/all of the observed relationship
between the two
Confounder not a result of the exposure
e.g., association between childs birth rank
(exposure) and Down syndrome (outcome);
mothers age a confounder?
e.g., association between mothers age (exposure)
and Down syndrome (outcome); birth rank a
confounder?

Confounding
To be a confounding factor, two conditions must be met:

Exposure

Outcome

Third variable
Be associated with exposure
- without being the consequence of exposure
Be associated with outcome
- independently of exposure (not an intermediary)

Confounding

Birth Order

Down Syndrome

Maternal Age
Maternal age is correlated with birth
order and a risk factor even if birth order
is low

Confounding ?

Maternal Age

Down Syndrome

Birth Order
Birth order is correlated with maternal age
but not a risk factor in younger mothers

Confounding

Coffee

CHD

Smoking
Smoking is correlated with coffee
drinking and a risk factor even for those
who do not drink coffee

Confounding ?

Smoking

CHD

Coffee
Coffee drinking may be correlated with
smoking but is not a risk factor in nonsmokers

Confounding

Alcohol

Lung Cancer

Smoking
Smoking is correlated with alcohol
consumption and a risk factor even for
those who do not drink alcohol

Confounding ?

Smoking

CHD

Yellow fingers
Not related to the outcome
Not an independent risk factor

Confounding ?

Diet

CHD

Cholesterol
On the causal pathway

Confounding
Imagine you have repeated a positive finding of birth order
association in Down syndrome or association of coffee drinking
with CHD in another sample. Would you be able to replicate it?
If not why?
Imagine you have included only non-smokers in a study and
examined association of alcohol with lung cancer. Would you
find an association?
Imagine you have stratified your dataset for smoking status in
the alcohol - lung cancer association study. Would the odds
ratios differ in the two strata?
Imagine you have tried to adjust your alcohol association for
smoking status (in a statistical model). Would you see an
association?

Confounding
Imagine you have repeated a positive finding of birth order
association in Down syndrome or association of coffee drinking
with CHD in another sample. Would you be able to replicate it?
If not why?
You would not necessarily be able to replicate the
original finding because it was a spurious association
due to confounding.
In another sample where all mothers are below 30 yr,
there would be no association with birth order.
In another sample in which there are few smokers,
the coffee association with CHD would not be
replicated.

Confounding
Imagine you have included only non-smokers in a study and
examined association of alcohol with lung cancer. Would you
find an association?
No because the first study was confounded. The
association with alcohol was actually due to smoking.
By restricting the study to non-smokers, we have
found the truth. Restriction is one way of preventing
confounding at the time of study design.

Confounding
Imagine you have stratified your dataset for smoking status in
the alcohol - lung cancer association study. Would the odds
ratios differ in the two strata?
The alcohol association would yield the similar odds
ratio in both strata and would be close to unity. In
confounding, the stratum-specific odds ratios should
be similar and different from the crude odds ratio by at
least 15%. Stratification is one way of identifying
confounding at the time of analysis.

If the stratum-specific odds ratios are different, then


this is not confounding but effect modification.

Confounding
Imagine you have tried to adjust your alcohol association for
smoking status (in a statistical model). Would you see an
association?
If the smoking is included in the statistical model, the
alcohol association would lose its statistical
significance. Adjustment by multivariable modelling is
another method to identify confounders at the time of
data analysis.

Confounding
For confounding to occur, the confounders should be
differentially represented in the comparison groups.
Randomisation is an attempt to evenly distribute
potential (unknown) confounders in study groups. It
does not guarantee control of confounding.
Matching is another way of achieving the same. It
ensures equal representation of subjects with known
confounders in study groups. It has to be coupled with
matched analysis.
Restriction for potential confounders in design also
prevents confounding but causes loss of statistical
power (instead stratified analysis may be tried).

Confounding
Randomisation, matching and restriction can be tried at
the time of designing a study to reduce the risk of
confounding.
At the time of analysis:
Stratification and multivariable (adjusted) analysis can
achieve the same.
It is preferable to try something at the time of designing
the study.

Effect of randomisation on outcome of


trials in acute pain

Bandolier Bias Guide

(www)

Confounding

Obesity

Mastitis

Age
In cows, older ones are heavier and older
age increases the risk for mastitis. This
association may appear as an obesity
association

Confounding

If each case is matched with a same-age control, there will be no


association (OR for old age = 2.6, P = 0.0001)
(www)

No Confounding

(www)

Cases of Down Syndrome by Birth Order


and Maternal Age

If each case is matched with a same-age control, there will be no


association. If analysis is repeated after stratification by age, there
will be no association with birth order.
EPIET

(www)

BIAS

Definition
Types
Examples
Remedies

CONFOUNDING
Definition
Examples
Remedies

** (Effect Modification) **

FALLACIES
Definition

Confounding or Effect Modification


Birth Weight

Leukaemia
Sex

Can sex be responsible for the birth weight


association in leukaemia?
- Is it correlated with birth weight?
- Is it correlated with leukaemia independently of
birth weight?
- Is it on the causal pathway?
- Can it be associated with leukaemia even if birth
weight is low?
- Is sex distribution uneven in comparison groups?

Confounding or Effect Modification


Birth Weight

Leukaemia

OR = 1.5

Sex

Does birth weight association differ in strength according to sex?

BOYS

Birth Weight

GIRLS

Birth Weight

//

Leukaemia

OR = 1.8

Leukaemia

OR = 0.9

Effect Modification
In an association study, if the strength of the
association varies over different categories of a third
variable, this is called effect modification. The third
variable is changing the effect of the exposure.
The effect modifier may be sex, age, an environmental
exposure or a genetic effect.
Effect modification is similar to interaction in statistics.
There is no adjustment for effect modification. Once it
is detected, stratified analysis can be used to obtain
stratum-specific odds ratios.

Effect modifier

Belongs to nature
Different effects in different strata
Simple
Useful
Increases knowledge of biological mechanism
Allows targeting of public health action

Confounding factor

Belongs to study
Adjusted OR/RR different from crude OR/RR
Distortion of effect
Creates confusion in data
Prevent (design)
Control (analysis)

BIAS

Definition
Types
Examples
Remedies

CONFOUNDING
Definition
Examples
Remedies

(Effect Modification)

** FALLACIES **
Definition

Fallacies
HISTORICAL FALLACY
ECOLOGICAL FALLACY
(Cross-Level Bias)
BERKSON'S FALLACY
(Selection Bias in Hospital-Based CC Studies)
HAWTHORNE EFFECT
(Participant Bias)
REGRESSION TO THE MEAN (Davis, 1976)
(Information Bias)

HOW TO CONTROL FOR


CONFOUNDERS?
IN STUDY DESIGN
RESTRICTION of subjects according to potential
confounders (i.e. simply dont include confounder
in study)
RANDOM ALLOCATION of subjects to study groups
to attempt to even out unknown confounders
MATCHING subjects on potential confounder thus
assuring even distribution among study groups

HOW TO CONTROL FOR


CONFOUNDERS?
IN DATA ANALYSIS
STRATIFIED ANALYSIS using the Mantel Haenszel
method to adjust for confounders
IMPLEMENT A MATCHED-DESIGN after you have
collected data (frequency or group)
RESTRICTION is still possible at the analysis
stage but it means throwing away data
MODEL FITTING using regression techniques

Effect of blinding on outcome of trials


of acupuncture for chronic back pain

Bandolier Bias Guide

(www)

WILL ROGERS' PHENOMENON


Assume that you are tabulating survival for patients with a certain type of
tumour. You separately track survival of patients whose cancer has
metastasized and survival of patients whose cancer remains localized. As you
would expect, average survival is longer for the patients without metastases.
Now a fancier scanner becomes available, making it possible to detect
metastases earlier. What happens to the survival of patients in the two groups?
The group of patients without metastases is now smaller. The patients who are
removed from the group are those with small metastases that could not have
been detected without the new technology. These patients tend to die sooner
than the patients without detectable metastases. By taking away these
patients, the average survival of the patients remaining in the "no metastases"
group will improve.
What about the other group? The group of patients with metastases is now
larger. The additional patients, however, are those with small metastases.
These patients tend to live longer than patients with larger metastases. Thus
the average survival of all patients in the "with-metastases" group will
improve.
Changing the diagnostic method paradoxically increased the average survival
of both groups! This paradox is called the Will Rogers' phenomenon after a
quote from the humorist Will Rogers ("When the Okies left California and(www)
went
to Oklahoma, they raised the average intelligence in both states").
See also Festenstein, 1985 (www)

Cause-and-Effect Relationship

Grimes & Schulz, 2002 (www)

http://www.dorak.info

M. Tevfik DORAK
Paediatric & Lifecourse Epidemiology Research Group
School of Clinical Medical Sciences (Child Health)
Newcastle University
England, U.K.
http://www.dorak.info

You might also like