You are on page 1of 33

HOW TO READ A RESEARCH

PAPER

Dr. Kamal Preet


AP, Department Of Pathology
Why read a paper at all?

Traditional methods for information: Textbooks, journal


subscriptions, peer or expert opinion

Online sources of information:


For reference while writing thesis or research papers
For evidence based approach to clinical decision making
For best practice in a given situation (choosing between X or Y
intervention/drug)
For evidence on a currently followed intervention/drug and being
able to question it
For current guidelines/ classifications
Searching the literature
Raw databases and indexes: Medline/Pubmed is the flagship database.
Millions of research articles
Medline/Pubmed, Google scholar

Data-bases of pre-appraised articles: Systematic review synopses with critical appraisal.


Pioneered by Archie Cochrane in 1980s.
Aims at pooling all the literature available on a current topic.
Cochrane controlled clinical trials register, NHS Economic Evaluation Database

Databases of synthesized evidence:


Relatively newer concept.
Computerized decision support systems which help translate research into practice and help
with decision making statistically.
BMJ Point of Care, Clinical Evidence, Cochrane database of systematic reviews (CDSR),
Database of abstracts of reviews of effectiveness( DARE)
Hierarchy of literature search

Databases of synthesized
evidence

Data-bases of
pre-appraised
articles

Raw databases and indexes


Getting your bearings-what is this paper about?
The science of trashing papers: Only 1% of medical research is
free of methodological flaws.

Three preliminary questions to get your bearings:


Question One: What was the research question and why
was the study needed?
Question Two: What was the research design?
Question Three: Was the research design appropriate to the
question?
Question One: What was the research question and why
was the study needed?

Not all research studies (even good ones) are set up to test a
single definitive hypothesis
Qualitative research studies: Aim to look at particular issues
related to a particular area which generate further
hypotheses.

Question Two: What was the research design?

1. Primary studies: Report research first-hand (original article)


2. Secondary (Integrative)studies: Summarize and draw
conclusions from primary studies (review article)
Primary studies Secondary studies

Laboratory experiments Overviews


a) Non Systematic
b) Systematic
c) Meta analysis

Clinical trials Guidelines

Surveys Decision analyses


Types of studies

Randomized controlled trials


Case-control studies
Cohort studies
Cross-sectional survey
Case reports/ series
Systematic reviews of
RCTs
RCTs

Other controlled
clinical trials
Observational studies
(Cohort and case control)

Cross- sectional survey

Case studies, anecdote, bench studies


and personal opinion

Traditional hierarchy of research


Term Meaning

Parallel group comparison Each group receives a different treatment, with both
groups being entered at the same time.

Paired/ matched comparison Participants receiving different treatments are


matched to balance potential confounding variables
such as age and sex.

Within-participant comparison Participants are assessed before and after an


intervention and results analyzed in terms of within
participant changes
Term Meaning

Single blind Participants do not know which treatment they are


receiving
Double blind Neither do the patients nor the investigators

Crossover Each participant received both the intervention and


control treatments (in random order), often
separated by a no treatment period(washout period)
Placebo controlled Control participants receive a placebo (inactive pill)
which should look and taste the same as the active
pill. Placebo (sham) operations may also be used in
trials of surgery
Factorial design A study that permits investigation of the effects
(both separately and combined) of more than one
independent variable on a given outcome (e.g. a 2 x 2
factorial design tested the effects of placebo, aspirin
alone, streptokinase alone or aspirin with
streptokinase in acute heart attack).
Question Three: Was the research design appropriate to the
question?

e.g. Was an RCT the best method of addressing this particular


problem?
OR

If the study was not an RCT, should it have been?

How to reach at this conclusion?


Broad fields of research Study of choice

Therapy: Drug treatments, surgical procedures, other RCT


interventions
Diagnosis: Validating a new test Cross-sectional
survey
Screening: Pick up value of a screening test in pre symptomatic Cross-sectional
stage survey

Prognosis : Determining the course of illness Longitudinal survey


Causation: Determining the effect of a putative harmful agent to Cohort/ case
development of illness control study
depending on how
rare the disease is
Attitudes/ beliefs/ preferences about nature of illness or its Psychometric
treatment studies
Qualitative studies
Ethical considerations:
At times ignored
Any research done on vulnerable and sick patients without full
consideration of ethical issues is a criminal offence and possible
ground for being debarred
Doing invasive procedures without consent
Enrolling patients by providing false information

However at times well intentioned research may be prevented due


to ethical/ legal issues

E.g. HeLa cells form Henrietta Lacks- At that time there were no
laws regarding ethics of using and profiting from a subject.
Assessing methodological quality:

IMRAD pattern
M is the most important section
Trash it/ trust it? (flawed/flawless)

Ask the following questions:


Was the study original?
Whom the study is about?
Was the design of the study sensible?
Was systematic bias avoided or minimized?
Was assessment blind?
Was adequately followed up?
Were preliminary statistical questions addressed?
1. Was the study relevant and original?
Is the clinical issue addressed is of sufficient importance?
In theory no point in testing a hypothesis that someone
has already proved
Practice- seldom so

Has anyone done a similar study before?

Does this new research add to the literature in any way?

Bigger sample size/ conducted for longer


duration/rigorous methodology/different population
2. Whom the study is about?
Studies the applicability to your patients
More/ less sick than the patients you see
Different ethnicity
None of them smoked/ took alcohol
Had no other associated conditions
Received a different care
Inclusion criteria- study done on young patients may not be
applicable to old
Exclusion criteria- if mild form of disease was excluded, the
results cannot be extrapolated to them
Were the patients studied in real-life circumstances?-special
attention/ different equipment
3. Was the design of the study sensible?
Language of research methodology can be forbidding
Accurate and to the point description should be provided (for
clarity and replication)

What the authors said What they should have said /done
We compared a nicotine patch with Patch containing 15mg nicotine twice
placebo daily, control group received identical
looking patches
4. What outcome was measured and how?
In an incurable disease the efficacy of a drug is measured in
the form of number of years the patient lived, rather than
affecting some obscure enzyme
Difficult to measure- symptomatic(pain), functional (mobility),
psychological (anxiety), social (inconvenience)
However outcome should be objectified as far as possible

5. Was systematic bias avoided or minimized?


Systematic bias is defined as anything which erroneously
influences the conclusions about groups and distorts
comparisons
Can be minimized by making the comparison groups as
identical as possible, except for the condition being studied
Baseline
state

Allocation

Selection bias (incomplete Intervention Control


randomization) group group

Performance bias Exposed to Not


(differences in care) intervention exposed

Exclusion bias (differences Follow up Follow up


in withdrawal)

Detection bias (differences outcomes outcomes


in outcome assessment)

Sources of bias to check for in an


RCT
6. Was assessment blind?

Bias creeps in if assessors know which group a particular cases


was assigned to
100% reported objectivity is unlikely to be true
Rare for competing clinicians to agree completely on a given
interpretation
Levels of agreement beyond chance kappa score
Eg. 1- perfect agreement
Mammogram reading by experts- 0.67
JVP measurement by experts-0.42
Complex observations unlikely to have score of 1
7. Was the study adequately followed up?
Duration of follow up
Completeness of follow up:
Studies with low follow up rate are considered untrustworthy
Also, if all withdrawal cases are omitted from calculation, the
results will be biased in favor of intervention .
All the data on patients originally recruited should be
analyzed both in the intervention arm and placebo arm-
intent to treat
Exception:
Efficacy analysis: Where the study aims to study the effect of an
intervention itself, and therefore of the treatment actually
received.
8. Were preliminary statistical questions addressed?
Non-statisticians should seek answers for these two questions
1. The size of the sample
2. Type of data and the type of tests applied

1. The size of the sample


Representative of the power of the study
free access websites (http://www.macorr.com/ss_calculator.htm)
commercial statistical software package
(http://www.ncss.com/pass.html)
Power of study : The likelihood of detecting a true difference
between the groups
Usually set between 80-90%
Underpowered studies lead to type ii error- erroneous conclusion
that an intervention has no effect
2. Type of data and the type of tests applied

Evaluate: Have the authors applied any statistical tests at all?


If only numbers and no statistical analysis- suspect
Be aware of tricks of the trade

Parametric tests: for data from a particular form of


distribution
Non-parametric tests: for data with no particular type of
distribution.
Parametric test Example of equivalent Purpose of test
nonparametric (rank order)
test

Two sample (unpaired) t- MannWhitney U test Compares two


test independent samples
drawn from the same
population

One-sample (paired) t-test Wilcoxon matched-pairs Compares two sets of


test observations on a single
sample (tests the
hypothesis that the mean
difference between two
measurements is zero)
Parametric test Example of equivalent Purpose of test
nonparametric (rank order)
test
One-way analysis of Analysis of variance by Effectively, a generalization
variance using total sum of ranks (e.g. Kruskall Wallis of the paired t or Wilcoxon
squares (e.g. F-test) test) matched pairs test where
three or more sets of
observations are made on
a single sample
Two-way analysis of Two-way analysis of As above, but tests the
variance variance by ranks influence (and interaction)
of two different covariates

Product-moment Spearmans rank Assesses the strength of


correlation coefficient correlation coefficient () the straight-line
(Pearsons r) association between two
continuous variables
Parametric test Example of equivalent Purpose of test
nonparametric (rank order)
test
No direct equivalent X2 test Tests the null hypothesis
that the proportions of
variables estimated from
two (or more) independent
samples are the same

No direct equivalent McNemars test Tests the null hypothesis


that the proportions
estimated from a paired
sample are the same
Probability and confidence interval
P value- The probability that any particular outcome has arisen by chance

p<0.05 statistically significant


Arbitrary
P<0.01 statistically highly significant

Confidence interval: a range of values so defined that there is a specified


probability that the value of a parameter lies within it.
Expressed in %

Were paired tests performed on paired data?


Was a two tailed test performed whenever applicable?
Were outliers analyzed with both common sense and appropriate statistical
adjustments?-Real values or spurious values
Whether a statistically significant value a clinically
significant one?

The parameters that help decide this are

ARR (absolute risk reduction): Percentage of risk reduction


by x therapy when compared to no intervention

RRR (relative risk reduction): Percentage of risk reduction


by x therapy when compared to y intervention

NNT (number needed to treat): how many patients are


needed to be intervened to prevent one said outcome that
would occur without treatment
Summing up
Research question
Sample size, type
Type of study
Detailed methodology
Duration of follow up if any
Type of statistical tests performed
Discussion
We all read research papers for a variety of reasons
Medical research has direct implications for patients
Pertinent to differentiate quality papers from trash
A basic knowledge about research methodology and statistics
can help decide
Exhaustive subject, some areas have been voluntarily left out
Its easier to find faults with others work than produce a
flawless piece of research
Review your own research as strictly to increase the chances
of acceptance

References
1.Greenhalgh, T. How to read a paper: the basics of Evidence
Based Medicine.4th ed. London:BMJ; 1997.
Thank you!

You might also like