You are on page 1of 83

How to Appraise a Diagnostic Test?

Dr. Cita Rosita Sigit Prakoeswa, dr, SpKK(K)


Department of Dermato Venereology Dr Soetomo Hospital
Faculty of Medicine, Airlangga University, Surabaya
Tropical Disease Center, Airlangga University, Surabaya

What is diagnosis ?
Increase certainty about
presence/absence of
disease
Disease severity
Monitor clinical course
Assess prognosis
risk/stage within diagnosis
Plan treatment
Screening
Epidemiology




Knottnerus, BMJ 2002
By the end of this session,
you should be able to.
describe and illustrate key measures of
diagnostic test performance
represent diagnostic test performance in 2
different ways
EBM Process
Patient
Encounter
Formulating the
Clinical Question
Searching the
Evidence
Appraising the
Evidence
Diagnosis
Therapy
Prognosis
Etiology
Patient
Intervention
Comparison
Outcome
Hierarchy of evidence
Pre appraised resources
Drawing conclusion
That impact on practice
DOES
POEM
(Lang, 2000) 4
What should I do
about this condition
or problem?
What cause
the problem?
Does this person
have the condition
or problem?
Who will get
the condition
or problem?
How common
is the problem?
What are the
type of problem?
INTERVENTION
PROGNOSIS/RISK FACTORS
DIAGNOSIS
PROGNOSIS FACTORS
FREQUENCY & RATE
PHENOMENA / THOUGHTS
CLINICAL
QUESTION
5
ACQ Diagnosis (PICO)
Patient /
Problem /
Population
Intervention
(Index)
Comparison Outcome
In an otherwise
healthy 7-year-
old boy with
sore throat

how does the
clinical exam
compare to
throat culture
in diagnosing
GAS infection?

Researcher
Involvement
Longitudinal
Cross-sectional
Research
Goal
Research
Approach
Controlled?
Randomized?
Research
Focus
Clinical Manifestation / Diagnosis / Prognosis / Therapy / Review
1
3 2 4
7
Hierarchy of study designs
8
Basic Principles (1)
Ideal diagnostic tests right answers:
(+) results in everyone with the disease and
( - ) results in everyone else
Usual clinical practice:
The test be studied in the same way it would
be used in the clinical setting
Observational study, and consists of:
Predictor variable (test result)
Outcome variable (presence / absence of the
disease)
Basic Principles (2)
Sensitivity, specificity
Prevalence, prior probability, predictive values
Likelihood ratios
Dichotomous scale, cutoff points (continuous
scale)
Positive (true and false), negative (true & false)
ROC (receiver operator characteristic) curve

What is the reason that there are
many parameters in diagnostic test?
Prevalence
Sensitivity (%)
Specificity (%)
LR+
LR-
PPV (%)
NPV (%)
Pre-test Odds
Post-test Odds
Pre-test Probability (%)
Post-test Probability (%)
Disease
(+)
Disease
(-)
Total
Test
(+)
True pos
a
False
pos
b
a+b
Test
(-)
False
neg
c
True neg
d
c+d
Total a+c b+d
a+b+
c+d
METHOD 1:
NATURAL FREQUENCIES TREE
Population
1.000
IN EVERY 1.000 PEOPLE, 200 WILL HAVE THE DISEASE
Disease +
200
Disease -
800
Population
1.000
If these 1000 people are representative of the population at
risk, the assessed rate of those with the disease (20%)
represents the PREVALENCE of the disease it can also be
considered the PRE-TEST PROBABILITY of having the disease

Sensitivity

The proportion of people who truly
have a designated disorder who are
so identified by the test.
Sensitive tests have few false
negatives.
When a test with a high Sensitivity is
Negative, it effectively rules out the
diagnosis of disease. SnNout
Disease +
200
Disease -
800
Test +
190
Test -
10
Population
1.000
In other words, the
sensitivity is
190/200=95%
Test Alergi dengan Uji Kulit
Sensitivitas 95 %, artinya:
SnNout: bila hasil uji kulitnya (-): 95% out (dia bukan penderita alergi )

Sensitivity

The proportion of people who are
truly free of a designated disorder
who are so identified by the test.
Specific tests have few false positives
When a test is highly specific, a
positive result can rule in the
diagnosis. SpPin

Specificity

Disease +
200
Disease -
800
Test +
190
Test -
10
Population
1000

Test Alergi dengan Uji Kulit
Spesifitas 96 % artinya:
SpPin: bila hasil uji kulitnya (+): 96% in (dia penderita alergi)
Test +
32
Test -
768
In other words, the
specificity is 768/800
= 96%

Specificity

CASES NON-CASES
Sensitivity & Specificity
Negative
Positive
Degree of positivity on test
%

o
f

G
r
o
u
p

DISEASED
NON-DISEASED
Test cut-off
FALSE
NEGATIVES
FALSE
POSITIVES

Numeric? (complex)
Sensitivity & Specificity
Sensitivity and Specificity are usually
considered properties of the test rather
than the setting, and are therefore
usually considered to remain constant.

However, sensitivity and specificity are
likely to be influenced by complexity of
differential diagnoses and a multitude of
other factors (cf spectrum bias).

Sensitivity & Specificity
Positive & Negative Predictive Value
For sensitivity and specificity, the
reference variable (denominator) is the
DISEASE
For predictive value, the reference
variable (denominator) is the TEST
Pre Test & Post Test Probability
Pre-test Probability
The probability of the target condition
being present before the results of a
diagnostic test are available. (prevalence)
Post-test Probability
The probability of the target condition
being present after the results of a
diagnostic test are available.
(Positive Predictive Value)
POSITIVE
PREDICTIVE
VALUE = 190/222
=86 %
Disease +
200
Disease -
800
Test +
32
Test -
768
Test +
190
Test -
10
Population
1000
This is also the POST-
TEST PROBABILITY of
having the disease
Positive Predictive Value
Test Alergi dengan Uji Kulit
PPV 86 % artinya bila hasil uji kulitnya (+): kemungkinan dia
menderita alergi adalah 86%
Disease +
200
Disease -
800
Test +
32
Test -
768
Test +
190
Test -
10
Population
1000
Negative Predictive Value
NEGATIVE
PREDICTIVE
VALUE = 768/778
=99%
Test Alergi dengan Uji Kulit
NPV 99 % artinya bila hasil uji kulitnya (-): kemungkinan dia
tidak menderita alergi adalah 99 %
Positive & Negative
Predictive Value
The Positive Predictive Value of a test
will vary (according to the prevalence
of the condition in the chosen setting)
Predictive value & changing prevalence
Disease +
200
Disease -
9.800
Population
10.000
Prevalence reduced by an order
of magnitude from 20% to 2%
Disease +
200
Disease -
9.800
Test +
392
Test -
9.408
Test +
190
Test -
10
Population
10.000
Sensitivity and
Specificity
unchanged
Predictive value & changing prevalence
POSITIVE
PREDICTIVE
VALUE = 33%
Positive predictive value
at low prevalence
Disease +
200
Disease -
9.800
Test +
392
Test -
9.408
Test +
190
Test -
10
Population
10.000
Previously, PPV
was 86%
NEGATIVE
PREDICTIVE
VALUE >99%
Disease +
200
Disease -
9.800
Test +
392
Test -
9.408
Test +
190
Test -
10
Population
10.000
Previously, NPV
was 99%
Negative predictive value
at low prevalence
Prediction of low prevalence events
Even highly specific tests, when applied
to low prevalence events, yield a high
number of false positive results
Because of this, under such
circumstances, the Positive Predictive
Value of a test is low
However, this has much less influence
on the Negative Predictive Value

Likelihood Ratio

Relative likelihood that a given test would be
expected in a patient with (as opposed to one
without) a disorder of interest.
probability (%) of the test result in patients without disease
LR=
probability (%) of a test result in patients with disease
Likelihood
The likelihood that
someone with the
disease will have a
positive test is
190/200 or 95%
This is the same as
the sensitivity
Disease +
200
Test +
190
Test -
10
Population
1000
The likelihood that
someone without
the disease will
have a positive test
is 32/800 or 4%
This is the same as
the (1-specificity)
Disease -
800
Test +
32
Test -
768
Population
1000
Likelihood
LIKELIHOOD OF POSITIVE TEST
IN THE ABSENCE OF THE DISEASE
SENSITIVITY
1- SPECIFICITY
= = 23.8
LIKELIHOOD OF POSITIVE TEST
GIVEN THE DISEASE
= LIKELIHOOD
RATIO (LR)
A Likelihood Ratio (LR) of 1.0
indicates an uninformative test (occurs when sensitivity and specificity
are both 50%)
The higher the Likelihood Ratio
the better the test (other factors being equal)
0.95
0.04
=
Test Alergi dengan Uji Kulit
LR+=23,8, artinya bila hasil uji kulitnya (+): hasil (+) ini dapat terjadi 23,8
kali lebih besar terjadi pada penderita alergi dibandingkan dengan yang
bukan penderita alergi

Likelihood Ratio

0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.2 0.4 0.6 0.8 1
PRE-TEST PROBABILITY
P
O
S
T
-
T
E
S
T

A (90%)
B (70%)
C (50%)
A : Sensitivity =
Specificity = 0.9
LR+ = 9.0

B : Sensitivity =
Specificity = 0.7
LR+ = 3.0

C : Sensitivity =
Specificity = 0.5
LR+ = 1.0
P
O
S
T
-
T
E
S
T

P
R
O
B
A
B
I
L
I
T
Y

Sensitivity & Specificity; Positive
Predictive Value; Prevalence & LR
DISEASE
Yes No Total
3 7
Yes

a b
D

10
a
+
b


c d

No
1 89
90
c
+
d

4 96 100

T
E
S
T

Total
a+c b+d
a+b+c+d

METHOD 2:
TRADITIONAL 2x2 TABLES
DISEASE
Yes No Total
3 7
Yes

a b
D

10
a
+
b


c d

No
1 89
90
c
+
d

4 96 100

T
E
S
T

Total
a+c b+d
a+b+c+d

SENSITIVITY
SENSITIVITY
The proportion of people with the diagnosis
(N=4) who are correctly identified (N=3)
Sensitivity = a/(a+c) = 3/4 = 75%
FALSE
NEGATIVES
DISEASE
Yes No Total
3 7
Yes

a b
D

10
a
+
b


c d

No
1 89
90
c
+
d

4 96 100

T
E
S
T

Total
a+c b+d
a+b+c+d

SPECIFICITY
SPECIFICITY
The proportion of people without the diagnosis
(N=96) who are correctly identified (N=89)
Specificity = d/(b+d) = 89/96 = 93%
FALSE
POSITIVES
DISEASE
Yes No Total
3 7
Yes

a b
D

10
a
+
b


c d

No
1 89
90
c
+
d

4 96 100

T
E
S
T

Total
a+c b+d
a+b+c+d

PRE-TEST ODDS
In the sample as a whole, the odds of having the
disease are 4 to 96 or 4% (the PRE-TEST ODDS)
DISEASE
Yes No Total
3 7
Yes

a b
D

10
a
+
b


c d

No
1 89
90
c
+
d

4 96 100

T
E
S
T

Total
a+c b+d
a+b+c+d

POST-TEST ODDS
In those who score positive on the test, the odds of having
the disease are 3 to 7 or 43% (the POST-TEST ODDS)
In the sample as a whole, the odds of having the disease
are 4 to 96 or 4% (the PRE-TEST ODDS)
DISEASE
Yes No Total
3 7
Yes

a b
D

10
a
+
b


c d

No
1 89
90
c
+
d

4 96 100

T
E
S
T

Total
a+c b+d
a+b+c+d

POST-TEST ODDS
In those who score positive on the test, the odds of having
the disease are 3 to 7 or 43% (the POST-TEST ODDS)
In the sample as a whole, the odds of having the disease
are 4 to 96 or 4% (the PRE-TEST ODDS)
In those who score negative on the test, the odds of having
the disease are 1 to 89 or approximately 1%
BAYES THEOREM
POST-TEST ODDS =
LIKELIHOOD RATIO x PRE-TEST ODDS
LIKELIHOOD RATIO AND PRE-
AND POST-TEST PROBABILITIES
For a given test with a given
likelihood ratio, the post-
test probability will depend
on the pre-test probability
(that is, the prevalence of
the condition in the sample
being assessed)
SENSITIVITY ANALYSIS OF
A DIAGNOSTIC TEST
Value 95% CI
Pre-test
probability
35% 26% to 44%
Applying the 95% confidence
intervals above to the nomogram,
the post-test probability is likely to
lie in the range 55-85%
Value 95% CI
Pre-test
probability
35% 26% to 44%
Likelihood
ratio
5.0 3.0 to 8.5
SENSITIVITY ANALYSIS OF
A DIAGNOSTIC TEST
RECEIVER OPERATING CHARACTERISTIC CURVE
Overall shape is
predicted by the
reciprocal relationship
between sensitivity and
specificity
The closer the curve gets
to Sensitivity=1 and
Specificity=1, the better
the overall performance
of the test
The diagonal line (representing Sensitivity=0.5
and Specificity=0.5) represents performance no
better than chance
Hence the area under the
curve gives a measure of
the tests performance
FALSE POSITIVE RATE (1-Specificity)
0
100
1-Specificity
S
e
n
s
i
t
i
v
i
t
y
AREA UNDER ROC CURVES
0
100
1-Specificity
S
e
n
s
i
t
i
v
i
t
y
Sensitivity and specificity both
100% - TEST PERFECT
Sensitivity and specificity both
50% - TEST USELESS
AREA=1.0
AREA=0.5
The area under a ROC
curve will be between
0.5 and 1.0
0
100
1-Specificity
S
e
n
s
i
t
i
v
i
t
y
Area = 0.7 (between
0.5 and 1.0)
Consider (hypothetically) two patients drawn
randomly from the DISEASE+ and DISEASE- groups
respectively
If the test is used to guess which patient is from the
DISEASE+ group, it will be right 70% of the time
AREA UNDER ROC CURVES
APPLYING A DIAGNOSTIC TEST
IN DIFFERENT SETTINGS
The Positive Predictive Value of a test will vary
(according to the prevalence of the condition in
the chosen setting)
Sensitivity and Specificity are usually considered
properties of the test rather than the setting, and
are therefore usually considered to remain
constant
However, sensitivity and specificity are likely to
be influenced by complexity of differential
diagnoses and a multitude of other factors (cf
spectrum bias)
RECEIVER OPERATING
CHARACTERISTIC (ROC) CURVE
0
10
20
30
40
50
60
70
80
90
100
0 20 40 60
1-Specificity
S
e
n
s
i
t
i
v
i
t
y
ACAT
MC
This study compared
the performance of a
dementia screening test
in a community sample
(ACAT) and a memory
clinic sample (MC)
Flicker L, Loguidice D, Carlin
JB, Ames D. The predictive
value of dementia screening
instruments in clinical
populations. International
Journal of Geriatric
Psychiatry 1997 ; 12 : 203-
209
Diagnosis test & clinical setting
Diagnosis test & clinical setting
Interpreting Diagnostic Studies
VIA - RaMMbo
Validity


Participants
Index group (IG) &
Gold standard
Comparison Group (CG)
Outcome


I
G
C
G
+
-
D C
+ -
B A
Representative?
Selection?
VALIDITY
Reproducible
Maintain?




Measurements
blind subjective? OR
objective?

QUESTION:


Diagnostic Accuracy Study:
Basic Design
Series of patients
Index test
Reference standard
Blinded cross-classification
Recruitment:
Was diagnostic test evaluated is representative
spectrum of patient?
Series of patients
Index test
Reference standard
Blinded cross-classification
Maintenance:
Was the endpoint of the reference standard
obtained for all subjects?
Series of patients
Index test
Reference standard
Blinded cross-classification
Measurement:
Were the assesors kept blind to the results of each
test and/or were the reference standard endpoint
objective
Series of patients
Index test
Reference standard
Blinded cross-classification
Selected Patients
Index test
Reference standard
Blinded cross-classification
Spectrum Bias
Series of patients
Index test
Reference standard
Blinded cross-classification
Verification Bias
Series of patients
Index test
Blinded cross-classification
Ref. Std A Ref. Std. B
Differential Reference Bias
Series of patients
Index test
Reference standard
Unblinded cross-classification
Observer Bias
Importance
INTERVENTION
ETIOLOGY/RISK FACTORS
DIAGNOSIS
PROGNOSIS & PREDICTION
FREQUENCY & RATE
PHENOMENA / THOUGHTS
I
M
P
O
R
T
A
N
C
E
What should I do
about this condition
or problem?
What cause
the problem?
Does this person
have the condition
or problem?
Who will get
the condition
or problem?
How common
is the problem?
What are the
type of problem?
66
CLINICAL TRIAL
PROGNOSIS
DIAGNOSTIC
RRR, ARR, NNT
p & CI
Survival curve
RR / OR
p & CI
Sn,Sp,LH,PPV,NPV
p & CI
I
M
P
O
R
T
A
N
C
E
67
Applicability
PICO & Applicability
Your question
(PICO)
Study
What do the
Result mean?
How well was
study done?
Validity
Importance
Applicability
69
Diagnostic tests
Is not about finding absolute truth, but
about limiting uncertainty
establishes both the necessity and the
logical base for introducing probabilities,
pragmatic test-treatment thresholds ..
Start thinking about
what youre going to do with the results of the
diagnostic test, and
whether doing the test will help your patients
CRITICAL
APPRAISAL
DIAGNOSTIC
TEST
Critical appraisal diagnostic test
Use worksheet (VIA; RAMMbo)
STARD
Use supporting softwares
CAT Maker

Validity (1)
Apakah penelitian uji diagnostik dilakukan secara tersamar dengan baku
emas yang benar ?
Validity (2)
Apakah uji diagnostik dilakukan terhadap pasien dengan spektrum
penyakit atau kelainan yang memadai sehingga dapat diterapkan dalam
praktek sehari-hari?
Validity (3)
Apakah pemeriksaan dengan baku emas dilakukan tanpa memandang
hasil pemeriksaan dengan uji diagnostik ?
Important
Berapa Sn, Sp, LR+, LR-, PPV, NPV, Pre-test probability, Post-test
probability, Pre-test Odds, Post-test Odds ?
Applicable (1)
Apakah uji diagnostik tersebut tersedia, terjangkau dan akurat?
Applicable (2)
Apakah kita bisa memperkirakan pre-test probability (prevalens)
penyakit pada pasien kita ?
Applicable (3)
Apakah post-test probability yang dihitung akan mengubah tatalaksana
pasien kita?
Applicable (4)
Apakah secara keseluruhan uji diagnostik tersebut bermanfaat bagi
pasien ?
Section and and
topic
Title, abstract, and
keywords
Introduction
Methods
Participants
Test methods
Statistical methods
Results
Participants
Test results
Estimates
Discussions

STARD initiative (25 items)
Standards for Reporting of Diagnostic Accuracy
Bossuyt PM, Reitsma JB, Bruns, DE, Gatsonis CA, Glasziou PP et al. BMJ 2003,326:41-6
1
st
component of STARD
2
nd
component of STARD
Does early diagnosis really lead to improved
survival, or quality of life, or both?
Are the early diagnosed patients willing partners in
the treatment strategy?
Is the time and energy it will take us to confirm the
diagnosis and provide (lifelong) care well spent?
Do the frequency and severity of the target disorder
warrant this degree of effort and expenditure?
Guides for deciding whether a screening or
early diagnostic maneuver does more good
than harm:

You might also like