Professional Documents
Culture Documents
Tom Sensky
HOW TO USE THIS POWERPOINT
PRESENTATION
VARIABLES
QUANTITATIVE QUALITATIVE
12 97.5th Centile
10
75th Centile
8
6
MEDIAN
4 (50th centile)
2
25th Centile
0
-2
N= 74 27
2.5th Centile
Female Male
Inter-quartile
range
STANDARD DEVIATION MEASURE
OF THE SPREAD OF VALUES OF A
SAMPLE AROUND THE MEAN
THE SQUARE OF THE 2
SD IS KNOWN AS Sum(Value Mean)
THE VARIANCE SD
Number of values
SD decreases as a function of:
smaller spread of values
about the mean
larger number of values
IN A NORMAL
DISTRIBUTION, 95%
OF THE VALUES WILL
LIE WITHIN 2 SDs OF
THE MEAN
STANDARD DEVIATION AND
SAMPLE SIZE
As sample size
increases, so
SD decreases n=150
n=50
n=10
SKEWED DISTRIBUTION
MEAN
MEDIAN 50% OF
VALUES WILL LIE
ON EITHER SIDE OF
THE MEDIAN
DOES A VARIABLE FOLLOW A
NORMAL DISTRIBUTION?
NORMAL SKEWED
DISTRIBUTION DISTRIBUTION
SAMPLE B
COMPARING TWO SAMPLES
SD
SE
Sample Size
COMPARING TWO SAMPLES
We start by assuming that our sample came from the
original population
Our null hypothesis (to be tested) is that IQ=107.5 is
not significantly different from IQ=100
The larger a,
the greater the
chance that the
sample comes
from the Red
population
100 110
COMPARING TWO SAMPLES
The a level represents the probability of finding a significant
difference between the two means when none exists
This is known as a
Type I error
This is known as
the b level and is
normally set at
0.20
Type I (a)
Find a significant difference even
though one does not exist
Usually set at 0.05 (5%) or 0.01 (1%)
False negative
Fail to find a significant difference
Type II (b) even though one exists
Usually set at 0.20 (20%)
Power = 1 b (ie usually 80%)
ABNORMAL
POPULATION
a
DISTRIBUTION FIRST POSSIBLE CUT-OFF:
OF OUTSIDE THE RANGE OF THE
DYSFUNCTIONAL DYSFUNCTIONAL
SAMPLE POPULATION
DISTRIBUTION OF
FUNCTIONAL
(NORMAL) SAMPLE
UNPAIRED OR INDEPENDENT-
SAMPLE t-TEST: PRINCIPLE
The two distributions
are widely separated
so their means clearly
different
The distributions
overlap, so it is unclear
whether the samples
come from the same
population
SD
SE
Sample Size
ONE-SAMPLE
Used to compare means of
(INDEPENDENT
two independent samples
SAMPLE) t-TEST
Actual number
15 15
discharged
Expected
number
discharged
COMPARING PROPORTIONS:
THE CHI-SQUARE TEST
(Observed - Expected)2
A B 2
Sum
Expected
Number of
100 50 (15 20)2 (15 10)2
patients
20 10
Actual % 25 25
15 30 1.25 2.5 3.75
Discharged 20 10
Sum of Mean
df F Sig.
Squares Square
Total 2709.69 67
Total 2709.69 67
Total 2709.69 67
Independent t-
Continuous variables ANOVA
test
Mann-Whitney U
Ordinal variables (not test Kruskal-Wallis
normally distributed) ANOVA
Median test
KAPPA
(Non-parametric) measure of agreement
TIME 1 (OR OBSERVER 1)
Positive Negative Total
Positive A C A+C
TIME 2(OR
Negative D B B+D
OBSERVER 2)
Total A+D B+C N
Kappa Agreement
<0.20 Poor
0.21-0.40 Slight
0.41-0.60 Moderate
0.61-0.80 Good
p1 (1 p)1 p2 (1 p2 )
se
n1 n2
0.13(1 0.13) 0.52(1 0.52)
se(ARR)
23 23
NB This formula is given for convenience. You are not required to commit any of
these formulae to memory they can be obtained from numerous textbooks
CONFIDENCE INTERVAL OF
ABSOLUTE RISK REDUCTION
ARR = 0.39
se = 0.13
95% CI of ARR = ARR 1.95 x se
95% CI = 0.39 1.95 x 0.13
95% CI = 0.39 0.25 = 0.14 to 0.64
The calculated value of ARR is 39%, and the
95% CI indicates that the true ARR could be
as low as 14% or as high as 64%
Key point result is statistically significant
because the 95% CI does not include zero
INTERPRETATION OF CONFIDENCE
INTERVALS
Remember that the mean estimated from a
sample is only an estimate of the population
mean
The actual mean can lie anywhere within the
95% confidence interval estimated from
your data
For an Odds Ratio, if the 95% CI passes
through 1.0, this means that the Odds Ratio
is unlikely to be statistically significant
For an Absolute Risk Reduction or Absolute
Benefit increase, this is unlikely to be
significant if its 95% CI passes through zero
CORRELATION
14
Here, there are two
12 variables (HADS depression
HADS Depression
10
score and SIS) plotted
against each other
8
6 The question is
do HADS scores correlate
4
with SIS ratings?
2
0
0 5 10 15 20 25 30
SIS
CORRELATION
10
minimised
SIS
CORRELATION
14 r2=0.34 14 r2=0.06
12 12
HADS Depression
HADS Depression
10 10
8 8
6 6
4 4
2 2
0 0
0 5 10 15 20 25 30 0 5 10 15 20 25 30
SIS SIS
CORRELATION
y = A + Bx
x
CORRELATION
y = A + Bx
x
CORRELATION
y = A + Bx
x
REGRESSION
y = A + B 1 x1 + B 2 x2 + B 3 x3 .
Beta t p R2
Disease Activity
.02 .01 0.91 .00
(RADAI)
Sense of
-.40 -4.40 <0.001 .23
Coherence
6 W Patients who
have not
7 W relapsed at
8 the end of
the study are
9 X
described as
10 X censored
0 1 2 3 4 5
Year of Study
SURVIVAL ANALYSIS: ASSUME
ALL CASES RECRUITED AT TIME=0
1 X X=Relapsed
2 C W=Withdrew
3 W
C=Censored
4 X
5 C
Case
6 W
7 W
8 C
9 X
10 X
0 1 2 3 4 5
Year of Study
SURVIVAL ANALYSIS:
EVENTS IN YEAR 1
1 X X=Relapsed
2 C W=Withdrew
3 W
C=Censored
4 X
5 C Case 6 withdrew within
Case
SURVIVAL CURVE
Year
KAPLAN-MAIER SURVIVAL
ANALYSIS