You are on page 1of 56

Successful completion of NAMA7903:

• Successful completion of ethics theme


AND
• At least 2 of 3 contact session tasks
passed (at most one on re-try) AND
• Attendance of at least 2 of the 3 contact
sessions
(Optional: How to write a protocol session 11 April)

Registrar protocol presentations 13 June 4-6pm Metro 4


Registrar result presentations 25 July 4-6pm Metro 4
NAMA7903
Theme 4 2018
Descriptive and analytical statistics
Enteral nutrition for patients in septic shock: a
retrospective cohort study.
Rai SS, O’Connor SN, Lange K, Rivett J,
Chapman MJ. Crit Care Resusc 2010;
12(3):177-181.

43 patients (mean age, 54 [SD, 20] years; mean


APACHE II score, 20 [SD, 8]) were identified, of
whom 33 had shock. The median length of ICU
stay was 13 days (range, 3-55 days), and 32
patients (74%) survived hospital.
Importance of physicians' attire: factors influencing the
impression it makes on patients, a cross-sectional
study
With regard to attire, regardless of a doctor's gender, the
white coat was judged to be the most appropriate style of
dress, followed by surgical scrubs. Only the preference
for scrubs was significantly affected by age, gender and
region (P < 0.05). Using binomial logistic regression
analysis, we evaluated the effects of age on the
appropriateness (Likert score 3-5) versus
inappropriateness (score 1-2) of scrubs. There was a
significant increase in the number of subjects aged 50-
64 and >65 years of age who thought scrubs were
inappropriate compared with those aged 20-34 years
(adjusted odds ratios of 4.30 and 12.7 for male doctors,
and 3.66 and 6.91 for female doctors)).

Kurihara H, Maeno T, Maeno T. Asia Pac Fam Med 2014;


13(1):2. .
Aims of analysis of collected information:
• What were the study results?
• How precise are the study estimates?
Variables (information which is
collected)

• Numerical (quantitative)
discrete (eg parity)
continuous (eg height)

• Categorical (qualitative)
nominal ( eg gender)
ordinal (eg disease severity)
• Data entry into Excel for analysis by
Department of Biostatistics
• Data checking before analysis
– see protocol manual
Number Age Gender Date of Current Ever Age started Length of
interview smoker smoked smoking hospitalization
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 45 M 25-May-07 Y Y 15 12 days
2 32 F Jun 15 2007 Y N 0 2 days
3 35 F 23-May-07 N Y 31 5 days
4 46 May 24 2007 N Y 40 2 weeks
5 25 M 22-May-07 Y Y 28 1 month
6 20 M 13-Jun-07 N N 5 days
7 17 F 12-Jun-07 Y Y 16 23 days
8 20 M 11-Jun-07 Y Y 7 days
9 19 M 21-May-07 Y Y 18 8 days
10 30 F 3-Jun-07 N N 20 2 weeks
Number Age Gender Date of Current Ever Age started Length of
interview smoker smoked smoking hospitalization
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 45 M 25-May-07 Y Y 15 12 days
2 32 F Jun 15 2007 Y N 0 2 days
3 35 F 23-May-07 N Y 31 5 days
4 46 ? May 24 2007 N Y 40 2 weeks
5 25 M 22-May-07 Y Y 28 1 month
6 20 M 13-Jun-07 N N 5 days
7 17 F 12-Jun-07 Y Y 16 23 days
8 20 M 11-Jun-07 Y Y ? 7 days
9 19 M 21-May-07 Y Y 18 8 days
10 30 F 3-Jun-07 N N 20 2 weeks
Summary (descriptive) statistics:
numerical variables
Numerical data: | 14 | 8
| 13 |
| 12 |
symmetric | 11 |
| 10 | 4
4 | 9|3
71 | 8|045 skew
95411 | 7|146
9954332 | 6|001
975220 | 5|22448
66 | 4|11116668
8 | 3|

Summary (descriptive) statistics:
numerical variables

Central location
• mean (average): sum of all the values
divided by the number of
observations in the group
• median
• (mode: the value which occurs most often)
Median

• odd number of observations: the middle


value of the ordered series

• even number of observations: the mean of


the 2 middle values in the ordered series
| 14 | 8
| 13 |
| 12 |
| 11 |
median=63.5 | 10 | 4 median=58
mean=64.25 4 | 9|3 mean=64.6
71 | 8|045
95411 | 7|146
9954332 | 6|001
975220 | 5|22448
66 | 4|11116668
8 | 3|

Censored observations

• laboratory values
1,2,3,4,5

1,2,3,4,50
median rather than mean if:
• skew distribution
• small sample with outlier
• censored observations
• Categorise a numerical variable:
Blood pressure, Body mass index
Measures of variability
• range: largest - smallest value
can be influenced by extreme value
Measures of variability
• standard deviation

 (xi  x)2
n

i1
n 1

• interquartile range: 75th -25th


percentile
| 14 | 8
| 13 |
| 12 |
| 11 |
median=63.5 | 10 | 4 median=58
mean=64.25 4 | 9|3 mean=64.6
s=13.6 71 | 8|045 s=24.8
95411 | 7|146
9954332 | 6|001
975220 | 5|22448
66 | 4|11116668
8 | 3|

• Coefficient of variation (CV)

s x 100
mean
• 25th percentile: the value which divides sample
in a quarter versus three quarters
• 75th percentile: the value which divides sample
in three quarters versus a quarter
Summary (descriptive) statistics:
categorical variables

• Frequencies and percentages


• OR Percentages with overall sample size
out of which percentage was calculated
• Frequencies (numbers) on their own of
little value
Enteral nutrition for patients in septic shock: a
retrospective cohort study.
Rai SS, O’Connor SN, Lange K, Rivett J,
Chapman MJ. Crit Care Resusc 2010;
12(3):177-181.

43 patients (mean age, 54 [SD, 20] years; mean


APACHE II score, 20 [SD, 8]) were identified, of
whom 33 had shock. The median length of ICU
stay was 13 days (range, 3-55 days), and 32
patients (74%) survived hospital.
Summarising the relationship
between variables
Risk assessment:
Association between risk factor/exposure and
an outcome

Serious side effect


Treatment present absent
A 10 40 50
B 10 90 100
20 130 150

2x2 table: 2 rows, 2 columns


Serious side effect
Treatment present absent
A 10 40 50
B 10 90 100
20 130 150

Risk of serious side effect on Treatment A:


10/50=20%
Risk of serious side effect on Treatment B:
10/100=10%
Relative risk (Treatment A to B): 20%/10%=2
(also known as risk ratio)
Risk difference (Treatment A-B):
20%-10%=10%
Relative risk reduction

• 20% to 10%: reduce risk by 50%

Number needed to treat to prevent one


adverse outcome

• A: 20% B:10% 1/(20%-10%)=10


Serious side effect
Treatment present absent
A 10 40 50
B 10 90 100
20 130 150

Odds of serious side effect on Treatment A:


10/40=25%
Odds of serious side effect on Treatment B:
10/90=11%
Odds ratio (Treatment A to B): 25%/11%=2.25
OUTCOME
RISK FACTOR Poor Good
Yes a b a+b
No c d c+d
a+c b+d a+b+c+d

Cross-sectional study: select a+b+c+d


Cohort study or Clinical Trial: select a+b and
c+d
Case-control study: select a+c and b+d
Lung cancer
Cigarettes per day Yes No
50+ 38 12
25-49 293 154
15-24 475 431
5-14 489 570
1-4 55 129
0 7 61

6x2 table
Lung cancer
Cigarettes per day Yes No
50+ 38 12
0 7 61
Odds ratio 27.6

1-4 55 129
0 7 61
Odds ratio 3.7
Stratified analysis: adjusting for effect of
confounders
Hypertension
Yes No
Overweight
BMI > 25 151 108 259
BMI 25 or less 84 132 216
235 240 475
Relative risk 1.5
Hypertension
Yes No
BMI > 25 151 108 259
BMI 25 or less 84 132 216
235 240 475
Relative risk 1.5

Males Females
Hypertension Hypertension
Yes No Yes No
BMI >25 20 24 44 131 84 215
BMI ≤ 25 36 73 109 48 59 107
56 97 153 179 143 322

Relative risk 1.38 1.36


• Logistic regression: binary/dichotomous
outcome, a number of numerical or
categorical predictors
• Correlation (r= -1 to 1): assesses
relationship between two numeric
variables

• Multiple regression: numeric outcome,


various numeric or categorical predictors
How precise are the study
estimates

Target population

Sample

Results
• Statistics based on sample data are
estimates of the corresponding
parameters in the target population
• Different random samples of the same
size from the same target population will
not give exactly the same estimates
(results): there is variation in any sample
• The accuracy of the estimates depends
on:
- sample size
- variability of the variable being
measured
• The standard error of an estimate consists
of these two components:

• For example, standard error of mean


95% confidence interval
(estimation)
The range of values we consider possible
for the target population parameter, based
on the sample estimate

To calculate use:
• sample estimate
• its standard error
• cutoff value of the appropriate probability
distribution
• 4/10 (95% CI: 12.2% to 73.8%)

• 40/100 (95% CI: 30.3% to 50.3%)

• 400/1000 (95% CI: 37.0% to 43.1%)

Sample must be representative of target


population
• RR=2 (95% CI 0.9;4.5)
Hypothesis testing and p-values
Null hypothesis H0: there is no difference
Alternative hypothesis HA: there is a
difference
To test the null hypothesis:
• calculate a test statistic based on the sample
information
• this test statistic follows a certain probability
distribution
• determine the probability (p-value) of observing
the test statistic or a more extreme value, if the
null hypothesis is true
Null hypothesis H0: there is no difference

p-value: the probability of observing the


study results, if the null hypothesis is true

“Only the preference for scrubs was significantly affected


by age, gender and region (P < 0.05)”
If p is small, reject null hypothesis (p<0.05)
If p is large, null hypothesis cannot be
rejected
• Type I error (α): probability that null
hypothesis is rejected when it is fact true
• Type II error (β): probability that null
hypothesis is not rejected when alternative
hypothesis is true
• Power (1-β): ability of test to detect a
difference when there is in fact a
difference
True prevalence A B
10% 30%
Sample size per group Power
71 80%
59 70%
40 50%
20 21%
10 8%
• Confidence intervals can be used to
evaluate the clinical significance of a
difference/association

• If a sample is very large, a clinically


insignificant difference can be
statistically significant

• The researcher must decide beforehand


which values are clinically significant
and which not.
Consider clinical trials conducted to test a new
treatment which lowers blood pressure.

The effect of the medication is measured by the


decrease in blood pressure from the start of
the trial to after 6 weeks of treatment.

A decrease of 5 mm Hg or larger is considered


clinically significant.
95% CI interpretation:
Trial for mean decrease significance
observed p-value
mean decrease Clinical Statistical
A -13.5 (-20; -7) <0.05 yes yes
B -10.0 (-16,-4) <0.05 possibly yes
C -2.5 (-4; -1) <0.05 no yes
D 0 (-4; 4) >0.05 no no
E 0 (-10; 10) >0.05 conclusion impossible no
gnbsgj@ufs.ac.za

You might also like