You are on page 1of 25

Nervous system:

Biostatistics
Min-Kyung Jung, Ph.D.
Biostatistician
mjung01@nyit.edu
NYCOM I Room 314E

Objectives
State chi-square test, McNemars test, and
Fishers exact test as major statistical
procedures for comparing proportions
Choose a right test among them for a given
study design
Interpret odds ratios to measure effect
sizes and their confidence intervals to
describe precision

Chi-square test:
Statistical Procedure for Comparing
Proportions
Chi-square tests are performed to
compare proportions of two (or more)
independent groups
e.g.) If you want to compare the proportions of
having headache between the group of chronic
caffeine users and the group of non-caffeine
users, which statistical test would you perform?

Important to check assumptions prior to


employing a chi-square test

Assumptions for chi-square


test
1.Independence of the data
when violated

Use McNemars test

2. Expected counts greater than 5 for all


cells
- acceptable in larger contingency table to have up to
20% of expected counts below 5
- loss of power when violated
(the test may fail to detect a genuine effect)
when violated

Use Fishers exact test

Suppl: Procedure of Chi-square


test

1. Check assumptions

2. Compute the chi-square test statistic


3. Compute the degree of freedom (df):
degree of freedom = (number of rows 1) * (number of
columns 1)

4. Determine the significance level of alpha ()


5. Find a look-up value from the chi-square table
depending on the df and the alpha ()
6. Report the result of whether or not the two groups
are significantly different by comparing the test
statistic to the look-up value:
significant if the test statistic value > the look-up value

Suppl: Chi-square test statistic


formula

where O is the observed count, E is the


expected count for each cell, and the sum is
made for all of four cells
Have caffeinewithdrawal headaches?

e.g.)
Caffeine
relieved
headache
s?
Yes
No
Total

Yes

No

Total

Suppl: Steps to compute the chisquare statistic


1. Find the observed counts O:
2. Compute the expected counts E:
E = (row sum) * (column sum) / (total)

2. Compute the deviations of observed counts from


expected counts:
(deviation) = (observed count) - (expected count)
= O-E
3. Determine the contribution to chi-square:
2
(O E )
(contribution to chi-square) =E

4. Compute the chi-square statistic:


(O E ) 2
(test statistic)
=
E

Suppl: Observed/Expected counts


Have headaches?
Caffeine
relieved
headache
s?

Yes

No

Total

Yes

16

24

No

26

102

128

Total

42

110

152

Observed counts (O)


Expected counts (E)
24 * 42 / 152 = 6.6
16
Headache/Relief
128 * 42 / 152 = 35.4
Yes/Yes cell:
26
8
Yes/No cell:
24 * 110 / 152 = 17.4
102
128 * 110 /152 = 92.6
No/Yes cell:
No/No cell:

Suppl: Contributions to chi-square


Have headaches?
Caffeine
relieved
headache
s?

Yes

No

Total

Yes

16

24

No

26

102

128

Total

42

110

152

Observed(O)
to chi-square
Headache/Relief

Yes/Yes cell:
Yes/No cell:
No/Yes cell:
No/No cell:

16
26
8
102

6.6
35.4
17.4
92.6

Expected(E)

Contrib.

(16 - 6.6)2 / 6.6 = 13.4


(26 35.4)2 / 35.4 = 2.5
(8 17.4)2 / 17.4 = 5.1
(102 92.6)2 / 92.6 = 1.

Suppl: Chi-square statistic


Have headaches?
Caffeine
relieved
headache
s?

Yes

No

Total

Yes

16

24

No

26

102

128

Total

42

110

152

Contrib. to chi-square
test statistic
2
13.4
Headache/Relief (16 - 6.6) / 6.6 = 13.4
Yes/Yes cell: (26 35.4)2 / 35.4 = 2.5 + 2.5
Yes/No cell: (8 17.4)2 / 17.4 = 5.1 + 5.1
No/Yes cell: (102 92.6)2 / 92.6 = 1.0+ 1 = 22
No/No cell:

Suppl: Degree of freedom


Have headaches?
Caffeine
relieved
headache
s?

Yes

No

Total

Yes

16

24

No

26

102

128

Total

42

110

152

Degree of freedom
= (number of rows 1) * (number of
columns 1)
= (2 1) * (2 1) = 1

Suppl: Alpha (); level of


significance
Level of significance, notated alpha (), is
one of the key concepts in hypothesis testing
that specifies the probability level for our
evidence to be an unreasonable estimate
A recommended standard, or decision
criterion:
= 0.05
Be cautious about the blind adoption of this
level
Can be adjusted when multiple comparison
correction has to be made

Suppl.

Suppl: Look up value


from chi-square table

The look up value with df = 1 and = 0.05 from the


chi-square table is
3.84

Suppl: Report the result


whether or not significant
Report the result of whether or not the two groups
are significantly different by comparing the test
statistic to the look-up value:
significant if the test statistic value > the lookup value

computed statistic: 22
Look up value with df = 1 and = 0.05: 3.84
Thus, the proportions of the two groups are
significantly different.

Review: Odds, Odds ratio


Factor

Disease
Yes

No

Exposed

Unexposed

: the odds in favor of having disease with the factor


Odds = A/B,

: the odds in favor of having disease without the fac


C/D

AD
A/B
Odds ratio =
=
BC
C/D
(OR)

Exercise
A group of researchers conducted a survey study about
caffeine-withdrawal headache with 152 subjects. They
compared caffeine-withdrawal headache subgroup to nonheadache subgroup regarding various caffeine self-report
items including whether they agree that caffeine relieved
headaches. The result was presented in a 2X2 table as
Have caffeine-withdrawal headaches?
below:
Caffeine
Yes
No
Total
relieved
headache
s?
Yes

16

24

No

26

102

128

Total

42

110

152

(16 / 8) / (26 / 102) = 7.85


Calculate the odds ratio associated with having headaches
and being relieved by caffeine
Odds of believing that caffeine relieves headaches
is about 8 times higher in the caffeine-withdrawal
headache group than in the non-headache group.

Review: CONFIDENCE INTERVAL


A range of values that tries to quantify the
uncertainty, a range of plausible values
A narrow interval implies high precision
A wide interval implies poor precision
Check if the interval contains a value of no
change /effect for interpretation
The end points of the confidence interval
are referred to as the upper and lower
confidence limits.

Review: 95% CONFIDENCE


INTERVAL
for a population odds ratio :
The sample odds ratio does not assume the
approximate normal distribution
Transform using Ln (natural log, log with base e)
Standard error of Ln(OR)=
95% CI of Ln(OR) = Ln(OR) 1.96
95% CI of OR = exp[Ln(OR) 1.96

95% Confidence Interval of OR


Have headaches?
Caffeine
relieved
headache
s?

Yes

No

Total

Yes

16

24

No

26

102

128

Total
42
110
152
Odds ratio (OR) (16
= / 8) / (26 / 102) = 7.85


Standard error of Ln(OR)=
16 8 26 102

1 1 1
1

95% CI of Ln(OR) = Ln(7.85) 16


1.96
8 26 102

1 1 1
1

95% CI of OR = exp[Ln(7.85) 16
1.96

8 26 102

]
95% Confidence interval of OR 7.85 is (3.03, 20.33)
which does not contain the null value 1.

How to interpret?
OR = 1 means no increased risk
OR > 1 means an increased risk
OR < 1 means a protective effect
e.g. 1) OR = 0.56 in the study of effect of low-dose
aspirin on cardiovascular disease means
a protective effect of having aspirin
e.g. 2) OR = 3.9 in the study of effect of helmet
wearing on head injury at bike ride means
an increased risk of not wearing helmet

How to check significance?


Check if the associated 95% confidence interval
contains 1
OR is not significant if its CI contains 1
OR is significant if CI does not contain 1
e.g. 1) OR = 0.48 with 95% confidence interval (0.26,
a protective effect that is significant
e.g. 2) OR = 0.71 with 95% confidence interval (0.48,
a protective effect that is not significant
e.g. 3) OR = 1.22 with 95% confidence interval (0.86,
an increased risk that is not significant
e.g. 4) OR = 2.95 with 95% confidence interval (1.73,
an increased risk that is significant

0.88) means
1.04) means
1.73) means
3.51) means

Reference article 1
Gender and the functional outcome of elderly
ischemic stroke patients. Mizrahi et al.
: To compare between the two gender groups,
they performed t-test for continuous variables
and chi-square test for categorical variables

Reference article 2
Gender differences in non-motor symptoms
(NMS) in early PD: A 2-yr follow-up study on
previously untreated patients. Picillo et al.
: To compare NMS frequency between the
baseline and the follow-up, they performed
McNemars test

Reference article 3
Depression in PD is related to a genetic polymorphism
of the cannabinoid receptor gene (CNR1). Barrero et al.

: To compare genotypes frequency between


those with and without depression, they
performed Fishers exact test

You might also like