Professional Documents
Culture Documents
Page 1 of 4
Study Sheet: Chi-Square Hypothesis Tests for Association or Independence
Introduction
By now you should be familiar with how to use the X2 statistic in goodness-of-fit tests. The
idea in a goodness-of-fit test is to determine how well a sample set of outcomes matches an
expected set of outcomes. For example, if you rolled a six-sided die 60 times, you'd expect
to get 10 ones, 10 twos, and so on up to 10 sixes. The X2 statistic allows you to compare
your set of observations with this set of expectations, but this isn't its only use. The X2
statistic can also be used to determine whether two categorical variables are associated in
some way.
Association and Independence
Categorical variables may be related or they may be independent. For example, gender
might be related to one's opinions about television programs, but it might not be related to
restaurant preferences. If knowing the values of one variable gives us information about the
values of another variable, we say the variables are related or associated. If knowing the
values of one variable gives us no information about the values of another variable, we say
the variables are not related or are not associated. We can also say they are independent.
Computing the X2 Statistic
If two variables are independent, then observed values tend to match expected values. If
O E
To compute X2:
Arrange your data into a two-variable table with one category running vertically and
the other horizontally.
Calculate the marginal distributions of each variable (that is, determine row and
column totals).
Calculate the expected values of each cell using the following formula:
column total
expeced value
row total .
sample size
observed-expected
Calculate
expected
to get X2.
Calculate the degrees of freedom by multiplying the number of rows minus one by the
number of columns minus one:
df = (number of rows 1)(number of columns 1)
_________________________________
Copyright 2011 Apex Learning Inc. (See Terms of Use at www.apexvs.com/TermsOfUse)
AP Statistics
Page 2 of 4
Study Sheet: Chi-Square Hypothesis Tests for Association or Independence
Hypothesis Tests for Independence
Hypothesis tests for independence are done in much the same way as other hypothesis
tests, except that the null hypothesis is usually stated in words, rather than symbolically. To
conduct a hypothesis test for independence, follow these steps:
O E
. Find the p-value by using a table or by using the X2cdf function on the
E
TI-83/TI-84 or Chi square Cdf on the TI-89 calculator. Although you'll probably use your
calculator for your calculations, you should know how to calculate this statistic and find
your p-value without it.
State your conclusion: Be sure to state your conclusion in terms of the hypothesis
being tested. Use your p-value to determine the likelihood that your X2 statistic is the
result of chance alone if the variables are independent.
X
2
Key Terms
Two-way table: A two-way table compares two categorical variables. The rows of the table
take on the values of the row variable and the columns take on the values of the column
variable. We usually say there are r rows and c columns. Thus a two-way table can be
described as an r X c table.
Chi-Square Test for Independence or Association: We use the X2 statistic to test
hypotheses when we want to determine whether two categorical variables are independent.
The variables are independent if knowledge of the values of one variable gives no
information about the values of the other. If they aren't independent, we say they're
associated or related.
Worked Examples
Performing a Chi-Square Significance Test of Association
You want to see whether there's a relationship between gender and preferred exercise
routine, so you collect the following data:
A.
B.
C.
D.
Cardio
Weights
Both
Neither
Male
43
65
47
23
Female
72
24
43
18
_________________________________
Copyright 2011 Apex Learning Inc. (See Terms of Use at www.apexvs.com/TermsOfUse)
AP Statistics
Page 3 of 4
Study Sheet: Chi-Square Hypothesis Tests for Association or Independence
E.
F.
G.
H.
Answers
A.
Ho : The variables are independent; there is no relationship between gender and preferred
exercise routine.
Ha : The variables are not independent; there is a relationship between gender and
preferred exercise routine.
B.
Cardio
Weights
Both
Neither
TOTAL
Male
43
65
47
23
178
Female
72
24
43
18
157
TOTAL
115
89
90
41
335
C.
Cardio
Weights
Both
Neither
TOTAL
Male
(115/335)(178)
= 61.1
(89/335)(178)
= 47.29
(90/335)(178)
= 47.82
(41/335)(178)
= 21.79
178
Female
(115/335)(157)
= 53.9
(89/335)(157)
= 41.71
(90/335)(157)
= 42.18
(41/335)(157)
= 19.22
157
TOTAL
115
89
90
41
335
D.
All the expected values are greater than 5, so a chi-square hypothesis test of independence
will be valid for this situation.
E.
X 2 25.76
X
2
O E
E
72 53.9
53.9
43 61.1
61.1
24 41.71
41.71
65 47.29
47.29
43 42.18
42.18
47 47.82
47.82
18 19.22
23 21.79
21.79
19.22
F.
df = 3
degrees of freedom = (number of rows 1)(number of columns 1) = (2 1)(4 1)
= (1)(3) = 3
_________________________________
Copyright 2011 Apex Learning Inc. (See Terms of Use at www.apexvs.com/TermsOfUse)
AP Statistics
Page 4 of 4
Study Sheet: Chi-Square Hypothesis Tests for Association or Independence
G.
p < .0005
Find 3 degrees of freedom in the left column of the table and move across until you find the
range containing your X2 statistic. For 3 degrees of freedom, the highest value of X2 on the
table is 17.73 (p = .0005), so your p-value is less than .0005.
H.
We have evidence, since p , that allows us to reject the null hypothesis which stated that
the variables are independent. If the variables were independent, a X2 statistic of 25.76
would occur due to chance alone less than .05% of the time, therefore we can reasonably
claim that the variables are associated.
_________________________________
Copyright 2011 Apex Learning Inc. (See Terms of Use at www.apexvs.com/TermsOfUse)