You are on page 1of 4

AP Statistics

Page 1 of 4
Study Sheet: Chi-Square Hypothesis Tests for Association or Independence
Introduction
By now you should be familiar with how to use the X2 statistic in goodness-of-fit tests. The
idea in a goodness-of-fit test is to determine how well a sample set of outcomes matches an
expected set of outcomes. For example, if you rolled a six-sided die 60 times, you'd expect
to get 10 ones, 10 twos, and so on up to 10 sixes. The X2 statistic allows you to compare
your set of observations with this set of expectations, but this isn't its only use. The X2
statistic can also be used to determine whether two categorical variables are associated in
some way.
Association and Independence
Categorical variables may be related or they may be independent. For example, gender
might be related to one's opinions about television programs, but it might not be related to
restaurant preferences. If knowing the values of one variable gives us information about the
values of another variable, we say the variables are related or associated. If knowing the
values of one variable gives us no information about the values of another variable, we say
the variables are not related or are not associated. We can also say they are independent.
Computing the X2 Statistic
If two variables are independent, then observed values tend to match expected values. If

O E

0. We decide whether sample values are the result


E
of chance by looking at the extent to which X2 differs from 0.

they match exactly, then x 2

To compute X2:
Arrange your data into a two-variable table with one category running vertically and
the other horizontally.
Calculate the marginal distributions of each variable (that is, determine row and
column totals).
Calculate the expected values of each cell using the following formula:
column total
expeced value
row total .
sample size

observed-expected
Calculate
expected

for each cell of the table. Add these values together

to get X2.
Calculate the degrees of freedom by multiplying the number of rows minus one by the
number of columns minus one:
df = (number of rows 1)(number of columns 1)

_________________________________
Copyright 2011 Apex Learning Inc. (See Terms of Use at www.apexvs.com/TermsOfUse)

AP Statistics
Page 2 of 4
Study Sheet: Chi-Square Hypothesis Tests for Association or Independence
Hypothesis Tests for Independence
Hypothesis tests for independence are done in much the same way as other hypothesis
tests, except that the null hypothesis is usually stated in words, rather than symbolically. To
conduct a hypothesis test for independence, follow these steps:

State your null and alternative hypotheses:


Ho : The variables are independent, or there is no association between the variables.
Ha : The variables are not independent, or there is an association between the variables.
Justify your use of the test: The expected value for every cell of your table must be
at least five for the chi-square test to be valid.
Compute your test statistic and p-value: The chi-square test statistic is computed as

O E

. Find the p-value by using a table or by using the X2cdf function on the
E
TI-83/TI-84 or Chi square Cdf on the TI-89 calculator. Although you'll probably use your
calculator for your calculations, you should know how to calculate this statistic and find
your p-value without it.
State your conclusion: Be sure to state your conclusion in terms of the hypothesis
being tested. Use your p-value to determine the likelihood that your X2 statistic is the
result of chance alone if the variables are independent.
X
2

Key Terms
Two-way table: A two-way table compares two categorical variables. The rows of the table
take on the values of the row variable and the columns take on the values of the column
variable. We usually say there are r rows and c columns. Thus a two-way table can be
described as an r X c table.
Chi-Square Test for Independence or Association: We use the X2 statistic to test
hypotheses when we want to determine whether two categorical variables are independent.
The variables are independent if knowledge of the values of one variable gives no
information about the values of the other. If they aren't independent, we say they're
associated or related.
Worked Examples
Performing a Chi-Square Significance Test of Association
You want to see whether there's a relationship between gender and preferred exercise
routine, so you collect the following data:

A.
B.
C.
D.

Cardio

Weights

Both

Neither

Male

43

65

47

23

Female

72

24

43

18

Set up your null and alternative hypotheses.


Compute the marginal distributions for each variable (row and column totals).
Compute the expected values for each cell.
Justify your use of a chi-square hypothesis test of independence. Are all necessary
conditions met?

_________________________________
Copyright 2011 Apex Learning Inc. (See Terms of Use at www.apexvs.com/TermsOfUse)

AP Statistics
Page 3 of 4
Study Sheet: Chi-Square Hypothesis Tests for Association or Independence
E.
F.
G.
H.

Compute your test statistic (X2).


Determine your degrees of freedom.
Find your p-value in a table.
What conclusion would you draw if .05 ?

Answers
A.
Ho : The variables are independent; there is no relationship between gender and preferred
exercise routine.
Ha : The variables are not independent; there is a relationship between gender and
preferred exercise routine.
B.
Cardio

Weights

Both

Neither

TOTAL

Male

43

65

47

23

178

Female

72

24

43

18

157

TOTAL

115

89

90

41

335

C.
Cardio

Weights

Both

Neither

TOTAL

Male

(115/335)(178)
= 61.1

(89/335)(178)
= 47.29

(90/335)(178)
= 47.82

(41/335)(178)
= 21.79

178

Female

(115/335)(157)
= 53.9

(89/335)(157)
= 41.71

(90/335)(157)
= 42.18

(41/335)(157)
= 19.22

157

TOTAL

115

89

90

41

335

D.
All the expected values are greater than 5, so a chi-square hypothesis test of independence
will be valid for this situation.
E.
X 2 25.76

X
2

O E
E

72 53.9

53.9

43 61.1

61.1

24 41.71

41.71

65 47.29

47.29

43 42.18

42.18

47 47.82

47.82

18 19.22

23 21.79

21.79

19.22

5.36 6.63 .014 .0672 6.078 7.52 0.159 .7744 25.76254

F.
df = 3
degrees of freedom = (number of rows 1)(number of columns 1) = (2 1)(4 1)
= (1)(3) = 3
_________________________________
Copyright 2011 Apex Learning Inc. (See Terms of Use at www.apexvs.com/TermsOfUse)

AP Statistics
Page 4 of 4
Study Sheet: Chi-Square Hypothesis Tests for Association or Independence

G.
p < .0005
Find 3 degrees of freedom in the left column of the table and move across until you find the
range containing your X2 statistic. For 3 degrees of freedom, the highest value of X2 on the
table is 17.73 (p = .0005), so your p-value is less than .0005.
H.
We have evidence, since p , that allows us to reject the null hypothesis which stated that
the variables are independent. If the variables were independent, a X2 statistic of 25.76
would occur due to chance alone less than .05% of the time, therefore we can reasonably
claim that the variables are associated.

_________________________________
Copyright 2011 Apex Learning Inc. (See Terms of Use at www.apexvs.com/TermsOfUse)

You might also like