You are on page 1of 3

Cross-tabulation Example

By Dr. Jim Cox

In order to decide to run a cross-tabulation you need to look at your objectives/hypotheses. The
number of cross-tabulations for each objective will vary. There is no "right" number to run.
Common variables to use in cross-tabulations are demographics. For example, if I wanted to find
out which students are most likely to buy a certain item, we could crosstab class rank with
purchasing this item.

A cross-tabulation is done to determine if there is a relationship between two variables. For


example, if we want to determine who is most likely to purchase from a business, a cross-
tabulation could be run between those who said they would purchase from the business with
demographic variables. This will give the characteristics of which types of people will buy from
the business. Thus the business can use this to target certain groups.

In the example below have two variables V170: gender and V40: type of atmosphere preferred in
a bar. The values for V170 are male and female. The values for V40 are Yes - prefer sports
theme and No - do not prefer sports theme. Thus we are going to see if there is a relationship
between gender and if a sports bar theme is preferred for a bar.
First we have to determine the direction of the relationship. In order to do this we have to
determine which variable is the "dependent" variable and which variable is the "independent"
variable. In other words what influences what? In this case it would make sense that gender
would influence bar theme preference and not the other way around. This would make V170:
gender the independent variable. V40: sports bar theme preference would be the dependent
variable.

In calculating percentages for a cross-tabulation, we follow the rule: CALCULATE


PERCENTAGES ACROSS THE DEPENDENT VARIABLE. In our example below, V40 goes
down the column and not across the row. Thus we want to calculate percentage down the
column, i.e., across the dependent variable. Thus we are looking for column percentages. We
can find these in the cross-tabulation by taking the third number in each cell. The legend for the
numbers is in the top left corner of the cross-tabulation. The top number is the count, the second
is the row percentage, the third is the column percentages, and the fourth is the total
percentages. Since we want the column percentages, we find that we have 50. 5 percent of
males which prefer a sports theme. We also find that we have 24.8 percent of females that prefer
a sports theme. Thus it appears that males are approximately twice as likely to prefer a sports
bar theme as females. Now we know the direction of the relationship.

Independent

Dependent

Now we have to determine if we can trust these results. Could the results have happened by
chance? We have to examine the Pearson Chi-Square results. The significance level (alpha) is .
00018. Since one minus the significance level is the confidence level, we would have a
confidence level of 1 - .00018 = .99982 or 99.98 percent confident. Our confidence level is over
95 percent. This is good. This will happen whenever the significance level is less or equal to .05.

2
Next we have to determine how strong the relationship is between the two variables. In order to
check this we need to look at the Cramer's V and the Contingency Coefficient values. The values
in our example are .26664 and .25764 respectively. We have indicated that values above .35
would be considered "strong." Values between .25 and .35 would be considered "moderately
strong." and values below .25 would be considered "weak." In our example we show a
"moderately strong" relationship between gender and preference for a sports bar.

So in summary we have shown that there is a moderately strong relationship between gender
and preference for a sports bar with at least 95% confidence in the findings.

What would we recommend for a strategy based on these findings? Since males are much more
likely to prefer the sports bar theme than women, the business should direct their marketing
efforts to males and ask them to bring their significant others.

If we would have found that the significance level was above .05 or that the Cramer's V (or the
Contingency Coefficient) was less than .25 we would not be able to indicate that there was a
relationship between gender and sports bar preference. If we were to get a result that was not
significant (alpha > .05) but was strong (Cramer's V > .35) we might want to do further
investigation before we would make a marketing recommendation.

You might also like