Professional Documents
Culture Documents
Boy/Girl European
African
Asian
b. What is actually analyzed are raw frequencies, i.e., the number of boys,
girls or the number of Europeans, Africans, and Asians.
2. There are 3 statistical procedures which are applied to analyze nominal data.
a. Chi-Square Test for Goodness of Fit
b. Chi-Square Test of Independence (Also called Association)
c. Chi-Square Test of Homogeneity
B. Chi-Square Introduction
1. Chi-square has a distribution but is classified as a nonparametric statistical test
by tradition (Daniel, 1990, pp. 178-179). Nonparametric tests are distribution-
free statistics which are used when a population's (or a sample's) distribution
is substantially non-normal and parametric tests are not suitable.
4. The null (or statistical) hypothesis (Ho) states that there is either “no
differences between observed or expected frequencies" (Goodness of Fit) or
"the variables or samples are not related, i.e., are independent" (Test for
Independence) or are homogeneous (Test of Homogeneity).
a. As normally applied, the X2 is not directed at any specific alternative
hypothesis (Reynolds, 1984, p. 22).
266
6. Assumptions
a. The distribution is not symmetric.
b. Chi-square values can be zero or positive but never negative.
c. Chi-square distribution is different for each degree of freedom. As the
number of degrees of freedom increase, the distribution approaches the
SNC.
d. Degree of freedom (df) varies depending on the chi-square test being used.
e. Measurements are independent of each other. Before-and-after frequency
counts of the same subjects cannot be analyzed using χ 2 .
7. Chi-Square Issues
a. Expected frequency cell size
(1) The greater the numerical difference between observed and expected
frequencies within cells, the more likely is a statistically significant X2.
Cells with estimated frequencies (<5) may yield an inflated X2.
(2) There is considerable debate concerning small cell expected
frequencies (Daniel, 1990, p. 185). Various authors have proposed
guidelines depending on the specific test.
(3) Welkowitz, Ewen, & Cohen (1991, p. 292) offer general advise:
(a) for df = 1, all expected frequencies should be at least 5;
(b) for df = 2, all expected frequencies should be at least 3; and
(c) for df = 3, all but one expected frequency value should equal 5.
(4) If cells are to be combined, then there should be a logical reason for
any combination effected, otherwise interpretation is adversely
affected. Morehouse and Stull (1975, pp. 320-321) advocate the use of
Yates' Correction for Continuity when expected cell sizes are less than
5. However, Daniel (1990, p. 187) reports that based on research there
is a trend away from applying Yates' Correction. Spatz (2001, p. 293)
advises against using the Yates correction.
267
2. Percentages
(1) There is some disagreement among authors as whether or not
percentages may be included in chi-square computations.
(2) Reynolds (1984, p. 36) argues for 2 X 2 contingency tables that
“percentages permit one to detect patterns of departure from
independence...and [that] percentages are particularly useful in 2 X 2
tables." However, Morehouse and Stull (1975, p. 320) disagree, "the
direct numerical counts or raw data should always be used as a basis
for the calculation of chi-square. Percentages and ratios are not
independent, and consequently, their use will result in errors in
conclusions."
(3) Daniel (1990), Siegel (1956), Welkowitz, Ewen, & Cohen (1991), and
Udinsky, Osterlind, & Lynch (1981) are silent on the subject, but none
of their examples contain percentages.
χ =∑
2
[(O − E ) 2
E
]
where: χ2 = chi-square value; O = observed frequencies; E = expected
frequencies
d. For Goodness of Fit test, there are no measures of association as it’s a one
variable test.
268
(1) We are concerned only with whether or not the observed data “fit” the
expected.
(2) One reason the X2 Goodness of Fit test is not used as a measure of
association is that its magnitude depends on sample size (i.e., large
sample equal large computed X2 value and the larger the computed X2,
the greater the chances of rejecting the null hypothesis).
( P1i − Poi ) 2
m
ϖ= ∑
i =1 P0i
(c) Substitute into Formula 13.1: See Data Table where χ 2 = 13.908
(d) Critical Value: 7.815 at for ∂ = .05 & df = 3 (Triola 1998, p. 716)
(11)Effect Size Estimate: In this case an effect size could be computed and
then interpreted using Cohen’s (1988, pp. 224-225) criteria.
χ =∑
2
[ (O − E ) 2
E
]
b. Measure of Association: The phi φ Coefficient
(1) The phi coefficient is a symmetric index with a range between 0 and 1,
where zero equals statistical independence.
(a) Maximum value is attained only under strict perfect association.
270
(2) Characteristics
(a) Value varies between “0” and “1.”
(b) Where no association is present, C = 0.
(c) When r = c, a perfect correlation is indicated by C = 1.
(d) When r ≠ c, a perfect correlation may not be indicated even
when C = 1.
χ2
C=
n(t − 1)
(2) CC will have the same value regardless of how the categories are
arranged in the columns or rows.
(3) Formula 13.6 The Contingency Coefficient (Siegel, 1956, p. 196)
χ2
CC =
N + χ2
(9) Apply Decision Rule: Since χ 2 = 1.174 is < 3.841, retain Ho: ρ1 = ρ2
as p > .05.
N ( AD − BC ) 2
χ2 = ( A+ B )( C + D )( A+C )( B + D )
N ( AD − BC ) 2
χ =2
( A + B )( C + D )( A + C )( B = D )
χ 2 = 2.34
(d) Critical Value: 3.841 at ∂ = .05 & df = 1 (Triola 1998, p. 716),
(9) Apply Decision Rule: Since χ 2 = 2.34 is < 3.841, retain Ho: ρ1 = ρ2, as
p > .05.
(8) Compute the test statistic and select the relevant critical value(s).
(a) Data Table
N ( AD − BC ) 2
χ = 2
( A + B )( C + D )( A + C )( B = D )
100 (15 • 22 − 28 • 35 ) 2
χ = 2
( 43)( 57 )( 50 )( 50 )
χ 2 = 6.895
χ2 = 42 , 250 , 000
6 ,127 , 500
(9) Apply Decision Rule: Since χ 2 = 6.895 is > 3.841, reject Ho: ρ1 = ρ2,
p > .05.
χ 2
6 . 895
φ = = = . 06895 = . 26
N 100
(9) Apply Decision Rule: Since χ 2 = 8.0 is > 5.99 so reject, Ho: ρ1 = ρ2 =
ρ3 as p < .05.
276
χ2 8.0 8.0
C= = = = .08 = .283
n(t − 1) 100(1 − 1) 100
4. Three nonparametric tests will be examined which test instances where the
dependent variable is measured in ranks. These are:
a. For two dependent samples, the Wilcoxon Matched Pairs Test will be
presented.
b. For two independent samples, the Mann-Whitney U Test is profiled.
c. A nonparametric correlation test (Spearman Rank Order Correlation Test)
has been previously presented.
B. Two Dependent Samples: Wilcoxon Matched Pairs Signed Ranks “T” Test
1. The Wilcoxon Matched Pairs Test is the nonparametric equivalent to the
dependent samples t-test and is applied to ordinal data (Spatz, 2001, pp. 309-
313). Recall that there are three dependent designs: natural pairs (e.g. twins),
matched pairs, and repeated (before and after). Scores from the two groups
must be logically paired.
2. The test statistic is “T.” The critical value is drawn from the critical values for
the Wilcoxon matched pairs signed rank T table. It is the differences that are
ranked not the values of the differences. The rank of “1” always goes to the
smallest difference.
(1) When a “D” value equals zero, it is not assigned a rank and N is
reduced by one. If two “D” values equal zero, then one is given a +1.5
ranking and the second is given a -1.5 ranking. If three “D” values
equal zero, then one is dropped (reducing N by one) and the other two
are assigned -1.5 and +1.5 ranking.
(2) When “D” values are tied, the mean of the ranks which would have
been received is given to each of the tied ranks. See how pairs 5 and 6
are treated in Table 13.8.
c. To each value of D attach the sign of its difference, negative (-) or positive
(+) which is usually left off as it is understood.
d. Sum the positive and negative ranks separately. T is the absolute value of
the smaller of the two sums.
e. If the test statistic “T” is less than (<) the critical value “T”, the null
hypothesis is rejected.
f. Computation and Null Hypothesis Decision-Making
Σ (+ ranks) = 15
Σ (- ranks) = -11
T = 11 (smallest absolute value)
(1) Since the test statistic T = 11 is > the critical value T = 2, for a 2-tail
test where ∂ = 0.5 (Triola 1998, p. 726), we retain the null hypothesis
that there is no difference between the rankings.
(2) Remember that the dependent samples t-test could have been applied
to these Table 13.8 data but the decision was made to test the rankings
of the scores and not the scores themselves. In cases where there is
reason to believe that the populations are not normally distributed and
do not have equal score variances, one would use this test.
(3) The rationale for the Wilcoxon Matched Pairs Test is that if the
populations are truly equal (i.e., there are no real differences) the
absolute values of the positive and negative sums will be equal and
any differences are due to sampling fluctuations.
279
g. When the sample size is greater than 50, the T statistic is approximated
using the z-score and the SNC.
(1) Formula 13.8a The Wilcoxon Matched Pairs Test N > 50 (Spatz, 2001,
pp. 312-313)
(T + c ) −uT
z= δT
(2) Formula 13.8b The Wilcoxon Matched Pairs Test N > 50 (Spatz, 2001,
pp. 312-313)
N ( N +1)
µT =
4
(3) Formula 13.8c The Wilcoxon Matched Pairs Test N > 50 Formula
(Spatz, 2001, pp. 312-313)
N(N+1)(2N+1)
δT =
24
Where: T = Smaller sum of the signed ranks
c = 0.5
N = number of pairs
(7) Apply Decision Rule: Since the test statistic of T = 5 is less than the
critical T value of T = 8, the null hypothesis is rejected.
(8) It appears that the workshop does not improve teaching skills.
e. Ties between ranks are handled in the same manner as in the Wilcoxon
Matched Pairs Signed Ranks test. If the ties are in the same group, then the
U value is not affected. If there are several ties across both groups, the
correction suggested by Kirk (1999, p. 594) should be applied.
f. Hint: The smaller the U value, the more different the two groups are.
N1(N1 +1)
U = (N1)(N2 ) + − ∑R1
2
Where N1 = Number comprising group one
N2 = Number comprising group two
ΣR1 = sum of ranks for group one
N2(N2 +1)
U=(N1)(N2)+ −∑R2
2
Where N1 = Number comprising group one
N2 = Number comprising group two
Σ R2 = Sum of ranks for group two
N1(N1 +1)
U = (N1)(N2 ) + − ∑R1
2
5(5 + 1)
U = (5)(6) + − 22
2
30
U = 30 + − 22
2
U = 45 − 22
U = 23
N2(N2 +1)
U = (N1)(N2) + −∑R2
2
283
6(6 + 1)
U = (5)(6) + − 44
2
42
U = 30 + − 44
2
U = 30 + 21 − 44
U =7
(d) Critical “U” Value: = 3 at ∂ = .05 in a 2-tail test at the intersection
of column N1 = 5 and row N2 = 6 in the Mann-Whitney Critical
Value Table (Spatz, 2001, p. 377)
(7) Apply Decision Rule: Since the test statistic of U = 7 is greater than
the critical U value of U = 3, the null hypothesis is retained.
(8) It appears that the workshop does not improve academic competency
as the distributions of the two groups are statistically equal.
4. When the sample size is greater than 21, in either N1 or N2, the U test statistic
is approximated using the z-score and the SNC.
a. Formula 13.10 The Mann-Whitney U Test For N1 or N2, > 21 (Spatz,
2001, p. 306). In Formula 13.10 “c” is a .05 correction factor.
(U + c ) − uU
z= δU
( N1 )( N 2 )
µU =
2
( N1 )( N 2 )( N1 + N 2 + 1)
δU =
12
b. Decision Rules (Spatz, 2001, p. 306)
(1) For a 2-tail test, reject the null hypothesis (Ho) if the computed test
statistic “z” ≥ |1.96| at ∂ = .05.
(2) For a 2-tail test, reject the null hypothesis (Ho) if the computed test
statistic “z” ≥ |2.58| at ∂ = .01.
(3) For a 1-tail test, reject the null hypothesis (Ho) if the computed test
statistic “z” ≥ 1.65 at ∂ = .05.
(4) For a 1-tail test, reject the null hypothesis (Ho) if the computed test
statistic “z” ≥ 2.33 at ∂ = .01.
284
Review Questions
6. Which one of the following statements about the phi ( φ ) is not accurate?
a. The phi coefficient is a symmetric index with a range between 0 and 1, where
zero equals statistical independence.
b. Maximum value is attained only under strict perfect association.
c. Before the phi coefficient is applied, there should be a statistically significant chi-
square value.
d. Apply the phi coefficient only to any chi-square problem.
285
7. Which one of the following tests is appropriate for the “before and after” design?
a. Chi-square Goodness of Fit c. Mann-Whitney U Test Small Samples
b. Mann-Whitney U Test Big Samples d. Wilcoxon Matched Pairs Test
8. When the sample size is > ______, the T statistic is approximated by the z-score and
the normal curve.
a. 40 c. 60
b. 50 d. 70
9. When the sample size is > _______ for both N1 and N2 the U statistics is
approximated by the z-score and the normal curve.
a. 20 c. 40
b. 30 d. 50
10. The statistical test where the null hypothesis is rejects when the test statistic is less
than the critical value is _______.
a. Chi-square Goodness of Fit c. Mann-Whitney U Test Small Samples
b. Mann-Whitney U Test Big Samples d. Wilcoxon Matched Pairs Test
Answers: 1. a, 2. d, 3. d, 4. a, 5. d, 6. d, 7. d, 8. b, 9. a, 10. d.
References
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.).
Hillsdale, NJ: Lawrence Erlbaum Associates, Publishers.
Reynolds. H. T. (1984). Analysis of nominal data (2nd ed). Beverly Hills, CA: Sage
Publications.
Siegel, S. (1956). Nonparametric statistics for the behavioral sciences. New York:
McGraw-Hill.
Spatz, C. (2001). Basic statistics: Tales of distributions (7th ed.). Belmont, CA:
Wadsworth.
Udinsky, B. F., Osterlind, S. J., & Lynch, S. W. (1981). Evaluation resource handbook.
San Diego, CA: Edits Publishers.
286
Welkowitz, J., Ewen, R. B., & Cohen, J. (1991). Introductory statistics for the behavioral
sciences (4th ed.). New York: Harcourt Brace Jovanovich Publishers.