Professional Documents
Culture Documents
Subchapter 14a.
The Kruskal-Wallis Test
for 3 or More Independent Samples
As a reminder, the assumptions of the one-way ANOVA for independent samples are
1. that the scale on which the dependent variable is measured has the properties of an
equal interval scale;T
2. that the k samples are independently and randomly drawn from the source
population(s);T
3. that the source population(s) can be reasonably supposed to have a normal
distribution; andT
4. that the k samples have approximately equal variances.
We noted in the main body of Chapter 14 that we need not worry very much about the first,
third, and fourth of these assumptions when the samples are all the same size. For in that
case the analysis of variance is quite robust, by which we mean relatively unperturbed by the
violation of its assumptions. But of course, the other side of the coin is that when the
samples are not all the same size, we do need to worry. In this case, should one or more of
assumptions 1, 3, and 4 fail to be met, an appropriate non-parametric alternative to the
one-way independent-samples ANOVA can be found in the Kruskal-Wallis Test.
I will illustrate the Kruskal-Wallis test with an example based on rating-scale data, since this
is by far the most common situation in which unequal sample sizes would call for the use of a
non-parametric alternative. In this particular case the number of groups is k=3. I think it will
be fairly obvious how the logic and procedure would be extended in cases where k is greater
than 3.
As it happens, the three wines are the same for all subjects. The only difference is in the
texture of the interview, which is designed to induce a relatively high expectation of quality in
the members of group A; a relatively low expectation in the members of group C; and a
merely neutral state, tending in neither the one direction nor the other, for the members of
group B. At the end of the study, each subject's ratings are averaged across all three wines,
and this average is then taken as the raw measure for that particular subject. The following
table shows these measures for each subject in each of the three groups.
file:///Macintosh%20HD/webtext/%A5_PS_Preview_.html
Wednesday, December 13, 2000 Kruskal-Wallis Test Page: 2
Group
A B C
¶Mechanics
The preliminaries of the Kruskal-Wallis test are much the same as those of the Mann-Whitney
test described in Subchapter 11a. We begin by assembling the measures from all k samples
into a single set of size N. These assembled measures are rank-ordered from lowest
(rank#1) to highest (rank#N), with tied ranks included where appropriate; and the resulting
ranks are then returned to the sample, A, B, or C, to which they belong and substituted for
the raw measures that gave rise to them. Thus, the raw measures that appear in the
following table on the left are replaced by their respective ranks, as shown in the table on
the right.
A B C A B C
With the Kruskal-Wallis test, however, we take account not only of the sums of the ranks
within each group, but also of the averages. Thus the following items of symbolic notation:
file:///Macintosh%20HD/webtext/%A5_PS_Preview_.html
Wednesday, December 13, 2000 Kruskal-Wallis Test Page: 3
A B C All
counts 8 7 6 21
In Chapters 13 and 14 you saw that the squared deviate for any particular group mean is
equal to the squared difference between that group mean and the mean of the overall array
of data, multiplied by the number of observations on which the group mean is based. Thus,
for each of our current three groups
file:///Macintosh%20HD/webtext/%A5_PS_Preview_.html
Wednesday, December 13, 2000 Kruskal-Wallis Test Page: 4
SS bg(R) = 380.3
On analogy with the formulaic structures for SS bg developed in Chapters 13 and 14, we can
write the conceptual formula for SS bg(R) as
(Tg)2 (Tall)2
SS bg(R) = —
ng Na
Here, in any event, is how it would work out for the present example. The discrepancy
between what we get now and what we got a moment ago (380.3) is due to rounding error
in the earlier calculation. As usual, it is the computational formula that is the less susceptible
to rounding error, hence the more reliable.
= 378.7
file:///Macintosh%20HD/webtext/%A5_PS_Preview_.html
Wednesday, December 13, 2000 Kruskal-Wallis Test Page: 5
Consider the very simple case where there are 3 groups, each containing 2
A B C
observations. By way of analogy, imagine you had six small cards representing
x x x the ranks "1," "2," "3," "4," "5," and "6." If you were to sort these cards into
x x x every possible combination of two ranks per group, you would find the total
number of possible combinations to be
N! 6!
= = 90
n a ! n b ! n c! 2! 2! 2!
And the values of SS bg(R) produced by these 90 combinations would constitute the sampling
distribution of SS bg(R) for this particular case. Of these 90 possible combinations, a few (6)
would yield values of SS bg(R) equal to exactly zero. All the rest would produce values
greater than zero. (It is mathematically impossible to have a sum of squared deviates less
than zero.) Accordingly, the mean of this sampling distribution—the value that observed
instances of SS bg(R) will tend to approximate if the null hypothesis is true—is not zero, but
something greater than zero.
In any particular case of this sort, the mean of the sampling distribution of SS bg(R) is given
by the formula
N(N+1)
(k—1) x
12
6 (6+1)
(3—1) x = 7.0
12
For our main example, we therefore know that the observed value of SS bg(R)=378.7 belongs
to a sampling distribution whose mean is equal to
21 (21+1)
(3—1) x = 77.0
12
All that now remains is to figure out how to turn this fact into a rigorous assessment of
file:///Macintosh%20HD/webtext/%A5_PS_Preview_.html
Wednesday, December 13, 2000 Kruskal-Wallis Test Page: 6
probability.
And now for the denouement. When each of the k samples includes at least 5 observations
(that is, when n a , n b , nc, etc., are all equal to or greater than 5), the sampling distribution of
H is a very close approximation of the chi-square distribution for df=k—1. It is actually a fairly
close approximation even when one or more of the samples includes as few as 3
observations.
And then, treating this result as though it were a value of chi-square, we can refer it to the
sampling distribution of chi-square with df =3 — 1=2. The following graph, borrowed from
Chapter 8, will remind you of the outlines of this particular chi-square distribution. In brief: by
the Kruskal-Wallis test, the observed aggregate difference among the three samples is
significant a bit beyond the .01 level.
file:///Macintosh%20HD/webtext/%A5_PS_Preview_.html
Wednesday, December 13, 2000 Kruskal-Wallis Test Page: 7
12 (Tg)2
H =
N(N+1) ( ng
) — 3 (N+1)
In any event, as you can see below, this version yields exactly the same result as the other.
= 9.84
The VassarStats web site has a page that will perform all steps of the Kruskal-Wallis test,
including the rank-ordering of the raw measures.
Home Click this link only if the present page does not appear in a frameset headed by
the logo Concepts and Applications of Inferential Statistics
file:///Macintosh%20HD/webtext/%A5_PS_Preview_.html