Krushal Wallis Test

Wednesday, December 13, 2000 Kruskal-Wallis Test Page: 1
©Richard Lowry, 1999-2000

All rights reserved.
Subchapter 14a.
The Kruskal-Wallis Test
for 3 or More Independent Samples
As a reminder, the assumptions of the one-way ANOVA for independent samples are
1. that the scale on which the dependent variable is measured has the properties of an
equal interval scale;T
2. that the k samples are independently and randomly drawn from the source
population(s);T
3. that the source population(s) can be reasonably supposed to have a normal
distribution; andT
4. that the k samples have approximately equal variances.
We noted in the main body of Chapter 14 that we need not worry very much about the first,
third, and fourth of these assumptions when the samples are all the same size. For in that
case the analysis of variance is quite robust, by which we mean relatively unperturbed by the
violation of its assumptions. But of course, the other side of the coin is that when the
samples are not all the same size, we do need to worry. In this case, should one or more of
assumptions 1, 3, and 4 fail to be met, an appropriate non-parametric alternative to the
one-way independent-samples ANOVA can be found in the Kruskal-Wallis Test.
I will illustrate the Kruskal-Wallis test with an example based on rating-scale data, since this
is by far the most common situation in which unequal sample sizes would call for the use of a
non-parametric alternative. In this particular case the number of groups is k=3. I think it will
be fairly obvious how the logic and procedure would be extended in cases where k is greater
than 3.
To assess the effects of expectation on the perception of aesthetic quality, an investigator

randomly sorts 24 amateur wine aficionados into three groups, A, B, and C, of 8 subjects
each. Each subject is scheduled for an individual interview. Unfortunately, one of the subjects
of group B and two of group C fail to show up for their interviews, so the investigator must
make do with samples of unequal size: n a =8, n b =7, and n c =6, for a total of N=21. The
subjects who do show up for their interviews are each asked to rate the overall quality of
each of three wines on a 10-point scale, with "1" standing at the bottom of the scale and
"10" at the top.
As it happens, the three wines are the same for all subjects. The only difference is in the
texture of the interview, which is designed to induce a relatively high expectation of quality in
the members of group A; a relatively low expectation in the members of group C; and a
merely neutral state, tending in neither the one direction nor the other, for the members of
group B. At the end of the study, each subject's ratings are averaged across all three wines,
and this average is then taken as the raw measure for that particular subject. The following
table shows these measures for each subject in each of the three groups.
file:///Macintosh%20HD/webtext/%A5_PS_Preview_.html
Group
A B C
6.4 2.5 1.3

6.8 3.7 4.1
7.2 4.9 4.9
8.3 5.4 5.2
8.4 5.9 5.5
9.1 8.1 8.2
9.4 8.2
9.7
mean 8.2 5.5 4.9
¶Mechanics
The preliminaries of the Kruskal-Wallis test are much the same as those of the Mann-Whitney
test described in Subchapter 11a. We begin by assembling the measures from all k samples
into a single set of size N. These assembled measures are rank-ordered from lowest
(rank#1) to highest (rank#N), with tied ranks included where appropriate; and the resulting
ranks are then returned to the sample, A, B, or C, to which they belong and substituted for
the raw measures that gave rise to them. Thus, the raw measures that appear in the
following table on the left are replaced by their respective ranks, as shown in the table on
the right.
Raw Measures Ranked Measures
A B C A B C
6.4 2.5 1.3 11 2 1

6.8 3.7 4.1 12 3 4
7.2 4.9 4.9 13 5.5 5.5
8.3 5.4 5.2 17 8 7
8.4 5.9 5.5 18 10 9
9.1 8.1 8.2 19 14 15.5
9.4 8.2 20 15.5 A, B, C
9.7 21 Combined
sum of ranks 131 58 42 231
average of ranks 16.4 8.3 7.0 11
With the Kruskal-Wallis test, however, we take account not only of the sums of the ranks
within each group, but also of the averages. Thus the following items of symbolic notation:
TA = the sum of the na ranks in group A

MA = the mean of the na ranks in group A
TB = the sum of the nb ranks in group B

MB = the mean of the nb ranks in group B
TC = the sum of the nc ranks in group C

MC = the mean of the nc ranks in group C
Tall = the sum of the N ranks in all groups combined

Mall = the mean of the N ranks in all groups combined
¶Logic and Procedure
·The Measure of Aggregate Group Differences

You will sometimes find the Kruskal-Wallis test described as an "analysis of variance by
ranks." Although it is not really an analysis of variance at all, it does bear a certain
resemblance to ANOVA up to a point. In both procedures, the first part of the task is to find a
measure of the aggregate degree to which the group means differ. With ANOVA that
measure is found in the quantity known as SS bg , which is the between-groups sum of
squared deviates. The same is true with the Kruskal-Wallis test, except that here the group
means are based on ranks rather than on the raw measures. As a reminder that we are now
dealing with ranks, we will symbolize this new version of the between-groups sum of
squared deviates as SS bg(R) . The following table summarizes the mean ranks for the
present example. Also included are the sums and the counts (n a , n b , n c , and N) on which
these means are based.
A B C All
counts 8 7 6 21
sums 131 58 42 231
means 16.4 8.3 7.0 11.0
In Chapters 13 and 14 you saw that the squared deviate for any particular group mean is
equal to the squared difference between that group mean and the mean of the overall array
of data, multiplied by the number of observations on which the group mean is based. Thus,
for each of our current three groups
A: 8(16.4 —11.0)2 = 233.3

B: 7(8.3—11.0)2 = 051.0
C: 6(7.0—11.0)2 = 096.0
SS bg(R) = 380.3
On analogy with the formulaic structures for SS bg developed in Chapters 13 and 14, we can
write the conceptual formula for SS bg(R) as
SS bg(R) =( [ng (Mg—Mall)2] Here as well, the subscript "g"

means "any particular group."
and the computational formula as
(Tg)2 (Tall)2
SS bg(R) = —
ng Na
With k=3 samples, this latter structure would be equivalent to
(TA )2 (TB)2 (TC)2 (Tall)2

SS bg(R) = + + —
na nb nc Na
For k=4 it would be
(TA )2 (TB)2 (TC)2 (TD )2 (Tall)2

SS bg(R) = + + + —
na nb nc nd Na
And so forth for other values of k.
Here, in any event, is how it would work out for the present example. The discrepancy
between what we get now and what we got a moment ago (380.3) is due to rounding error
in the earlier calculation. As usual, it is the computational formula that is the less susceptible
to rounding error, hence the more reliable.
(131 )2 (58 )2 (42 )2 (213 )2

SS bg(R) = + + —
8 7 6 21
= 378.7
·The Null-Hypothesis Value of SSbg(R)

The null hypothesis in this or any comparable situation involving several independent
samples of ranked data is that the mean ranks of the k groups will not substantially differ.
On this account, you might suppose that the null-hypothesis value of SS bg(R), the aggregate
measure of group differences, would be simply zero. A moment's reflection, however, will
show why this cannot be so.
Consider the very simple case where there are 3 groups, each containing 2
A B C
observations. By way of analogy, imagine you had six small cards representing
x x x the ranks "1," "2," "3," "4," "5," and "6." If you were to sort these cards into
x x x every possible combination of two ranks per group, you would find the total
number of possible combinations to be
N! 6!
= = 90
n a ! n b ! n c! 2! 2! 2!
And the values of SS bg(R) produced by these 90 combinations would constitute the sampling
distribution of SS bg(R) for this particular case. Of these 90 possible combinations, a few (6)
would yield values of SS bg(R) equal to exactly zero. All the rest would produce values
greater than zero. (It is mathematically impossible to have a sum of squared deviates less
than zero.) Accordingly, the mean of this sampling distribution—the value that observed
instances of SS bg(R) will tend to approximate if the null hypothesis is true—is not zero, but
something greater than zero.
In any particular case of this sort, the mean of the sampling distribution of SS bg(R) is given
by the formula
N(N+1)
(k—1) x
12
which for the simple case just examined works out as
6 (6+1)
(3—1) x = 7.0
12
For our main example, we therefore know that the observed value of SS bg(R)=378.7 belongs
to a sampling distribution whose mean is equal to
21 (21+1)
(3—1) x = 77.0
12
All that now remains is to figure out how to turn this fact into a rigorous assessment of
probability.
·The Kruskal-Wallis Statistic: H

In case you have been girding yourself for some heavy slogging of the sort encountered with
the Mann-Whitney test, you can now relax, for the rest of the journey is quite an easy one.
The Kruskal-Wallis procedure concludes by defining a ratio symbolized by the letter H, whose
numerator is the observed value of SS bg(R) and whose denominator includes a portion of
the above formula for the mean of the sampling distribution of SS bg(R) . Note that most
textbooks give a very different-looking formula for the calculation of H—a rather impenetrable
structure to which we will return in a moment. This first version affords a much clearer sense
of the underlying concepts.
SS bg(R)
H=
N(N+1)/12
And now for the denouement. When each of the k samples includes at least 5 observations
(that is, when n a , n b , nc, etc., are all equal to or greater than 5), the sampling distribution of
H is a very close approximation of the chi-square distribution for df=k—1. It is actually a fairly
close approximation even when one or more of the samples includes as few as 3
observations.
For our present example, we can therefore calculate the value of H as

SS bg(R) 378.7
H= = = 9.84
N(N+1)/12 21 (21+1)/12
And then, treating this result as though it were a value of chi-square, we can refer it to the
sampling distribution of chi-square with df =3 — 1=2. The following graph, borrowed from
Chapter 8, will remind you of the outlines of this particular chi-square distribution. In brief: by
the Kruskal-Wallis test, the observed aggregate difference among the three samples is
significant a bit beyond the .01 level.
Theoretical Sampling Distribution of Chi-Square for df=2
·An Alternative Formula for the Calculation of H

I noted a moment ago that textbook accounts of the Kruskal-Wallis test usually give a
different version of the formula for H. If you are a beginning student calculating H by hand,
I would recommend using the version given above, as it gives you a clearer idea of just what
H is measuring. Once you get the hang of things, however, you might find this alternative
computational formula a bit more convenient.
12 (Tg)2
H =
N(N+1) ( ng
) — 3 (N+1)
In any event, as you can see below, this version yields exactly the same result as the other.
12 (131 )2 (58 )2 (42 )2

H =
21 (21+1) ( 8
+
7
+
6
) — 3 (21+1)
= 9.84
The VassarStats web site has a page that will perform all steps of the Kruskal-Wallis test,
including the rank-ordering of the raw measures.
End of Subchapter 14a.

Return to Top of Subchapter 14a
Go to Chapter 15 [One-Way Analysis of Variance for Correlated Samples]
Home Click this link only if the present page does not appear in a frameset headed by
the logo Concepts and Applications of Inferential Statistics

Krushal Wallis Test

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Krushal Wallis Test

Uploaded by

Copyright:

Available Formats

Wednesday, December 13, 2000 Kruskal-Wallis Test Page: 1

©Richard Lowry, 1999-2000

To assess the effects of expectation on the perception of aesthetic quality, an investigator

6.4 2.5 1.3

mean 8.2 5.5 4.9

Raw Measures Ranked Measures

6.4 2.5 1.3 11 2 1

sum of ranks 131 58 42 231

average of ranks 16.4 8.3 7.0 11

TA = the sum of the na ranks in group A

TB = the sum of the nb ranks in group B

TC = the sum of the nc ranks in group C

Tall = the sum of the N ranks in all groups combined

¶Logic and Procedure

·The Measure of Aggregate Group Differences

sums 131 58 42 231

means 16.4 8.3 7.0 11.0

A: 8(16.4 —11.0)2 = 233.3

SS bg(R) =( [ng (Mg—Mall)2] Here as well, the subscript "g"

and the computational formula as

With k=3 samples, this latter structure would be equivalent to

(TA )2 (TB)2 (TC)2 (Tall)2

For k=4 it would be

(TA )2 (TB)2 (TC)2 (TD )2 (Tall)2

And so forth for other values of k.

(131 )2 (58 )2 (42 )2 (213 )2

·The Null-Hypothesis Value of SSbg(R)

which for the simple case just examined works out as

·The Kruskal-Wallis Statistic: H

For our present example, we can therefore calculate the value of H as

Theoretical Sampling Distribution of Chi-Square for df=2

·An Alternative Formula for the Calculation of H

12 (131 )2 (58 )2 (42 )2

End of Subchapter 14a.

You might also like