Professional Documents
Culture Documents
Nonparametric
Statistics
Nonparametric Statistics
This chapter deals with statistical techniques that
deal with ordinal data.
Recall: when the data are ordinal, the mean is not
an appropriate measure of central location.
Instead, we will test characteristics of populations
without referring to specific parameters, hence
the term nonparametric.
Rather than testing to determine whether the
population means differ, we will test to determine
whether the population locations differ
Nonparametric Statistics
The tests that we discussed so far
can be applied only when the data is
normal or approximately normal. If
the above condition is not satisfied
we can use nonparametric statistics
also known as distribution free
statistics.
Nonparametric Statistics
The techniques that we are going
to discuss can be used when the
data is interval and the required
condition of normality is
unsatisfied. In such circumstances
we will treat the interval data as if
they are ordinal.
Population Locations
The location of popn 1 is to the left of the location of
popn 2
population 1
population 2
population 1
Problem Objectives
When the problem objective is to compare two
populations the null hypothesis will state:
H0: The two population locations are the
same.
The alternative hypothesis can take on any one
of the following three forms:
H1: The location of population 1 is different
from the location of population 2
H1: The location of population 1 is to the
right of the location of population 2
H1: The location of population 1 is to the left
of the location of population 2
NOTE:
All of our hypotheses are phrased in
terms of 1 then 2.
This is for consistency. Rather than
state:
H1: The location of population 2 is to
the left of the location of population 1,
we would want to phrase this as:
H1: The location of population 1 is to
the right of the location of population 2
Graphical Demonstration
Why use the sum of ranks to test
locations?
If the locations of the two populations are about the same, (the null hypothesis is true)
we would expect the ranks to be evenly spread between the samples.
In this case the sum of ranks for the two samples will be close to one another.
Sum of ranks = 41
Sum of ranks = 37
10
11
12
Graphical Demonstration
Why use the sum of ranks to test
locations?
Graphical Demonstration
Why use the sum of ranks to test
locations?
Sum of ranks = 41
Sum of ranks = 37
Sum of ranks = 40
Sum of ranks = 38
Sum of ranks = 33
Sum of ranks = 45
10
11
12
Graphical Demonstration
Why use the sum of ranks to test
locations?
Sum of ranks = 41
Sum of ranks = 37
Sum of ranks = 40
Sum of ranks = 38
Sum of ranks = 33
Sum of ranks = 45
10
11
12
Sample 1
22
23
20
Rank
3
4
2
Sample 2
18
27
26
Rank
1
6
5
2. Calculate the
2. Calculate the
sum of ranks: 9
sum of ranks:12
3. Let T = 9 be the test statistic (We arbitrarily define the test
statistic as the rank sum of sample 1.)
ENUMERATE
CALCULATE
PROBABILITIES
Table 21.2
Sampling
Distribution of
T with Two
Samples of
Size 3
3 co
ion
t
a
n
m bi
Total of
20 combinations
INTERPRET
Example 21.1
H0 is rejected if TSince T = 9,
there is insufficient evidence to
conclude that population 1 is
located to the left of population 2,
at the 5% significance level.
T=9 <
X TCritical=6, we cannot
reject H0
Table 21.3a
TL T U
11 25
For a two tail test: P(T<11) = P(T>25) = .025 if n 1=4 and n2=4.
For a one tail test: P(T<11) = P(T>25) = .05 if n 1=4 and n2=4.
Using the table: For given two samples of sizes n 1 and n2, P(T<TL)=P(T>TU)=
A similar table exists for = .05 (one tail test) and = .10 (two tail test)
Table 21.3b
Therefore,
T - E(T)
Z=
T
Example 21.2
A drug company is trialing a new painkiller. 30 people
were selected at random, half were given the new drug,
half given aspirin, and all were told to rate the
effectiveness on a five point scale (hence ordinal data):
5 = The drug was extremely effective.
4 = The drug was quite effective.
3 = The drug was somewhat effective.
2 = The drug was slightly effective.
1 = The drug was not at all effective.
Example 21.2
IDENTIFY
New painkiller: 3, 5, 4, 3, 2, 5, 1, 4, 5, 3, 3, 5, 5, 5, 4
Aspirin:
4, 1,to3,note
2, 4,here
1, 3,
4,5
2, 2,
4, 3, 4,score,
5 so
Its important
that
is a2,good
if the drug is effective, wed likely see its location
greater than the location of aspirin users, hence:
H1: The location of population 1 is to the right of the
location of population 2, and so:
H0: The two population locations are the
same.
Example 21.2
IDENTIFY
Example 21.2
IDENTIFY
New Painkiller
Rank
Aspirin
Rank
12
19.5
27
19.5
12
12
19.5
27
12
19.5
19.5
27
12
12
27
19.5
27
12
27
19.5
19.5
27
Rank Total T1 =
276.5
Rank Total T2 =
188.5
Example 21.2
COMPUTE
Example 21.2
COMPUTE
T - E(T)
276.5 232.5
Z=
=
= 1.83
T
24.1
The p-value of the test is:
p-value = P(Z > 1.83) = .5 - .4664
= .0336
(or Z=1.83 > ZCritical=1.645)
Example 21.2
INTERPRET
Nonbusiness graduates
25 60 22 24 23 36 39 15 35 16
28 9 60 29 16 22 60 17 60 32
Business graduates
Business
60
11
18
19
5
25
60
7
8
17
37
4
8
28
27
11
60
25
5
13
22
11
17
9
4
T1
=
Rank
42
11
20
21
3.5
28
42
5
6.5
18
37
1.5
6.5
31.5
30
11
42
28
3.5
13
23
11
18
8.5
1.5
463
Nonbusiness
25
60
22
24
23
36
39.
15
35
16
28
9
60
29
16
22
60
17
60
32
T2
=
Rank
28
42
23
26
25
36
38
14
35
15.5
31.5
8.5
42
33
15.5
23
42
18
42
34
572
z z / 2 z .025 1.96
z z / 2 z .025 1.96
2.56 > 1.96
There is strong evidence to infer that
the duration of employment is different
for business and non-business
graduates. The data can not tell
us the reason.
Required Conditions
The Wilcoxon rank sum test actually tests to
determine whether the population distributions
are identical. This means that it tests not only
for identical locations, but for identical spreads
(variances) and shapes (distributions) as well.
The rejection of the null hypothesis may be due
instead to a difference in distribution shapes
and/or spreads.
To avoid this problem, we will require that the
two probability distributions be identical
except with respect to location.
Identifying Factors
Factors that identify the Wilcoxon Rank
Sum
Sign Test
We can think of the sign test in
terms of a binomial experiment,
getting a positive sign is like flipping
heads on a coin. We use this notion
along with previously developed
statistics to come up with our
standardized test statistic (assuming
the null hypothesis is true):
x - np
np ( 1- p )
Example 21.3
25 people were asked to ride in a European car
(and rate the ride) then ride in a North American car
(and again, rate the ride). The ratings were ordinal,
from 1 very uncomfortable to 5 very
comfortable, and its a matched pairs experiment
since the same rider tried both cars. [Xm21-03.xls]
Can we conclude (at 5% significance) that the
European car is perceived to be more comfortable
than the North American car?
Example 21.3
Comfort Ratings
Respondent
E. Car
N.A. Car
Comfort Ratings
Differen.
-1
Respondent
E. Car
N.A. Car
Differen.
-1
13
14
-1
15
16
17
18
-2
-2
19
20
21
10
22
11
23
-1
-1
12
24
25
5 negatives
18 positives
2 same rating
-1
-1-1
Example 21.3
The data was analyzed
COMPUTE
We had 5 negative
responses.
We had 25 pairs of data
initially, two pairs gave
identical ratings (i.e.
delta = zero) so these
data points are
dropped, hence n=23
We had 18 positive
responses, thus x=18
Example 21.3
INTERPRET
SPSS Output
Ranks
N
european - american
Negative Ranks
Positive Ranks
Ties
Total
5a
18b
2c
25
Mean Rank
10.70
12.36
Sum of Ranks
53.50
222.50
Te st Sta tisticsb
Z
Asy mp. Sig. (2-tailed)
european american
-2.683a
.007
Table 21.4
Example 21.4
IDENTIFY
Example 21.4
Data
Travel time
Worker
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
32
Arrival at 8.00AM
34
35
43
46
16
26
68
38
61
52
68
13
69
18
53
18
42
Flextime Program
31
31
44
44
15
28
63
39
63
54
65
12
71
13
55
19
38
Example 21.4
IDENTIFY
The data are interval (i.e. times) and were produced by a matched pairs
experiment (same drivers, same day of the week Wednesday). Why
arent we using a t-test for D ?
Example 21.4
COMPUTE
Travel time
Worker Arrival at 8.00AM Flextime Program Difference Difference
3
1
34
3
31
4
2
35
4
31
-1
3
43
1
44
2
4
46
2
44
1
5
16
1
15
-2
6
26
2
28
5
7
68
5
63
-1
8
38
1
39
-2
9
61
2
63
-2
10
52
2
54
3
11
68
3
65
1
12
13
1
12
-2
13
69
2
71
5
14
18
5
13
-2
15
53
2
55
-1
16
18
1
19
4
32
42
4
38
Rank
21.0
27.0
4.5
13.0
4.5
13.0
31.0
4.5
13.0
13.0
21.0
4.5
13.0
31.0
13.0
4.5
27.0
Example 21.4
The Original
Data
COMPUTE
Sorted ascending by |
difference|
Rank
Sums
Example 21.4
COMPUTE
INTERPRET
Example 21.4
There is not enough evidence to
infer that flextime commutes are
different from the commuting times
under the current schedule
Negative Ranks
Positive Ranks
Ties
Total
12a
20b
0c
32
Mean Rank
13.38
18.38
Sum of Ranks
160.50
367.50
Te st Sta tisticsb
Z
Asymp. Sig. (2-tailed)
@8_00_ar
- flextime
-1.947a
.051
Example 21.4
INTERPRET
compare
p-value
Identifying Factors I
Factors that Identify the Sign Test
Identifying Factors II
Factors that Identify the Wilcoxon Signed
Rank Sum Test
Size 6
2.00
3.00
3.00
2.00
3.00
4.00
3.00
3.00
4.00
4.00
2.00
3.00
4.00
4.00
1.00
3.00
4.00
3.00
4.00
2.00
Rank
8
20.5
20.5
8
20.5
34.5
20.5
20.5
34.5
34.5
8
20.5
34.5
34.5
2
20.5
34.5
20.5
34.5
8
Size 14
2.00
4.00
3.00
1.00
2.00
3.00
3.00
2.00
3.00
4.00
3.00
4.00
3.00
4.00
4.00
2.00
3.00
1.00
2.00
3.00
Rank
8
34.5
20.5
2
8
20.5
20.5
8
20.5
34.5
20.5
34.5
20.5
34.5
34.5
8
20.5
2
8
20.5
T 6 = 439.5
T14 = 380.5
21.14
H0 :
H1 :
z z z.05 1.645
n1 (n1 n 2 1) 20(20 20 1)
E (T )
410,
2
2
n n (n n 1)
(20)(20)(20 20 1)
T 1 2 1 2
37.0
12
12
439.5 410
T E (T )
.80,
z
=
37.0
T
p-value = P(Z > .80) = .5 .2881 = .2119.
There is not enough evidence to infer that women perceive another
woman wearing a size 6 dress as more professional than one wearing
a size 14 dress.
# 21.37
In a study to determine whether gender affects
salary offers for graduating MBA students, 25
pairs of students were selected. Each pair
consisted of a male and a female student who
had almost identical grade point averages,
courses taken, ages, and previous work
experience. The highest salary offered to each
student upon graduation was recorded. Is there
sufficient evidence to allow us to conclude the
salary offers differ between men and women?
Female
Male
Difference
Difference
29981.00
29689.00
30916.00
30300.00
31772.00
30647.00
30943.00
31598.00
32811.00
32754.00
32698.00
32223.00
32404.00
32578.00
34053.00
34823.00
35044.00
34783.00
34870.00
34806.00
35062.00
34905.00
36399.00
36186.00
36502.00
29233.00
28733.00
29541.00
29058.00
31149.00
29141.00
29739.00
33529.00
33938.00
32239.00
32661.00
31176.00
34375.00
34454.00
32184.00
34570.00
34097.00
36458.00
33321.00
34860.00
36207.00
33660.00
36758.00
34800.00
37701.00
748
956
1375
1242
623
1506
1204
-1931
-1127
515
37
1047
-1971
-1876
1869
253
947
-1675
1549
-54
-1145
1245
-359
1386
-1199
748
956
1375
1242
623
1506
1204
1931
1127
515
37
1047
1971
1876
1869
253
947
1675
1549
54
1145
1245
359
1386
1199
Rank
19
17
9
11
20
7
12
2
15
21
25
16
1
3
4
23
18
5
6
24
14
10
22
8
13
21.37
H0 :
H1 :
z z / 2 z .025 1.96
z z / 2 z.025 1.96
n ( n 1)
25( 25 1)
162.5
4
4
n ( n 1)(2n 1)
25( 25 1)(2[25] 1)
37.2
24
24
E (T )
;
or
T E (T )
z
T
190 162.5
.74,
37.2
Ranks
N
male - female
Negative Ranks
Positive Ranks
Ties
Total
16a
9b
0c
25
Mean Rank
11.88
15.00
Sum of Ranks
190.00
135.00
Test Statisticsb
Z
Asymp. Sig. (2-tailed)
male - female
-.740a
.459
Kruskal-Wallis Test
So far weve been comparing locations of two
populations, now well look at comparing two or
more populations.
The Kruskal-Wallis test is applied to problems
where we want to compare two or more
populations or ordinal or interval (but
nonnormal) data from independent samples.
Our hypotheses will be:
H0: The locations of all k populations are
the same.
H1: At least two population locations differ.
Test Statistic
In order to calculate the Kruskal-Wallis test
statistic, we need to:
1. Rank all the observations from smallest (1)
to largest (n), and average the ranks in the
case of ties.
2. We calculate rank sums for each sample:
T1, T2, , Tk
3. Lastly, we calculate the test statistic
(denoted H):
Figure 21.10
Sampling Distribution of H
Example 21.5
IDENTIFY
Example 21.5
10 customers were selected at random from
each shift
4:00 P.M to Midnight Midnight to 8:00 A.M
4
3
4
4
3
2
4
2
3
3
3
4
3
3
3
3
2
2
3
3
8: A.M to 4:P.M
3
1
3
2
1
3
4
2
4
1
Example 21.5
COMPUTE
Example 21.5
COMPUTE
Example 21.5
COMPUTE
= 2.64
Our critical value of Chi-squared (5% significance
and k1=2 degrees of freedom) is 5.99147, hence
there is not enough evidence to reject H0.
Example 21.5
INTERPRET
Example 21.5
COMPUTE
compare
SPSS Output
Test Statisticsa,b
Chi-Square
df
Asymp. Sig.
mid_8_00
1.752
2
.416
@8_00_4
2.226
2
.329
Identifying Factors
Factors that Identify the Kruskal-Wallis Test
Friedman Test
The Friedman Test is a technique used
compare two or more populations of ordinal
or interval (nonnormal) data that are
generated from a matched pairs experiment.
The hypotheses are the same as before:
H0: The locations of all k populations are the
same.
H1: At least two population locations differ.
Figure 21.11
Sampling Distribution of Fr
Example 21.6
IDENTIFY
Example 21.6
IDENTIFY
Example 21.6
COMPUTE
Example 21.6
COMPUTE
Example 21.6
COMPUTE
Original
Scores
checksu
m
straigh
t
ranking
10
average
d
ranking
(1+2)/2=
1.5
(1+2)/2=
1.5
checksum = 1 + 2 + 3 + + k
10
Example 21.6
COMPUTE
Example 21.6
INTERPRET
Itappearsthatthemanagers
evaluationsofapplicantsdoindeeddiffer
SPSS Output
Ranks
manager1
manager2
manager3
manager4
Mean Rank
2.63
1.25
3.06
3.06
Test Statisticsa
N
Chi-Square
df
Asymp. Sig.
8
12.864
3
.005
a. Friedman Test
Identifying Factors
Factors that Identify the Friedman Test
Sab
r
=
s
where a and b are the ranks of x and y respectively.
Example 21.7
The production manager of a firm wants to examine the
relationship between aptitude test scores given prior to
hiring of production line workers and performance
ratings received by the employees 3 months after starting
work. The results of the study would allow the firm to
decide how much weight to give to these aptitude tests
relative to other work-history information obtained,
including references. The aptitude test results range from
0 to 100. The performance rating are as follows:
1 = Employee has performed well below average
2 = Employee has performed somewhat below average
3 = Employee has performed at the average level
4 = Employee has performed some what above average
5 = Employee has performed well above average
Example 21.7
A random sample of 20 production workers yielded
the results listed below. Can the firms manager infer
at the 5% significance level that aptitude test scores
are correlated with performance rating?
Aptitude Performance
Employee test score
Rating
1
2
3
4
5
6
7
8
9
10
59
47
58
66
77
57
62
68
69
36
3
2
4
3
2
4
3
3
5
1
Aptitude Performance
Employee test score
Rating
11
12
13
14
15
16
17
18
19
20
48
65
51
61
40
67
60
56
76
71
3
3
2
3
3
4
2
3
3
5
Example 21.7
IDENTIFY
H1: 0
s
Example 21.7
COMPUTE
As before, we rank each of the variables separately and average any ties
Employee
Aptitude
test score
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
59
47
58
66
77
57
62
68
69
36
48
65
51
61
40
67
60
56
76
71
Performance
Rank a
Rating
9
3
8
14
20
7
12
16
17
1
4
13
5
11
2
15
10
6
19
18
3
2
4
3
2
4
3
3
5
1
3
3
2
3
3
4
2
3
3
5
Rank b
10.5
3.5
17
10.5
3.5
17
10.5
10.5
19.5
1
10.5
10.5
3.5
10.5
10.5
17
3.5
10.5
10.5
19.5
Example 21.7
COMPUTE
rs =
Sa S b
12.34
( 5.92)(5.50)
= .379
Example 21.7
COMPUTE
Identifying Factors
Factors that Identify the Spearman Rank
Correlation Coefficient Test