Statistik Non Parametrik 2

Chapter 21
Nonparametric
Statistics
Nonparametric Statistics
This chapter deals with statistical techniques that
deal with ordinal data.
Recall: when the data are ordinal, the mean is not
an appropriate measure of central location.
Instead, we will test characteristics of populations
without referring to specific parameters, hence
the term nonparametric.
Rather than testing to determine whether the
population means differ, we will test to determine
whether the population locations differ
The tests that we discussed so far
can be applied only when the data is
normal or approximately normal. If
the above condition is not satisfied
we can use nonparametric statistics
also known as distribution free
statistics.
The techniques that we are going
to discuss can be used when the
data is interval and the required
condition of normality is
unsatisfied. In such circumstances
we will treat the interval data as if
they are ordinal.
Distribution of two populations when

their locations are same
Population Locations
The location of popn 1 is to the left of the location of
popn 2
population 1
population 2
The location of popn 1 is to the right of the location of

popn 2
population 2
population 1
Problem Objectives
When the problem objective is to compare two
populations the null hypothesis will state:
H0: The two population locations are the
same.
The alternative hypothesis can take on any one
of the following three forms:
H1: The location of population 1 is different
from the location of population 2
H1: The location of population 1 is to the
right of the location of population 2
H1: The location of population 1 is to the left
of the location of population 2
The Alternative Hypotheses

H1: The location of population 1 is
different from the location of
population 2
Used when we want to know
whether there is sufficient evidence to
infer that there is a difference between
the two populations.

H1: The location of population 1 is to
the right of the location of population 2
Used when we want to know whether
we can conclude that the random
variable in population 1 is larger in
general than the random variable in
population 2,

H1: The location of population 1 is
to the left of the location of
population 2
Used when we want to know
whether we can conclude that the
random variable in population 1 is
smaller in general than the random
variable in population 2.
NOTE:
All of our hypotheses are phrased in
terms of 1 then 2.
This is for consistency. Rather than
state:
the left of the location of population 1,
we would want to phrase this as:
the right of the location of population 2
Wilcoxon Rank Sum Test

The problem characteristics of this
test are:
The problem objective is to compare two
populations.
The data are either ordinal or interval (but
not normal).
The samples are independent.

Example
Example 21.1
Based on the two samples shown below,
can we infer at 5% significance level that the
location of population 1 is to the left of the
location of population 2?
Sample 1: 22, 23, 20; Sample 2: 18, 27, 26;
The hypotheses are:
H0: The two population locations are the same.
H1: The location of population 1 is to the left of the
location of population 2.
Graphical Demonstration
Why use the sum of ranks to test
locations?
If the locations of the two populations are about the same, (the null hypothesis is true)
we would expect the ranks to be evenly spread between the samples.
In this case the sum of ranks for the two samples will be close to one another.
Sum of ranks = 41
Sum of ranks = 37
Two hypothetical populations and their corresponding samples are

presented, the GREEN population and the PURPLE population.
Populations
Let us rank the observations of the two samples together
1
10
11
12
locations?
Allow the GREEN population to shift

to the left of the PURPLE population.
locations?
Sum of ranks = 41
Sum of ranks = 37
Sum of ranks = 40
Sum of ranks = 38
Sum of ranks = 33
Sum of ranks = 45
10
The green sample is expected to shift to the left too.

As a result,
several observations exchange location.
At
ten happens
What
Click.
n
tio
on to the sum oftioranks?
i
t
n en
ten
t
t
t
A
A
11
12
locations?
Sum of ranks = 41
Sum of ranks = 37
Sum of ranks = 40
Sum of ranks = 38
Sum of ranks = 33
Sum of ranks = 45
10
11
12
The green sum decreases , and the purple sum increases.

Changing the relative location of two populations affect the
sum of ranks of the two samples combined.
Wilcoxon Rank Sum Test Example

Example 21.1 continued
Test statistic
1. Rank all the six observations (1 for the
smallest).
Sample 1
22
23
20
Rank
3
4
2
Sample 2
18
27
26
Rank
1
6
5
2. Calculate the
2. Calculate the
sum of ranks: 9
sum of ranks:12
3. Let T = 9 be the test statistic (We arbitrarily define the test
statistic as the rank sum of sample 1.)
Sampling Distribution of the Test Statistic

A small value of T indicates most of the smaller
observations are in sample 1 which was drawn from
population 1 but how small is small? is 9 small
enough?
We have our test statistic, T=9. We need to compare it
to some critical value of T to know if were in the
rejection region for H0 (or not).
So, what then, does the sampling distribution of
ranks look like?

We can build up the sampling distribution of
the test statistic in much the same way we
built histograms for the outcomes of rolls of 2
and 3 dice
1. Enumerate all possible combinations of ranks
2. Calculate ranks sums for the combinations
3. The probability of any rank sum is the number
of occurrences divided by the total number of
combinations
ENUMERATE
CALCULATE
PROBABILITIES
Table 21.2
Sampling
Distribution of
T with Two
Samples of
Size 3

1.Enumerate 2. Calculate 3. Probabilities
1c
om
bin
ati
2co
on
mb
ina
ti o
n
3 co
ion
t
a
n
m bi
Total of
20 combinations
INTERPRET
Example 21.1
H0 is rejected if TSince T = 9,
there is insufficient evidence to
conclude that population 1 is
located to the left of population 2,
at the 5% significance level.
Sampling Distribution of T with Two

Samples of Size 3
P(T6) = 1/20 = .05 Since

hus our critical value of T is 6
T=9 <
X TCritical=6, we cannot
reject H0
Critical Values of the

Table 21.3a
Critical values of the

= .025 for two tail test, or = .05 for one tail test
T L T U T L T U TL T U
TL T U
11 25
For a two tail test: P(T<11) = P(T>25) = .025 if n 1=4 and n2=4.
For a one tail test: P(T<11) = P(T>25) = .05 if n 1=4 and n2=4.
Using the table: For given two samples of sizes n 1 and n2, P(T<TL)=P(T>TU)=
A similar table exists for = .05 (one tail test) and = .10 (two tail test)
Table 21.3b
Critical Values of the Wilcoxon Rank Sum

Test
Critical Values: Wilcoxon Rank Sum Test

For sample sizes smaller than 10 observations
(in each sample), refer to the Critical Values in
Table 9 (Appendix B)
For sample sizes larger than 10, the test
statistic is approximately normally distributed
with:
Mean:
Hence:
Standard Deviation:
ni=sizeofsamplei,i=1,2
Wilcoxon rank sum test for

samples where n > 10
The test statistic is approximately normally
distributed with the following parameters:
n1(n1 + n2 + 1)
2
n1n2 (n1 n2 1)
T
12
E(T) =
Therefore,
T - E(T)
Z=
T
Example 21.2
A drug company is trialing a new painkiller. 30 people
were selected at random, half were given the new drug,
half given aspirin, and all were told to rate the
effectiveness on a five point scale (hence ordinal data):
5 = The drug was extremely effective.
4 = The drug was quite effective.
3 = The drug was somewhat effective.
2 = The drug was slightly effective.
1 = The drug was not at all effective.
Example 21.2
IDENTIFY
The data were recorded. Can we conclude (at 5%

significance) that the new painkiller is perceived to be
more effective?
New painkiller: 3, 5, 4, 3, 2, 5, 1, 4, 5, 3, 3, 5, 5, 5, 4
Aspirin:
4, 1,to3,note
2, 4,here
1, 3,
4,5
2, 2,
4, 3, 4,score,
5 so
Its important
that
is a2,good
if the drug is effective, wed likely see its location
greater than the location of aspirin users, hence:
H1: The location of population 1 is to the right of the
location of population 2, and so:
H0: The two population locations are the
same.
Example 21.2
IDENTIFY
The data looks like:

These three ones would occupy
ranks 1, 2, & 3 we average
them to ( 1 + 2 + 3)/3 = 2
These five twos would occupy

ranks 4,5,6,7, & 8 again,
average them to (4+5+6+7+8)/5
=6
and so on and so forth
Example 21.2
IDENTIFY
New Painkiller
Rank
Aspirin
Rank
12
19.5
27
19.5
12
12
19.5
27
12
19.5
19.5
27
12
12
27
19.5
27
12
27
19.5
19.5
27
Rank Total T1 =
276.5
Rank Total T2 =
188.5
Example 21.2
COMPUTE
The rank sum for the new painkiller is T1=276.5,

and the rank sum for aspirin: T2=188.5
Set T= T1=276.5, and begin calculating
Example 21.2
COMPUTE
T - E(T)
276.5 232.5
Z=
=
= 1.83
T
24.1
The p-value of the test is:
p-value = P(Z > 1.83) = .5 - .4664
= .0336
(or Z=1.83 > ZCritical=1.645)
Example 21.2
INTERPRET
Since Z = 1.83 > Zcritical =1.645

There is sufficient evidence to infer
that the new painkiller is perceived
to be more effective than aspirin
Wilcoxon rank sum test for nonnormal interval data, Example

Retaining Workers
The human resource manager of a large
company wanted to compare how long business
and non-business graduates worked for the
company before quitting.
Two samples of 25 business graduates and 20
non-business graduates were randomly selected.
The data representing their time with the
company were recorded.
Duration of Employment (Months)

Business graduates
60 11 18 19 5 25 60 7 8 17 37 4 8
28 27 11 60 25 5 13 22 11 17 9 4
Nonbusiness graduates
25 60 22 24 23 36 39 15 35 16
28 9 60 29 16 22 60 17 60 32

Retaining workers - continued
Business Non-Bus
60
25
11
60
18
22
19
24
5
23
25
36
.
.
.
.
.
.
Can the personnel manager

conclude at 5% significance
level that a difference in
duration of employment exists
between business and nonbusiness graduates?

Solution
The problem objective is to compare two
populations of interval data.
The samples are independent.
The non-normality of the two populations
is apparent from the sample histograms:
Non Business graduates
Business graduates

Solution continued
The Wilcoxon rank test is the correct
procedure to run.
H0: The two population locations are
the same
H1: The location of population
1(business graduates) is
different from the location of
population 2 (non-business graduates).
Business
60
11
18
19
5
25
60
7
8
17
37
4
8
28
27
11
60
25
5
13
22
11
17
9
4
T1
=
Rank
42
11
20
21
3.5
28
42
5
6.5
18
37
1.5
6.5
31.5
30
11
42
28
3.5
13
23
11
18
8.5
1.5
463
Nonbusiness
25
60
22
24
23
36
39.
15
35
16
28
9
60
29
16
22
60
17
60
32
T2
=
Rank
28
42
23
26
25
36
38
14
35
15.5
31.5
8.5
42
33
15.5
23
42
18
42
34
572

Solution continued
Solving by hand
The rejection region is
z z / 2 z .025 1.96
After the ranking process is completed,

we have:
T = Tbusiness graduates = 463.
E(T) = n1(n1+n2+1)/2=575;
T=[n1n2(n1+n2+1)/12]1/2=43.8
T E(T) 463 575

z
2.56 Reject the null hypothesis

T
43.8
Wilcoxon rank sum test for

INTERPRET
non-normal interval data, Example
z z / 2 z .025 1.96
2.56 > 1.96
There is strong evidence to infer that
the duration of employment is different
for business and non-business
graduates. The data can not tell
us the reason.
Required Conditions
The Wilcoxon rank sum test actually tests to
determine whether the population distributions
are identical. This means that it tests not only
for identical locations, but for identical spreads
(variances) and shapes (distributions) as well.
The rejection of the null hypothesis may be due
instead to a difference in distribution shapes
and/or spreads.
To avoid this problem, we will require that the
two probability distributions be identical
except with respect to location.
Identifying Factors
Factors that identify the Wilcoxon Rank
Sum
Sign Test and Wilcoxon Signed Rank Sum Test

(Tests for Matched Pairs Experiments)
We will now look at two nonparametric
techniques (Sign Test and Wilcoxon Signed Rank
Sum Test) that test hypotheses in problems with
the following characteristics:
We want to compare two populations,
The data are either ordinal or interval
(nonnormal),
and the samples are matched pairs.
As before, well compute matched pair
differences and work from there
The Sign Test

We can use the Sign Test when were dealing with
two populations of ordinal data in a matched pairs
experiment.
For each matched pair, take the differences and
count up the number of positive differences and
negative differences.
If population locations are the same (say), wed
expect the number of positives and negatives to
net out to zero. If we have more positives than
negatives (or vice versa) what can we learn?
Again, how many is enough to make a difference?
Sign Test
We can think of the sign test in
terms of a binomial experiment,
getting a positive sign is like flipping
heads on a coin. We use this notion
along with previously developed
statistics to come up with our
standardized test statistic (assuming
the null hypothesis is true):
Test Statistics and Sampling

Distribution
When x is binomially distributed and
that, for sufficiently large n, x is
approximately normally distributed
with mean = np and standard
deviation
np ( 1- p ) . The
standardized test statistics is
Z=
x - np
np ( 1- p )
Test Statistics and Sampling Distribution

The null hypothesis is:
H0 = the two population locations are the
same
is equivalent to:
H0: p = .5 (i.e. equal proportions of +s & -s)
Therefore the test statistics becomes
x - .5n
x
np
z=
= z=
.5 n
np ( 1- p )
Test Statistics and Sampling

Distribution
The normal approximation of binomial
is valid when np 5 and
n ( 1 p ) 5 when p = .5
np = n (.5) 5 and
n( 1- p ) = n ( 1 - .5) = n(.5) 5
Implies that n must be greater than 10.
This is one of the required conditions
for sign test.
Sign Test Hypotheses

Since our null hypothesis is:
H0: the two population locations are the
same
(i.e. p=.5)
Our research hypothesis must be:
H1: the two population locations are different
which is the same as:
H1: p .5
Example 21.3
25 people were asked to ride in a European car
(and rate the ride) then ride in a North American car
(and again, rate the ride). The ratings were ordinal,
from 1 very uncomfortable to 5 very
comfortable, and its a matched pairs experiment
since the same rider tried both cars. [Xm21-03.xls]
Can we conclude (at 5% significance) that the
European car is perceived to be more comfortable
than the North American car?
Example 21.3
Comfort Ratings
Respondent
E. Car
N.A. Car
Comfort Ratings
Differen.
-1
Respondent
E. Car
N.A. Car
Differen.
-1
13
14
-1
15
16
17
18
-2
-2
19
20
21
10
22
11
23
-1
-1
12
24
25
5 negatives
18 positives
2 same rating
-1
-1-1
Example 21.3
The data was analyzed
COMPUTE
We had 5 negative
responses.
We had 25 pairs of data
initially, two pairs gave
identical ratings (i.e.
delta = zero) so these
data points are
dropped, hence n=23
We had 18 positive
responses, thus x=18
Example 21.3
INTERPRET
The p-value is P(Z > 2.71) =0.5 - .4966 = .0034,

hence we reject H0 in favor of H1, and conclude:
H1: the two population locations are different
Or, in the context of this problem
There is relatively strong evidence to indicate that
people perceive the European car to provide a more
comfortable ride than the North American car.
SPSS Output
Ranks
N
european - american
Negative Ranks
Positive Ranks
Ties
Total
5a
18b
2c
25
Mean Rank
10.70
12.36
Sum of Ranks
53.50
222.50
a. european < american

b. european > american
c. european = american
Te st Sta tisticsb
Z
Asy mp. Sig. (2-tailed)
european american
-2.683a
.007
a. Based on negative ranks.

b. W ilcoxon Signed Ranks Test
Checking the Required Conditions

The sign test requires:
The populations be similar in shape and spread:
The sample size exceeds 10 (n=23).
Wilcoxon Signed Rank Sum Test

Well use Wilcoxon Signed Rank Sum test
when we want to compare two populations of
interval (but not normally distributed) data in a
matched pairs type experiment.
Compute paired differences, discard zeros.
Rank absolute values of differences smallest
(1) to largest (n), averaging ranks of tied
observations.
Sum the ranks of positive differences (T+)
and of negative differences (T).
Use T=T+ as our test statistic
Wilcoxon Signed Rank Sum Test

Now we have a test statistic, but what to
compare it against?
For small sample sizes, i.e. n 30, critical
values of T can be read from Table 10 in
Appendix B.
For large sample sizes, i.e. n > 30, T is
approximately normally distributed, so we
have:
Table 21.4
Critical Values for the

Wilcoxon Signed Rank
Sum Test
Example 21.4
IDENTIFY
Do travel times to the office vary between an

8:00 am start and a flextime start? 32
workers recorded their travel times
We want to research this hypothesis:
H1: the two population locations are
different
Thus we require:
H0: the two population locations are the
same.
Example 21.4
Data
Travel time
Worker
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
32
Arrival at 8.00AM
34
35
43
46
16
26
68
38
61
52
68
13
69
18
53
18
42
Flextime Program
31
31
44
44
15
28
63
39
63
54
65
12
71
13
55
19
38
Example 21.4
IDENTIFY
The data are interval (i.e. times) and were produced by a matched pairs
experiment (same drivers, same day of the week Wednesday). Why
arent we using a t-test for D ?
A histogram of the paired differences reveals a non-normal distribution,

hence we must use a non-parametric technique.
Example 21.4
COMPUTE
Travel time
Worker Arrival at 8.00AM Flextime Program Difference Difference
3
1
34
3
31
4
2
35
4
31
-1
3
43
1
44
2
4
46
2
44
1
5
16
1
15
-2
6
26
2
28
5
7
68
5
63
-1
8
38
1
39
-2
9
61
2
63
-2
10
52
2
54
3
11
68
3
65
1
12
13
1
12
-2
13
69
2
71
5
14
18
5
13
-2
15
53
2
55
-1
16
18
1
19
4
32
42
4
38
Rank
21.0
27.0
4.5
13.0
4.5
13.0
31.0
4.5
13.0
13.0
21.0
4.5
13.0
31.0
13.0
4.5
27.0
Example 21.4
The Original
Data
COMPUTE
ranks of +ve differences

ranks of -ve
differences
Sorted ascending by |
difference|
Rank
Sums
Example 21.4
We compute our test statistic as follows
Our rejection region is
COMPUTE
INTERPRET
Example 21.4
There is not enough evidence to
infer that flextime commutes are
different from the commuting times
under the current schedule
SPSS Output for example 21.4

Ranks
N
@8_00_ar - flextime
Negative Ranks
Positive Ranks
Ties
Total
12a
20b
0c
32
Mean Rank
13.38
18.38
Sum of Ranks
160.50
367.50
a. @8_00_ar < flextime

b. @8_00_ar > flextime
c. @8_00_ar = flextime
Te st Sta tisticsb
Z
Asymp. Sig. (2-tailed)
@8_00_ar
- flextime
-1.947a
.051
a. Based on negative ranks.

b. W ilcoxon Signed Ranks Test
Example 21.4
INTERPRET
compare
p-value
Identifying Factors I
Factors that Identify the Sign Test
Identifying Factors II
Factors that Identify the Wilcoxon Signed
Rank Sum Test
Problem 21.14 (Xr21-14)

Do the ways that women dress influence the ways
that other women judge them? This question was
addressed by a researcher at Ohio State University.
The experiment consisted of asking women to rate
how professional two women looked. One woman
wore a size 6 dress and the other woman wore size
14. Suppose that the researcher asked 20 women to
rate the women wearing the size 6 dress and another
20 rate the women wearing the size 14 dress. The
ratings were as follows:
4 = Highly professional; 3 = Somewhat professional
2 = Not very professional; 1 = Not all professional
Do these data provide sufficient evidence to infer
that women perceive another woman wearing size 6
dress as more professional than one wearing a size
14 dress?.
Size 6
2.00
3.00
3.00
2.00
3.00
4.00
3.00
3.00
4.00
4.00
2.00
3.00
4.00
4.00
1.00
3.00
4.00
3.00
4.00
2.00
Rank
8
20.5
20.5
8
20.5
34.5
20.5
20.5
34.5
34.5
8
20.5
34.5
34.5
2
20.5
34.5
20.5
34.5
8
Size 14
2.00
4.00
3.00
1.00
2.00
3.00
3.00
2.00
3.00
4.00
3.00
4.00
3.00
4.00
4.00
2.00
3.00
1.00
2.00
3.00
Rank
8
34.5
20.5
2
8
20.5
20.5
8
20.5
34.5
20.5
34.5
20.5
34.5
34.5
8
20.5
2
8
20.5
T 6 = 439.5
T14 = 380.5
21.14
H0 :
The two population locations are the same
H1 :
The location of population 1 is to the right of the

location of population 2
Rejection region:
z z z.05 1.645
n1 (n1 n 2 1) 20(20 20 1)
E (T )
410,
2
2
n n (n n 1)
(20)(20)(20 20 1)
T 1 2 1 2
37.0
12
12
439.5 410
T E (T )
.80,
z
=
37.0
T
p-value = P(Z > .80) = .5 .2881 = .2119.
There is not enough evidence to infer that women perceive another
woman wearing a size 6 dress as more professional than one wearing
a size 14 dress.
# 21.37
In a study to determine whether gender affects
salary offers for graduating MBA students, 25
pairs of students were selected. Each pair
consisted of a male and a female student who
had almost identical grade point averages,
courses taken, ages, and previous work
experience. The highest salary offered to each
student upon graduation was recorded. Is there
sufficient evidence to allow us to conclude the
salary offers differ between men and women?
Female
Male
Difference
Difference
29981.00
29689.00
30916.00
30300.00
31772.00
30647.00
30943.00
31598.00
32811.00
32754.00
32698.00
32223.00
32404.00
32578.00
34053.00
34823.00
35044.00
34783.00
34870.00
34806.00
35062.00
34905.00
36399.00
36186.00
36502.00
29233.00
28733.00
29541.00
29058.00
31149.00
29141.00
29739.00
33529.00
33938.00
32239.00
32661.00
31176.00
34375.00
34454.00
32184.00
34570.00
34097.00
36458.00
33321.00
34860.00
36207.00
33660.00
36758.00
34800.00
37701.00
748
956
1375
1242
623
1506
1204
-1931
-1127
515
37
1047
-1971
-1876
1869
253
947
-1675
1549
-54
-1145
1245
-359
1386
-1199
748
956
1375
1242
623
1506
1204
1931
1127
515
37
1047
1971
1876
1869
253
947
1675
1549
54
1145
1245
359
1386
1199
Rank
19
17
9
11
20
7
12
2
15
21
25
16
1
3
4
23
18
5
6
24
14
10
22
8
13
21.37
H0 :
H1 :
The two population locations are the same

The location of population 1 is different from the location
of population 2
Rejection region:
z z / 2 z .025 1.96
z z / 2 z.025 1.96
n ( n 1)
25( 25 1)
162.5
4
4
n ( n 1)(2n 1)
25( 25 1)(2[25] 1)
37.2
24
24
E (T )
;
or
T E (T )
z
T
190 162.5
.74,
37.2
p-value = 2P(Z > .74) = .2(5 .2704) = .4592.

There is not enough evidence of a difference
in salary offers between men and women
Ranks
N
male - female
Negative Ranks
Positive Ranks
Ties
Total
16a
9b
0c
25
Mean Rank
11.88
15.00
Sum of Ranks
190.00
135.00
a. male < female

b. male > female
c. male = female
Test Statisticsb
Z
Asymp. Sig. (2-tailed)
male - female
-.740a
.459
a. Based on positive ranks.

b. Wilcoxon Signed Ranks Test
Two or More Populations
Kruskal-Wallis Test
So far weve been comparing locations of two
populations, now well look at comparing two or
more populations.
The Kruskal-Wallis test is applied to problems
where we want to compare two or more
populations or ordinal or interval (but
nonnormal) data from independent samples.
Our hypotheses will be:
H0: The locations of all k populations are
the same.
H1: At least two population locations differ.
Test Statistic
In order to calculate the Kruskal-Wallis test
statistic, we need to:
1. Rank all the observations from smallest (1)
to largest (n), and average the ranks in the
case of ties.
2. We calculate rank sums for each sample:
T1, T2, , Tk
3. Lastly, we calculate the test statistic
(denoted H):
Sampling Distribution of the Test Statistic:

For sample sizes greater than or equal to
5, the test statistic H is approximately
Chi-squared distributed with k1 degrees
of freedom.
Our rejection region is: H > 2,k-1
And our p-value is: P ( 2 > H )
Figure 21.10
Sampling Distribution of H
Example 21.5
IDENTIFY
Can we compare customer ratings (4=good

1=poor) for speed of service across three shifts
in a fast food restaurant? Our hypotheses will be:
H0: The locations of all 3 populations are
the same.
(that is, there is no difference in service between shifts),

and
Customer ratings for service were recorded
Example 21.5
10 customers were selected at random from
each shift
4:00 P.M to Midnight Midnight to 8:00 A.M
4
3
4
4
3
2
4
2
3
3
3
4
3
3
3
3
2
2
3
3
8: A.M to 4:P.M
3
1
3
2
1
3
4
2
4
1
Example 21.5
COMPUTE
One way to solve the problem is to take the original data,

stack it, and then
sort by customer response
& rank bottom to top
sorted by response
Example 21.5
COMPUTE
Once its in stacked format, put in straight rankings

from 1 to 30, average the rankings for the same
response, then parse them out by shift to come up with
rank sum totals
Example 21.5
COMPUTE
= 2.64
Our critical value of Chi-squared (5% significance
and k1=2 degrees of freedom) is 5.99147, hence
there is not enough evidence to reject H0.
Example 21.5
INTERPRET
There is not enough evidence to infer that a

difference in speed of service exists between
the three shifts, i.e. all three of the shifts are
equally rated, and any action to improve service
should be applied to all three shifts
Example 21.5
COMPUTE
compare
There is not enough evidence to infer that a p-value

difference
in speed of service exists between the three shifts, i.e.
all three of the shifts are equally rated, and any action to
improve service should be applied to all three shifts
SPSS Output
Test Statisticsa,b
Chi-Square
df
Asymp. Sig.
mid_8_00
1.752
2
.416
@8_00_4
2.226
2
.329
a. Kruskal Wallis Test

b. Grouping Variable: @4_00_mi
There is not enough evidence to infer that a difference in

speed of service exists between the three shifts, i.e. all three
of the shifts are equally rated, and any
action to improve service should be applied to all three shifts
Identifying Factors
Factors that Identify the Kruskal-Wallis Test
Friedman Test
The Friedman Test is a technique used
compare two or more populations of ordinal
or interval (nonnormal) data that are
generated from a matched pairs experiment.
The hypotheses are the same as before:
H0: The locations of all k populations are the
same.
Friedman Test Test Statistic

Since this is a matched pairs experiment,
we first rank each observation within
each of b blocks from smallest to largest
(i.e. from 1 to k), averaging any ties. We
then compute the rank sums: T1, T2, ,
Tk. Then we calculate our test statistic:
Friedman Test Test Statistic

This test statistic is approximate
Chi-squared with k1 degrees of
freedom (provided either k or b 5).
Our rejection region and p-value
are:
Sampling Distribution of the Test

Statistic
The test statistics is approximately chisquared distributed with k 1 degrees
of freedom provided either k or b is
greater than or equal to 5.The rejection
region is
Fr = 2, k-1
and the p value is
P( 2 > Fr )
The figure on next slide depicts the
sampling distribution and p value
Figure 21.11
Sampling Distribution of Fr
Example 21.6
IDENTIFY
Four managers evaluate and score

job applicants on a scale from 1
(good) to 5 (not so good). There
have been complaints that the
process isnt fair. Is it the case that
all managers score the candidates
equally or not? That is:
Example 21.6
IDENTIFY
H0: The locations of all 4 populations are

the same.
(i.e. all managers score like candidates
alike)
(i.e. there is some disagreement between
managers on scores)
Fr > 2,k-1 = 2.05,3 = 7.81473
Example 21.6
COMPUTE
The data looks like this:
There are k=4 populations (managers)

and b=8 blocks (applicants) in this setup.
Example 21.6
COMPUTE
Applicant #1 for example, received a top score from

manager and next-to-top scores from the other three.
Applicant #7 received a top score from manager as
well, but the other three scored this candidate very low
Example 21.6
COMPUTE
rank each observation within block from

smallest to largest (i.e. from 1 to k), averaging
any ties For example, consider the case of
candidate #2:
Manage
r
Manage Manage Manage

r
r
r
Original
Scores
checksu
m
straigh
t
ranking
10
average
d
ranking
(1+2)/2=
1.5
(1+2)/2=
1.5
checksum = 1 + 2 + 3 + + k
10
Example 21.6
COMPUTE
Compute the rank sums: T1, T2, , Tk and

our test statistic
Example 21.6 COMPUTE

Fr > 2,k-1 = 2.05,3 = 7.81473
Example 21.6
INTERPRET
The value of our Friedman test statistic is 10.61

compared to a critical value of Chi-squared (at
5% significance and 3 d.f.) which is: 7.81473
Thus, there is sufficient evidence to reject H0 in
favor of H1
Itappearsthatthemanagers
evaluationsofapplicantsdoindeeddiffer
SPSS Output
Ranks
manager1
manager2
manager3
manager4
Mean Rank
2.63
1.25
3.06
3.06
Test Statisticsa
N
Chi-Square
df
Asymp. Sig.
8
12.864
3
.005
a. Friedman Test
Identifying Factors
Factors that Identify the Friedman Test
Spearman Rank Correlation

Coefficient
Previously we looked at the t-test of the
coefficient of correlation ( ). In many
situations, one or both variables may be
ordinal; or if both variables are interval, the
normality requirement may not be satisfied.
In such cases, we measure and test to
determine whether a relationship exists by
employing a nonparametric technique, the
Spearman rank correlation coefficient.

Coefficient
We are interested whether a relationship exists between the two
variables, hence the hypotheses to be tested are:
H0: s = 0 (no linear pattern, hence no correlation)
H1: s 0 (correlation; we can also do one-tail tests)
Since s is a population parameter, our sample statistic is rs ,and is
calculated as:
Sab
r
=
s
where a and b are the ranks of x and y respectively.
Sa Sbsab is the covariance

sa & sb are the standard deviations;
[s is referred to as the Spearman correlation coefficient]

Coefficient
For values of n between 5 and 30, critical values of rs
are available in Table 11 of Appendix B.
When n is greater than 30, rs is approximately normally
distributed with
a mean of zero, and
a standard deviation of
Hence our standardized test statistic is:
Example 21.7
The production manager of a firm wants to examine the
relationship between aptitude test scores given prior to
hiring of production line workers and performance
ratings received by the employees 3 months after starting
work. The results of the study would allow the firm to
decide how much weight to give to these aptitude tests
relative to other work-history information obtained,
including references. The aptitude test results range from
0 to 100. The performance rating are as follows:
1 = Employee has performed well below average
2 = Employee has performed somewhat below average
3 = Employee has performed at the average level
4 = Employee has performed some what above average
5 = Employee has performed well above average
Example 21.7
A random sample of 20 production workers yielded
the results listed below. Can the firms manager infer
at the 5% significance level that aptitude test scores
are correlated with performance rating?
Aptitude Performance
Employee test score
Rating
1
2
3
4
5
6
7
8
9
10
59
47
58
66
77
57
62
68
69
36
3
2
4
3
2
4
3
3
5
1
Aptitude Performance
Employee test score
Rating
11
12
13
14
15
16
17
18
19
20
48
65
51
61
40
67
60
56
76
71
3
3
2
3
3
4
2
3
3
5
Example 21.7
IDENTIFY
We specify our hypotheses as:

H0: = 0
s
H1: 0
s
At a 5% significance level and n=20

observations, the rejection region (from Table
10) is:
rs < .450 -or- rs > .450
Example 21.7
COMPUTE
As before, we rank each of the variables separately and average any ties
Employee
Aptitude
test score
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
59
47
58
66
77
57
62
68
69
36
48
65
51
61
40
67
60
56
76
71
Performance
Rank a
Rating
9
3
8
14
20
7
12
16
17
1
4
13
5
11
2
15
10
6
19
18
3
2
4
3
2
4
3
3
5
1
3
3
2
3
3
4
2
3
3
5
Rank b
10.5
3.5
17
10.5
3.5
17
10.5
10.5
19.5
1
10.5
10.5
3.5
10.5
10.5
17
3.5
10.5
10.5
19.5
Example 21.7
COMPUTE
We use rank a and b to compute Pearson

coefficient of correlation. We need to compute
sa , sb , and sab. They are
sa = 5.92
sb = 5.50
sab = 12.34
Sab
rs =
Sa S b
12.34
( 5.92)(5.50)
= .379
Example 21.7
COMPUTE
Compare .379 to our critical value of

rs=.450,
Since .379 < .450
There is not enough evidence to

believe that the aptitude test scores
and performance ratings are
related.
Identifying Factors
Factors that Identify the Spearman Rank
Correlation Coefficient Test

Statistik Non Parametrik 2

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Statistik Non Parametrik 2

Uploaded by

Copyright:

Available Formats

Chapter 21

Distribution of two populations when

The location of popn 1 is to the right of the location of

The Alternative Hypotheses

The Alternative Hypotheses

The Alternative Hypotheses

Wilcoxon Rank Sum Test

Wilcoxon Rank Sum Test

Two hypothetical populations and their corresponding samples are

Allow the GREEN population to shift

The green sample is expected to shift to the left too.

The green sum decreases , and the purple sum increases.

Wilcoxon Rank Sum Test Example

Sampling Distribution of the Test Statistic

Sampling Distribution of the Test Statistic

Sampling Distribution of the Test Statistic

Sampling Distribution of T with Two

P(T6) = 1/20 = .05 Since

Critical Values of the

Critical values of the

Critical Values of the Wilcoxon Rank Sum

Critical Values: Wilcoxon Rank Sum Test

Wilcoxon rank sum test for

The data were recorded. Can we conclude (at 5%

The data looks like:

These five twos would occupy

The rank sum for the new painkiller is T1=276.5,

Since Z = 1.83 > Zcritical =1.645

Wilcoxon rank sum test for nonnormal interval data, Example

Duration of Employment (Months)

Wilcoxon rank sum test for nonnormal interval data, Example

Can the personnel manager

Wilcoxon rank sum test for nonnormal interval data, Example

Wilcoxon rank sum test for nonnormal interval data, Example

Wilcoxon rank sum test for nonnormal interval data, Example

After the ranking process is completed,

T E(T) 463 575

2.56 Reject the null hypothesis

Wilcoxon rank sum test for

Sign Test and Wilcoxon Signed Rank Sum Test

The Sign Test

Test Statistics and Sampling

Test Statistics and Sampling Distribution

Test Statistics and Sampling

Sign Test Hypotheses

The p-value is P(Z > 2.71) =0.5 - .4966 = .0034,

a. european < american

a. Based on negative ranks.

Checking the Required Conditions

The sample size exceeds 10 (n=23).

Wilcoxon Signed Rank Sum Test

Wilcoxon Signed Rank Sum Test

Critical Values for the

Do travel times to the office vary between an

A histogram of the paired differences reveals a non-normal distribution,

ranks of +ve differences

We compute our test statistic as follows

Our rejection region is

SPSS Output for example 21.4

a. @8_00_ar < flextime

a. Based on negative ranks.

Problem 21.14 (Xr21-14)

The two population locations are the same

The location of population 1 is to the right of the