Professional Documents
Culture Documents
11
Nonparametric Tests
Section 11.1
Nonparametric Tests
A nonparametric test is a hypothesis test that does not require any specific conditions about the shape of the populations or the value of any population parameters. Tests are often called distribution free tests. The Sign Test is a nonparametric test that can be used to test a population median against a hypothesized value, k.
Hypotheses
Left-tailed test: or median <k H0: median k and Ha:
Right-tailed or test: H0: median k and Ha: median > k Two-tailed test: H0: median = k and Ha: median k
Sign Test
To use the sign test, first compare each entry in the sample to the hypothesized median, k. If the entry is below the median, assign it a sign. If the entry is above the median, assign it a + sign. If the entry is equal to the median, assign it a 0. Compare the number of + and signs. (Ignore 0s.) If the number of + signs and the number of signs are approximately equal, the null hypothesis is not likely to be rejected. If they are not approximately equal, however, it is likely that the null hypothesis will be rejected.
Sign Test
Test Statistic: When n 25, the test statistic is the
smaller number of + or signs.
For n > 25, you are testing the binomial probability that
= 0.50.
Application
A meteorologist claims that the daily median temperature for the month of January in San Diego is 57 Fahrenheit. The temperatures (in degrees Fahrenheit) for 18 randomly selected January days are listed below. At = 0.01, can you support the meteorologists claim? 58 62 55 55 53 52 52 59 55 55 60 56 57 61 58 63 63 55
1. Write the null and alternative hypothesis. H0: median = 57 and Ha: median 57 2. State the level of significance. = 0.01 3. Determine the sampling distribution. Binomial with p = 0.5
58 62 55 55 53 52 52 59 55 55 60 56 57 61 58 63 63 55
+ + + 0 + + + + +
There are 8 + signs and 9 signs. So, n = 8 + 9 = 17. Since Ha contains the symbol, this is a two-tail test.
The test statistic is the smaller number of + or signs, so the test statistic is 8.
Section 11.2
To find the test statistic, ws Find the difference for each pair: Sample 1 value Sample 2 value Find the absolute value of the difference. Rank order these differences. Affix a + or sign to each of the rankings. Find the sum of the positive ranks. Find the sum of the negative ranks. Select the smaller of the absolute values of the sums.
Application
The table shows the daily headache hours suffered by 12 patients before and after receiving a new drug for seven weeks. At = 0.01, is there enough evidence to conclude that the new drug helped to reduce daily headache hours?
1. Write the null and alternative hypothesis. H0: The headache hours after using the new drug are at least as long as before using the drug. Ha: The new drug reduces headache hours. (Claim) 2. State the level of significance. = 0.01
Before
After
Diff.
Abs
Rank
Sign Rank
1 2 3 4 5 6 7 8
The sum of the positive ranks is 5 + 6 + 3 + 8 + 7 + 4 = 33. The sum of the negative ranks is 1.5 + (1.5) = 3. The test statistic is the smaller of the absolute value of these sums, ws = 3.
There are 8 + and signs, so n = 8. The critical value is 2. Because ws = 3 is greater than the critical value, fail to reject the null hypothesis. There is not enough evidence to conclude the new drug reduces headache hours.
Both samples must be at least 10. Then n1 represents the size of the smaller sample and n2 the size of the larger sample.
When the samples are the same size, it does not matter which is n1.
where
Section 11.3
H0: There is no difference in the population distributions. Ha: There is a difference in the population distributions.
Combine the data and rank the values. Then separate the data according to sample and find the sum of the ranks for each sample.
where k represents the number of samples, ni is the size of the i th sample, N is the sum of the sample sizes, and Ri is the sum of the ranks of the i th sample.
The sampling distribution is a chi-square distribution with k 1 degrees of freedom (where k = the number of samples). Reject the null hypothesis when H is greater than the critical number. (Always use a right-tail test.)
Application
You want to compare the hourly pay rates of accountants who work in Michigan, New York and Virginia. To do so, you randomly select 10 accountants in each state and record their hourly pay rate as shown below. At the .01 level, can you conclude that the distributions of accountants hourly pay rates in these three states are different?
MI(1) 14.24 14.06 14.85 17.47 14.83 19.01 13.08 15.94 13.48 16.94 NY(2) 21.18 20.94 16.26 21.03 19.95 17.54 14.89 18.88 20.06 21.81 VA(3) 17.020 20.630 17.470 15.540 15.380 14.900 20.480 18.500 12.800 15.570
1. Write the null and alternative hypothesis. H0 : There is no difference in the hourly pay rate in the 3 states. Ha : There is a difference in the hourly pay in the 3 states. 2. State the level of significance. = 0.01 3. Determine the sampling distribution. 4. Find the critical value. 5. Find the rejection region.
X2
The sampling distribution is chi-square with d.f. = 3 1 = 2. From Table 6, the critical value is 9.210.
Test Statistic
Data 12.800 13.080 13.480 14.060 14.240 14.830 14.850 14.890 14.900 15.380 15.540 15.570 15.940 16.260 16.940 17.020 17.470 17.470 17.540 18.500 18.880 19.010 19.950 20.060 20.480 20.630 20.940 21.030 21.180 21.810 State VA MI MI MI MI MI MI NY VA VA VA VA MI NY MI VA MI VA NY VA NY MI NY NY VA VA NY NY NY NY Rank 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17.5 17.5 19 20 21 22 23 24 25 26 27 28 29 30
Michigan salaries are in ranks: 2, 3, 4, 5, 6, 7, 13, 15, 17.5, 22 The sum is 94.5. New York salaries are in ranks: 8, 14, 19, 21, 23, 24, 27, 28, 29, 30 The sum is 223. Virginia salaries are in ranks: 1, 9, 10, 11, 12, 16, 17.5, 20, 25, 26 The sum is 147.5.
9.210
10.76
Section 11.4
Rank Correlation
Rank Correlation
The Spearman rank correlation coefficient, rs, is a measure of the strength of the relationship between two variables. The Spearman rank correlation coefficient is calculated using the ranks of paired sample data entries. The formula for the Spearman rank correlation coefficient is
where n is the number of paired data entries and d is the difference between the ranks of a paired data entry.
The hypotheses:
(There is no correlation between the variables.) (There is a significant correlation between the variables.)
Rank Correlation
Seven candidates applied for a nursing position. The seven candidates were placed in rank order first by x and then by y. The results of the rankings are listed below. Using a .05 level of significance, test the claim that there is a significant correlation between the variables.
x
1 2 3 4 5 6 7 2 4 1 5 7 3 6
y
1 4 3 2 6 1 7
(There is no correlation between the variables.) (There is a significant correlation between the variables.)
Application
x
1 2 3 4 5 6 7 2 4 1 5 7 3 6
y
1 4 3 2 6 1 7
d=xy
1 0 2 3 1 2 1
d2
1 0 4 9 1 4 1 20
Since the statistic 0.643 does not fall in the rejection region, fail to reject H0. There is not enough evidence to support the claim that there is a significant correlation.