Professional Documents
Culture Documents
Sign Test RANK _SUM -TESTS 1 Mann-Whitney U Test 2 Kruskal-Wallis Test Rank Correlation
Slide 1
Nonparametric Methods
Most of the statistical methods referred to as parametric require the use of interval- or ratio-scaled data. Nonparametric methods are often the only way to analyze nominal or ordinal data and draw statistical conclusions. Nonparametric methods require no assumptions about the population probability distributions. Nonparametric methods are often called distributionfree methods.
Slide 2
Nonparametric Methods
In general, for a statistical method to be classified as nonparametric, it must satisfy at least one of the following conditions. The method can be used with nominal data. The method can be used with ordinal data. The method can be used with interval or ratio data when no assumption can be made about the population probability distribution.
Slide 3
Sign Test
A common application of the sign test involves using a sample of n potential customers to identify a preference for one of two brands of a product. The objective is to determine whether there is a difference in preference between the two items being compared. To record the preference data, we use a plus sign if the individual prefers one brand and a minus sign if the individual prefers the other brand. Because the data are recorded as plus and minus signs, this test is called the sign test.
Slide 4
Slide 5
Hypotheses H0: No preference for one brand over the other exists Ha: A preference for one brand over the other exists Sampling Distribution Sampling distribution of the number of + values if there is no brand preference
2.74
= 15 = .5(30)
Slide 6
When H0 is true p=0.5 q=0.5 = np = 30( 0.5) =15 = npq = 30(0.5)(0.5) = 7.5 = 2.24 x =18 = number preferring Hoppy Peanut Butter In the sample z = (x - )/ follows standard normal distribution
Slide 7
Rejection Rule Using .05 level of significance, Reject H0 if z < -1.96 or z > 1.96 Test Statistic z = (18 - 15)/2.74 = 3/2.74 = 1.095 Conclusion Do not reject H0. There is no difference in preference for the two brands of peanut butter. Fewer than 10 or more than 20 individuals would have to have a preference for a particular brand in order for us to reject H0.
Slide 8
Slide 9
Hypotheses H0: No preference for one brand over the other exists Ha: A preference for one brand over the other exists Sampling Distribution Sampling distribution of the number of + values if there is no brand preference
2.74
= 140 = .5(280)
Slide 10
When H0 is true p=0.5 q=0.5 = np = 280( 0.5) =140 = npq = 280(0.5)(0.5) = 70 = 8.36 x =160 = number preferring Fluoiride Toothpaste In the sample z = (x - )/ follows standard normal distribution
Slide 11
Rejection Rule Using .05 level of significance, Reject H0 if z < -1.96 or z > 1.96 Test Statistic z = (160 - 140)/8.36 = 20/8.36 = 2.39 Conclusion Reject H0. There is sufficient evidence in the sample to conclude that a difference in preference exists for the two brands of Tooth Pastes.
Slide 12
Mann-Whitney-U Test
This test is another nonparametric method for determining whether there is a difference between two populations. This test does not require interval data or the assumption that both populations are normally distributed. The only requirement is that the measurement scale for the data is at least ordinal.
Slide 13
Mann-Whitney-U-Test
Instead of testing for the difference between the means of two populations, this method tests to determine whether the two populations are identical. The hypotheses are:
H0: The two populations are identical Ha: The two populations are not identical
Slide 14
Mann-Whitney-U- Test (Large-Sample Case) Manufacturer labels indicate the annual energy cost associated with operating home appliances such as freezers. The energy costs for a sample of 10 Westin freezers and a sample of 10 Brand-X Freezers are shown on the next slide. Do the data indicate, using a = .05, that a difference exists in the annual energy costs associated with the two brands of freezers?
Slide 15
Mann-Whitney-U- Test (Large-Sample Case) Hypotheses H0: There is no difference between the two populations. So Annual energy costs for Westin freezers and Brand-X freezers are the same. Ha: There is difference between the two populations. In particular they have different means. Annual energy costs differ for the two brands of freezers.
Slide 17
First, rank the combined data from the lowest to the highest values, with tied values being assigned the average of the tied rankings. Then, compute R1, the sum of the ranks for the first sample. Then, compute R2, the sum of the ranks for the second sample. Let n 1 be number of observations in the first sample Let n 2 be number of observations in the second sample
Slide 18
Slide 19
u= n1n1 /2
Standard Deviation
T 1 12 n1n2 (n1 n2 1)
Distribution Form
Approximately normal, provided n1 > 10 and n2 > 10
Slide 20
R 1= 86.5 R 2= 123.5 n1 = 10 n 2 = 10 U ={ (10) ( 10) + 10( 11)/2 ) -86.5 =(100+55) -86.5 =155-86.5 = 68.5
Slide 22
u13.23
u = 50 =(1/2)(10)(10)
Slide 23
Rejection Rule Using .05 level of significance, Reject H0 if z < -1.96 or z > 1.96 Test Statistic z = (U - u )/ u = (68.5 - 50)/13.23 = 1.40 Conclusion Do not reject H0. There is no difference in the annual energy cost associated with the two brands of freezers.
Slide 24
Kruskal-Wallis Test
The Mann-Whitney-U_ test can be used to test whether two populations are identical. The MW-U- test has been extended by Kruskal and Wallis for cases of three or more populations. The Kruskal-Wallis test can be used with ordinal data as well as with interval or ratio data. Also, the Kruskal-Wallis test does not require the assumption of normally distributed populations. The hypotheses are:
H0: All populations are identical Ha: Not all populations are identical
Slide 25
Kruskal-Wallis-Test
Kruskal-Wallis-Test Hypotheses H0: There is no difference between the K populations.. Ha: There is a difference between the K populations. In particular they have different means.
Slide 26
First, rank the combined data from the lowest to the highest values, with tied values being assigned the average of the tied rankings. Then, compute R1, the sum of the ranks for the first sample. Then, compute R2, the sum of the ranks for the second sample. And so on upto p th sample Then , compute Rp , the sum of the ranks for the p th sample. Let n 1 be number of observations in the first sample Let n 2 be number of observations in the second sample Let n p be number of observations in the p th sample
Slide 27
Kruskal-Wallis-Test
Compute Test Statistics Test Statistics is K= {12/n(n+1) } Rj 2 /nj - 3(n+1) K follows a Chi Square Distribution with p-1 d.f when all sample sizes are at least 5 and p is the number of groups
Slide 28
Kruskal-Wallis-Test
A sample of 20 pilot students were divided into 3 groups of 6, 5 and 9.For the purpose of their written tests the first group was trained through use of video cassettes , the second through use of Audio cassettes and the third group through classroom teaching. The scores of the students in the three groups are as follows. Video-74, 88,82,93,55,70 Audoo-78,80,65,57, 89 Classroom-68,83,50,91,84,77,94,81,92 Test the hypothesis that the mean scores of stundent pilots trained by each of these three methods is equal
Slide 29
Video-74(7), 88(15),82(12),93(19),55(2),70(6) Audoo-78(9),80(10),65(4),57(3), 89(16) Classroom68(5),83(13),50(1),91(17),84(14),77(8),94(20),81(11),92(18) R1 = 61 n1 = 6 R2= 42 n2 = 5 R3 = 107 n3 = 9 n = 20 p= 3 K= {12/n(n+1) } Rj 2 /nj - 3(n+1) Test Statistics : K= 1.143 Follows Chi Square distribution with 3-1=2 d.f Table value is 4.605 .Since 1.143 less than 4.065 we accept the hypothesis & conclude that there is no difference in the three teaching methods
Slide 30
Rank Correlation
The Pearson correlation coefficient, r, is a measure of the linear association between two variables for which interval or ratio data are available. The Spearman rank-correlation coefficient, rs , is a measure of association between two variables when only ordinal data are available. Values of rs can range from 1.0 to +1.0, where values near 1.0 indicate a strong positive association between the rankings, and values near -1.0 indicate a strong negative association between the rankings.
Slide 31
Rank Correlation
rs 1
n(n2 1)
6 di2
where: n = number of items being ranked xi = rank of item i with respect to one variable yi = rank of item i with respect to a second variable di = xi - yi
Slide 32
We may want to use sample results to make an inference about the population rank correlation ps. To do so, we must test the hypotheses:
H0: ps = 0 Ha: ps = 0
Slide 33
Rank Correlation
r 0
s
Standard Deviation
1 rs n1
Distribution Form
Approximately normal, provided n > 10
Slide 34
Rank Correlation Connor Investors provides a portfolio management service for its clients. Two of Connors analysts rated ten investments from high (6) to low (1) risk as shown below. Use rank correlation, with a = .10, to comment on the agreement of the two analysts rankings. Investment Analyst #1 Analyst #2 A B 1 4 1 5 C D E 9 8 6 6 2 9 F G H I J 3 5 7 2 10 7 3 10 4 8
Slide 35
Differ. 0 -1 3 6 -3 -4 2 -3 -2 2 Sum =
(Differ.)2 0 1 9 36 9 16 4 9 4 4 92
Slide 36
Hypotheses H0: ps = 0 (No rank correlation exists.) Ha: ps = 0 (Rank correlation exists.) Sampling Distribution Sampling distribution of rs under the assumption of no rank correlation
1 rs .333 10 1
r = 0
rs
Slide 37
Rejection Rule Using .10 level of significance, Reject H0 if z < -1.645 or z > 1.645 Test Statistic
Conclusion Do no reject H0. There is not a significant rank correlation. The two analysts are not showing agreement in their rating of the risk associated with the different investments.
Slide 38
6 di2