Non-Parametric Tests

Non-parametric Tests
Learning Objectives
1. Distinguish Parametric & Nonparametric Test Procedures 2. Explain commonly used Nonparametric Test Procedures
3. Perform Hypothesis Tests Using Nonparametric Procedures
Introduction
The word parametric comes from parameter, or
characteristic of a population. The parametric tests (include assumptions about the shape of the population distribution (e.g. normally distributed). Non-parametric techniques, do not have such stringent requirements and do not make assumptions about the underlying population distribution (which is why they are sometimes referred to as distribution-free tests).
Hypothesis Testing Procedures

Hypothesis Testing Procedures
Parametric
Nonparametric
Wilcoxon Rank Sum Test
Kruskal-Wallis H-Test
Z Test
t Test
One-Way ANOVA
Many More Tests Exist!
Parametric Test Procedures

1. Involve Population Parameters (Mean)
2. Have Stringent Assumptions (Normality) 3. Examples: Z Test, t Test, F test
Nonparametric Test Procedures

1. Do Not Involve Population Parameters Example: Probability Distributions, Independence
2. Data Measured on Any Scale (Ratio or Interval, Ordinal or Nominal) 3. Example: Mann-Whitney, KruskalWallis , Wilcoxon Rank Sum Test, 2 Test
Advantages of Nonparametric Tests

1. 2. Used With All Scales Easier to Compute Fewer
3. Make Assumptions
4.
Need Not Involve Population Parameters

Results May Be as
1984-1994 T/Maker Co.
5. Exact
as Parametric Procedures
Disadvantages of Nonparametric Tests

1. May Waste Information Parametric model more efficient if data Permit Difficult to Compute by
1984-1994 T/Maker Co.
2.
hand for Large Samples 3. Tables Not Widely Available
Popular Nonparametric Tests

1. 2. 3. 4. 5. 6. 7. 8. 9.
Mann-Whitney Rank Sum Test Sign Test Wilcoxon Test Kruskal-Wallis Test Friedman Test Spearmans Rank Correlation Kolmogorov-Smirnov Chi-square for independence
Nonparametric tests vs. Parametric

Comparison
2 Independent 2 Matched/Related >2 Independent Two-way Correlation Distribution
NonParametric Test
Mann-Whitney Rank Sum Test Sign Test Wilcoxon Test Kruskal-Wallis Test Friedman Test Spearmans Rank Correlation Kolmogorov-Smirnov Chi-square for independence
Parametric Equivalent
Independent t-test Paired samples t-test One-way ANOVA Two-way ANOVA Pearsons Correlation None None
Assumptions for non-parametric techniques

Random samples.
Independent observations. Each person or case
can be counted only once, they cannot appear in more than one category or group, and the data from one subject cannot influence the data from another. The exception to this is the repeated measures techniques (Wilcoxon Signed Rank Test, Friedman Test), where the same subjects are retested on different occasions or under different conditions. Some of the techniques discussed in this lecture have additional assumptions that should be
1. Chi-square
There are two different types of chi-square
tests, both involving categorical data:

1. The chi-square for goodness of fit (also referred to as one-sample chi-square) explores the proportion of cases that fall into the various categories of a single variable, and compares these with hypothesised values. 2. The chi-square test for independence is used to determine whether two categorical variables are related. It compares the frequency of cases found in the various categories of one variable across the different categories of another variable. For example: Is the proportion of smokers to non-smokers the same
1. Chi-square test for independence

This test is used when you wish to explore the relationship between two categorical variables. Each of these variables can have two or more categories.
Summary for chi-square

Example of research question:
There are a variety of ways questions can be phrased: Are males more likely to be smokers than females? Is the proportion of males that smoke the same as the proportion of females? Is there a relationship between gender and smoking behaviour? What you need: Two categorical variables, with two or more categories in each, for example: Gender (Male/Female); and Smoker (Yes/No).
Cont.
Additional assumptions :
The lowest expected frequency in any cell should be 5 or more. Some authors suggest a less stringent criteria: at least 80 per cent of cells should have expected frequencies of 5 or more. If you have a 1 by 2 or a 2 by 2 table, it is recommended that the expected frequency be at least 10. If you have a 2 by 2 table that violates this assumption you should consider using Fishers Exact Probability Test instead (also provided as part of the output from chi square).
Procedure for chi-square

1. From the menu at the top of the screen click on: Analyze, then click on Descriptive Statistics, then on Crosstabs. 2. Click on one of your variables (e.g. sex) to be your row variable, click on the arrow to move it into the box marked Row(s). 3. Click on the other variable to be your column variable (e.g. smoker), click on the arrow to move it into the box marked Column(s). 4. Click on the Statistics button. Choose Chi-square. Click on Continue. 5. Click on the Cells button. 6. In the Counts box, click on the Observed and Expected boxes.
The output
Cont.
Interpretation of output from chi-square

Assumptions
The first thing you should check is whether you have
violated one of the assumptions of chi-square concerning the minimum expected cell frequency, which should be 5 or greater (or at least 80 per cent of cells have expected frequencies of 5 or more). This information is given in a footnote below the final table (labelled Chi-Square Tests). Footnote b in the example provided indicates that 0 cells (.0%) have expected count less than 5. This means that we have not violated the assumption, as all our expected cell sizes are greater than 5 (in our case greater than 35.87).
Cont.
Chi-square tests The main value that you are interested in from the
output is the Pearson chi square value. If you have a 2 by 2 table (i.e. each variable has only two categories), then you should use the value in the second row (Continuity Correction). In the example presented above the corrected value is .337, with an associated significance level of .56 (Asymp. Sig. (2-sided). In this case the value of .56 is larger than the alpha value of .05, so we can conclude that our result is not significant. This means that the proportion of males that smoke is not significantly different from the proportion of females that smoke.
Summary information
To find what percentage of each sex smoke you will
need to look at the summary information provided in the table labelled SEX*SMOKE Crosstabulation. This table may look a little confusing to start with, with a fair bit of information presented in each cell. To find out what percentage of males are smokers you need to read across the page in the first row, which refers to males. In this case we look at the values next to % within sex. For this example 17.9 per cent of males were smokers, while 82.1 per cent were non-smokers. For females, 20.6 per cent were smokers, 79.4 per cent non-smokers. If we wanted to know what percentage of the sample as a whole smoked we would move down to the total row, which summarises across both sexes. In this case we would look at the
2. The Chi-Square Test for Goodnessof-Fit

The chi-square test for goodness-of-fit
uses frequency data from a sample to test hypotheses about the shape or proportions of a population. Each individual in the sample is classified into one category on the scale of measurement. The data, called observed frequencies, simply count how many individuals from the sample are in each category.
23
The Chi-Square Test for Goodnessof-Fit (cont.)

The null hypothesis specifies the proportion of the
population that should be in each category. The proportions from the null hypothesis are used to compute expected frequencies that describe how the sample would appear if it were in perfect agreement with the null hypothesis.
24
Procedure for chi-square
3. Mann-Whitney U Test
This technique is used to test for differences
between two independent groups on a continuous measure. For example, do males and females differ in terms of their self-esteem? This test is the non-parametric alternative to the ttest for independent samples. Mann-Whitney U Test compares medians. It converts the scores on the continuous variable to ranks, across the two groups. It then evaluates whether the ranks for the two groups differ significantly. As the scores are converted to ranks, the actual
Summary for Mann-Whitney U Test

Do males and females differ in terms of their levels of
self-esteem? Do males have higher levels of self-esteem than females?

What do you need: Two variables:
one categorical variable with two groups (e.g. sex); and one continuous variable (e.g. total self-esteem).
Assumptions: the general assumptions for non-
parametric techniques presented at the beginning of this presentation. Parametric alternative: Independent-samples t-test.
Procedure for Mann-Whitney U Test

1. From the menu at the top of the screen click on: Analyze, then click on Nonparametric Tests, then on 2 Independent Samples. 2. Click on your continuous (dependent) variable (e.g. total self-esteem) and move it into the Test Variable List box. 3. Click on your categorical (independent) variable (e.g. sex) and move into Grouping Variable box. 4. Click on Define Groups button.Type in the value for Group 1 (e.g. 1) and for Group 2 (e.g. 2). These are the values that were used to code your values for this variable (see your codebook). Click on Continue. 5. Make sure that the Mann-Whitney U box is ticked under the section labelled Test Type. Click on OK.
The output
Interpretation of output from MannWhitney U Test

The two values that you need to look at in your output are the Z value and the significance level, which is given as Asymp. Sig (2-tailed). If your sample size is larger than 30, SPSS will give you the value for a Z-approximation test which includes a correction for ties in the data. In the example given above, the Z value is 1.23 (rounded) with a significance level of p=.22. The probability value (p) is not less than or equal to .05, so the result is not significant. There is no statistically significant difference in the self-esteem scores of males and females.
4. Wilcoxon Signed Rank Test

The Wilcoxon Signed Rank Test (also referred to
as the Wilcoxon matched pairs signed ranks test) is designed for use with repeated measures: that is, when your subjects are measured on two occasions, or under two different conditions. It is the non-parametric alternative to the repeated measures t-test, but instead of comparing means the Wilcoxon converts scores to ranks and compares them at Time 1 and at Time 2. The Wilcoxon can also be used in situations involving a matched subject design, where subjects are matched on specific criteria.
Summary for Wilcoxon Signed Rank Test

Is there a change in the scores on the Fear of Statistics test from Time 1 to Time 2? What do you need: One group of subjects measured on the same continuous scale or criterion on two different occasions. The variables involved are scores at Time 1 or Condition 1, and scores at Time 2 or Condition 2. Assumptions: See general assumptions for non-parametric techniques presented at the beginning of this presentation. Parametric alternative: Paired-samples t-test.
Procedure for Wilcoxon Signed Rank Test

1. From the menu at the top of the screen click on: Analyze, then click on Nonparametric Tests, then on 2 Related Samples. 2. Click on the variables that represent the scores at Time 1 and at Time 2 (e.g. fost1, fost2). Move these into the Test Pairs List box. 3. Make sure that the Wilcoxon box is ticked in the Test Type section. Click on OK.
The output
Interpretation of output from Wilcoxon Signed Rank Test

The two things you are interested in the output are
the Z value and the associated significance levels, presented as Asymp. Sig. (2-tailed). If the significance level is equal to or less than .05 (e.g. .04, .01, .001) then you can conclude that the difference between the two scores is statistically significant. In this example the Sig. value is .000 (which really means less than .0005). Therefore we can conclude that the two sets of scores are significantly different.
5. Kruskal-Wallis Test
The Kruskal-Wallis Test (sometimes referred to as
the Kruskal-Wallis H Test) is the non-parametric alternative to a one-way between-groups analysis of variance. It allows you to compare the scores on some continuous variable for three or more groups. It is similar in nature to the Mann-Whitney test presented earlier in this chapter, but it allows you to compare more than just two groups. Scores are converted to ranks and the mean rank for each group is compared. This is a between groups analysis, so different people must be in
Summary for Kruskal-Wallis Test

Is there a difference in optimism levels across three age levels? What you need: Two variables: one categorical independent variable with three or more categories (e.g. agegp3: 1829, 3044, 45+); and one continuous dependent variable (e.g. total optimism). Assumptions: See general assumptions for non-parametric techniques presented at the beginning of this presentation. Parametric alternative:
Procedure for Kruskal-Wallis Test

1. From the menu at the top of the screen click on: Analyze, then click on Nonparametric Tests, then on K Independent Samples. 2. Click on your continuous (dependent variable) (e.g. total optimism) and move it into the Test Variable List box. 3. Click on your categorical (independent variable) (e.g. agegp3) and move it into the Grouping Variable box. 4. Click on the Define Range button. Type in the first value of your categorical variable (e.g., 1) in the Minimum box. Type the largest value for your categorical variable (e.g. 3) in the Maximum box. Click on Continue.
The output
Interpretation of output from Kruskal-Wallis Test

The main pieces of information you need from this output are: Chi-Square value, the degrees of freedom (df) and the significance level (presented as Asymp. Sig.). If this significance level is a value less than .05 (e.g. .04, .01, .001), then you can conclude that there is a statistically significant difference in your continuous variable across the three groups. You can then inspect the Mean Rank for the three groups presented in your first output table. This will tell you which of the groups had the highest overall ranking that corresponds to the highest score on your continuous variable. In the output presented above the significance level was .01 (rounded). This is less than the alpha level of .05, so these results suggest that there is a difference in optimism levels across the different age groups. An inspection of the mean ranks for the groups suggest that the older group (45+) had
6. Friedman Test
The Friedman Test is the non-parametric
alternative to the one-way repeated measures analysis of variance. It is used when you take the same sample of subjects or cases and you measure them at three or more points in time, or under three different conditions.
Summary for Friedman Test

Is there a change in Fear of Statistics scores across three time periods (pre-intervention, post-intervention and at follow-up)? What do you need: One sample of subjects, measured on the same scale or measured at three different time periods, or under three different conditions. Assumptions: See general assumptions for non-parametric techniques . Parametric alternative:
Procedure for Friedman Test

1. From the menu at the top of the screen click on: Analyze, then click on Nonparametric Tests, then on K Related Samples. 2. Click on the variables that represent the three measurements (e.g. fost1, fost2, fost3). 3. In the Test Type section check that the Friedman option is selected. Click on OK.
The output
Interpretation of output from Friedman Test

The results of this test suggest that there are significant differences in the Fear of Statistics scores across the three time periods. This is indicated by a Sig. level of .000 (which really means less than .0005). Comparing the ranks for the three sets of scores, it appears that there was a steady decrease in Fear of Statistics scores over time.
Reporting Statistics in APA Style

The following examples illustrate how to report
statistics in the text of a research report. You will note that significance levels in journal articles-especially in tables--are often reported as either "p > .05," "p < .05," "p < .01," or "p < .001." APA style dictates reporting the exact p value within the text of a manuscript (unless the p value is less than .001). Please pay attention to issues of italics and spacing. APA style is very precise about these. Also, with the exception of some p values, most statistics should be rounded to two decimal places.
EXAMPLES
Mean and Standard Deviation are most clearly
presented in parentheses:The sample as a whole was relatively young (M = 19.22, SD = 3.45).The average age of students was 19.22 years (SD = 3.45). Percentages are also most clearly displayed in parentheses with no decimal places: Nearly half (49%) of the sample was married.
CONT.
Reporting a significant single sample t-test (
0): Students taking statistics courses in psychology at the University of Washington reported studying more hours for tests (M = 121, SD = 14.2) than did UW college students in in general, t(33) = 2.10, p = .034.
Reporting a significant t-test for dependent groups
(1 2): Results indicate a significant preference for pecan pie (M = 3.45, SD = 1.11) over cherry pie (M = 3.00, SD = .80), t(15) = 4.00, p = .001.
CONT.
Reporting a significant t-test for independent
groups (1 2): UW students taking statistics courses in Psychology had higher IQ scores (M = 121, SD = 14.2) than did those taking statistics courses in Statistics (M = 117, SD = 10.3), t(44) = 1.23, p = .09. Over a two-day period, participants drank significantly fewer drinks in the experimental group (M= 0.667, SD = 1.15) than did those in the wait-list control group (M= 8.00, SD= 2.00), t(4) = -5.51, p=.005.
CONT.
Reporting a significant omnibus F test for a one-
way ANOVA: An analysis of variance showed that the effect of noise was significant, F(3,27) = 5.94, p = .007. Post hoc analyses using the Scheff post hoc criterion for significance indicated that the average number of errors was significantly lower in the white noise condition (M = 12.4, SD = 2.26) than in the other two noise conditions (traffic and industrial) combined (M = 13.62, SD = 5.56), F(3, 27) = 7.77, p = .042.
CONT.
Reporting the results of a chi-square test of
independence: A chi-square test of independence was performed to examine the relation between religion and college interest. The relation between these variables was significant, X2 (2, N = 170) = 14.14, p <.01. Catholic teens were less likely to show an interest in attending college than were Protestant teens. Reporting the results of a chi-square test of goodness of fit: A chi-square test of goodness-of-fit was performed to determine whether the three sodas were equally preferred. Preference for the three sodas was not equally distributed in the population, X2 (2, N = 55) =
Cont.
Regression results are often best presented in a table.
APA doesn't say much about how to report regression results in the text, but if you would like to report the regression in the text of your Results section, you should at least present the unstandardized or standardized slope (beta), whichever is more interpretable given the data, along with the t-test and the corresponding significance level. (Degrees of freedom for the t-test is N-k-1 where k equals the number of predictor variables.) It is also customary to report the percentage of variance explained along with the corresponding F test. Social support significantly predicted depression scores, b = -.34, t(225) = 6.53, p < .001. Social support also explained a significant
Cont.
Correlations are reported with the degrees of freedom
(which is N-2) in parentheses and the significance level: The two variables were strongly correlated, r(55) = .49, p < .01.
Tables are useful if you find that a paragraph has
almost as many numbers as words. If you do use a table, do not also report the same information in the text. It's either one or the other. Based on: American Psychological Association. (2010). Publication manual of the American Psychological Association (6th ed.). Washington, DC:
THE END

Non-Parametric Tests

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Non-Parametric Tests

Uploaded by

Copyright:

Available Formats

Non-parametric Tests

3. Perform Hypothesis Tests Using Nonparametric Procedures

Hypothesis Testing Procedures

Wilcoxon Rank Sum Test

Many More Tests Exist!

Parametric Test Procedures

2. Have Stringent Assumptions (Normality) 3. Examples: Z Test, t Test, F test

Nonparametric Test Procedures

Advantages of Nonparametric Tests

Need Not Involve Population Parameters

Disadvantages of Nonparametric Tests

hand for Large Samples 3. Tables Not Widely Available

Popular Nonparametric Tests

Nonparametric tests vs. Parametric

Assumptions for non-parametric techniques

tests, both involving categorical data:

1. Chi-square test for independence

Summary for chi-square

Procedure for chi-square

Interpretation of output from chi-square

2. The Chi-Square Test for Goodnessof-Fit

The Chi-Square Test for Goodnessof-Fit (cont.)

Procedure for chi-square

Summary for Mann-Whitney U Test

self-esteem? Do males have higher levels of self-esteem than females?

Procedure for Mann-Whitney U Test

Interpretation of output from MannWhitney U Test

4. Wilcoxon Signed Rank Test

Summary for Wilcoxon Signed Rank Test

Procedure for Wilcoxon Signed Rank Test

Interpretation of output from Wilcoxon Signed Rank Test

Summary for Kruskal-Wallis Test

Procedure for Kruskal-Wallis Test

Interpretation of output from Kruskal-Wallis Test

Summary for Friedman Test

Procedure for Friedman Test

Interpretation of output from Friedman Test

Reporting Statistics in APA Style

You might also like