Professional Documents
Culture Documents
Learning Objectives
1. Distinguish Parametric & Nonparametric Test Procedures 2. Explain commonly used Nonparametric Test Procedures
Introduction
The word parametric comes from parameter, or
characteristic of a population. The parametric tests (include assumptions about the shape of the population distribution (e.g. normally distributed). Non-parametric techniques, do not have such stringent requirements and do not make assumptions about the underlying population distribution (which is why they are sometimes referred to as distribution-free tests).
Parametric
Nonparametric
Kruskal-Wallis H-Test
Z Test
t Test
One-Way ANOVA
2. Data Measured on Any Scale (Ratio or Interval, Ordinal or Nominal) 3. Example: Mann-Whitney, KruskalWallis , Wilcoxon Rank Sum Test, 2 Test
3. Make Assumptions
4.
5. Exact
as Parametric Procedures
2.
Mann-Whitney Rank Sum Test Sign Test Wilcoxon Test Kruskal-Wallis Test Friedman Test Spearmans Rank Correlation Kolmogorov-Smirnov Chi-square for independence
NonParametric Test
Mann-Whitney Rank Sum Test Sign Test Wilcoxon Test Kruskal-Wallis Test Friedman Test Spearmans Rank Correlation Kolmogorov-Smirnov Chi-square for independence
Parametric Equivalent
Independent t-test Paired samples t-test One-way ANOVA Two-way ANOVA Pearsons Correlation None None
can be counted only once, they cannot appear in more than one category or group, and the data from one subject cannot influence the data from another. The exception to this is the repeated measures techniques (Wilcoxon Signed Rank Test, Friedman Test), where the same subjects are retested on different occasions or under different conditions. Some of the techniques discussed in this lecture have additional assumptions that should be
1. Chi-square
There are two different types of chi-square
There are a variety of ways questions can be phrased: Are males more likely to be smokers than females? Is the proportion of males that smoke the same as the proportion of females? Is there a relationship between gender and smoking behaviour? What you need: Two categorical variables, with two or more categories in each, for example: Gender (Male/Female); and Smoker (Yes/No).
Cont.
Additional assumptions :
The lowest expected frequency in any cell should be 5 or more. Some authors suggest a less stringent criteria: at least 80 per cent of cells should have expected frequencies of 5 or more. If you have a 1 by 2 or a 2 by 2 table, it is recommended that the expected frequency be at least 10. If you have a 2 by 2 table that violates this assumption you should consider using Fishers Exact Probability Test instead (also provided as part of the output from chi square).
The output
Cont.
violated one of the assumptions of chi-square concerning the minimum expected cell frequency, which should be 5 or greater (or at least 80 per cent of cells have expected frequencies of 5 or more). This information is given in a footnote below the final table (labelled Chi-Square Tests). Footnote b in the example provided indicates that 0 cells (.0%) have expected count less than 5. This means that we have not violated the assumption, as all our expected cell sizes are greater than 5 (in our case greater than 35.87).
Cont.
Chi-square tests The main value that you are interested in from the
output is the Pearson chi square value. If you have a 2 by 2 table (i.e. each variable has only two categories), then you should use the value in the second row (Continuity Correction). In the example presented above the corrected value is .337, with an associated significance level of .56 (Asymp. Sig. (2-sided). In this case the value of .56 is larger than the alpha value of .05, so we can conclude that our result is not significant. This means that the proportion of males that smoke is not significantly different from the proportion of females that smoke.
Summary information
To find what percentage of each sex smoke you will
need to look at the summary information provided in the table labelled SEX*SMOKE Crosstabulation. This table may look a little confusing to start with, with a fair bit of information presented in each cell. To find out what percentage of males are smokers you need to read across the page in the first row, which refers to males. In this case we look at the values next to % within sex. For this example 17.9 per cent of males were smokers, while 82.1 per cent were non-smokers. For females, 20.6 per cent were smokers, 79.4 per cent non-smokers. If we wanted to know what percentage of the sample as a whole smoked we would move down to the total row, which summarises across both sexes. In this case we would look at the
uses frequency data from a sample to test hypotheses about the shape or proportions of a population. Each individual in the sample is classified into one category on the scale of measurement. The data, called observed frequencies, simply count how many individuals from the sample are in each category.
23
population that should be in each category. The proportions from the null hypothesis are used to compute expected frequencies that describe how the sample would appear if it were in perfect agreement with the null hypothesis.
24
3. Mann-Whitney U Test
This technique is used to test for differences
between two independent groups on a continuous measure. For example, do males and females differ in terms of their self-esteem? This test is the non-parametric alternative to the ttest for independent samples. Mann-Whitney U Test compares medians. It converts the scores on the continuous variable to ranks, across the two groups. It then evaluates whether the ranks for the two groups differ significantly. As the scores are converted to ranks, the actual
one categorical variable with two groups (e.g. sex); and one continuous variable (e.g. total self-esteem).
Assumptions: the general assumptions for non-
parametric techniques presented at the beginning of this presentation. Parametric alternative: Independent-samples t-test.
The output
as the Wilcoxon matched pairs signed ranks test) is designed for use with repeated measures: that is, when your subjects are measured on two occasions, or under two different conditions. It is the non-parametric alternative to the repeated measures t-test, but instead of comparing means the Wilcoxon converts scores to ranks and compares them at Time 1 and at Time 2. The Wilcoxon can also be used in situations involving a matched subject design, where subjects are matched on specific criteria.
Is there a change in the scores on the Fear of Statistics test from Time 1 to Time 2? What do you need: One group of subjects measured on the same continuous scale or criterion on two different occasions. The variables involved are scores at Time 1 or Condition 1, and scores at Time 2 or Condition 2. Assumptions: See general assumptions for non-parametric techniques presented at the beginning of this presentation. Parametric alternative: Paired-samples t-test.
The output
the Z value and the associated significance levels, presented as Asymp. Sig. (2-tailed). If the significance level is equal to or less than .05 (e.g. .04, .01, .001) then you can conclude that the difference between the two scores is statistically significant. In this example the Sig. value is .000 (which really means less than .0005). Therefore we can conclude that the two sets of scores are significantly different.
5. Kruskal-Wallis Test
The Kruskal-Wallis Test (sometimes referred to as
the Kruskal-Wallis H Test) is the non-parametric alternative to a one-way between-groups analysis of variance. It allows you to compare the scores on some continuous variable for three or more groups. It is similar in nature to the Mann-Whitney test presented earlier in this chapter, but it allows you to compare more than just two groups. Scores are converted to ranks and the mean rank for each group is compared. This is a between groups analysis, so different people must be in
Is there a difference in optimism levels across three age levels? What you need: Two variables: one categorical independent variable with three or more categories (e.g. agegp3: 1829, 3044, 45+); and one continuous dependent variable (e.g. total optimism). Assumptions: See general assumptions for non-parametric techniques presented at the beginning of this presentation. Parametric alternative:
The output
6. Friedman Test
The Friedman Test is the non-parametric
alternative to the one-way repeated measures analysis of variance. It is used when you take the same sample of subjects or cases and you measure them at three or more points in time, or under three different conditions.
Is there a change in Fear of Statistics scores across three time periods (pre-intervention, post-intervention and at follow-up)? What do you need: One sample of subjects, measured on the same scale or measured at three different time periods, or under three different conditions. Assumptions: See general assumptions for non-parametric techniques . Parametric alternative:
The output
statistics in the text of a research report. You will note that significance levels in journal articles-especially in tables--are often reported as either "p > .05," "p < .05," "p < .01," or "p < .001." APA style dictates reporting the exact p value within the text of a manuscript (unless the p value is less than .001). Please pay attention to issues of italics and spacing. APA style is very precise about these. Also, with the exception of some p values, most statistics should be rounded to two decimal places.
EXAMPLES
Mean and Standard Deviation are most clearly
presented in parentheses:The sample as a whole was relatively young (M = 19.22, SD = 3.45).The average age of students was 19.22 years (SD = 3.45). Percentages are also most clearly displayed in parentheses with no decimal places: Nearly half (49%) of the sample was married.
CONT.
Reporting a significant single sample t-test (
0): Students taking statistics courses in psychology at the University of Washington reported studying more hours for tests (M = 121, SD = 14.2) than did UW college students in in general, t(33) = 2.10, p = .034.
Reporting a significant t-test for dependent groups
(1 2): Results indicate a significant preference for pecan pie (M = 3.45, SD = 1.11) over cherry pie (M = 3.00, SD = .80), t(15) = 4.00, p = .001.
CONT.
Reporting a significant t-test for independent
groups (1 2): UW students taking statistics courses in Psychology had higher IQ scores (M = 121, SD = 14.2) than did those taking statistics courses in Statistics (M = 117, SD = 10.3), t(44) = 1.23, p = .09. Over a two-day period, participants drank significantly fewer drinks in the experimental group (M= 0.667, SD = 1.15) than did those in the wait-list control group (M= 8.00, SD= 2.00), t(4) = -5.51, p=.005.
CONT.
Reporting a significant omnibus F test for a one-
way ANOVA: An analysis of variance showed that the effect of noise was significant, F(3,27) = 5.94, p = .007. Post hoc analyses using the Scheff post hoc criterion for significance indicated that the average number of errors was significantly lower in the white noise condition (M = 12.4, SD = 2.26) than in the other two noise conditions (traffic and industrial) combined (M = 13.62, SD = 5.56), F(3, 27) = 7.77, p = .042.
CONT.
Reporting the results of a chi-square test of
independence: A chi-square test of independence was performed to examine the relation between religion and college interest. The relation between these variables was significant, X2 (2, N = 170) = 14.14, p <.01. Catholic teens were less likely to show an interest in attending college than were Protestant teens. Reporting the results of a chi-square test of goodness of fit: A chi-square test of goodness-of-fit was performed to determine whether the three sodas were equally preferred. Preference for the three sodas was not equally distributed in the population, X2 (2, N = 55) =
Cont.
Regression results are often best presented in a table.
APA doesn't say much about how to report regression results in the text, but if you would like to report the regression in the text of your Results section, you should at least present the unstandardized or standardized slope (beta), whichever is more interpretable given the data, along with the t-test and the corresponding significance level. (Degrees of freedom for the t-test is N-k-1 where k equals the number of predictor variables.) It is also customary to report the percentage of variance explained along with the corresponding F test. Social support significantly predicted depression scores, b = -.34, t(225) = 6.53, p < .001. Social support also explained a significant
Cont.
Correlations are reported with the degrees of freedom
(which is N-2) in parentheses and the significance level: The two variables were strongly correlated, r(55) = .49, p < .01.
Tables are useful if you find that a paragraph has
almost as many numbers as words. If you do use a table, do not also report the same information in the text. It's either one or the other. Based on: American Psychological Association. (2010). Publication manual of the American Psychological Association (6th ed.). Washington, DC:
THE END