Professional Documents
Culture Documents
1
Chapter 2 Data entry in SPSS
2
Chapter 3 Exploring data in SPSS
3
Chapter 4 Data handling
4
Chapter 5 Tests of difference for two sample designs
5
Chapter 6 Tests of correlation
6
Chapter 7 Tests for nominal data
7
Chapter 8 Analysis of variance
8
Chapter 9 Multiple regression
9
Chapter 10 Analysis of covariance and multivariate analysis of variance
10
Chapter 11 Discriminant analysis and logistic regression
11
Chapter 12 Factor analysis, and reliability and dimensionality of scales
Glossary
Gloss
References
Refs
Appendix
App
Index
Index
282 SPSS for Psychologists
IN THIS CHAPTER
An introduction to analysis of covariance
An example
Imagine, for example, you were interested in the speed with which students could
learn how to use different computerised statistical packages (SPSS is not the only
one available!). You might choose three different packages and give these to three
different groups of first-year students who have no prior knowledge of any
statistical package. However, you are aware that the dependent variable here,
speed of learning, might be influenced by other factors, such as familiarity with
10
computer software. If you measured familiarity with computer software before
exposing students to one of the statistical packages, you could then control for
and remove its effect on the dependent variable so that you can obtain a clearer
insight into the differences in learning speed for the different packages.
also expect to see variability in the performance of students within each group,
and because speed of learning is also related to familiarity with computer
software, some of this variability will be directly attributable to differences in this
familiarity. If we ran a test of correlation, we would probably find that there is a
positive correlation so that as familiarity increases so does speed of learning.
ANCOVA examines the association between familiarity and speed and removes
the variance due to this association.
You might be thinking that had we randomly assigned our participants to the
three groups, then probably there would be no systematic differences in
familiarity across each of the three groups. Therefore, the means on this covariate
would not differ too much. This is the ideal situation in which to use ANCOVA,
because it will get rid of the effects of the covariate, reduce error variance and
hence result in a larger F-value. However, it is also possible to use ANCOVA
when the means on the covariate differ significantly. For example, imagine that in
one group, either due to bad luck or poor experimental control, there were more
individuals with greater familiarity with computer software. ANCOVA is now
useful because it will adjust the means on our dependent variable, speed of
learning, to an estimate of what they would have been if the groups had not
differed in level of familiarity. In other words, these adjusted means are our best
guess as to what the means would have been had our three groups not differed
on the covariate.
So, by performing an ANCOVA, we may either:
2. Reduce error variance and adjust the means on the dependent variable.
misinformation group and a control group where there was no discussion. The
participants were then given a posttest questionnaire. Performance on the pretest
recognition questionnaire was the covariate in her analysis.
You might be wondering why Helen did not work out the difference between
the pretest and posttest scores, that is the extent to which performance changed,
and then perform a one-way ANOVA. However, calculating difference scores
does not eliminate the variation present in the pretest scores – these scores vary
because participants differ in their ability to remember the information from the
video, caused by such things as differences in attention directed to the video. This
variation will not be removed by calculating difference scores (the pretest scores
will normally be correlated with the difference scores). Helen is not interested in
this variation and by partialling out or removing it she can focus on the effect of
participating in one of the three groups.
10
4. A covariate should be measured reliably, that is if it were measured several
times across a time period, there would be a high correlation between the
scores.
5. The relationship between a covariate and the dependent variable must be
linear (straight-line). You can check this by looking at the scatterplots for
each group and if there is more than one covariate then they should not be
strongly correlated with each other.
6. There should be homogeneity of regression. The relationship between the
dependent variable and the covariate should be similar for all experimental
groups, so that the regression lines are parallel. So, using our first example,
the relationship between learning speed of statistical package and familiarity
with computer software should be similar for each of the three groups tested.
7. In addition, ANCOVA makes the same assumptions as ANOVA (see
Chapter 8, Section 1).
286 SPSS for Psychologists
TIP If you have read Chapter 8 or performed ANOVA on SPSS, then you will have
already seen the dialogue boxes that appear next. This is because SPSS has incorporated
ANCOVA as an option in the ANOVA dialogue box.
Analysis of covariance and multivariate analysis of variance 287
1. Click on
Analyze.
2. Select
General Linear
Model.
3. Select
Univariate. You
will then see the
Univariate
dialogue box.
10
have here.
When you click on the Model button, the following Univariate: Model
dialogue box will appear.
9. Select the factor ‘group’ and 10. Select both ‘group’ and ‘time1’ by
click on the arrow under Build clicking on one after the other so
Term(s) so that it appears in the that both become highlighted) and
Model box, as shown here. Then click on the arrow so that the
repeat with the covariate ‘time1’. interaction ‘group*time1’ appears in
the right hand box, as it does here.
Click on the Continue button.
Once you click on the Continue button you will return to the Univariate
dialogue box. Now click on the button. The output is shown on the
following pages.
Analysis of covariance and multivariate analysis of variance 289
10
This is the only row that you are interested in. If this
interaction is statistically significant, then the data violate
the assumption of homogeneity of regression slopes. Here
SPSS reports the interaction to be non-significant so this
assumption has not been violated.
TIP Now that we have checked for homogeneity of regression slopes, we can perform
the ANCOVA test. First, however, we show you how to inspect the relationship between
the covariate and the dependent variable graphically, using scatterplots. This procedure
can be used to check that there is a linear relationship between the covariate and the
dependent variable, and also that there is homogeneity of regression.
290 SPSS for Psychologists
6. Double click on the scattergram in the SPSS output to bring up the Chart
Editor dialogue box as shown below on the left.
10
TIP double clicking on the symbols will bring up a different Properties dialogue box
that allows you to change the appearance of the symbols and lines.
Analysis of covariance and multivariate analysis of variance 291
10
Inspect your graph to see if there is a linear relationship between the covariate
and the dependent variable – if there is no linear relationship then there is no
point in performing ANCOVA. Here, there appears to be a linear relationship.
Remember that the slopes of the regression lines should be roughly parallel, that
is the relationship between the covariate and the dependent variable should be
similar for all three groups (the assumption of homogeneity of regression). This is
important because ANCOVA assumes that the overall relationship between the
dependent variable and the covariate is true for each of the three groups. We
already know that this assumption has not been violated by our earlier check and
this is confirmed here by the fact that the slopes are almost parallel. The R
squared values indicate how strong the relationship is between the dependent
variable and the covariate – a covariate should be correlated with the dependent
variable.
Now that we have checked that there is a linear relationship between the
covariate and the dependent variable and that there is homogeneity of regression,
we can perform the ANCOVA test.
292 SPSS for Psychologists
6. In the Specify
Model options, select
Full factorial.
7. Click on the
Continue button.
You will return to the Univariate dialogue box where you can click on the
Options button. This will bring up the Univariate: Options dialogue box,
shown next.
10
9. Click here to check for
assumption of homogeneity of
variance-covariance matrices.
Click on and SPSS will calculate the ANCOVA. The output is shown
next.
294 SPSS for Psychologists
Between-Subjects Factors
Value Label N
Discussion 1.00 discussion
24
group with stooge
2.00 discussion
without 24
stooge To check that the assumption
3.00 no of equality of variance was not
discussion- 24 violated, we clicked on
Homogeneity tests in the
- writing
Univariate: Options dialogue
box. A significance level greater
than 0.05, as here, shows that
Levene's Test of Equality of Error Variancesa the data do not violate the
assumption of equality of error
Dependent Variable: time2 variances.
F df1 df2 Sig.
.584 2 69 .560
Tests the null hypothesis that the error variance of
the dependent variable is equal across groups.
a. Design: Intercept+time1+group
The first line highlighted shows that the
covariate is significantly related to the
dependent variable. The next line shows the main
10
effect of group.
Tests of Between-Subjects Effects
Discussion group
After adjusting for pretest scores, there was a significant effect of the between-
10
subjects factor group, F(2,68) = 13.86, p < .0005, partial Ș² = .29. Adjusted mean
recognition scores suggest that discussion with confederate, who introduced
misinformation, lowered recognition memory compared with discussion without
confederate or a writing-only control group.
296 SPSS for Psychologists
An example
Imagine you wanted to investigate fear of crime. You might want to conduct a
study to see if being a victim of crime influences fear of crime, and compare
10
several groups: those who have been a victim several times; those who have been
a victim once; and those who have never been a victim. You may then want to
measure a number of different aspects of fear of crime, including objective
measures such as the number of security measures implemented in the home, the
number of times per week participants go out on their own, as well as using a self
report measure. You could perform one-way ANOVAs on each of the different
dependent variables but, as you may remember from Section 8 in Chapter 8, by
performing multiple tests we run an increased risk of making a Type 1 error, that
is incorrectly rejecting a null hypothesis. In the same way that we use ANOVA
rather than conducting multiple t-tests, we use MANOVA rather than
conducting multiple ANOVAs.
dependent variables. MANOVA will tell you if the mean differences among
groups on the combined dependent variable are larger than expected by chance.
It might be helpful to compare MANOVA with other tests that also combine
variables. In Chapter 9 on multiple regression, a model containing a combination
of predictor variables sought to predict the scores on a criterion variable. Here,
MANOVA does the opposite by seeking to predict an independent variable from
a combination of dependent variables. In the next chapter you will read about
how variables are combined together to predict category membership in a type of
analysis called Discriminant Analysis.
You may remember that for ANOVA the statistic calculated is the F-ratio,
which is the ratio of the variance due to the manipulation of the IV and the error
variance. Conceptually, MANOVA does something similar but this is statistically
far more complicated and it will provide you with a choice of four different
statistics to choose from, all of which indicate whether there are statistically
significant differences among the levels of the independent variable on the linear
combination of the dependent variables:
Pillai’s Trace
Hotelling’s Trace
Wilks’ Lambda
Roy’s Largest Root
SPSS will report a value for each of these, along with the F tests for each. If your
factor has only two levels, then the F tests reported will be identical. This is
because when the factor has only two levels, and hence one degree of freedom,
10
there is only one way of combining the different dependent variables to separate
the levels or the groups. However, when your factor has more than two levels,
then the F tests reported for the four test statistics are usually different and it is
possible that some may be significant and some not. Most researchers report the
values for the Wilks’ Lambda, so we suggest you report these too. However,
Pillai’s is considered to be the most robust (although all four are reasonably
robust), so you might consider reporting the values for Pillai’s when your sample
size is small.
variables do not exceed 0.90. Tabachnick and Fidell (2001) suggest that the
dependent variables should not be correlated with each other and that each
should measure different aspects of the construct you are interested in. Certainly,
if the dependent variables are correlated, and MANOVA shows a significant
result, then it is difficult to tease apart the contribution of each of the individual
dependent variables to this overall effect. Rather than looking at the univariate
ANOVAs, you would need to explore your data using Discriminant Analysis as
this will allow you to explore the relationship between the dependent variables.
We recommend that you perform tests of correlation to check the strength of the
correlations between your dependent variables to help you decide how best to
follow up any significant MANOVA result.
for each level of your factor. (If you are not sure how to generate scatterplots
on SPSS, then see Chapter 6, Section 2.)
3. You must ensure that the number of cases in each cell is greater than the
number of dependent variables.
4. There should be homogeneity of variance–covariance matrices and this is
similar to the assumption of homogeneity of variance that we’ve mentioned
previously in relation to parametric tests. SPSS can check this assumption for
you and we will show you how to do this below.
5. There should be both univariate and multivariate normality of distributions.
Assessment of multivariate normality is difficult in practice, however you
should at least check that each dependent variable is normally distributed,
that is that univariate normality holds, as this is likely to reflect multivariate
normality. Giles (2002) points to two ways in which normality may be
violated. The first is platykurtosis and this is evident when the distribution
curve looks like a low plateau. You can check for this by generating
histograms of each dependent variable. The second is the presence of outliers;
these are data points far outside the area covered by the normal distribution
(see Tabachnick and Fidell, 2001, for advice on screening for outliers).
Generally, if you have equal sample sizes and a reasonable number of
participants in each group, and you’ve checked for outliers before conducting
your analysis, then MANOVA will still be a valid test even with modest violations
of these assumptions.
10
300 SPSS for Psychologists
Correlations
1. Click on
Analyze.
2. Select
General Linear
Model.
3. Select
Multivariate.
You will then see
the Multivariate
dialogue box.
10
4. Select the dependent variables
‘security’, ‘outings’ and ‘report’
and click here to move them into
the Dependent Variable box, as we
have here.
Click on and SPSS will calculate the MANOVA. The output is shown
next.
Analysis of covariance and multivariate analysis of variance 303
Between-Subjects Factors
Value Label N
group 1.00 victim
several 10
times
2.00 victim once 10
3.00 never been
10
victim
Partial eta squared values were obtained by selecting Estimates of effect size.
These provide you with an indication of the proportion of variance in the new
combined dependent variable that can be accounted for by the factor ‘group’. A rule
10
of thumb is that values larger than .14 (or 14%) indicate a large effect.
Multivariate Testsc
Partial Eta
Effect Value F Hypothesis df Error df Sig. Squared
Intercept Pillai's Trace .984 513.563a 3.000 25.000 .000 .984
Wilks' Lambda .016 513.563a 3.000 25.000 .000 .984
Hotelling's Trace 61.628 513.563a 3.000 25.000 .000 .984
Roy's Largest Root 61.628 513.563a 3.000 25.000 .000 .984
group Pillai's Trace .919 7.375 6.000 52.000 .000 .460
Wilks' Lambda .101 17.906a 6.000 50.000 .000 .682
Hotelling's Trace 8.713 34.852 6.000 48.000 .000 .813
Roy's Largest Root 8.690 75.312b 3.000 26.000 .000 .897
a. Exact statistic
b. The statistic is an upper bound on F that yields a lower bound on the significance level.
c. Design: Intercept+group
In the table above, we are only interested in the results for the variable
‘group’, and we ignore those reported for the Intercept. Here we find the four
MANOVA test statistics that tell us whether the new combined dependent
variable, fear of crime, is different across the three groups of participants.
Here, p is smaller than .05 for each test statistic so all are significant. We
are going to report the values for Wilks’ Lambda.
304 SPSS for Psychologists
a
Levene's Test of Equality of Error Variances These statistics were obtained by
F df1 df2 Sig.
clicking on Homogeneity tests in the
security .678 2 27 .516 Multivariate: Options dialogue box. If
outings .369 2 27 .695 Levene’s p > .05, then there is equality
report .456 2 27 .638 of variance, and this is important both
Tests the null hypothesis that the error variance of the in terms of the reliability of the results
dependent variable is equal across groups. below and in supporting the robustness
a. Design: Intercept+group of the multivariate statistics.
c. R Squared = .784 (Adjusted R Squared = .768) These are the three univariate ANOVA test
statistics. As there are three dependent variables,
we apply Bonferroni correction by dividing 0.05 by
3, so sig. values need to be smaller than 0.017 for
results to be significant. This is the case for two of
the three dependent variables: ‘outings’ and ‘report’.
Again, partial eta squared values are reported
showing the amount of variance in the dependent
Estimated Marginal Means variable. Approximately 78% of the variance in
‘report’ is accounted for by ‘group’.
group
There was a significant effect of the level of experienced crime (none, one or
multiple experiences) on the combined dependent variable fear of crime, F(6,50)
= 17.91, p < .0005; Wilks’ Lambda = .1; partial Ș² = .68. Analysis of each
individual dependent variable, using a Bonferroni adjusted alpha level of .017,
showed that there was no contribution of the number of security measures
installed, F(2,27) = 0.22, p = .801, partial Ș² = .02. The three groups differed in
terms of the number of times they went out on their own per week, F(2,27) =
7.86, p = .002, partial Ș² = .37 and in terms of the self report measure, F(2,27) =
49.00, p < .0005, partial Ș² = .78.
10
and ‘2’ into the Number of Levels
box. The button Add becomes
highlighted. We click on this button
and by doing so define our within-
subjects variable, before/after being
a victim.
SUMMARY