You are on page 1of 28

Temporal Basis Functions

&
Correlated Regressors

Gary Price & Patti Adank

fMRI for Dummies 29-03-06


Correlated Regressors
(or: the trouble with multicollinearity)

by (a slightly puzzled) Patti Adank

fMRI for Dummies 29-03-06


Sources:
Will Penny

Rik Hensons slides www.mrc-


cbu.cam.ac.uk/Imaging/Common/rikSPM-GLM.ppt

previous years presenters slides

fMRI for Dummies 29-03-06


Correlations between regressors

in multiple regression analysis:


problems for behavioural data
behavioural example (fictional)
solutions

in the General Linear Model:


problems for neuroimaging data
PET example
solutions?

fMRI for Dummies 29-03-06


Multiple Regression Analysis
&
Correlated Regressors

fMRI for Dummies 29-03-06


Multiple regression analysis

Multiple regression characterises the relationship between


several independent variables (or regressors), X1, X2, X3
etc, and a single dependent variable, Y:

Y = 1X1 + 2X2 +..+ LXL +

The X variables are combined linearly and each has its own
regression coefficient (weight)
s reflect the independent contribution of each regressor,
X, to the value of the dependent variable, Y
i.e. the proportion of the variance in Y accounted for by
each regressor after all other regressors are accounted for
fMRI for Dummies 29-03-06
Multiple regression analysis

Fit straight line


through points for
Y and X

some statistics: if the model fits the data well:


- R2 is high (reflects the proportion of variance in Y
explained by the regressor X)
- the corresponding p value will be low
fMRI for Dummies 29-03-06
Multiple regression analysis: multicollinearity

multiple regression results are sometimes difficult to


interpret:
the overall p value of a fitted model is very low,
but individual p values for the regressors are high
this means that the model fits the data well, even
though none of the X variables has a significant impact
on predicting Y.
How is this possible?
caused when two (or more) regressors are highly
correlated: problem known as multicollinearity

fMRI for Dummies 29-03-06


Regression analysis: multicollinearity example

When is multicollinearity between regressors a problem:


no: when you just want to predict Y from X1 and X2, the
values of R2 and p will be correct
yes: but when you want assess how individual regressors
impact the independent variable:
- individual p values can be misleading: a p value can be
high, even though the variable is important);
- the confidence intervals on the regression coefficients are
very wide and may include zero: you cannot be confident
whether an increase in the X value is associated with an
increase, or a decrease, in Y.

fMRI for Dummies 29-03-06


Regression analysis: multicollinearity example

Measures for multicollinearity:


In general:
if r > 0.8 between regressors it can be expected that they
show multicollinearity
In SPSS:
Tolerance:
proportion of a regressors variance not accounted for by other
regressors in the model
low tolerance values are an indicator of multicollinearity
Variance Inflation Factor (VIF)
the reciprocal of the tolerance
large VIF values are an indicator of multicollinearity
fMRI for Dummies 29-03-06
Regression analysis: multicollinearity example

Example:
Question: how can the perceived clarity of a auditory
stimulus be predicted from the loudness and
frequency of that stimulus?
perception experiment in which subjects had to judge
the clarity of an auditory stimulus.
model to be fit:
Y = 1X1 + 2X2 +
Y = judged clarity of stimulus
X1 = loudness
X2 = frequency
fMRI for Dummies 29-03-06
Regression analysis: multicollinearity example

What happens when X1 (pitch) and X2 (loudness) are


collinear, i.e., strongly correlated?
56

55

54
Correlation loudness
& frequency : 0.945
53
(p<0.000)
52

high loudness values


51
correspond to high
50
110 120 130 140 150 160 170
frequency values
frequency
PITCH
fMRI for Dummies 29-03-06
Regression analysis: multicollinearity example

Contribution of individual predictors:

X1 (loudness) is entered as sole predictor:


Y = 0.859X1 + 24.41
R2 = 0.74 (74% explained variance in Y)
p < 0.000

X2 (frequency) entered as sole predictor:


Y = 0.824X1 + 26.94
R2 = 0.68 (68% explained variance in Y)
p < 0.000
fMRI for Dummies 29-03-06
Regression analysis: multicollinearity example

Collinear regressors X1 and X2 entered together:


Resulting model:
Y = 0.756X1 + 26.94 (X2?)
R2 = 0.74 (74% explained variance in Y)
p < 0.000

Individual regressors:
X1 (loudness): R2 = , p < 0.000
X2 (frequency): R2 = 0.555, p < 0.594

fMRI for Dummies 29-03-06


Regression analysis: removing multicollinearity

How to deal with collinearity:


1. Increase the sample size (no data like more data)
2. Orthogonalise the correlated regressor variables
- using factor analysis
- this will produce linearly independent regressors and
corresponding factor scores.
- these factor scores can subsequently be used instead
of the original correlated regressor values

fMRI for Dummies 29-03-06


General Linear Model
&
Correlated Regressors

fMRI for Dummies 29-03-06


General Linear Model

the General Linear Model can be seen as an extension


of multiple regression (or multiple regression is just a
simple form of the General Linear Model):
Multiple Regression only looks at ONE Y variable
GLM allows you to analyse several Y variables in a
linear combination (time series in voxel)
ANOVA, t-test, F-test, etc. are also forms of the
GLM

fMRI for Dummies 29-03-06


General Linear Model and fMRI

Y = X . +

Observed Design matrix Parameters Error/residu


data Several components which Define the al
Y is the BOLD explain the observed data, contribution of Difference
signal at various i.e. the BOLD time series each component between the
time points at a for the voxel of the design observed data, Y,
single voxel Timing info: onset vectors, matrix to the and that predicted
Omj, and duration vectors, value of Y by the model,
Dmj X .
HRF, hm, describes shape
of the expected BOLD
response over time
Other regressors, e.g.
realignment parameters
Experimental
manipulations

fMRI for Dummies 29-03-06


fMRI: constructing the design matrix

In analysing fMRI data, the problem of


multicollinearity occurs when specifying regressors in
the design matrix
If the regressors are linearly dependent (correlated)
then the results of the GLM are not easy to interpret
because variance attributable to an individual
regressor may be confounded with other regressor(s)
this may lead to misinterpretations of activations in
certain brain areas

fMRI for Dummies 29-03-06


fMRI: an example

for example:
- suppose that a response to a stimulus Sr is highly
correlated with the associated motor response Mr;
- and suppose it is hypothesised that a specific regions
activity for Sr is not influenced by Mr;
- then this region should be tested only after removing
all the variance from the regressor for Mr all variance
that can be explained for by Sr;
- dangerous: as the motor response does influence the
signal in the region; the test signal will be overly
significant! -> variance is wrongly assigned to Sr
fMRI for Dummies 29-03-06
fMRI: PET example

Andrade et al., (1999) Ambiguous Results in Functional


Neuroimaging Data Analysis Due to Covariate
Correlation, NeuroImage 10, 483-486
Andrade at al. show how correlated regressors can lead to
misinterpretations
collected PET data from a single subject and generated a
covariate (regressor) variable that correlated strongly with
the activation conditions used in the experiment (0 for
rest, 1-6 increasing linearly with activation levels in the
experiment)

fMRI for Dummies 29-03-06


fMRI: PET example

two purposes:
1. detect areas where the signal correlated with the
generated covariate
2. search for differences in activation versus control
periods
Implies fitting two models:
One with activation-vs-rest plus covariate regressors (r = 0.845):
M = C1 (ac-rest) + C2 (covariate)
One with variance from covariate C2 removed:
M* = C1 + C2* (C2* = 0.845(SSC2/SSC1)

fMRI for Dummies 29-03-06


fMRI: PET example

For model M and M*


SPM processing
parameters for C1 and C2/C2* were tested using t-tests
and transformed into z-scores
Results:
differences between M and M* occurred only for
activation related to C1 (the rest/activation regressor)
e.g., parahippocampal activation significant in M but
not in M*
left precuneal, superior temporal, medial frontal activity
significant in M* but not in M
fMRI for Dummies 29-03-06
fMRI: PET example

Example voxels:
(54, -56, 34) activated in M (p = 0.004) not in M* (p = 0.901)
(6, 28, -28) activated in M* (p = 0.014) not in M (p = 0.337)

fMRI for Dummies 29-03-06


fMRI: dealing with multicollinearity

Andrade et al. suggest a technique using the F-statistic to


orthogonalise correlated regressors without having to re-
estimate the parameters (which can be very time-
consuming) using principles from linear model theory
(Christensen, 1996)
Other technique used to remove correlations from
regressors: Gram-Schmidt orthogonalisation (cf. Rik
Hensons slides)

Christensen, 1996, Plane answers to Complex Questions: The Theory of Linear Models, Springer-
Verlag, Berlin

fMRI for Dummies 29-03-06


Dealing with multicollinearity in SPM

Use toolbox Design Magic - Multicollinearity assessment


for fMRI for SPM99 (SPM5?)
Author: Matthijs Vink
URL: http://www.matthijs-vink.com/tools.html
Allows you to assess the multicollinearity in your fMRI-
design by calculating the amount of factor variance that is
also accounted for by the other factors in the design
(expressed in R2).
also allows you to reduce correlations between regressors
through use of high-pass filters

fMRI for Dummies 29-03-06


Conclusion

When fitting a model in multiple regression analysis or


constructing your design matrix, correlations between
regressors can lead to misinterpretations of the
influence of the independent variables on the
dependent variable
Multicollinearity is a hassle, but can be dealt with,
usually though orthogonalisation procedures involving
(groups of) regressors

fMRI for Dummies 29-03-06


Assessing multicollinearity in SPM

The end

fMRI for Dummies 29-03-06

You might also like