You are on page 1of 46

# Data Modeling

## General Linear Model &

Statistical Inference
Thomas Nichols, Ph.D.
Assistant Professor
Department of Biostatistics

http://www.sph.umich.edu/~nichols

## Brain Function and fMRI

ISMRM Educational Course
July 11, 2002 1
Motivations
• Data Modeling
– Characterize Signal
– Characterize Noise
• Statistical Inference
– Detect signal
– Localization (Where’s the blob?)

2
Outline
• Data Modeling
– General Linear Model
– Linear Model Predictors
– Temporal Autocorrelation
– Random Effects Models
• Statistical Inference
– Statistic Images & Hypothesis Testing
– Multiple Testing Problem
3
Basic fMRI Example
• Data at one
voxel
– Rest vs.
passive
word
listening
• Is there an
effect?
4
A Linear Model
• “Linear” in
parameters
β1 & β2

error
Time

= β1 + β2 +

Intensity x1 x2 ε
5
Linear model, in image form…

= β1 + β2 +

Y = β1 x1 + β 2 x2 + ε 6
Linear model, in image form…

Estimated

= β̂1 + β̂ 2 +

Y = βˆ1 x1 + ˆ
β 2 x2 + εˆ 7
… in image matrix form…

 βˆ1 
= × ˆ  +
 β 2 

Y = X × β̂ + εˆ 8
… in matrix form.
1 p 1 1
Y = Xβ + ε
β
p
Y = X + ε

N N N
N: Number of scans, p: Number of regressors 9
Linear Model Predictors
• Signal Predictors
– Block designs
– Event-related responses
• Nuisance Predictors
– Drift
– Regression parameters

10
Signal Predictors
• Linear Time-Invariant system
Blocks
• LTI specified solely by
– Stimulus function of Events
experiment

– Hemodynamic Response
Function (HRF)
• Response to instantaneous
impulse 11
Convolution Block Design Event-Related
Examples
Experimental
Stimulus
Function

Hemodynamic
Response
Function

Predicted
Response

12
HRF Models
• Canonical HRF
– Most sensitive
if it is correct
bias and/or poor fit
• E.g. True response
may be faster/slower
SPM’s HRF
• E.g. True response
may have smaller/
bigger undershoot
13
HRF Models
• Smooth Basis HRFs
– More flexible
– Less interpretable
• No one parameter Gamma Basis
explains the response
– Less sensitive relative
to canonical (only
if canonical is correct)

14
Fourier Basis
HRF Models
• Deconvolution
– Most flexible
• Allows any shape
• Even bizarre,
non-sensical ones
– Least sensitive relative
to canonical (again, if
canonical is correct) Deconvolution Basis

15
Drift Models
• Drift
– Slowly varying
– Nuisance variability
• Models
– Discrete Cosine Transform

Discrete Cosine 16
Transform Basis
General Linear Model
Recap
• Fits data Y as linear combination of
predictor columns of X
Y = Xβ + ε

• Very “General”
– Correlation, ANOVA, ANCOVA, …
• Only as good as your X matrix
17
Temporal Autocorrelation
• Standard statistical methods assume
independent errors
– Error εi tells you nothing about εj i ≠ j
• fMRI errors not independent
– Autocorrelation due to
– Physiological effects
– Scanner instability

18
Temporal Autocorrelation
In Brief
• Independence
• Precoloring
• Prewhitening

19
Autocorrelation:
Independence Model
• Ignore autocorrelation
– Under-estimation of variance
– Over-estimation of significance
– Too many false positives

20
Autocorrelation:
Precoloring
• Temporally blur, smooth your data
– This induces more dependence!
– But we exactly know the form of the
dependence induced
– Assume that intrinsic autocorrelation is
negligible relative to smoothing
• Then we know autocorrelation exactly
• Correct GLM inferences based on “known”
autocorrelation
21
[Friston, et al., “To smooth or not to smooth…” NI 12:196-208 2000]
Autocorrelation:
Prewhitening
• Statistically optimal solution
• If know true autocorrelation exactly, can
undo the dependence
– Then proceed as with independent data
• Problem is obtaining accurate estimates of
autocorrelation
– Some sort of regularization is required
• Spatial smoothing of some sort 22
Autocorrelation Redux

## Indep. Simple Inflated All

significance
Precoloring Avoids Statistically SPM99
autocorr. est. inefficient
Whitening Statistically Requires precise FSL,
optimal autocorr. est. SPM2
23
Autocorrelation: Models
• Autoregressive
– Error is fraction of previous error plus
“new” error
– AR(1): εi = ρεi-1 + ηI
• Software: fmristat, SPM99
• AR + White Noise or ARMA(1,1)
– AR plus an independent WN series
• Software: SPM2
• Arbitrary autocorrelation function
ρk = corr( εi, εi-k ) 24
• Software: FSL’s FEAT
Statistic Images &
Hypothesis Testing
• For each voxel Y = Xβ + ε
– Fit GLM, estimate betas
• Write b for estimate of β
– But usually not interested in all betas
• Recall β is a length-p vector

25
Building Statistic Images
Predictor of interest

β1

β2

β3
= β4
+
β5

β6

β7

β8
Y = X × ββ + ε
26

9
Building Statistic Images
c’ = 1 0 0 0 0 0 0 0
• Contrast
– A linear combination
b1 b2 b3 b4 b5 ....
of parameters
– c’β

contrast of
estimated
parameters
c’b
T= T=
variance
estimate s2c’(X’X)+c
27
Hypothesis Test
• So now have a value T for our statistic
• How big is big
– Is T=2 big? T=20?

28
Hypothesis Testing
• Assume Null Hypothesis of no signal
T
• Given that there is no
signal, how likely
is our measured T?
• P-value measures this P-val
– Probability of obtaining T
as large or larger
∀ α level
– Acceptable false positive rate 29
Random Effects Models
• GLM has only one source of randomness
Y = Xβ + ε
– Residual error
• But people are another source of error
– Everyone activates somewhat differently…

30
Distribution of
Fixed vs. each subject’s
Random effect

Effects Subj. 1

Subj. 2
• Fixed Effects
– Intra-subject Subj. 3

## variation suggests Subj. 4

all these subjects
Subj. 5
different from zero
• Random Effects Subj. 6
0
– Intersubject
variation suggests
population not
very different from
zero 31
Random Effects for fMRI
• Summary Statistic Approach
– Easy
• Create contrast images for each subject
• Analyze contrast images with one-sample t
– Limited
• Only allows one scan per subject
• Assumes balanced designs and homogeneous meas. error.
• Full Mixed Effects Analysis
– Hard
• Requires iterative fitting
• REML to estimate inter- and intra subject variance
– SPM2 & FSL implement this, very differently
– Very flexible 32
Random Effects for fMRI
Random vs. Fixed
• Fixed isn’t “wrong”, just usually isn’t of interest
• If it is sufficient to say
“I can see this effect in this cohort”
then fixed effects are OK
• If need to say
“If I were to sample a new cohort from the
population I would get the same result”
then random effects are needed

33
Multiple Testing Problem
• Inference on statistic images
– Fit GLM at each voxel
– Create statistic images of effect
• Which of 100,000 voxels are significant?
α=0.05 ⇒ 5,000 false positives!
t > 0.5 t > 1.5 t > 2.5 t > 3.5 t > 4.5 t > 5.5 t > 6.5

34
MCP Solutions:
Measuring False Positives
• Familywise Error Rate (FWER)
– Familywise Error
• Existence of one or more false positives
– FWER is probability of familywise error
• False Discovery Rate (FDR)
– R voxels declared active, V falsely so
• Observed false discovery rate: V/R
– FDR = E(V/R)
35
FWER MCP Solutions
• Bonferroni
• Maximum Distribution Methods
– Random Field Theory
– Permutation

36
FWER MCP Solutions
• Bonferroni
• Maximum Distribution Methods
– Random Field Theory
– Permutation

37
FWER MCP Solutions:
Controlling FWER w/ Max
• FWER & distribution of maximum
FWER = P(FWE)
= P(One or more voxels ≥ u | Ho)
= P(Max voxel ≥ u | Ho)
• 100(1-α)%ile of max distn controls FWER
FWER = P(Max voxel ≥ uα | Ho) ≤ α

α 38

FWER MCP Solutions:
Random Field Theory
• Euler Characteristic χu
– Topological Measure
• #blobs - #holes
Threshold
– At high thresholds, Random Field

## just counts blobs

– FWER = P(Max voxel ≥ u | Ho)
= P(One or more blobs | Ho)
≈ P(χu ≥ 1 | Ho)
≈ E(χu | Ho) 39 Sets
Suprathreshold
Controlling FWER:
Permutation Test
• Parametric methods
– Assume distribution of
max statistic under null
5%
hypothesis
Parametric Null Max Distribution
• Nonparametric methods
– Use data to find
distribution of max statistic
5%
under null hypothesis
– Any max statistic! Nonparametric Null Max Distribution
40
Measuring False Positives
• Familywise Error Rate (FWER)
– Familywise Error
• Existence of one or more false positives
– FWER is probability of familywise error
• False Discovery Rate (FDR)
– R voxels declared active, V falsely so
• Observed false discovery rate: V/R
– FDR = E(V/R)
41
Measuring False Positives
FWER vs FDR
Noise

Signal

Signal+Noise

42
Control of Per Comparison Rate at 10%

11.3% 11.3% 12.5% 10.8% 11.5% 10.0% 10.7% 11.2% 10.2% 9.5%
Percentage of Null Pixels that are False Positives

## Control of Familywise Error Rate at 10%

FWE
Occurrence of Familywise Error

## Control of False Discovery Rate at 10%

6.7% 10.4% 14.9% 9.3% 16.2% 13.8% 14.0% 10.5% 12.2% 8.7%
Percentage of Activated Pixels that are False Positives 43
Controlling FDR:
Benjamini & Hochberg
• Select desired limit q on E(FDR)
• Order p-values, p(1) ≤ p(2) ≤ ... ≤ p(V)
• Let r be largest i such that

1
p(i) ≤ i/V × q
p(i)

p-value
• Reject all hypotheses
corresponding to
p(1), ... , p(r).
i/V × q
0

0 1
i/V
44
Conclusions
• Analyzing fMRI Data
– Need linear regression basics
– Lots of disk space, and time
– Watch for MTP (no fishing!)

45
Thanks
• Slide help
– Stefan Keibel, Rik Henson, JB Poline, Andrew
Holmes

46