The General Linear Model

Regression
A goal of science is prediction and

explanation of phenomena
In order to do so we must find events that
are related in some way such that
knowledge about one will lead to knowledge
about the other
In psychology we seek to understand the
relationship among variables that are
indicators of an innumerable amount of
information about human nature in order
better understand ourselves and why we do
the things we do
While we could just use our N of 1 personal

experience to try and understand human
behavior, a scientific (and better) means of
understanding the relationship between variables
is by means of assessing correlation
Two variables take on different values, but if they
are related in some fashion they will covary
They may do so in a way in which their values
tend to move in the same direction, or they may
tend to move in opposite directions
The underlying statistic assessing this is
covariance, which is at the heart of every
statistical procedure you are likely to use
inferentially
Covariance as a
statistical construct is
unbounded and thus
difficult to interpret in
its raw form
Correlation (Pearsons
r) is a measure of the
direction and degree
of a linear association
between two variables
Correlation is the
standardized
covariance between
two variables
cov( x, y )
( x x)( y y )
i 1
n 1
cov( x, y )
n
cov( x, y )
rxy
sx s y
rxy
1 r 1
Z
i 1
xi
Z yi
n 1
Regression allows us to use the information about covariance to

make predictions
Given a particular value of x, we can predict y with some level of
accuracy
The basic model is that of a straight line (the general linear
model)
Only one possible straight line can be drawn once the slope and
Y intercept are specified
The formula for a straight line is:
Y = bx + a
Y = the calculated value for the variable on the vertical axis

a = the intercept
b = the slope of the line
X = a value for the variable on the horizontal axis
Once this line is specified, we can calculate the corresponding

value of Y for any value of X entered
In more general terms Y = Xb + e, where these elements
represent vectors and/or matrices (of the outcome, data,
coefficients and error respectively), is the general linear model to
which most of the techniques in psychological research adhere to
Real data do not conform perfectly to a

straight line
The best fit straight line is that which
minimizes the amount of variation in data
points from the line
The common, but by no means the only or only
acceptable method attempts to derive a least

squares regression line which minimizes the
squared deviations from it
The equation for this line can be used to

predict or estimate an individuals score
on Y on the basis of his or her score on X
bx a
When the relation between

variables are expressed in this
manner, we call the relevant
equation(s) mathematical
models
The intercept and weight values
are called the parameters of the
model
While typical regression analysis
by itself does not determine
causal relations, the assumption
indicated by such a model is that
the variable on the left-hand side
of the previous equation is being
caused by the variable(s) on the
right side
The arrows explicitly go from the
predictors to the outcome, not

vice versa*
Variable X
Criterio
n
Variable Y
C
Variable Z
The process of obtaining the correct parameter values

(assuming we are working with the right model) is called
parameter estimation
Often, theories specify the form of the relationship rather
than the specific values of the parameters
The parameters themselves, assuming the basic model is
correct, are typically estimated from data.
We refer to the estimation processes as calibrating the model
A method is required for choosing parameter values that

will give us the best representation of the data possible
In estimating the parameters of our model, we are trying to
Y Y
find a set of parameters that minimizes
the error variance.
N
With least-squares estimation, we want
to be as
small as it possibly can be.
2
Estimating the Slope (the regression coefficient)

requires first estimating the covariance
cov( X , Y )
b
var( X )
Estimating the Y intercept

a Y bX
where and
are the means based on the sets
of the Y and X values respectively, and b is the
estimated slope
These calculations ensure that the regression line
passes through the point on the scatterplot
defined by the two means
Alternatively , slope
s y
b r
sx
so, by substituting we get
s
s
y
y
Y r X Y r X
sx
sx
Total variability in the dependent variable (observed mean)

comes from two sources
Variability predicted by the model i.e. what variability in the
dependent variable is due to the independent variable
How far off our predicted values are from the mean of Y
Error or residual variability i.e. variability not explained by

the independent variable
The difference between the predicted values and the observed
values
S2y
S2y
S2(yi - yi)
y
Total variance =
predicted variance + error variance
We can also show this

graphically using a
Venn diagram Showing
r2 as the proportion of
variability shared by
two variables (X and Y)
The larger the area of
overlap, the greater
the strength of the
association between
the two variables
The square of the

correlation, r, is the
fraction of the variation
in the values of y that is
explained by the
regression of y on x
R = variance of predicted
values y divided by the
variance of observed
values y
2
y
( y
y)
n 1
s r s
2
y
r
2
2 2
y
s
s
2
y
2
y
How good a fit does our line

represent?
The error associated with a

prediction (of a Y value from a
known X value) is a function of the
deviations of Y about the predicted
point
The standard error of estimate
provides an assessment of
accuracy of prediction
the standard deviation of Y
predicted from X
In terms of R2, we can see that the

more variance we account for the
smaller our standard error of
estimate will be
SY X
( ) 2
SS residual
df residual
SY X (1 R 2 )
N 1
2
Intercept
Value of Y if X is 0
Often not meaningful, particularly if its practically
impossible to have an X of 0 (e.g. weight)
Slope
Amount of change in Y seen with 1 unit change in X
Standardized regression coefficient
Amount of change in Y seen in standard deviation units
with 1 standard deviation unit change in X

In simple regression it is equivalent to the r for the two
variables
Standard error of estimate
Gives a measure of the accuracy of prediction
R2
Proportion of variance explained by the model
The General Linear Model with Categorical Predictors
Regression
can actually handle

different types of predictors, and in
the social sciences we are often
interested in differences between
groups
For now we will concern ourselves
with the two independent groups case
E.g. gender, republican vs. democrat etc.
There are different ways to code

categorical data for regression, and in
general, to represent a categorical variable
you need k-1* coded variables
k = number of categories/groups
Dummy coding involves using zeros and

ones to identify group membership, and
since we only have two groups, one group
will be zero (the reference group) and the
other 1
We will revisit coding with k > 2 after
weve discussed multiple regression
Example
The thing to note at this point is

that we have a simple bivariate
correlation/simple regression setting
The correlation between group and
the DV is .76
This is sometimes referred to as the
point biserial correlation (rpb)
because of the categorical variable
However, dont be fooled, it is
calculated exactly the same way as
before i.e. you treat that 0,1
grouping variable like any other in
calculating the correlation
coefficient
Group
0
0
0
0
0
1
1
1
1
1
DV
3
5
7
2
3
6
7
7
8
9
Graphical
display
The
R-square is .
762 = .577
The regression
equation is
Y 4 3.4 X
Look
closely at the descriptive

output compared to the
coefficients.
What do you see?
Descriptive Statistics a
group
.00
1.00
Mean
dv
Valid N (listwise)
dv
Valid N (listwise)
Std. Deviation
4.0000
2.00000
7.4000
1.14018
a. No statistics are computed for one or more split files because there are no valid cases.
Coefficients a
Unstandardized
Coefficients
Model
1
Standardized
Coeffic ients
Std. Error
(Constant)
4.000
.728
group
3.400
1.030
a. Dependent Variable: dv
Beta
95% Confidence Interval for B

t
.760
Sig.
Lower Bound
Upper Bound
5.494
.001
2.321
5.679
3.302
.011
1.026
5.774
Note again our regression equation

Recall the definition for the slope and constant
First the constant, what does when X = 0 mean here in
this setting?
It means when we are in the 0 group
What is that value?
Y = 4, which is that groups mean
The constant here is thus the reference groups mean
group
.00
1.00
Mean
dv
Valid N (listwise)
dv
Valid N (listwise)
Y 4 3.4 X
Std. Deviation
4.0000
2.00000
7.4000
1.14018
Coefficients a
Unstandardized
Coeffic ients
Model
1
Standardized
Coeffic ients
Std. Error
(Constant)
4.000
.728
group
3.400
1.030
Beta

t
.760
Sig.
Lower Bound
Upper Bound
5.494
.001
2.321
5.679
3.302
.011
1.026
5.774
Now think about the slope

What does a 1 unit change in X mean in this
setting?
It means we go from one group to the other
Based on that coefficient, what does the slope
represent in this case (i.e. can you derive that
coefficient from the descriptive stats in some way?)
The coefficient is the difference between means
group
.00
1.00
Mean
dv
Valid N (listwise)
dv
Valid N (listwise)
Y 4 3.4 X
Std. Deviation
4.0000
2.00000
7.4000
1.14018
Coefficients a
Unstandardized
Coeffic ients
Model
1
Standardized
Coeffic ients
Std. Error
(Constant)
4.000
.728
group
3.400
1.030
Beta

t
.760
Sig.
Lower Bound
Upper Bound
5.494
.001
2.321
5.679
3.302
.011
1.026
5.774
The regression line

covers the values
represented
i.e. 0, 1, for the two groups
It passes through each

of their means
Using least squares
regression the regression

line always passes through
the mean of X and Y
The constant (if we are

using dummy coding) is
the mean for the zero
(reference) group
The coefficient is the
difference between
means
Analysis of variance
Recall that in regression we are trying to
account for the variance in the DV
That total variance reflects the sum of the
squared deviations of values from the DV
mean
Sums of squares
That breaks down into:

Variance we account for
Sums of squares predicted or model or
regression
And that which we do not account for

Sums of squares error (observed predicted)
What are our predicted values in this case?

We only have 2 values of X to plug in
We already know what Y is if X is zero, and
so wed predict the group mean of 4 for all
zero values Y 4 3.4*0
The only other value to plug in is 1 for the

rest of the cases
In other words for those in the 1 group, were
predicting their respective mean

Y 4 3.4*1 7.4
So in order to get our

model summary and Fstatistic, we need:
Total variance
Predicted variance
Predicted value minus grand
mean of the DV just like it has

always been
Note again how our average
predicted value is our group
average for the DV
Error variance
Essentially each persons
score minus group mean
Predicted SS = 5[(4-5.7)2 + (7.4-5.7)2]

28.9
Error SS = (3-4)2 + 5-4)2+ (9-7.4)2

21.2
Total variance to be accounted for = (3-5.7)2+(55.7)2+(9-5.7)2

Or just Predicted SS + Error SS
50.1
Calculate R2 from these values
Here is the summary

table from our
regression
The mean square is
derived from dividing
our sums of squares by
the degrees of freedom
K-1 for the regression

Total = N -1
Error N-k
The ratio of the mean

squares is the F-statistic
ANOVA b
Model
1
Sum of Squares
df
Mean Square
Regression
28.900
28.900
Residual
21.200
2.650
Total
50.100
a. Predictors: (Constant), group

b. Dependent Variable: dv
F
10.906
Sig.
.011 a
Note the title of the summary

table
ANOVA
It is an ANOVA summary table

because you have in fact just
conducted an analysis of
variance, specifically for the
two group situation
ANOVA, the statistical
procedure as it is so-called, is
a special case of regression
Below the first table is the
ANOVA, as opposed to
regression output.
ANOVA b
Model
1
Sum of Squares
df
Mean Square
Regression
28.900
28.900
Residual
21.200
2.650
Total
50.100
Sig.
10.906
.011 a

Tests of Betw een-Subjects Effects

Dependent Variable: dv
Sourc e
Type III Sum

of Squares
df
Mean Square
group
28.900
28.900
Error
21.200
2.650
Total
375.000
10
50.100
Corrected Total
F
10.906
Sig.
.011
Partial Eta
Squared
.577
Note the partial etasquared

Eta-squared has the
same interpretation as Rsquared and as one can
see, is R-squared from
our regression
SPSS calls it partial as there
is often more than one

grouping variable, and we
are interested in unique
effects (i.e. partial out the
effects from other variables)
However it is actually etasquared here, as there is no
other variable effect to
partial out
ANOVA b
Model
1
Sum of Squares
df
Mean Square
Regression
28.900
28.900
Residual
21.200
2.650
Total
50.100
Sig.
10.906
.011 a


Sourc e
Type III Sum

of Squares
df
Mean Square
group
28.900
28.900
Error
21.200
2.650
Total
375.000
10
50.100
Corrected Total
F
10.906
Sig.
.011
Partial Eta
Squared
.577
The t-test is a special case of ANOVA

ANOVA can handle more than two groups,
while the t-test is just for two
However, F = t2 in the two group setting,
the p-value is exactly the same
Independent Sam ples Test

t-test for Equality of Means
t
dv
Equal variances assumed
df
-3.302
Sig. (2-tailed)
Mean Difference
.011
-3.40000
Std. Error
Difference
95% Confidence Interval of

the Differenc e
1.02956
Lower
Upper
-5.77418
-1.02582

Sourc e
Type III Sum

of Squares
df
Mean Square
group
28.900
28.900
Error
21.200
2.650
Total
375.000
10
50.100
Corrected Total
F
10.906
Sig.
.011
Partial Eta
Squared
.577
Compare to regression
The t, standard error, CI
and p-value
are the same, and again the
coefficient is the difference between
means
Independent Sam ples Test
t-test for Equality of Means
t
dv
Equal variances assumed
df
-3.302
Std. Error
Difference
Sig. (2-tailed)
Mean Difference
.011
-3.40000
95% Confidence Interval of

the Differenc e
1.02956
Lower
Upper
-5.77418
-1.02582
Coefficients a
Unstandardized
Coeffic ients
Model
1
Standardized
Coeffic ients
Std. Error
(Constant)
4.000
.728
group
3.400
1.030
Beta

t
.760
Lower Bound
Upper Bound
5.494
Sig.
.001
2.321
5.679
3.302
.011
1.026
5.774
Statistics is a language used for

communicating research ideas and findings
We have various dialects with which to
speak it and of course pick freely of the
words available
Sometimes we prefer to do regression and
talk about amount of variance to be
accounted for
Sometimes we prefer to talk about mean
differences and how large those are
In both cases we are interested in the effect size
Which tool we use reflects how we want to

talk about our results
Lets
assume that
we believe there
is a linear
relationship
between X and Y.
Which set of
parameter values
will bring us
closest to
representing the
data accurately?
Y 2 2 X
y y
We begin by picking
some values, plugging
them into the equation,
and seeing how well the
implied values correspond
to the observed values
We can quantify what we
mean by how well by
examining the difference
between the modelimplied Y and the actual Y
value
This difference
y y between
our observed value and
the one predicted,
,
is often called error in
prediction, or the residual
Y 2 1X
Lets try a different

value of b and see
what happens
Now the implied
values of Y are
getting closer to
the actual values
of Y, but were still
off by quite a bit
Y 2 0 X
Things
are
getting better,
but certainly
things could
improve
Y 2 1X
Ah,
much better
Y 2 2 X
Now
thats very
nice
There is a perfect
correspondence
between the
predicted values
of Y and the
actual values of
Y

The General Linear Model

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

The General Linear Model

Uploaded by

Copyright:

Available Formats

Regression

A goal of science is prediction and

While we could just use our N of 1 personal

Regression allows us to use the information about covariance to

Y = the calculated value for the variable on the vertical axis

Once this line is specified, we can calculate the corresponding

Real data do not conform perfectly to a

The common, but by no means the only or only

acceptable method attempts to derive a least

The equation for this line can be used to

When the relation between

predictors to the outcome, not

The process of obtaining the correct parameter values

A method is required for choosing parameter values that

Estimating the Slope (the regression coefficient)

Estimating the Y intercept

Total variability in the dependent variable (observed mean)

Error or residual variability i.e. variability not explained by

We can also show this

The square of the

How good a fit does our line

The error associated with a

the standard deviation of Y

In terms of R2, we can see that the

impossible to have an X of 0 (e.g. weight)

Amount of change in Y seen with 1 unit change in X

Standardized regression coefficient

Amount of change in Y seen in standard deviation units

with 1 standard deviation unit change in X

Standard error of estimate

Gives a measure of the accuracy of prediction

Proportion of variance explained by the model

The General Linear Model with Categorical Predictors

can actually handle

There are different ways to code

Dummy coding involves using zeros and

The thing to note at this point is

closely at the descriptive

95% Confidence Interval for B

Note again our regression equation

Y = 4, which is that groups mean

The constant here is thus the reference groups mean

95% Confidence Interval for B

Now think about the slope

95% Confidence Interval for B

The regression line

It passes through each

regression the regression

The constant (if we are

That breaks down into:

And that which we do not account for

What are our predicted values in this case?

The only other value to plug in is 1 for the

predicting their respective mean

So in order to get our

mean of the DV just like it has

Essentially each persons

score minus group mean

Predicted SS = 5[(4-5.7)2 + (7.4-5.7)2]

Error SS = (3-4)2 + 5-4)2+ (9-7.4)2

Total variance to be accounted for = (3-5.7)2+(55.7)2+(9-5.7)2

Here is the summary

K-1 for the regression

The ratio of the mean