You are on page 1of 28

Partial & Semi-Partial Correlation and Multiple Regression

Relationships among > 2 variables

Correlation & Regression


Both test simple linear relationships between 2 variables
Correlation: non-directional Regression: directional

Both can be extended to more than 2 variables


Partial correlation: non-directional Semi-partial correlation: directional Multiple regression: directional

Dealing with Data


Imagine the ETS calls you up and says they think there is a relationship between the hours a student spends preparing for the SAT and the score on the SAT. They have asked recent SAT-takers to provide an estimate of the hours spent preparing (including classes). They provide you with these data as well as each students GPA and the final score on the SAT.

ETS Example
What data do you have?
Hours of prep GPA SAT score

What kinds of predictions might you make about the relationship between hours of preparation and SAT score? How can you examine the relationship(s)?

Simple Correlation
Goal: determine the relationship between 2 variables (e.g. y and x1) r2yx1 is the shared variance between y and x1

r2yx1

X1

ETS Example
Can look at simple correlation between each pair of variables
prep hours & SAT prep hours & GPA GPA & SAT

r2yx1

X1

ETS Example
Prep hours x SAT

Prep hours SAT score


15 6 12 2 18 30 26 15 10 20 5 30 12 16 25 7 24 10 14 22 1040 1450 1000 1510 1230 1160 1580 1240 1329 1470 1460 1020 1390 1200 1060 1040 1340 1280 1290 1450

GPA
2.8 3.75 2.6 3.8 3.2 2.75 3.15 2.4 3.3 3.5 3.4 2.4 3.6 2.87 2.9 2.65 2.67 3.5 3.23 3.0

ETS Example
GPA x SAT

Prep hours SAT score


15 6 12 2 18 30 26 15 10 20 5 30 12 16 25 7 24 10 14 22 1040 1450 1000 1510 1230 1160 1580 1240 1329 1470 1460 1020 1390 1200 1060 1040 1340 1280 1290 1450

GPA
2.8 3.75 2.6 3.8 3.2 2.75 3.15 2.4 3.3 3.5 3.4 2.4 3.6 2.87 2.9 2.65 2.67 3.5 3.23 3.0

ETS Example
GPA x prep hours

Prep hours SAT score


15 6 12 2 18 30 26 15 10 20 5 30 12 16 25 7 24 10 14 22 1040 1450 1000 1510 1230 1160 1580 1240 1329 1470 1460 1020 1390 1200 1060 1040 1340 1280 1290 1450

GPA
2.8 3.75 2.6 3.8 3.2 2.75 3.15 2.4 3.3 3.5 3.4 2.4 3.6 2.87 2.9 2.65 2.67 3.5 3.23 3.0

ETS Example
GPA & SAT: not surprising GPA & Prep hours: huh? GPA & Prep hours:
People with lower GPAs prep more (why?) Could explain the GPA & Prep hrs

Prep Hrs SAT Prep Hrs

GPA

Three (or more) Variables


3 variables = 3 relationships
Each can effect the other two r2yx1 X1

Partial & semi-partial correlation--remove contributions of 3rd variable

X2

Partial Correlation
Find the correlation between two variables with the third held constant in BOTH That is, we remove the effect of x2 from both y and x1 r2yx1.x2 is the shared variance of y & x1 with x2 removed

r2yx1.x2 r2yx1

X1

X2

Partial Correlation
Y r2 yx1.x2 X1

y without x2 & x1 without x2 (residuals) We can put this in terms of simple corr. coefficients:
Simple correlation between y and x1 ryx1.x2 =

X2

ryx1 - ryx2rx1x2

Product of the corr. between y & x2 and the corr. of x1 & x2

( 1 - r2yx2)(1 - r2x1x2)

These represent all the variance without the partialled out relationships

Partial Correlation
Y r2 yx1.x2 X1

The signicance of ryx1.x2 can be calculated using t

H0 : xy = 0 (no relationship) H1 : xy 0 (either positive or negative corr.) t(N-3) = r yx1.x2 - yx1. x2 (1 - r2yx1.x2)/N-3
1- r2 yx1.x2 is the unexplained variance N-3 = degrees of freedom (three variables) (1 - ryx1. x22)/N-3 = standard error of ryx1.x2

X2

ETS Example
Correlation between prep hours and SAT score with GPA partialled out:
ryx1.x2 = = ryx1 - ryx2rx1x2
( 1 - r2yx2)(1 - r2x1x2)

-0.21 -(-0.54*0.71)
(1 - (-0.542))(1 - 0.712)

= 0.28

ETS Example
The partial correlation between prep hours and SAT score with effect of GPA removed: ryx1.x2 = 0.28, r2yx1.x2 = 0.08
t (N-3) = t (17) = ryx1.x2 ( 1 - r2yx1.x2)/N-3 0.28

(1 - 0.08)/17

= 1.23 Significant? t0.05(17) = 2.11 t(17) = 1.23 is not significant

Semi-Partial Correlation
Find the correlation between two variables with the third held constant in one of the variables That is, we remove the effect of x2 from x1 r2y(x1.x2) is the shared variance of y & x1 with x2 removed from x1

r2y(x1.x2) r2yx1

X1

X2

Semi-Partial Correlation
Why semi-partial? Generally used with multiple regression to remove the effect of one predictor from another predictor without removing that variability in the predicted variable NOT typically reported as the only analysis

r2y(x1.x2) r2yx1

X1

X2

Semi-Partial Correlation
y & x1 without x2 (residuals) Put in terms of simple correlation coefficients:
Simple correlation between y and x1 ry(x1. x2) =
Y r2y(x1. x2) X1

X2

ryx1 - ryx2rx1x2 ( 1 - r2x1x2)

Product of the corr. between y & x2 and the corr. of x1 & x2

Same as partial except the shared variance of y & x2 is left in

Semi-Partial Correlation
Which will be larger, the partial or the semipartial correlation?
ryx1 - ryx2rx1x2 ( 1 - r2 yx2)(1 - r2x1x2) ryx1 - ryx2rx1x2 ( 1 - r2x1x2)

ryx1.x2 =
partial

ry(x1. x2) =
semi-partial

ETS Example
Going back to the SAT example, suppose we partial out GPA from hours of prep only ryx1 - ryx2rx1x2
ry(x1. x2) = (1 - r2x1x2) -0.21 -(-0.54*0.71) = (1 - 0.542 ) = 0.20

Signicance of Semi-partial
Same as for partial correlation, just substitute the ry(x1.x2) df = N-3
t (N-3) = ry(x1. x2) ( 1 - r2y(x1.x2))/N-3

ETS Example
The semi-partial correlation between prep hours and SAT score with effect of GPA removed: ry(x1.x2) = 0.20, r2y(x1.x2) = 0.04
t (N-3) = t (17) = ry(x1. x2) ( 1 - r2y(x1.x2))/N-3 0.20 ( 1 - 0.04)/17

= 0.84 Significant? t0.05(17) = 2.11 t(17) = 0.84 is not significant

Multiple Regression
Simple regression: y = a + bx Multiple regression: General Linear Model
y = a + b 1 x1 + b2 x2 (2 predictors) Therefore, the general formula: y = a + b1 x1 + + b k xk (k predictors)
The problem is to solve for k+1 coefficients
k predictors (regressors) + the intercept We are most concerned with the predictors

ETS Example
Prep hours (x1), GPA (x2), & SAT (y)
Use Prep hours and GPA to predict SAT score

Simple regressions
y = -4.79x1 + 1353 y = 300x2 + 355

ETS Example
Use both prep hours and GPA to predict SAT score Now find equation for 3-D relationship

Finding Regression Weights


What do we minimize?
(y-y)2 (least square principle)

For multiple regression, it is easier to think in terms of standardized regression coefcients*

Finding Regression Weights


What do we minimize?
(y-y)2 (least square principle)

For multiple regression, it is easier to think in terms of standardized regression coefcients*


zy = 1 zx1 + 2zx2 The goal is to nd s that minimizes:
1 (zy - zy) 2 N = 1 (zy - zx - zx )2 1 1 2 2 N

Finding Regression Weights


Using differential calculus, we find 2 normal equations for 2 regressors:
1 + rx1x2 1 + rx1x22 2 rx1y - rx2yrx1x2 1 - r2x1x2 rx2y - rx1yrx1x2 1 - r2x1x2

rx1y = 0 rx2y = 0
Notice that these are like the semipartial correlation

These can be converted to:


1 =

2 =

Finding Regression Weights


In practice, the raw scores are used:
zy y - y est y = 1 = 1zx1 + x1 - x1 est x1 + 2 2zx2 x2 - x2 est x2

which is equivalent to:


y = 1 est y est x1 x1+ 2 est y x 2 + y - 1 est y est x1 x 1 - 2 x2 est x2 est y

est x2

10

Finding Regression Weights


Look at each segment...
y = 1 y = est y est x1 b 1 x1 x1+ 2 + est y est y est y x 1 - 2 x 2 + y - 1 x2 est x1 est x2 est x2 b2 x2 + a

we have the regression equation with...


b1 = 1 est y est x1 b2 = 2 est y est x2 Note: RAW

regression
weights

a = y - b1 x1 - b2x2

ETS Example
Use the rs to get the s
rx1x2 = -0.54 rx1y = -0.22 rx2y = 0.72
1 = rx1y - rx2yrx1x2 1 - r2x1x2 2 = rx2y - rx1yrx1x2 1 - r2x1x2 2 = 0.84 est y est x2

1 = 0.24 Use the s to get the coefcients b1 = 1 est y est x1 = 5.16 b2 = 2

= 353

a = y - b1 x1 - b2x2 = 110

Finding Regression Weights


For >2 predictors, the same principle apply
Use normal equations will minimize (y - y) 2 (deviation of actual from predicted) The equations can be expressed in matrix form as: RijB j - R jy = 0 Rij = k x k matrix of the correlation among the different independent variables (xs) Bj = a column vector of the k unknown values (1 for each x) Rjy = a column vector of the correlation coefcient for each k predictor and the dependent variable (y)

11

Finding Regression Weights


RijBj - Rjy = 0 Rij and Rjy are known
each rxixj and each ryxi

Therefore, we can solve for Bj Bj = Rij-1 Rjy (in matrix form, this is really easy!) Dont worry about actually calculating these, but be sure you understand the equation!

Finding Regression Weights


For each independent variable, we can use the relationship of b to :
bj = j est y est xj

The same principle for obtaining the intercept in simple regression applies as well:
a = y - b jxj

Explained Variance (Fit)


For 2 predictors, equation denes a plane
y = 5.16x1 + 353x2 + 110 (ETS example)

How far are the points in 3-D space from the plane dened by the equation?

12

Explained Variance
In addition to simple (rxy), partial (ryx1.x2), & semi-partial (ry(x1.x2)) correlation coefcients, we can have a multiple correlation coefcient (Ry.x1x2) Ry.x1x2 = correlation between observed value of y and predicted value of y
can be expressed in terms of beta weights and simple correlation coefcients
Ry.x1x2 =

1 yx1

+ 2 r yx2

OR

R2y.x1x2 =

1r yx1 + 2r yx2

Explained Variance
R2y.x1x2 = 1 ryx1 + 2 ryx2 Any i represents the contribution of variable xi to predicting y The more general version of this equation is simply: R2 = jryxj or in matrix form... R2 = BjRjy
(Just add up the products of the s and the rs)

How are i s and R2 related to the simple correlation coefcients?

Explained Variance
R2y.x1x2 = 1 ryx1 + 2 ryx2 If x1 and x2 are uncorrelated:
1 = ryx1
X1 Y

2 = ryx2

X2

R2y.x1x2 = ryx1ryx1 + ryx2ryx2 = r2yx1 + r2yx2

13

Explained Variance
R2y.x1x2 = 1 ryx1 + 2 ryx2 If x1 and x2 are correlated:
is are corrected so that overlap is not counted twice
Y

X1

X2

Adjusted R2
R2 is a biased estimate of the population R2 value If you want to estimate the population, use Adjusted R2
Most stats packages calculate both R2 and Adjusted R2 If not, the value can be obtained from the R2:
(k)(1 - R2) Adj R 2 = R2 N-k-1

Signicance Tests
In multiple regression, there are 3 different statistical tests that are of interest
Significance of R2
Is the fit of the regression model significant?

Significance for increments to R2


How much does adding a variable improve the fit of the regression model?

Significance of the regression coefficients


j is the contribution of xj. Is this different from 0?

14

Partitioning Variance
(y-y)2 = (y-y)2 + (y-y)2
Total variance in y (aka SStotal) Unexplained Explained Variance Variance (aka SSres) (aka SSreg)

Same as in simple regression! The only difference is that y is generated by a linear function of several independent variables (k predictors) Note: SStotal = SSregression + SSresidual

Signicance of R2
Need a ratio of variances (F value)
F= MSReg MSRes = SSReg/dfReg SSRes/dfRes

Where do these values come from? SSReg = (y-y)2; dfReg = k (# of regressors - 1) SSRes = (y-y)2; dfRes = N-k-1 (# obs - # reg) F for the overall model reflects this ratio

Signicant Increments to R2
As variables (predictors) are added to the regression R2 can
stay the same; additional variable has NO contribution increase; additional variable has some contribution

If R2 increases, we want to know if that increase is significant

15

Signicant Increments to R2
Use an FR2 FR2 = RL2 - RS2/kL - k S (1-RL2 )/(N-kL-1)

Making sense of the equation


L = larger model; S = smaller model ALL variables in smaller model (S) must also be in larger model (L) Therefore, L is model S + one or more additional variables

Signicance of Coefcients
Think about bj in t terms: bj /est bj bj/est bj is distributed as a t with N-k-1 degrees of freedom, where...
est bj = SSRes/N-k-1 SSj(1-Rj2 )

SSj = sum of squares for variable xj Rj2 = squared multiple correlation for predicting j from remaining k-1 predictors (treating xj as the predicted variable)

Signicance of Coefcients
est bj = SSRes/N-k-1 SSj(1-Rj2 )

As Rj2 increases, the denominator of the t equation approaches 0; that is, est bj becomes larger As the remaining xs account for xj , bj is less likely to reach significance

16

Importance of IVs (xs)


Uncorrelated IVs: simple ryxjs work Correlated IVs:
simple correlation coefficients include variance shared among IVs (over-estimated) regression weights can involve predictor intercorrelations or suppressors (more later) Best measure: squared semi-partial correlation srj2 BUT srj2 comes in different forms for different types of regression

Multiple Regression Types


Several types of regression available How do they differ?
Method for entering variables
What variables are in the model What variables are held constant

Use different types of R2 values to assess importance Use of different measures to assess importance of IVs

Multiple Regression Types


Simultaneous Regression (most common)
Single regression model with all variables
All predictors are entered simultaneously All variables treated equally

Each predictor is assessed as if it was entered last


Each predictor is evaluated in terms of what it adds to the prediction of the dependent variable, over and above the other variables Key test: srj2 for each xj with all other xs held constant

17

Multiple Regression Types


Hierarchical Regression
Multiple models calculated
Start with one predictor Add predictors Order specified by researcher

Each predictor is assessed in terms of what it adds at the time it is entered


Each predictor is evaluated in terms of what it adds to the prediction of the dependent variable, over and above the other variables that have already been entered Key test: R2 at each step

Multiple Regression Types


Hierarchical Regression
Used when the researcher has a priori reasons for entering variables in a certain order
Specific hypothesis about the components of theoretical models Practical concerns about what it is important to know

Multiple Regression Types


Stepwise & Setwise Regressions
Multiple models calculated (like hierarchical)
Use statistical criteria to determine order Limit final model to meaningful regressors

Recommended for exploratory analyses of very large data sets (> 30 predictors)
With lots of predictors, keeping all but one constant may make it difficult to find any significant These procedures capitalize on chance to find the meaningful variables

18

Multiple Regression Types


Stepwise Regression: Forward
Step 1: enter xj with largest simple ryxj Step 2: partial out first variable and choose xj with highest partial ryxj.x1 Step 3: partial out x1 and x2 Stop when resulting model reaches some criteria (e.g., min R2)

Multiple Regression Types


Stepwise Regression: Backward
Step 1: Start with complete model (all xjs) Step 2: remove xj based on some criterion
Smallest R2 Smallest F

Stop removing variables when some criteria is reached


All regressors significant Min R 2

Multiple Regression Types


Setwise Regression
Test several simultaneous models Finds the best possible subset of variables
Setwise(#): for a given set size Setwise Full: for all possible set sizes

For example, with 8 variables:


Look at all possible combinations of say 5 variables Figure out which combo has the largest R2 Can be done for sets of 2, 3, 4, 5, 6, or 7 variables In each case, find the set with the largest R2

19

Importance of Regressors
is primarily serve to help dene the equation for predicting y Squared semi-partial correlation (sr2) more appropriate for practical importance
Put in terms of variance explained by each regressor Compare how variance much each regressor explains

Importance of IVs (xs)


For simultaneous or setwise regression
srj2 is the amount R2 would be reduced if variable xj were not included in the regression equation In terms of the regression statistics:
srj2 = Fj dfRes (1-R2)

When the IVs are correlated, the srj2 s for all of the xjs will not sum to the R2 for the full model

Importance of IVs (xs)


For hierarchical or stepwise regression
srj2 is the increment to R2 added when xj is entered into the equation. Because each variable is added separately, the srj2 will reflect that variables contribution AT A PARTICULAR POINT in the model The sum of the srj2 values WILL sum to R2 The importance of the different variables may vary depending on the order in which the variables are entered

20

Potential Problems
Several assumptions
(see Berry & Feldman pp. 10-11 in book)

Random variables, interval scale No perfect collinear relationships

Also practical concerns Focus on most relevant/prevalent

Multicollinearity
Perfect collinearity: when one independent variable is perfectly linearly related to one or more of the other regressors
x1 = 2.3x2 + 4 : x1 is perfectly predicted by x2 x1 = 4.1x3 + .45x4 + 11.32 ; x1 is perfectly predicted by linear combination of x3 and x4 Any case where there is an R2 value of 1.00 among the regressors (NOT including y) Why might this be a problem?

Multicollinearity
Perfect collinearity (simplest case)
One variable is a linear function of another Wed be thrilled (and skeptical) to see this in a simple regression However...

21

Multicollinearity
Perfect collinearity (simplest case)
Problem in multiple regression y values will line up in single plane rather than varying about a plane

Multicollinearity
Perfect collinearity (simplest case)
No way to determine the plane that fits the y values best Many possible planes

Multicollinearity
In practice
perfect collinearity violates assumptions of regression less-than-perfect collinearity is more common
not an all or nothing situation can have varying degrees of multicollinearity dealing with multicollinearity depends on what you want to know

22

Multicollinearity
Consequences
If the only goal is prediction, not a problem
plugging in known numbers will give you the unknown value although specific regression weights may vary, the final outcome will not

We usually want to explain the data


can identify the contributions of the regressors that are NOT collinear cannot identify the contributions of the regressors that are collinear because regression weights will change from sample to sample

Multicollinearity
Detecting collinearities
Some clues
full model is significant but none of the individual regressors reach significance instability of weights across multiple samples look at simple regression coefficients for all pairs cumbersome way: regress each independent variable on all other independent variables to see if any R2 values are close to 1

Multicollinearity
What can you do about it?
Increase the sample size
reduce the error offset the effects of multicollinearity

If you know the relationship, you can use that information to offset the effect (yea right!) Delete one of the variables causing the problem
which one? If one is predicted by group of others logical rationale? presumably, the variables were there for theoretical reasons

23

Multicollinearity
Detecting collinearities
SPSS: Collinearity diagnostics & follow-up
Tolerance: 1-R2 for the regression of each IV against the remaining regressors.
Collinearity: tolerance close to 0 Use this to locate the collinearity

VIF: variance inflation factor = instability of


(reciprocal of Tolerance)

To locate collinearity
removing the variable with the lowest tolerance

To resolve original regression


Run a Forward regression on the variable with the lowest tolerance

Suppression
Special case of multicollinearity Suppressor variables are variables that increase the values of R2 by virtue of their correlations with other predictors and NOT the dependent variable The best way to explain this is by way of an example...

Suppression Example
Predicting course grade in a multivariate statistics course with GRE verbal and quantitative
The multiple correlation R was 0.62 (reasonable, right?) However, the s were 0.58 for GRE-Q and -0.24 for GRE-V Does this mean that higher GRE-V scores were associated with lower course performance? Not exactly

24

Suppression Example
Why was the for GRE-V negative?
The GRE-V alone actually had a small positive correlation with course grade The GRE-V and GRE-Q are highly correlated with each other The regression weights indicate that for a given score on the GRE-Q, the lower a person scores on the GRE-V, the higher the predicted course grade

Suppression Example
Another way to put it...
The GRE-Q is a good predictor of course grade, but part of the performance on GRE-Q is determined by GRE-V, so it favors people of high verbal ability. Suppose we have 2 people who score equally on GRE-Q but differently on GRE-V
Bob scores 500 on GRE-Q and 600 on GRE-V Jane scores 500 on GRE-Q and 500 on GRE-V What happens to the predictions about course grade?

Suppression Example
Another way to put it...
Bob: 500 on GRE-Q and 600 on GRE-V Jane: 500 on GRE-Q and 500 on GRE-V Based on the verbal scores, we would predict that Bob should have better quantitative skills than Jane, but he does not score better Thus, Bob must actually have LESS quantitative knowledge than Jane, so we would predict his course grade to be lower. This is equivalent to giving GRE-V a negative regression weight, despite positive correlation

25

Suppression
More generally...
If x2 is a better measure of the source of errors in x1 than in y, then giving x2 a negative regression weight will improve our predictions of y x2 subtracts out/corrects for (i.e., suppresses) sources of error in x1 Suppression seems counterintuitive, but actually improves the model

Suppression
More generally...
Suppressor variables usually considered bad--can cause misinterpretation (GRE example) However, careful exploration
enlighten understanding of interplay of variables improve our prediction of y

Easy to identify
Significant regression weights b/ (reg) & r (simple corr) have opposite signs

Practical Issues
Number of cases
Must exceed number of predictors (N > k) Acceptable N/k ratio depends on
reliability of data researchers goals

Larger samples required for:


more specific conclusions vs vague conclusions post-hoc vs. a priori tests designs with interactions collinear predictors

Generally, more is better

26

Practical Issues
Outliers
Correlation is extremely sensitive to outliers Easiest to show with simple correlation rxy = +0.59 rxy = -0.03
Outliers should be assessed for DV and all IVs Ideally, we would identify multivariate outliers, but this is not practical

Practical Issues
Linearity
Multiple regression assumes linear relationship between DV and each IV Relationships can be non-linear, multiple regression may not be appropriate Transformations may rectify non-linearity
logs reciprocals

Practical Issues
Normality
Normally distributed relationship between y and residuals (y-y) violation affects statistical conclusions, but not validity of model

Homoscedasticity
multivariate version of homogeneity of variance violation affects statistical conclusions, but not validity of model

27

y' values
Residuals (y-y') Residuals (y-y) Residuals (y-y') Residuals (y-y)

y' values

Assumptions Met
y' values
Residuals (y-y') Residuals (y-y) Residuals (y-y') Residuals (y-y)

Normality Violated
y' values

Linearity Violated

Homoscedasticity Violated

What do you report?


Correlation analyses
Always state whats happening in the data Report the r value and its corresponding p value (either actual p or p < ) Qualify simple correlation with partial correlation coefficient, if multiple variables Authors may include r2 values, stating xx% of the variance was accounted for by this relationship.

What do you report?


Regression analyses
Report the correlations first For simple regression: state the equation, the r2 value, and significance of the regression weight (sometimes a table will work) For multiple regression
state equation (not always in manuscripts) state practical importance of each regressor (sr2 ) state the relative relationship among regressors state the significance of each regressor

28

You might also like