You are on page 1of 130

Alternatives to Difference Scores:

Polynomial Regression and


Response Surface Methodology

Jeffrey R. Edwards
University of North Carolina

1
Outline
I. Types of Difference Scores
II. Questions Difference Scores Are Intended To
Address
III. Problems With Difference Scores
IV. An Alternative Procedure
V. Analyzing Quadratic Regression Equations
Using Response Surface Methodology
VI. Moderated Polynomial Regression
VII. Mediated Polynomial Regression
VIII. Difference Scores As Dependent Variables
IX. Answers to Frequently Asked Questions
2
Types of Difference Scores
Univariate:
Algebraic difference: (X – Y)
Absolute difference: |X – Y|
Squared difference: (X – Y)2
Multivariate:
Sum of algebraic differences: Σ(Xi – Yi) = D1
Sum of absolute differences: Σ|Xi – Yi| = |D|
Sum of squared differences: Σ(Xi – Yi)2 = D2
Euclidean distance: (Σ(Xi – Yi)2)½ = D
Profile correlation: C(Xi,Yi)/S(X)S(Y) = rXi,Yi = Q 3
Questions Difference Scores are Intended
to Address
 How well do characteristics of the job fit the needs or
desires of the employee?
 To what extent do job demands exceed or fall short of the
abilities of the person?
 Are prior expectations of the employee met by actual job
experiences?
 What is the degree of similarity between perceptions or
beliefs of supervisors and subordinates?
 Do the values of the person match the culture of the
organization?
 Can novices provide performance evaluations that agree
with expert ratings?
4
Problems with Difference Scores:
Reliability
When component measures are positively
correlated, difference scores are often less
reliable than either component.
The reliability of an algebraic difference is:
 r y r yy  2 r xy x y
2 2
+
(x  y) =
x xx

 y  2 r xy x y
2 2
x+
To illustrate, if X and Y have unit variances,
have reliabilities of .75, and are correlated .50,
the reliability of X – Y equals .50. 5
Problems with Difference Scores:
Conceptual Ambiguity
 It might seem that component variables are reflected
equally in a difference score, given that the
components are implicitly assigned the same weight
when the difference score is constructed.
 However, the variance of a difference score depends
on the variances and covariances of the component
measures, which are sample dependent.
 When one component is a constant, the variance of a
difference score is solely due to the other component,
i.e., the one that varies. For instance, when P-O fit is
assessed in a single organization, the P-O difference
solely represents variation in the person scores.
6
Problems with Difference Scores:
Confounded Effects
Difference scores confound the effects of the
components of the difference.
For example, an equation using an algebraic
difference as a predictor can be written as:
Z = b0 + b1(X – Y) + e
In this equation, b1 can reflect a positive
relationship for X, an negative relationship for
Y, or some combination thereof.

7
Problems with Difference Scores:
Untested Constraints
Difference scores constrain the coefficients
relating X and Y to Z without testing these
constraints.
The constraints imposed by an algebraic
difference can be seen with the following
equations:
Z = b0 + b1(X – Y) + e
Expansion yields:
Z = b0 + b1X – b1Y + e
8
Problems with Difference Scores:
Untested Constraints
Now, consider an equation that uses X and Y
as separate predictors:
Z = b0 + b1X + b2Y + e
Using (X – Y) as a predictor constrains the
coefficients on X and Y to be equal in
magnitude but opposite in sign (i.e., b1 = –b2).
This constraint should not be simply imposed
on the data but instead should be treated as a
hypothesis to be tested.
9
Problems with Difference Scores:
Untested Constraints
The constraints imposed by a squared
difference can be seen with the following
equations:
Z = b0 + b1(X – Y)2 + e
Expansion yields:
Z = b0 + b1X2 – 2b1XY + b1Y2 + e
Thus, a squared difference implicitly treats Z
as a function of X2, XY, and Y2.
10
Problems with Difference Scores:
Untested Constraints
 Now, consider a quadratic equation using X and Y:
Z = b0 + b1X + b2Y + b3X2 + b4XY + b5Y2 + e
 Comparing this equation to the previous equation
shows that (X – Y)2 imposes four constraints:
b1 = 0
b2 = 0
b3 = b5, or b3 – b5 = 0
b3 + b4 + b5 = 0
 Again, these constraints should be treated as
hypotheses to be tested empirically, not simply
imposed on the data.
11
Problems with Difference Scores:
Dimensional Reduction
 Difference scores reduce the three-dimensional
relationship of X and Y with Z to two dimensions.
The linear algebraic difference function represents a
symmetric plane with equal but opposite slopes with
respect to the X-axis and Y-axis.
The V-shaped absolute difference function
represents a symmetric V-shaped surface with its
minimum (or maximum) running along the X = Y
line.
The U-shaped squared difference function represents
a symmetric U-shaped surface with its minimum (or
maximum) running along the X = Y line.
12
Two-Dimensional Algebraic Difference
Function
7

4
Z

1
-6 -4 -2 0 2 4 6
(X - Y)

13
Three-Dimensional Algebraic Difference
Function

14
Two-Dimensional Absolute Difference
Function
7

4
Z

1
-6 -4 -2 0 2 4 6
(X - Y)

15
Three-Dimensional Absolute Difference
Function

16
Two-Dimensional Squared Difference
Function
7

4
Z

1
-6 -4 -2 0 2 4 6
(X - Y)

17
Three-Dimensional Squared Difference
Function

18
Problems with Difference Scores:
Dimensional Reduction
These surfaces represent only three of the
many possible surfaces depicting how X and
Y may be related to Z.
This problem is compounded by the use of
profile similarity indices, which collapse a
series of three-dimensional surfaces into a
single two-dimensional function.

19
An Alternative Procedure
 The relationship of X and Y with Z should be viewed
in three dimensions, with X and Y constituting the two
horizontal axes and Z constituting the vertical axis.
 Analyses should focus not on two-dimensional
functions relating the difference between X and Y to Z,
but instead on three-dimensional surfaces depicting the
joint relationship of X and Y with Z.
 Constraints should not be simply imposed on the data,
but instead should be viewed as hypotheses that, if
confirmed, lend support to the conceptual model upon
which the difference score is based.
20
Data Used for Illustration
 Data were collected from 373 MBA students who were
engaged in the recruiting process.
 Respondents rated the actual and desired amounts of
various job attributes and the anticipated satisfaction
concerning a job for which they had recently interviewed.
 Actual and desired measured had three items and used 7-
point response scales ranging from “none at all” to “a
very great amount.” The satisfaction measured had three
items and used a 7-point response scale ranging from
“strongly disagree” to “strongly agree.”
 The job attributes used for illustration are autonomy,
prestige, span of control, and travel.
21
Confirmatory Approach
 When a difference scores represents a hypothesis
that is predicted a priori, the alternative procedure
should be applied using the confirmatory approach.
The R2 for the unconstrained equation should be
significant.
The coefficients in the unconstrained equation should
follow the pattern indicated by the difference score.
The constraints implied by the difference score should
not be rejected.
The set of terms one order higher than those in the
unconstrained equation should not be significant.

22
Confirmatory Approach Applied to the
Algebraic Difference
The unconstrained equation is:
Z = b0 + b1X + b2Y + e
The constrained equation used to evaluate the
third condition is:
Z = b0 + b1 (X – Y) + e
The equation that adds higher-order terms used
to evaluate the fourth condition is:
Z = b0 + b1X + b2Y + b3X2 + b4XY + b5Y2 + e
23
Example: Confirmatory Test of Algebraic
Difference for Autonomy
Unconstrained equation:
Dep Var: SAT N: 360 Multiple R: 0.356 Squared multiple R: 0.127
Adjusted squared multiple R: 0.122 Standard error of estimate: 1.077

Effect Coefficient Std Error Std Coef Tolerance t P(2 Tail)


CONSTANT 5.835 0.077 0.000 . 75.874 0.000
AUTCA 0.445 0.062 0.413 0.737 7.172 0.000
AUTCD -0.301 0.071 -0.244 0.737 -4.235 0.000

Analysis of Variance
Source Sum-of-Squares df Mean-Square F-ratio P
Regression 60.133 2 30.067 25.930 0.000
Residual 413.953 357 1.160

24
Example: Confirmatory Test of Algebraic
Difference for Autonomy
Unconstrained surface:

25
Example: Confirmatory Test of Algebraic
Difference for Autonomy
The first condition is met, because the R2 from
the unconstrained equation is significant.
The second condition is met, because the
coefficients on X and Y are significant and in
the expected direction.
For the third condition, testing the constraints
imposed by the algebraic difference is the same
as testing the difference in R2 between the
constrained and unconstrained equations.
26
Example: Confirmatory Test of Algebraic
Difference for Autonomy
Constrained equation:
Dep Var: SAT N: 360 Multiple R: 0.339 Squared multiple R: 0.115
Adjusted squared multiple R: 0.113 Standard error of estimate: 1.082

Effect Coefficient Std Error Std Coef Tolerance t P(2 Tail)


CONSTANT 5.937 0.061 0.0 . 97.007 0.000
AUTALD 0.393 0.058 0.339 1.000 6.825 0.000

Analysis of Variance
Source Sum-of-Squares df Mean-Square F-ratio P
Regression 54.589 1 54.589 46.586 0.000
Residual 419.498 358 1.172

27
Example: Confirmatory Test of Algebraic
Difference for Autonomy
Constrained surface:

28
Example: Confirmatory Test of Algebraic
Difference for Autonomy
 The general formula for the difference in R2 between
two regression equations is:
(R 2U  R C2 ) /( df C  df U )
F
(1  R 2U ) / df U
 The test of the constraint imposed by the algebraic
difference for autonomy is:
(.127  .115) /(358  357)
 4.91, p  .05
(1  .127) / 357
 The constraint is rejected, so the third condition is
not satisfied.
29
Example: Confirmatory Test of Algebraic
Difference for Autonomy
For the fourth condition, the unconstrained
equation for the algebraic equation is linear, so
the higher-order terms are the three quadratic
terms X2, XY, and Y2.
Testing the three quadratic terms as a set is the
same as testing the difference in R2 between the
linear and quadratic equations.

30
Example: Confirmatory Test of Algebraic
Difference for Autonomy
Quadratic equation:
Dep Var: SAT N: 360 Multiple R: 0.411 Squared multiple R: 0.169
Adjusted squared multiple R: 0.157 Standard error of estimate: 1.055

Effect Coefficient Std Error Std Coef Tolerance t P(2 Tail)


CONSTANT 5.825 0.083 0.000 . 70.161 0.000
AUTCA 0.197 0.100 0.182 0.273 1.966 0.050
AUTCD -0.293 0.106 -0.238 0.315 -2.754 0.006
AUTCA2 -0.056 0.047 -0.086 0.444 -1.177 0.240
AUTCAD 0.276 0.080 0.396 0.178 3.453 0.001
AUTCD2 -0.035 0.063 -0.054 0.242 -0.553 0.581

Analysis of Variance
Source Sum-of-Squares df Mean-Square F-ratio P
Regression 79.951 5 15.990 14.362 0.000
Residual 394.135 354 1.113

31
Example: Confirmatory Test of Algebraic
Difference for Autonomy
 The test of the higher-order terms associated with
the algebraic difference for autonomy:
(.169  .127) /(357  354)
 5.96, p  .05
(1  .169) / 354
 The higher-order terms are significant, so the fourth
condition is not satisfied.

32
Confirmatory Approach Applied to the
Absolute Difference
The unconstrained equation is:
Z = b0 + b1X + b2Y + b3W + b4WX + b5WY + e
The constrained equation used to evaluate the
third condition is:
Z = b0 + b1 |X – Y| + e
The equation that adds higher-order terms used
to evaluate the fourth condition is:
Z = b0 + b1X + b2Y + b3W + b4WX + b5WY +
b6X2 + b7XY + b8Y2 + b9WX2 + b10WXY + b10WY2 + e
33
Example: Confirmatory Test of Absolute
Difference for Autonomy
Unconstrained equation:
Dep Var: SAT N: 360 Multiple R: 0.399 Squared multiple R: 0.159
Adjusted squared multiple R: 0.147 Standard error of estimate: 1.061

Effect Coefficient Std Error Std Coef Tolerance t P(2 Tail)


CONSTANT 6.233 0.152 0.000 . 41.136 0.000
AUTCA -0.150 0.184 -0.139 0.082 -0.818 0.414
AUTCD 0.183 0.188 0.148 0.102 0.970 0.333
AUTW -0.349 0.201 -0.148 0.329 -1.737 0.083
AUTCAW 0.752 0.209 0.490 0.129 3.605 0.000
AUTCDW -0.554 0.219 -0.406 0.093 -2.537 0.012

Analysis of Variance
Source Sum-of-Squares df Mean-Square F-ratio P
Regression 75.381 5 15.076 13.386 0.000
Residual 398.705 354 1.126

34
Example: Confirmatory Test of Absolute
Difference for Autonomy
Unconstrained surface:

35
Example: Confirmatory Test of Absolute
Difference for Autonomy
The first condition is met, because the R2 from
the unconstrained equation is significant.
The second condition is not met, because the
coefficients on X and Y are not significant, and
in the expected direction.
For the third condition, testing the constraints
imposed by the absolute difference is the same
as testing the difference in R2 between the
constrained and unconstrained equations.
36
Example: Confirmatory Test of Absolute
Difference for Autonomy
Constrained equation:
Dep Var: SAT N: 360 Multiple R: 0.323 Squared multiple R: 0.105
Adjusted squared multiple R: 0.102 Standard error of estimate: 1.089

Effect Coefficient Std Error Std Coef Tolerance t P(2 Tail)


CONSTANT 6.212 0.087 0.000 . 71.122 0.000
AUTABD -0.531 0.082 -0.323 1.000 -6.464 0.000

Analysis of Variance
Source Sum-of-Squares df Mean-Square F-ratio P
Regression 49.555 1 49.555 41.788 0.000
Residual 424.532 358 1.186

37
Example: Confirmatory Test of Absolute
Difference for Autonomy
Constrained surface:

38
Example: Confirmatory Test of Absolute
Difference for Autonomy
 The test of the constraints imposed by the absolute
difference for autonomy is:
(.159  .105) /(358  354)
 5.68, p  .05
(1  .159) / 354
 The constraints are rejected, so the third condition is
not satisfied.

39
Example: Confirmatory Test of Absolute
Difference for Autonomy
For the fourth condition, the unconstrained
equation for the absolute equation is piecewise
linear, so the higher-order terms are the six
quadratic terms X2, XY, Y2, WX2, WXY, and
WY2.
Testing the six quadratic terms as a set is the
same as testing the difference in R2 between
the piecewise linear and piecewise quadratic
equations.
40
Example: Confirmatory Test of Absolute
Difference for Autonomy
Piecewise quadratic equation:
Dep Var: SAT N: 360 Multiple R: 0.431 Squared multiple R: 0.185
Adjusted squared multiple R: 0.160 Standard error of estimate: 1.053

Effect Coefficient Std Error Std Coef Tolerance t P(2 Tail)


CONSTANT 6.193 0.206 0.000 . 30.124 0.000
AUTCA -0.438 0.548 -0.407 0.009 -0.799 0.425
AUTCD 0.256 0.505 0.207 0.014 0.506 0.613
AUTW -0.534 0.276 -0.225 0.172 -1.931 0.054
AUTCAW 0.672 0.608 0.438 0.015 1.105 0.270
AUTCDW -0.373 0.592 -0.273 0.013 -0.631 0.529
AUTCA2 0.146 0.312 0.225 0.010 0.468 0.640
AUTCAD -0.092 0.618 -0.133 0.003 -0.150 0.881
AUTCD2 0.107 0.350 0.169 0.008 0.307 0.759
AUTCA2W -0.088 0.325 -0.082 0.026 -0.272 0.786
AUTCADW 0.325 0.641 0.368 0.004 0.507 0.613
AUTCD2W -0.219 0.371 -0.342 0.007 -0.589 0.556

Analysis of Variance
Source Sum-of-Squares df Mean-Square F-ratio P
Regression 87.940 11 7.995 7.205 0.000
Residual 386.146 348 1.110

41
Example: Confirmatory Test of Absolute
Difference for Autonomy
 The test of the higher-order terms associated with
the absolute difference for autonomy is:
(.185  .159) /(354  348)
 1.85, p  .05
(1  .185) / 348
 The higher-order terms are not significant, so the
fourth condition is satisfied.

42
Confirmatory Approach Applied to the
Squared Difference
The unconstrained equation is:
Z = b0 + b1X + b2Y + b3X2 + b4XY + b5Y2 + e
The constrained equation used to evaluate the
third condition is:
Z = b0 + b1 (X – Y)2 + e
The equation that adds higher-order terms used
to evaluate the fourth condition is:
Z = b0 + b1X + b2Y + b3X2 + b4XY + b5Y2 +
b6X3 + b7X2Y + b8XY2 + b9Y3 + e 43
Example: Confirmatory Test of Squared
Difference for Autonomy
Unconstrained equation:
Dep Var: SAT N: 360 Multiple R: 0.411 Squared multiple R: 0.169
Adjusted squared multiple R: 0.157 Standard error of estimate: 1.055

Effect Coefficient Std Error Std Coef Tolerance t P(2 Tail)


CONSTANT 5.825 0.083 0.000 . 70.161 0.000
AUTCA 0.197 0.100 0.182 0.273 1.966 0.050
AUTCD -0.293 0.106 -0.238 0.315 -2.754 0.006
AUTCA2 -0.056 0.047 -0.086 0.444 -1.177 0.240
AUTCAD 0.276 0.080 0.396 0.178 3.453 0.001
AUTCD2 -0.035 0.063 -0.054 0.242 -0.553 0.581

Analysis of Variance
Source Sum-of-Squares df Mean-Square F-ratio P
Regression 79.951 5 15.990 14.362 0.000
Residual 394.135 354 1.113

44
Example: Confirmatory Test of Squared
Difference for Autonomy
Unconstrained surface:

45
Example: Confirmatory Test of Squared
Difference for Autonomy
The first condition is met, because the R2 from
the unconstrained equation is significant.
The second condition is not met, because the
coefficients on X and Y are significant, and the
coefficients on X2 and Y2 are not significant.
For the third condition, testing the constraints
imposed by the squared difference is the same
as testing the difference in R2 between the
constrained and unconstrained equations.
46
Example: Confirmatory Test of Squared
Difference for Autonomy
Constrained equation:
Dep Var: SAT N: 360 Multiple R: 0.310 Squared multiple R: 0.096
Adjusted squared multiple R: 0.093 Standard error of estimate: 1.094

Effect Coefficient Std Error Std Coef Tolerance t P(2 Tail)


CONSTANT 5.993 0.067 0.000 . 89.830 0.000
AUTSQD -0.183 0.030 -0.310 1.000 -6.162 0.000

Analysis of Variance
Source Sum-of-Squares df Mean-Square F-ratio P
Regression 45.463 1 45.463 37.972 0.000
Residual 428.623 358 1.197

47
Example: Confirmatory Test of Squared
Difference for Autonomy
Constrained surface:

48
Example: Confirmatory Test of Squared
Difference for Autonomy
 The test of the constraint imposed by the squared
difference for autonomy is:
(.169  .096) /(358  354)
 7.77, p  .05
(1  .169) / 354
 The constraint is rejected, so the third condition is
not satisfied.

49
Example: Confirmatory Test of Squared
Difference for Autonomy
For the fourth condition, the unconstrained
equation for the squared equation is quadratic,
so the higher-order terms are the four cubic
terms X3, X2Y, XY2, and Y3.
Testing the four cubic terms as a set is the same
as testing the difference in R2 between the
quadratic and cubic equations.

50
Example: Confirmatory Test of Squared
Difference for Autonomy
Cubic equation:
Dep Var: SAT N: 360 Multiple R: 0.436 Squared multiple R: 0.190
Adjusted squared multiple R: 0.170 Standard error of estimate: 1.047

Effect Coefficient Std Error Std Coef Tolerance t P(2 Tail)


CONSTANT 5.757 0.109 0.000 . 52.736 0.000
AUTCA 0.364 0.119 0.337 0.190 3.055 0.002
AUTCD -0.312 0.120 -0.253 0.245 -2.609 0.009
AUTCA2 0.043 0.095 0.066 0.109 0.456 0.649
AUTCAD 0.356 0.175 0.511 0.037 2.033 0.043
AUTCD2 -0.075 0.126 -0.117 0.060 -0.594 0.553
AUTCA3 -0.104 0.037 -0.442 0.094 -2.817 0.005
AUTCA2D 0.052 0.066 0.167 0.052 0.794 0.428
AUTCAD2 -0.030 0.089 -0.098 0.028 -0.338 0.736
AUTCD3 0.003 0.053 0.011 0.046 0.047 0.962

Analysis of Variance
Source Sum-of-Squares df Mean-Square F-ratio P
Regression 90.233 9 10.026 9.142 0.000
Residual 383.853 350 1.097

51
Example: Confirmatory Test of Squared
Difference for Autonomy
 The test of the higher-order terms associated with
the squared difference for autonomy is:
(.190  .169) /(354  350)
 2.27, p  .05
(1  .190) / 350
 The higher-order terms are not significant, so the
fourth condition is satisfied.

52
Analyzing Quadratic Regression Equations
Using Response Surface Methodology
Response surface methodology can be used to
analyze features of surfaces corresponding to
quadratic regression equations. These analyses
are useful for three reasons:
Constraints imposed by difference scores are usually
rejected, which makes it necessary to interpret
unconstrained equations.
Many conceptually meaningful hypotheses cannot be
expressed using difference scores.
Response surfaces can themselves serve as the basis
for developing and testing hypotheses. 53
Key Features of Response Surfaces:
Stationary Point
The stationary point is the point at which the
slope of the surface relating X and Y to Z is
zero in all directions.
For convex (i.e., bowl-shaped) surfaces, the
stationary point is the overall minimum of the
surface with respect to the Z axis.
For concave (i.e., dome-shaped) surfaces, the
stationary point is the overall maximum of the
surface with respect to the Z axis.
For saddle-shaped surfaces, the stationary point is
where the surface is flat with respect to the Z axis.
54
Key Features of Response Surfaces:
Stationary Point
The coordinates of the stationary point can be
computed using the following formulas:

b 2 b 4  2b1b5
X0 =
4 b 3b5  b 4
2

b1b 4  2b 2 b3
Y0 =
4 b 3 b5  b 4
2

X0 and Y0 are the coordinates of the stationary


point in the X,Y plane.
55
Example: Stationary Point for Autonomy

Applying these formulas to the equation for


autonomy yields:
( 0.293)(0.276)  2(0.197)( 0.035)
X0 = = 0.982
4( 0.056)( 0.035)  0.2762

(0.197)( 0.276)  2( 0.293)( 0.056)


Y0 = =  0.315
4( 0.056)( 0.035)  0.276 2

56
Example: Stationary Point for Autonomy

Stationary Point

57
Key Features of Response Surfaces:
Principal Axes
 The principal axes describe the orientation of the
surface with respect to the X,Y plane. The axes are
perpendicular and intersect at the stationary point.
For convex surfaces, the upward curvature is greatest along
the first principal axis and least along the second principal
axis.
For concave surfaces, the downward curvature is greatest
along the second principal axis and least along the first
principal axis.
For saddle-shaped surfaces, upward curvature is greatest
along the first principal axis, and the downward curvature is
greatest along the second principal axis.
58
Key Features of Response Surfaces:
First Principal Axis
 An equation for the first principal axis is:
Y  p10  p11X
 The formula for the slope of the first principal axis
(i.e., p11) is:
b5  b3  ( b3  b5 ) 2  b 24
p11 = .
b4
 Using X0, Y0, and p11, the intercept of the first
principal axis (i.e., p10) can be calculated as follows:
p10  Y0  p11X 0
59
Example: First Principal Axis for
Autonomy
Applying these formulas to the equation for
autonomy yields:

 0.035  ( 0.056)  [0.056  ( 0.035)]2  0.2762


p11 = = 1.079
0.276

p10 =  0.315  (1.079)( 0.982) =  1.375

60
Example: First Principal Axis for
Autonomy

First Principal Axis

61
Key Features of Response Surfaces:
Second Principal Axis
 An equation for the second principal axis is:
Y  p 20  p 21X
 The formula for the slope of the second principal axis
(i.e., p21) is:
b5  b3  ( b3  b5 )  b2 2
4
p 21 = .
b4
 X0, Y0, and p21 can be used to obtain the intercept of
the second principal axis (i.e., p20) as follows:
p 20  Y0  p 21X 0
62
Example: Second Principal Axis for
Autonomy
Applying these formulas to the equation for
autonomy yields:

 0.035  ( 0.056)  [0.056  ( 0.035)]2  0.2762


p21 = =  0.927
0.276

p 20 =  0.315  ( 0.927)( 0.982) = 0.594

63
Example: Second Principal Axis for
Autonomy

Second Principal Axis

64
Key Features of Response Surfaces:
Shape Along the Y = X Line
 The shape of the surface along a line in the X,Y plane
can be estimated by substituting the expression for
the line into the quadratic regression equation.
 To estimate the slope along the Y = X line, X is
substituted for Y in the quadratic regression equation,
which yields:
Z = b0 + b1X + b2X + b3X2 + b4X2 + b5X2 + e
= b0 + (b1 + b2)X + (b3 + b4 + b5)X2 + e
 The term (b3 + b4 + b5) represents the curvature of the
surface along the Y = X line, and (b1 + b2) is the slope
of the surface at the point X = 0.
65
Example: Shape Along Y = X Line for
Autonomy
For autonomy, the shape of the surface along
the Y = X line is:
Z = 5.825 + [0.197 + (–0.293)]X
+ [–0.056 + 0.276 + (–0.035)]X2 + e
Simplifying this expression yields:
Z = 5.825 – 0.096X + 0.185X2 + e
The surface is curved upward along the Y = X
line and is negatively sloped at the point X = 0
(the curvature is significant at p < .05).
66
Example: Shape Along Y = X Line for
Autonomy

Contours Show
Shape Along
Y = X Line 67
Key Features of Response Surfaces:
Shape Along Y = –X Line
To estimate the slope along the Y = –X line, –
X is substituted for Y in the quadratic
regression equation, which yields:
Z = b0 + b1X – b2X + b3X2 – b4X2 + b5X2 + e
= b0 + (b1 – b2)X + (b3 – b4 + b5)X2 + e
The term (b3 – b4 + b5) represents the curvature
of the surface along the Y = –X line, and (b1 –
b2) is the slope of the surface at the point X =
0.
68
Example: Shape Along Y = –X Line for
Autonomy
For autonomy, the shape of the surface along
the Y = –X line is:
Z = 5.825 + [0.197 – (–0.293)]X
+ [–0.056 – 0.276 + (–0.035)]X2 + e
Simplifying this expression yields:
Z = 5.825 + 0.490X – 0.367X2 + e
The surface is curved downward along the Y =
–X line and is positively sloped at the point X
= 0 (both are significant at p < .05).
69
Example: Shape Along Y = –X Line for
Autonomy

Contours Show
Shape Along
Y = –X Line

70
Key Features of Response Surfaces:
Shape Along First Principal Axis
 To estimate the slope along the first principal axis, p10
+ p11X is substituted for Y:
Z  b0  b1X  b 2 ( p10  p11X )  b3X 2  b 4 X( p10  p11X )
 b5 ( p10  p11X )  e
2

 b0  b 2 p10  b5 p10
2
 ( b1  b 2 p11  b 4 p10  2b5 p10 p11 )X
 ( b3  b 4 p11  b5 p11
2
)X 2  e
 The composite terms preceding X2 and X are the
curvature of the surface along the first principal axis
and the slope of the surface at the point X = 0.
71
Example: Shape Along First Principal
Axis for Autonomy
For autonomy, the shape of the surface along
the first principal axis is:
Z  5.825  ( 0.293)(1.375)  ( 0.035)(1.3752 )
 [0.197  ( 0.293)(1.079)  (0.276)(1.375)
 2( 0.035)(1.375)(1.079)]X
 [0.056  (0.276)(1.079)  ( 0.035)(1.0792 )]X 2  e
 6.162  0.395X  0.201X 2  e
The surface is curved upward along the first
principal axis and is negatively sloped at the
point X = 0 (both are significant at p < .05).
72
Example: Shape Along First Principal
Axis for Autonomy

Contours Show
Shape Along
First Principal
Axis
73
Key Features of Response Surfaces:
Shape Along Second Principal Axis
 To estimate the slope along the second principal axis,
p20 + p21X is substituted for Y:
Z  b0  b1X  b 2 ( p 20  p 21X )  b3X 2  b 4 X( p 20  p 21X )
 b5 ( p 20  p 21X )  e
2

 b0  b 2 p 20  b5 p 220  ( b1  b 2 p 21  b 4 p 20  2b5 p 20 p 21 )X
 ( b3  b 4 p 21  b5 p 221 ) X 2  e
 The composite terms preceding X2 and X are the
curvature of the surface along the second principal
axis and the slope of the surface at the point X = 0.
74
Example: Shape Along Second Principal
Axis for Autonomy
 For autonomy, the shape of the surface along the
second principal axis is:
Z  5.825  ( 0.293)(0.594)  ( 0.035)(0.5942 )
 [0.197  ( 0.293)(0.927)  (0.276)(0.594)
 2( 0.035)(0.594)(0.927)]X
 [0.056  (0.276)(0.927)  ( 0.035)(0.9272 )]X 2  e
 5.639  0.671X  0.342X 2  e
 The surface is curved downward along the second
principal axis and is positively sloped at the point X =
0 (both are significant at p < .05).
75
Example: Shape Along Second Principal
Axis for Autonomy

Contours Show
Shape Along
Second
Principal Axis

76
Key Features of Response Surfaces:
Tests of Significance
 The formulas for shapes along predetermined lines
such as Y = X and Y = –X can be tested using
procedures for testing weighted linear combinations
of regression coefficients.
 For example, a t-test for b1 + b2 is obtained by
dividing b1 + b2 by its standard error, or the square
root of the variance of b1 + b2:
S( b1  b2 )  V( b1 )  V( b2 )  2C( b1 , b2 )
 The variances of b1 and b2 are the squares of their
standard errors, and the covariance of b1 and b2 is
their correlation times their standard errors. 77
Key Features of Response Surfaces:
Tests of Significance
Weighted linear combinations of regression
coefficients can also be tested using routines
available in many statistical packages.
Another approach is to test the reduction in R2
produced by the constraint represented by the
weighted linear combination of coefficients.
For instance, to jointly test (b1 + b2) and (b3 +
b4 + b5), we set both quantities equal to zero
and impose the resulting constraints.
78
Key Features of Response Surfaces:
Tests of Significance
 The expression b1 + b2 = 0 implies b2 = –b1.
Likewise, the expression b3 + b4 + b5 = 0 implies
b5 = –b3 – b4. Imposing these constraints on the
quadratic regression equation yields:
Z = b0 + b1X – b1Y + b3X2 + b4XY + (–b3 – b4)Y2 + e
 The expression simplifies to:
Z = b0 + b1(X – Y) + b3(X2 – Y2) + b4(XY – Y2) + e
 The reduction in R2 from this equation relative to
the R2 from the quadratic equation is a joint test
of b1 + b2 = 0 and b3 + b4 + b5 = 0.
79
Key Features of Response Surfaces:
Tests of Significance
 X0, Y0, p10, p11, p20, p21, and slopes along the principal
axes are nonlinear combinations of regression
coefficients. For these quantities, significance tests can
be conducted using the bootstrap, as follows:
A large number (e.g., 10,000) of samples of size N are
randomly drawn with replacement.
Each sample is used to estimate the quadratic regression
equation.
The coefficients from each sample are used to compute X0,
Y0, p10, p11, p20, and p21.
The distributions of X0, Y0, p10, p11, p20, and p21 are used to
construct confidence intervals. 80
Example: Testing Response Surface
Features for Autonomy
A joint test of (b1 + b2) and (b3 + b4 + b5), which
represent the slope at the point X = 0 and the
curvature along the Y = X line, is yielded by the
following commands:
MGLH
MOD SAT=CONSTANT+AUTCA+AUTCD+AUTCA2+AUTCAD+AUTCD2
EST
HYP
AMA [0 1 1 0 0 0;,
0 0 0 1 1 1]
TEST

81
Example: Testing Response Surface
Features for Autonomy
For autonomy, this test yields the following
result:
Hypothesis.

A Matrix

1 2 3 4 5 6
1 0.0 1.000 1.000 0.0 0.0 0.0
2 0.0 0.0 0.0 1.000 1.000 1.000

Test of Hypothesis

Source SS df MS F P

Hypothesis 16.878 2 8.439 7.580 0.001


Error 394.135 354 1.113

82
Example: Testing Response Surface
Features for Autonomy
Separate tests of (b1 + b2) and (b3 + b4 + b5) are
yielded by the following commands:
MGLH
MOD SAT=CONSTANT+AUTCA+AUTCD+AUTCA2+AUTCAD+AUTCD2
EST
HYP
AMA [0 1 1 0 0 0]
TEST
HYP
AMA [0 0 0 1 1 1]
TEST

83
Example: Testing Response Surface
Features for Autonomy
For autonomy, the results are:
A Matrix
1 2 3 4 5 6
1 0.0 1.000 1.000 0.0 0.0 0.0

Test of Hypothesis

Source SS df MS F P

Hypothesis 1.068 1 1.068 0.959 0.328


Error 394.135 354 1.113

A Matrix
1 2 3 4 5 6
1 0.0 0.0 0.0 1.000 1.000 1.000
Test of Hypothesis

Source SS df MS F P

Hypothesis 11.740 1 11.740 10.545 0.001


Error 394.135 354 1.113

84
Example: Testing Response Surface
Features for Autonomy
Likewise, a joint test of (b1 – b2) and (b3 – b4 +
b5), which represent the slope at the point X = 0
and the curvature along the Y = – X line, is
yielded by the following commands:
MGLH
MOD SAT=CONSTANT+AUTCA+AUTCD+AUTCA2+AUTCAD+AUTCD2
EST
HYP
AMA [0 1 -1 0 0 0;,
0 0 0 1 -1 1]
TEST

85
Example: Testing Response Surface
Features for Autonomy
For autonomy, this test yields the following
result:
Hypothesis.

A Matrix

1 2 3 4 5 6
1 0.0 1.000 -1.000 0.0 0.0 0.0
2 0.0 0.0 0.0 1.000 -1.000 1.000

Test of Hypothesis

Source SS df MS F P

Hypothesis 39.512 2 19.756 17.744 0.000


Error 394.135 354 1.113

86
Example: Testing Response Surface
Features for Autonomy
Separate tests of (b1 – b2) and (b3 – b4 + b5) are
yielded by the following commands:
MGLH
MOD SAT=CONSTANT+AUTCA+AUTCD+AUTCA2+AUTCAD+AUTCD2
EST
HYP
AMA [0 1 -1 0 0 0]
TEST
HYP
AMA [0 0 0 1 -1 1]
TEST

87
Example: Testing Response Surface
Features for Autonomy
For autonomy, the results are:
A Matrix
1 2 3 4 5 6
1 0.0 1.000 -1.000 0.0 0.0 0.0

Test of Hypothesis

Source SS df MS F P

Hypothesis 8.105 1 8.105 7.279 0.007


Error 394.135 354 1.113

A Matrix
1 2 3 4 5 6
1 0.0 0.0 0.0 1.000 -1.000 1.000

Test of Hypothesis

Source SS df MS F P

Hypothesis 6.588 1 6.588 5.917 0.015


Error 394.135 354 1.113
88
Example: Testing Response Surface
Features for Autonomy
In SYSTAT, the bootstrap is implemented with
the following commands:
MGLH
MOD SAT=CONSTANT+AUTCA+AUTCD+AUTCA2+AUTCAD+AUTCD2
SAVE AUTBOOT.SYD/COEF
EST/SAMPLE=BOOT(10000)

These commands will produce a large output file


with the results of all 10,000 regressions and a
system file containing 10,000 sets of coefficients.
The coefficients are used to construct confidence
intervals (Mooney & Duval, 1993; Stine, 1989).
89
Example: Testing Response Surface
Features for Autonomy
 For autonomy, the 95% confidence intervals for X0, Y0,
p10, p11, p20, p21 are:

Value CIL CIU


X0 0.982 0.199 5.142
Y0 –0.315 –3.480 0.239
p10 –1.375 –11.423 –0.359
p11 1.079 0.688 2.123
p20 0.594 –1.167 1.120
p21 –0.927 –1.449 –0.466
90
Interpretation of Results for Autonomy
 The surface was saddle-shaped.
 The slope of the first principal axis did not differ from 1, and the
intercept of first principal axis was negative, meaning that the axis
ran parallel to the Y = X line but was shifted to the right.
 The slope and intercept of the second principal axis did not differ
from –1 and 0, respectively. Thus, the axis did not differ from the
Y = –X line.
 The location of the first principal axis combined with the slope
along the second principal axis indicate that satisfaction increased
as actual autonomy increased toward desired autonomy, continued
to increase as actual autonomy exceeded desired autonomy, and
began to decrease when actual autonomy exceeded desired
autonomy by about one unit.
 Within the range of the data, satisfaction increased at an
increasing rate as actual and desired autonomy both increased
along the first principal axis. 91
Moderated Polynomial Regression
 In some cases, the effect represented by a quadratic
regression equation is believed to be moderated by
another variable.
 Incorporating the moderator variable V into a
quadratic regression equation yields:
Z = b0 + b1X + b2Y + b3X2 + b4XY + b5Y2 + b6V +
b7XV + b8YV + b9X2V + b10XYV + b11Y2V + e
 Moderation is tested by assessing the increment in
R2 yielded by the terms XV, YV, X2V, XYV, and
Y2V.
92
Moderated Polynomial Regression
 The moderated quadratic regression equation can be
rewritten to show simple surfaces at selected levels
of the moderator variable, as follows:
Z = (b0 + b6V) + (b1 + b7V)X + (b2 + b8V)Y +
(b3 + b9V)X2 + (b4 + b10V)XY + (b5 + b11V)Y2 + e
 The compound coefficients on the terms X, Y, X2,
XY, and Y2 can be tested using procedures for testing
weighted linear combinations of regression
coefficients.

93
Example: Moderated Polynomial
Regression for Autonomy
 Quadratic equation with importance as a moderator:
Dep Var: SAT N: 357 Multiple R: 0.431 Squared multiple R: 0.186
Adjusted squared multiple R: 0.160 Standard error of estimate: 1.057

Effect Coefficient Std Error Std Coef Tolerance t P(2 Tail)


CONSTANT 5.514 0.481 0.000 . 11.455 0.000
AUTCA 0.409 0.487 0.379 0.012 0.841 0.401
AUTCD -0.740 0.518 -0.595 0.014 -1.429 0.154
AUTCA2 0.181 0.292 0.278 0.012 0.620 0.536
AUTCAD 0.595 0.489 0.855 0.005 1.218 0.224
AUTCD2 -0.225 0.306 -0.353 0.010 -0.736 0.462
AUTI 0.062 0.101 0.051 0.343 0.614 0.540
AUTCAI -0.050 0.103 -0.242 0.009 -0.479 0.632
AUTCDI 0.103 0.115 0.454 0.009 0.890 0.374
AUTCA2I -0.046 0.054 -0.408 0.011 -0.862 0.389
AUTCADI -0.047 0.088 -0.392 0.004 -0.533 0.594
AUTCD2I 0.021 0.059 0.200 0.008 0.360 0.719

Analysis of Variance
Source Sum-of-Squares df Mean-Square F-ratio P
Regression 88.023 11 8.002 7.158 0.000
Residual 385.670 345 1.118
94
Example: Moderated Polynomial
Regression for Autonomy
 The test of the increment in R2 yielded by the five
moderator terms is:
(.186  .169) /(350  345)
 1.44, p  .05
(1  .186) / 345
 The increment in R2 is not significant, so moderation
is not supported.

95
Example: Moderated Polynomial
Regression for Autonomy
Simple quadratic equations at low, medium,
and high levels of importance:
X Y X2 XY Y2
Low 0.21 -0.33** -0.00 0.41* -0.14
Medium 0.16 -0.23 -0.05 0.36** -0.12
High 0.11 -0.13 -0.09 0.32** -0.10

96
Example: Moderated Polynomial
Regression for Autonomy
Simple surface for low importance:

97
Example: Moderated Polynomial
Regression for Autonomy
Simple surface for medium importance:

98
Example: Moderated Polynomial
Regression for Autonomy
Simple surface for high importance:

99
Mediated Polynomial Regression
 On occasion, the effect represented by a quadratic
regression equation is believed to be mediated by
(i.e., transmitted through) another variable.
 Mediation can be analyzed using two regression
equations, one that regresses the mediator on the
five quadratic terms, and another that regresses the
outcome on the five quadratic terms and the
mediator:
M = a0 + a1X + a2Y + a3X2 + a4XY + a5Y2 + eM
Z = b0 + b1M + b2X + b3Y + b4X2 + b5XY + b6Y2 + eZ
100
Mediated Polynomial Regression
 The mediated effect represented by these two
equation can be derived by substituting the equation
for M into the equation for Z to obtain a reduced form
equation:
Z = b0 + b1(a0 + a1X + a2Y + a3X2 + a4XY + a5Y2 + eM)
+ b2X + b3Y + b4X2 + b5XY + b6Y2 + eZ
 Distribution yields:
Z = b0 + a0b1 + a1b1X + a2b1Y + a3b1X2 + a4b1XY +
a5b1Y2 + b1eM + b2X + b3Y + b4X2 + b5XY + b6Y2
+ eZ
101
Mediated Polynomial Regression
 Collecting like terms yields:
Z = (b0 + a0b1) + (b2 + a1b1)X + (b3 + a2b1)Y +
(b4 + a3b1)X2 + (b5 + a4b1)XY + (b6 + a5b1)Y2 +
(eZ + b1eM)
 The compound coefficients on X, Y, X2, XY, and Y2
capture the portion of the quadratic effect mediated by
M as the products a1b1, a2b1, a3b1, a4b1, and a5b1.
 The portion of the quadratic effect that bypasses M is
captured by b2, b3, b4, b5, and b6.
 These coefficients can be analyzed separately and
jointly to examine the mediated quadratic effect. 102
Example: Mediated Polynomial
Regression for Autonomy
 Quadratic equation with intent to take the focal job as the
outcome variable:
Dep Var: INT N: 360 Multiple R: 0.276 Squared multiple R: 0.076
Adjusted squared multiple R: 0.063 Standard error of estimate: 1.174

Effect Coefficient Std Error Std Coef Tolerance t P(2 Tail)


CONSTANT 5.851 0.092 0.000 . 63.319 0.000
AUTCA 0.161 0.111 0.142 0.273 1.449 0.148
AUTCD -0.244 0.119 -0.187 0.315 -2.056 0.041
AUTCA2 -0.076 0.052 -0.110 0.444 -1.438 0.151
AUTCAD 0.197 0.089 0.267 0.178 2.211 0.028
AUTCD2 0.008 0.070 0.013 0.242 0.121 0.904

Analysis of Variance
Source Sum-of-Squares df Mean-Square F-ratio P
Regression 40.397 5 8.079 5.858 0.000
Residual 488.231 354 1.379

103
Example: Mediated Polynomial
Regression for Autonomy
 Quadratic equation with satisfaction as the mediator variable:
Dep Var: SAT N: 360 Multiple R: 0.411 Squared multiple R: 0.169
Adjusted squared multiple R: 0.157 Standard error of estimate: 1.055

Effect Coefficient Std Error Std Coef Tolerance t P(2 Tail)


CONSTANT 5.825 0.083 0.000 . 70.161 0.000
AUTCA 0.197 0.100 0.182 0.273 1.966 0.050
AUTCD -0.293 0.106 -0.238 0.315 -2.754 0.006
AUTCA2 -0.056 0.047 -0.086 0.444 -1.177 0.240
AUTCAD 0.276 0.080 0.396 0.178 3.453 0.001
AUTCD2 -0.035 0.063 -0.054 0.242 -0.553 0.581

Analysis of Variance
Source Sum-of-Squares df Mean-Square F-ratio P
Regression 79.951 5 15.990 14.362 0.000
Residual 394.135 354 1.113

104
Example: Mediated Polynomial
Regression for Autonomy
 Quadratic equation with intent to take the focal job as the
outcome variable and satisfaction as the mediating variable:
Dep Var: INT N: 360 Multiple R: 0.760 Squared multiple R: 0.578
Adjusted squared multiple R: 0.571 Standard error of estimate: 0.795

Effect Coefficient Std Error Std Coef Tolerance t P(2 Tail)


CONSTANT 1.074 0.242 0.000 . 4.445 0.000
SAT 0.820 0.040 0.777 0.831 20.480 0.000
AUTCA 0.000 0.076 0.000 0.270 0.001 0.999
AUTCD -0.003 0.081 -0.002 0.308 -0.038 0.969
AUTCA2 -0.030 0.036 -0.044 0.443 -0.842 0.401
AUTCAD -0.030 0.061 -0.040 0.173 -0.484 0.629
AUTCD2 0.037 0.047 0.055 0.242 0.780 0.436

Analysis of Variance
Source Sum-of-Squares df Mean-Square F-ratio P
Regression 305.506 6 50.918 80.556 0.000
Residual 223.122 353 0.632

105
Example: Mediated Polynomial
Regression for Autonomy
 The compound coefficients are:
b0 + a0b1 = 1.07 + 5.83 × 0.82 = 1.07 + 4.78 = 5.85
b2 + a1b1 = 0.00 + 0.20 × 0.82 = 0.00 + 0.16 = 0.16
b3 + a2b1 = –0.00 – 0.29 × 0.82 = –0.00 – 0.24 = –0.24
b4 + a3b1 = –0.03 – 0.06 × 0.82 = –0.03 – 0.05 = –0.08
b5 + a4b1 = –0.03 + 0.28 × 0.82 = –0.03 + 0.23 = 0.20
b6 + a5b1 = 0.04 – 0.04 × 0.82 = 0.04 – 0.03 = 0.01
 The individual coefficients can be tested using the
reported standard errors, and the products of
coefficients can be tested using the bootstrap.
106
Example: Mediated Polynomial
Regression for Autonomy
Tests of individual and compound coefficients:
Direct1 First Second Indirect Total
Effect Stage Stage Effect Effect
Intercept 1.07** 5.83** 0.82** 4.78** 5.85**
X 0.00 0.20* 0.82** 0.16* 0.16
Y –0.00 –0.29** 0.82** –0.24** –0.24*
X2 –0.03 –0.06 0.82** –0.05 –0.08
XY –0.03 0.28** 0.82** 0.23** 0.20
Y2 0.04 –0.04 0.82** –0.03 0.01
1The direct effect of the five quadratic terms was not significant.

107
Example: Mediated Polynomial
Regression for Autonomy
Surface for unmediated effect:

108
Example: Mediated Polynomial
Regression for Autonomy
Surface for direct effect:

109
Example: Mediated Polynomial
Regression for Autonomy
Surface for first stage of indirect effect:

110
Example: Mediated Polynomial
Regression for Autonomy
Surface for indirect effect:

111
Example: Mediated Polynomial
Regression for Autonomy
Surface for total effect:

112
Difference Scores as Dependent Variables
Many of the problems that occur when difference
scores are used as independent variables also
occur when they are used as dependent variables.
Alternative procedures for difference scores as
dependent variables are fundamentally different
from those for difference scores as independent
variables.
We will briefly consider procedures when the
dependent variable is an algebraic difference and
both components are endogenous, meaning they
are caused by the independent variables.
113
Difference Scores as Dependent Variables

An equation that uses an algebraic difference as a


dependent variable is:
(Y1 – Y2) = b0 + b1X + e
Y1 and Y2 may be recast as separate dependent
variables in a multivariate regression analysis:
Y1 = b10 + b11X + e1
Y2 = b20 + b21X + e2

114
Difference Scores as Dependent Variables
 The correspondence between these equations can be
seen by subtracting the Y2 equation from the Y1
equation, which yields:
(Y1 – Y2) = (b10 – b20) + (b11 – b21)X + (e1 – e2)
 This subtraction shows the following:
b0 = b10 – b20
b1 = b11 – b21
 These expressions reveal a fundamental ambiguity, in
that b0 and b1 indicate the differences between the
intercepts and slopes, respectively, from the Y1 and Y2
equations, but they provide no information regarding the
absolute magnitudes of these intercepts and slopes.
115
Difference Scores as Dependent Variables
 This ambiguity is illustrated by the following examples,
all of which yield the same value for b1.
 This pattern indicates that the effects of X on Y1 and Y2
are equal in magnitude but opposite in sign:
b11 = b1/2, b21 = –b1/2
 Here, X is positively related to Y1 and unrelated to Y2:
b11 = b1, b21 = 0
 Here, X is negatively related to Y2 and unrelated to Y1:
b11 = 0, b21 = –b1
 These examples show that b1 is essentially useless for
determining the effect of X on Y1 and Y2.
116
Difference Scores as Dependent Variables
 The alternative procedure uses Y1 and Y2 jointly as
dependent variables in multivariate regression equations.
 The multivariate equations reveal the separate effects of
X on Y1 and Y2 and can be used to test whether these
effects correspond to hypotheses implied when (Y1 – Y2)
is used as a dependent variable.
 The procedure provides multivariate tests of the effects
of X on Y1 and Y2 and differences between these effects.
 Multivariate piecewise regression equations can be used
as an alternative to |Y1 – Y2| is used as a dependent
variable.
117
Answers to Frequently Asked Questions
Q: Which higher-order terms should I use? Are squared and product
terms sufficient, or should I also use cubed terms, the products of
squared and first-order terms, etc.?
A: The higher-order terms to be included in the equation depend
entirely on one’s hypotheses regarding the joint relationships of
X and Y with Z. In most cases, I have found that the three
quadratic terms (i.e., X2, XY, and Y2) are sufficient to capture
most theoretically meaningful effects. In exploratory analyses, I
have found significant effects for cubic and quartic terms, but
these rarely survive cross-validation and are often symptoms of a
few outliers or influential cases in the data.

118
Answers to Frequently Asked Questions
Q: How do I interpret the coefficients on X2, XY, and Y2? I
understand what they each mean separately, but thinking about
them all together is confusing.
A: The coefficients on X2, XY, and Y2 should be interpreted along
with the coefficients on X and Y as a set, because these
coefficients collectively describe the shape of the surface relating
X and Y to Z. Trying to interpret any one of these coefficients in
the absence of the others will often yield erroneous conclusions.
Instead, surfaces indicated by quadratic regression equations
should be treated as whole entities, and features of the surfaces
can be tested using response surface methodology. A major
motivation for applying response surface methodology was my
frustration when trying to make sense of coefficients from
quadratic equations. Response surface methodology makes the
task much easier.
119
Answers to Frequently Asked Questions
Q: Given that the coefficients on X and Y are scale dependent when
X2, XY, and Y2 in the equation, how can I meaningfully interpret
these coefficients?
A: The coefficients on X and Y (i.e., b1 and b2) are indeed scale
dependent. However, this simply reflects the fact that b1 and b2
indicate the slope of the surface where X and Y are zero (i.e., the
origin of the X,Y plane). One could add or subtract arbitrary
constants to X and Y and change the values of b1 and b2, but
doing so may shift the origins of X and Y beyond the bounds of
the data, where it doesn’t make sense to estimate b1 and b2 in the
first place. A more reasonable strategy is to scale X and Y such
that their origins represent a meaningful point in the distribution
of the data in the X,Y plane, such as a point midway between
their means or the midpoint of their common scale.
120
Answers to Frequently Asked Questions
Q: How large should my sample be?
A: The sample should be large enough to provide the statistical
power needed to test constraints and combinations of regression
coefficients required to test hypotheses. Power is important
because showing support for constraints requires support for the
null hypothesis (i.e., the R2 values for the constrained and
unconstrained equations do not differ). A related concern is that
the sample should provide adequate dispersion of cases in the
X,Y plane. For example, if cases are skewed in the direction of
X > Y or X < Y, it will be very difficult to detect changes in the
slope of the surface along the Y = –X line, which are usually of
interest in congruence research. Keep in mind that skewness on
either side of the Y = –X line cannot be detected by examining
the distributions of X and Y separately.
121
Answers to Frequently Asked Questions
Q: I have seen measures that ask the respondent to directly compare
the degree to which X deviates from Y. Doesn’t this approach
avoid the problems with difference scores?
A: Not really. Although it removes the need for the researcher to
calculate the difference, it does not guarantee that the respondent
will not implicitly or explicitly calculate the difference between
X and Y when providing a response (many response scales for
such items prompt the respondent to do just that). If this occurs,
then items that solicit direct comparisons are subject to the
problems as difference scores, because these problems do not
depend on who calculates the difference. Moreover, direct
comparison items hopelessly confound X and Y (analogous to
any “double-barreled” item) and force the researcher to take a
two-dimensional view of the relationship of X and Y with Z,
even when a three-dimensional view may be more informative.
122
Answers to Frequently Asked Questions
Q: The unconstrained equations for profile similarity indices contain
so many items. How do I interpret all those coefficients, and
what do I do about degrees of freedom?
A: Testing the full set of constraints imposed D1, |D|, and D2 does
indeed require using items for all of the dimensions as predictors.
However, the items constituting profiles can often be grouped
into conceptually homogeneous subsets. Scales corresponding to
these subsets can then be constructed, which can drastically
reduce the effective number of dimensions to be analyzed. This
not only makes interpretation easier, but also reduces sample size
requirements. Moreover, higher-order terms for each dimension
can be tested as sets, and those that are not significant may be
dropped (for an illustration of this, see Edwards, 1993). Of
course, models derived in this manner should be considered
exploratory, pending cross-validation.
123
Answers to Frequently Asked Questions
Q: By not using difference scores, aren’t we ignoring “fit?”
A: Models using difference scores are simply special cases of
general models containing the components of the difference.
Hence, these general models subsume those that use difference
scores. The general models also permit tests of the constraints
imposed by difference scores, which remain unverified when
difference scores are used. Moreover, fit hypotheses can usually
be restated in terms of relationships involving the variables that
constitute the fit construct. By stating hypotheses in these terms,
one can verify that relationships for these variables conform to
patterns depicted by fit hypotheses. Thus, the use of component
variables, supplemented by higher-order terms and response
surface analyses, permit tests of most fit hypotheses as well as
hypotheses difference scores cannot depict. This approach lets
the researcher gain much and lose little, if anything at all.
124
Answers to Frequently Asked Questions
Q: How can I apply the quadratic approach to structural
equations modeling?
A: Drawing from the literature on moderated structural
equation modeling, I have developed procedures for
specifying and estimating quadratic structural equation
models and applying response surface methodology.
These procedures require squares and products of the
indicators of first-order latent variables, involve
complex nonlinear constraints on parameters, and use
estimation methods for nonnormal data. I hope to finish
a manuscript describing this procedure in the near
future.
125
Answers to Frequently Asked Questions
Q: How do you generate those fancy graphs?
A: I have traditionally used SYSTAT, which is great for
plotting three-dimensional surfaces and adding contour
lines, principal axes, and so forth. Surfaces can also be
plotted using Microsoft Excel, and I have developed a
file that allows the user to enter coefficient estimates
from a quadratic equation and the minimum and
maximum values of X and Y to produce a surface. This
file can be downloaded from my website at:
http://public.kenan-flagler.unc.edu/faculty/edwardsj/downloads.htm

126
Answers to Frequently Asked Questions
Q: Can you recommend empirical examples of
polynomial regression in the organizational
behavior literature?
A: The use of polynomial regression has grown
since its introduction. Examples published
through 2000 are cited in the Edwards (2001)
article on difference score myths, and more
recent examples are cited in the meta-analysis
conducted by Kristof-Brown et al. (2005).

127
Answers to Frequently Asked Questions
Q: Your approach looks like a real pain. Can I just pretend it
doesn’t exist? Or, can I just cite your work to make it look like
I'm doing what you recommend?
A: Some researchers tenaciously cling to difference scores. Old
habits die hard. As a case in point, in a 1992 Psychological
Bulletin article, Lee Cronbach lamented that researchers continue
to use profile similarity indices he once advocated (Cronbach,
1955; Cronbach & Gleser, 1953) but subsequently disavowed
(Cronbach, 1958). Researchers have also developed clever ways
of citing articles that criticize difference scores without following
the advice in the articles. Here are some of my favorites, quoted
from studies that cite Edwards (1994):

128
Answers to Frequently Asked Questions
 “Computing a correlation across dimensions for each individual to
predict outcomes of fit or congruence represents a flawed measure of
fit (Edwards, 1994). However, for our purposes here, correlations
across individuals within a dimension provide an appropriate
measure of the relationship between person and environment.”
 “The reliabilities of the difference scores created to assess similarity
were relatively high, so it seemed simpler and more understandable
to keep the analysis as it was rather than to apply more complicated
alternatives (e.g., Edwards, 1994).”
 “Unmet expectations were assessed by subtracting scores on each
item for the early expectations from scores on each item from the
current situation . . . Problems in measuring and analyzing
discrepancy scores, and unmet expectations in particular, have been
reported recently (Edwards, 1994) . . . these problems have not been
entirely overcome here.”
129
Key References
Bohrnstedt, G. W., & Goldberger, A. S. (1969). On the exact covariance of products of random variables.
Journal of the American Statistical Association, 64, 1439-1442.
Bohrnstedt, G. W., & Marwell, G. (1978). The reliability of products of two random variables. In K. F.
Schuessler, (Ed.), Sociological Methodology 1978 (pp. 254-273). San Francisco: Jossey-Bass.
Edwards, J. R. (1994). The study of congruence in organizational behavior research: Critique and a proposed
alternative. Organizational Behavior and Human Decision Processes, 58, 51-100 (erratum, 58, 323-325).
Edwards, J. R., & Parry, M. E. (1993). On the use of polynomial regression equations as an alternative to
difference scores in organizational research. Academy of Management Journal, 36, 1577-1613.
Edwards, J. R. (1995). Alternatives to difference scores as dependent variables in the study of congruence in
organizational research. Organizational Behavior and Human Decision Processes, 64, 307-324.
Edwards, J. R. (2001). Ten difference score myths. Organizational Research Methods, 4, 264-286.
Edwards, J. R. (2002). Alternatives to difference scores: Polynomial regression analysis and response surface
methodology. In F. Drasgow & N. W. Schmitt (Eds.), Advances in measurement and data analysis (pp. 350-
400). San Francisco: Jossey-Bass.
Kristof-Brown, A. L., Zimmerman, R. D., & Johnson, E. C. (2005). Consequences of individual's fit at work: A
meta-analysis of person-job, person-organization, person-group, and person-supervisor fit. Personnel
Psychology, 58, 281-342.
Mooney, C. Z., & Duval, R. D. (1993). Bootstrapping: A nonparametric approach to statistical inference.
Newbury Park, CA: Sage.
Stine, R. (1989). An introduction to bootstrap methods. Sociological Methods & Research, 18, 243-291.

130

You might also like