You are on page 1of 28

1

Least Squares Regression Line


X X
Y
Y = Y =
average average
Total Deviati on Total Deviati on
Devi ati on not Devi ati on not
explained by explained by
regression regression
Devi ati on Devi ati on
explained by explained by
regression regression
2
Regression Analysis Terms
Explained variance = R Explained variance = R
2 2
(coefficient of (coefficient of
determination). determination).
Unexplained variance = residuals (error). Unexplained variance = residuals (error).
y = 0 + 1x +
The error term is a random variable with
mean zero. Hence, E (y) = 0 + 1x
The variance of error term is and same
for all values of X. Consequently the
variance of Y equals to .
The values of are independent. Hence
one error term is not dependent on other
error term and consequently one Y value
is not dependent on another X value.
Error terms are normally distributed.
Not at i on f or Regr essi on
Equat i on
y-intercept of regression equation
0
b
0
Slope of regression equation
1
b
1
Equation of the regression line y =
0
+
1
x y = b
0
+ b
1
x
Population
Parameter
Sample
Statistic
^
Sum of Squares for Errors
This is the sum of differences
between the points and the
regression line.
It can serve as a measure of how well
the line fits the data. SSE is defined
by
. ) y

y ( SSE
n
1 i
2
i i
=
=
Linear Regression
Assumptions
Y can be predicted from X
A graph of X & Y is a straight line
The line extends infinitel y in both
directions
Model explains onl y the variability in Y
Each XY pair was randoml y sampled
Each XY pair was selected
independentl y
Linear Regression Assumptions
For each x value, y is a random variable
havi ng a normal (bell-shaped) distributi on.
All of these y distributions have the same
variance. Al so, for a gi ven val ue of x, the
distribution of y values has a mean that li es
on the regression l ine. (Results are not
seri ousl y affected i f departures from normal
distributions and equal variances are not too
extreme.)
The Normality of y
y is normally distributed with mean
E(y) = |
0
+ |
1
x, and a constant standard
deviation o
c
y is normally distributed with mean
E(y) = |
0
+ |
1
x, and a constant standard
deviation o
c

3
|
0
+ |
1
x
1
|
0
+ |
1
x
2
|
0
+ |
1
x
3
E(y|x
2
)
E(y|x
3
)
x
1
x
2
x
3

1
E(y|x
1
)

2
The standard deviation remains constant,
but the mean value changes with x
Standard Error of Estimate
The mean error is equal to zero.
If o
c
is small the errors tend to be close to
zero (close to the mean error). Then, the
model fits the data well.
Therefore, we can, use o
c
as a measure of
the suitability of using a linear model.
An estimator of o
c
is given by s
c
2
tan

=
n
SSE
s
Estimate of Error dard S

Testing the Slope


We can draw inference about |
1
from b
1
by
testing
H
0
: |
1
=0
H
1
: |
1
=0 (or <0,or >0)
The test statistic is
If the error variable is normally distributed,
the statistic is Student t distribution with d.f. =
n-2.
1
b
1 1
s
b
t
|
=
The standard error of b
1
.
2
x
b
s ) 1 n (
s
s
1

=
c
where
Multivariate Data
Analysis
Selecting a Multivariate Technique
Dependency
Dependent (criterion) variables and
independent (predictor) variables are
present
Interdependency
Variables are interrelated without
designating some dependent and others
independent
Dependency Techniques
Multiple regression (Univariate and
multivariate)
Conjoint analysis
Discriminant analysis
Multivariate analysis of variance
(MANOVA)
Linear structural relationships (LISREL)
Interdependency Techniques
Factor analysis
Cluster analysis
Multidimensional Scaling (MDS)
Multiple Regression Model
The equation that describes how the dependent
variable y is related to the independent variables
x
1
, x
2
, . . . x
p
and an error term is called the
multiple regression model.
The multiple regression model is:
y =
0
+
1
x
1
+
2
x
2
+. . . +
p
x
p
+

0
,
1
,
2
, . . . ,
p
are the parameters.
is a random variable called the error term.
In the SLR, the conditional mean of Y depends on
X. The Multiple Regression Model extends this
idea to include more than one independent
variable.
The equation that describes how the mean value of The equation that describes how the mean value of y y is is
related to related to x x
1 1
, , x x
2 2
, . . . , . . . x x
p p
is called the is called the multiple regression multiple regression
equation equation. .
The multiple regression equation is: The multiple regression equation is:
E E( (y y) = ) =
0 0
+ +
1 1
x x
1 1
+ +
2 2
x x
2 2
+. . . + +. . . +
p p
x x
p p
Multiple Regression Equation Multiple Regression Equation
A simple random sample is used to compute sample A simple random sample is used to compute sample
statistics statistics b b
0 0
, , b b
1 1
, , b b
2 2
, , . . . , . . . , b b
p p
that are used as the point that are used as the point
estimators of the parameters estimators of the parameters
0 0
, ,
1 1
, ,
2 2
, . . . , , . . . ,
p p
. .
The The estimated multiple regression equation is: estimated multiple regression equation is:
y y = =b b
0 0
+ +b b
1 1
x x
1 1
+ +b b
2 2
x x
2 2
+. . . + +. . . +b b
p p
x x
p p
Estimated Multiple Regression Equation Estimated Multiple Regression Equation
^ ^
Y
X
1
X
2
X
3
Estimation Process
Multiple Regression Model Multiple Regression Model
y y = =
0 0
+ +
1 1
x x
1 1
+ +
2 2
x x
2 2
+. . + +. . +
p p
x x
p p
+ +
Multiple Regression Equation Multiple Regression Equation
E E( (y y) = ) =
0 0
+ +
1 1
x x
1 1
+ +
2 2
x x
2 2
+. . . + +. . . +
p p
x x
p p
Unknown parameters are Unknown parameters are

0 0
, ,
1 1
, ,
2 2
, . . . , , . . . ,
p p
Sample Data: Sample Data:
x x
1 1
x x
2 2
. . . x . . . x
p p
y y
. . . . . . . .
. . . . . . . .
Estimated Multiple Estimated Multiple
Regression Equation Regression Equation
b b
0 0
, , b b
1 1
, , b b
2 2
, , . . . , . . . , b b
p p
are sample statistics are sample statistics
b b
0 0
, , b b
1 1
, , b b
2 2
, , . . . , . . . , b b
p p
provide estimates of provide estimates of

0 0
, ,
1 1
, ,
2 2
, . . . , , . . . ,
p p
0 1 1 2 2

...
p p
y b bx bx b x = + + + +
0 1 1 2 2

...
p p
y b bx bx b x = + + + +
Least Squares Method
Least Squares Criterion
Computation of Coefficients Values
The formulas for the regression
coefficients b
0
, b
1
, b
2
, . . . b
p
involve the
use of matrix algebra. We will rely on
computer software packages to perform
the calculations.
min (
i
y y
i
)
2
min (
i
y y
i
)
2
^ ^
Least Squares Method Least Squares Method
A Note on Interpretation of Coefficients A Note on Interpretation of Coefficients
b b
i i
represents an estimate of the change in represents an estimate of the change in y y
corresponding to a one corresponding to a one- -unit change in unit change in x x
i i
when all other when all other
independent variables are held constant. independent variables are held constant.
Relationship Among SST, SSR, SSE
SST = SSR + SSE
Multiple Coefficient of Determination Multiple Coefficient of Determination
Multiple Coefficient of Determination Multiple Coefficient of Determination
R R
2 2
=SSR/SST =SSR/SST
Adjusted Multiple Coefficient of Determination Adjusted Multiple Coefficient of Determination
If R
2
>0, then we reject the null hypothesis of no
relationship.
Multiple Coefficient of Determination Multiple Coefficient of Determination
R R
n
n p
a
2 2
1 1
1
1
=


( ) R R
n
n p
a
2 2
1 1
1
1
=


( )
Model Assumptions
Assumptions About the Error Term
1. The error is a random variable with mean of zero.
Implication: For given value of several independent
variables, the expected or average value of y is given by
E(y) =
0
+
1
x
1
+
2
x
2
+. . . +
p
x
p
2. The variance of , denoted by
2
, is the same for all
values of the independent variables. Implication: The
variance of y equals
2
and same for all values of x
1,
x
2,
. .
. X
p
3. The values of are independent. Implication: the size of
the error of a particular set of variables is not related to the
size of any other set of variable.
4. The error is a normally distributed random variable
reflecting the deviation between the y value and the
expected value of y given by
0
+
1
x
1
+
2
x
2
+. . . +
p
x
p
Implication: y is also a normally distributed random
variable for given
0,

1,

2
. . .
p
In simple linear regression, the F and t tests
provide the same conclusion.
In multiple regression, the F and t tests
have different purposes.
The F test is used to determine whether a
significant linear relationship exists between
the dependent variable and the set of all the
independent variables.
The F test is referred to as the test for
overall significance.
Testing for Significance
Testing for Significance: t
Test
If the F test shows an overall significance,
the t test is used to determine whether each
of the individual independent variables is
significant.
A separate t test is conducted for each of
the independent variables in the model.
We refer to each of these t tests as a test
for individual significance.
Testing for Significance: F Test
Hypotheses
H
0
:
1
=
2
=. . . =
p
=0
H
a
: One or more of the parameters
is not equal to zero.
Test Statistic
F =MSR/MSE
Rejection Rule
Reject H
0
if F >F

where F

is based on an F distribution with p d.f. in


the numerator and n - p - 1 d.f. in the denominator. Decide
significance based on p value
Testing for Significance: t
Test
Hypotheses
H
0
:
i
=0
H
a
:
i
=0
Test Statistic
Rejection Rule
Reject H
0
if t <-t
2
or t >t
2
where t
2
is based on a t distribution with
n - p - 1 degrees of freedom.
Decide significance based on p value
t
b
s
i
b
i
= t
b
s
i
b
i
=
Multicollinearity
Multicollinearity
Is multicollinearity bad?
If the multicollinearity is perfect, the
regression coefficient becomes
indeterminate.
Substitute value of x1 = ax2 in OLS
estimate.
What causes multicollinearity?
God causes multicollinearity!
The data collection procedure sampling
limited range of values.
Inbuilt constraint in the model income and
wealth, expenditure and number of member
in the household.
In time series data when regressors share
common trend.
Multicollinearity Diagnostics Multicollinearity Diagnostics
Tolerance the amount of variance in an independent variable that is
not explained by the other independent variables. If the other
variables explain a lot of the variance of a particular independent
variable we have a problem with multicollinearity. Thus, small values
for tolerance indicate problems of multicollinearity. The minimum
cutoff value for tolerance is typically .20. That is, the tolerance value
must be smaller than .20 to indicate a problem of multicollinearity.
Variance Inflation Factor (VIF) measures how much the variance of
the regression coefficients is inflated by multicollinearity problems. If
VIF equals 0, there is no correlation between the independent
measures. A VIF measure of 1 is an indication of some association
between predictor variables, but generally not enough to cause
problems. A maximum acceptable VIF value would be 5.0; anything
higher would indicate a problem with multicollinearity.

How to detect multicollinearity


Check the pair-wise correlation between
explanatory variable.
Remedy
Why to bother if I am using the model for
prediction!
Carryout factor analysis
Drop one collinear variable
Transform the variable by taking the
difference between two time (Time series
data)
Statistical vs. Practical Significance?
The F statistic is used to determine if the overall regression model is statistically
significant. If the F statistic is insignificant, it means it is unlikely your sample will
produce a large R2 when the population R2 is actually zero. To be considered
statistically significant, a rule of thumb is there must be <.05 probability the results are
due to chance.
If the R2 is statistically significant, we then evaluate the strength of the linear
association between the dependent variable and the several independent variables.
R2, also called the coefficient of determination, is used to measure the strength of the
overall relationship. It represents the amount of variation in the dependent variable
associated with all of the independent variables considered together (it also is referred
to as a measure of the goodness of fit). R2 ranges from 0 to 1.0 and represents the
amount of the dependent variable explainedby the independent variables combined.
A large R2 indicates the straight line works well while a small R2 indicates it does not
work well.
Even though an R2 is statistically significant, it does not mean it is practically
significant. We also must ask whether the results are meaningful. For example, is the
value of knowing you have explained 4 percent of the variation worth the cost of
collecting and analyzing the data?
Exercise: Multiple Regression Exercise: Multiple Regression
1. 1. Review the data for the McDonald's Review the data for the McDonald's
restaurant case. restaurant case.
2. 2. Where could multiple regression be Where could multiple regression be
useful for the customer survey? useful for the customer survey?
3. 3. Where could multiple regression be Where could multiple regression be
useful for the employee survey? useful for the employee survey?
37
Variable Description Variable Type
Restaurant Perceptions
X
1
Excellent Food Quality Metric
X
2
Attractive Interior Metric
X
3
Generous Portions Metric
X
4
Excellent Food Taste Metric
X
5
Good Value for the Money Metric
X
6
Friendly Employees Metric
X
7
Appears Clean & Neat Metric
X
8
Fun Place to Go Metric
X
9
Wide Variety of menu Items Metric
X
10
Reasonable Prices Metric
X
11
Courteous Employees Metric
X
12
Competent Employees Metric
Selection Factor Rankings
X
13
Food Quality Nonmetric
X
14
Atmosphere Nonmetric
X
15
Prices Nonmetric
X
16
Employees Nonmetric
Relationship & Classification Variables
X
17
Satisfaction Metric
X
18
Likely to Return in Future Metric
X
19
Recommend to Friend Metric
X
20
Frequency of Patronage Nonmetric
X
21
Who Saw Ad Nonmetric
X
22
Which Ad Viewed Nonmetric
X
23
Ad Rating Metric
X
24
Length of Time a Customer Metric
X
25
Gender Nonmetric
X
26
Age Metric
X
27
Income Metric
X
28
Competitor Nonmetric
Descript ion of Customer Survey Variables Descript ion of Customer Survey Variables
38
Variable Description Variable Type
Work Environment Measures
X
1
I am paid fairly for the work I do. Metric
X
2
I am doing the kind of work I want. Metric
X
3
My supervisor gives credit an praise for work well done. Metric
X
4
There is a lot of cooperation among the members of my work group. Metric
X
5
My job allows me to learn new skills. Metric
X
6
My supervisor recognizes my potential. Metric
X
7
My work gives me a sense of accomplishment. Metric
X
8
My immediate work group functions as a team. Metric
X
9
My pay reflects the effort I put into doing my work. Metric
X
10
My supervisor is friendly and helpful. Metric
X
11
The members of my work group have the skills and/or training
to do their job well. Metric
X
12
The benefits I receive are reasonable. Metric
Relationship Measures
X
13
Loyalty I have a sense of loyalty to McDonald's restaurant. Metric
X
14
Effort I am willing to put in a great deal of effort beyond that
expected to help McDonald's restaurant to be successful. Metric
X
15
Proud I am proud to tell others that I work for McDonald's restaurant. Metric
Classification Variables
X
16
Intention to Search Metric
X
17
Length of Time an Employee Nonmetric
X
18
Work Type = Part-Time vs. Full-Time Nonmetric
X
19
Gender Nonmetric
X
20
Age Metric
X
21
Performance Metric
Descript ion of Employee Survey Variables Descript ion of Employee Survey Variables
Using SPSS to Compute a Multiple Regression Using SPSS to Compute a Multiple Regression
Model Model
We want to first see the effect of food quality on
customers future return. The SPSS click through sequence
is ANALYZE REGRESSION LINEAR. Highlight X18
and move it to the dependent variables box. Highlight X1
and move it to the independent variables box. Use the
default Enterin the Methods box. Click on the Statistics
button and use the defaults for Estimatesand Model Fit.
Next click on Descriptivesand then Continue. There are
several other options you could select at the bottom of this
dialog box but for now we will use the program defaults.
Click on OKat the top right of the dialog box to run the
regression.
Selected Variables from McDonald's Customer Selected Variables from McDonald's Customer
Survey Survey
X X
11
Excellent Food Quality Excellent Food Quality Strongly Strongly Strongly Strongly
Disagree Agree Disagree Agree
1 2 3 4 5 6 7 1 2 3 4 5 6 7
X X
44
Excellent Food Taste Excellent Food Taste Strongly Strongly Strongly Strongly
Disagree Agree Disagree Agree
1 2 3 4 5 6 7 1 2 3 4 5 6 7
X X
99
Wide Variety of Menu Items Wide Variety of Menu Items Strongly Strongly Strongly Strongly
Disagree Agree Disagree Agree
1 2 3 4 5 6 7 1 2 3 4 5 6 7
X X
18 18
How likely are you to return to How likely are you to return to
McDonald's restaurant in the future? McDonald's restaurant in the future?
Definitely Will Definitely Will Definitely Will Definitely Will
Not Return Return Not Return Return
1 2 3 4 5 6 7 1 2 3 4 5 6 7
Using SPSS to Compute a Multiple Regression Using SPSS to Compute a Multiple Regression
Model Model
We want to compare McDonald's customers perceptions with
those of Domino's, so go to the Data pull-down menu to split the
sample. Scroll down and click on Split File, then on Compare Groups.
Highlight variable X28 and move it into the box labeled Groups based
on:and then click OK. Now you can run the regression and compare
McDonald's and Domino's.
The SPSS click through sequence is ANALYZE REGRESSION
LINEAR. Highlight X18 and move it to the dependent variables box.
Highlight X1, X4 and X9 and move them to the independent variables
box. Use the default Enterin the Methods box. Click on the Statistics
button and use the defaults for Estimatesand Model Fit. Next click
on Descriptivesand then Continue. There are several other options
you could select at the bottom of this dialog box but for now we will use
the program defaults. Click on OKat the top right of the dialog box to
run the regression.
Multiple Regression Dialog Boxes Multiple Regression Dialog Boxes
Multiple Regression Dialog Boxes Multiple Regression Dialog Boxes
44
Multiple Regression Output Multiple Regression Output
Degrees of freedom (df) = the total number of observations
minus the number of estimated parameters. For example, in
estimating a regression model with a single independent
variable, we estimate two parameters, the intercept (b0) and
a regression coefficient for the independent variable (b1). If
the number of degrees of freedom is small, the resulting
prediction is less generalizable. Conversely, a large degrees-
of-freedom value indicates the prediction is fairly robust
with regard to being representative of the overall sample of
respondents.
Total Sum of Squares (SST) = total
amount of variation that exists to be
explained by the independent variables.
TSS = the sum of SSE and SSR.
Sum of Squared Errors (SSE) = the variance in the dependent
variable not accounted for by the regression model = residual. The
objective is to obtain the smallest possible sum of squared errors as a
measure of prediction accuracy.
Sum of Squares Regression (SSR) = the
amount of improvement in explanation of the
dependent variable attributable to the
independent variables.
Unstandardized Coefficient (B) Unstandardized Coefficient (B)
interpretation = interpretation = for every unit the for every unit the
McDonald's X McDonald's X
11
increases, X increases, X
18 18
(dependent variable) will increase by (dependent variable) will increase by
.260 units. .260 units.
Constant term( Constant term(b b
00
) = ) = also referred to as the also referred to as the
intercept, it is the value on the Y axis intercept, it is the value on the Y axis
(dependent variable axis) where the line (dependent variable axis) where the line
defined by the regression equation crosses defined by the regression equation crosses
the axis. the axis.
Only significant betas are Only significant betas are
interpreted (=>.05) interpreted (=>.05)
Standardized Coefficient (Beta) Standardized Coefficient (Beta)
interpretation = interpretation = This value takes This value takes
care of different units of independent care of different units of independent
and dependent variables. and dependent variables.
Standardized coefficients are used to Standardized coefficients are used to
compare several independent compare several independent
variables. variables.
There is high multicollinearity among the There is high multicollinearity among the
independent variables. This can cause a independent variables. This can cause a
problem with the significance of the beta problem with the significance of the beta
coefficients. See X coefficients. See X
99
on previous slide. on previous slide.
Notice that when variables X
1
and X
4
are
eliminated then the beta for X
9
is significant.
Multicollinearity Diagnostics Multicollinearity Diagnostics
Variance Inflation Factor (VIF) measures how much the
variance of the regression coefficients is inflated by
multicollinearity problems. If VIF equals 0, there is no correlation
between the independent measures. A VIF measure of 1 is an
indication of some association between predictor variables, but
generally not enough to cause problems. A maximum
acceptable VIF value would be 5.0; anything higher would indicate
a problem with multicollinearity.
Tolerance the amount of variance in an independent variable
that is not explained by the other independent variables. If the
other variables explain a lot of the variance of a particular
independent variable we have a problem with multicollinearity.
Thus, small values for tolerance indicate problems of
multicollinearity. The minimum cutoff value for tolerance is
typically .20. That is, the tolerance value must be smaller than
.20 to indicate a problem of multicollinearity.
Using SPSS to Examine Multicollinearity: Using SPSS to Examine Multicollinearity:
The SPSS click through sequence is: ANALYZE
REGRESSION LINEAR. Go to McDonald's
employee survey data and click on X13 Loyalty
and move it to the Dependent Variables box. Click
on variables X1 to X12 and move them to the
Independent Variables box. The box labeled
Methodhas ENTER as the default and we will use
it. Click on the Statisticsbutton and use the
Estimatesand Model fitdefaults. Click on
Descriptives and Collinearity diagnostics and
then Continueand OKto run the regression.
Tolerance and VIF are two statistics that tell us Tolerance and VIF are two statistics that tell us
about the extent of multicollinearity exists in about the extent of multicollinearity exists in
the model. Tolerance is reciprocal of VIF the model. Tolerance is reciprocal of VIF
Residuals Plots
Histogram of standardized residuals Histogram of standardized residuals enables you to determine if enables you to determine if
the errors are normally distributed (see Exhibit 1). the errors are normally distributed (see Exhibit 1).
Normal probability plot Normal probability plot enables you to determine if the errors are enables you to determine if the errors are
normally distributed. It compares the observed (sample) standar normally distributed. It compares the observed (sample) standardized dized
residuals against the expected standardized residuals from a nor residuals against the expected standardized residuals from a normal mal
distribution (see Exhibit 2). distribution (see Exhibit 2).
ScatterPlot of residuals ScatterPlot of residuals can be used to test regression can be used to test regression
assumptions. It compares the standardized predicted values of t assumptions. It compares the standardized predicted values of the he
dependent variable against the standardized residuals from the dependent variable against the standardized residuals from the
regression equation (see Exhibit 3). If the plot exhibits a r regression equation (see Exhibit 3). If the plot exhibits a random andom
pattern then this indicates no identifiable violations of the as pattern then this indicates no identifiable violations of the assumptions sumptions
underlying regression analysis. underlying regression analysis.
Using SPSS to Examine Residuals Using SPSS to Examine Residuals
SPSS includes several diagnostic tools to examine residuals. To run the regression
that examines the residuals, first load the employee database. The click through sequence
is ANALYZE REGRESSION LINEAR. Highlight X15 Proud and move it to the
dependent variable box. Next highlight variables X2, X5, X7, and X19 and move them to the
independent variable box. Enter is the default in the Methods box and we will use it. Click
on the Statistics button and Estimates and Model Fit will be the defaults. Now, click on
Collinearity Diagnostics . Next click on Continue.
This is the same sequence as earlier regression applications, but now we also must
go to the Plots button to request some new information. To produce plots of the residuals
to check on potential violations of the regression assumptions, click on ZPREDand move
it to the Ybox. Then click on ZRESIDand move it to the Xbox. These two plots are
for the Standardized Predicted Dependent Variable and Standardized Residuals. Next,
click on Histogram and Normal Probability plot under the Standardized Residual Plots box
on the lower left side of the screen. Examination of these plots and tables enables us to
determine whether the hypothesized relationship between the dependent variable X15 and
the independent variables X2, X5, X7, and X19 is linear, and also whether the error terms in
the regression model are normally distributed. Finally, click on Continue and then on OK
to run the program. The results are the same as in Exhibits 1 to 3.
Exhibit 1: Histogram of Employee Survey
Dependent Variable X
15
Proud
Regression Standardized Residual
2.25
2.00
1.75
1.50
1.25
1.00
.75
.50
.25
0.00
-.25
-.50
-.75
-1.00
-1.25
-1.50
-1.75
Histogram
Dependent Variable: X15 -- Proud
F
r
e
q
u
e
n
c
y
10
8
6
4
2
0
Std. Dev =.97
Mean =0.00
N =63.00
Exhibit 2: Normal Probability Plot of
Regression Standardized Residuals
Normal P-P Plot of Regression Standardized Residual
Dependent Variable: X15 -- Proud
Observed Cum Prob
1.00 .75 .50 .25 0.00
E
x
p
e
c
t
e
d

C
u
m

P
r
o
b
1.00
.75
.50
.25
0.00
Normal probability plot = a
graphical comparison of the
shape of the sample
distribution (observed) to the
normal distribution. The
straight line angled at 45
degrees is the normal
distribution and the actual
distribution (observed) is
shown as deviations from the
straight line.
Exhibit 3: Scatterplot of Employee Survey
Dependent Variable X
15
Proud
Scatterplot
Dependent Variable: X15 -- Proud
Regression Standardized Residual
3 2 1 0 -1 -2
R
e
g
r
e
s
s
i
o
n

S
t
a
n
d
a
r
d
i
z
e
d

P
r
e
d
i
c
t
e
d

V
a
l
u
e
3
2
1
0
-1
-2
This is a scatterplot of the
standardized residuals versus the
predicted dependent (Y) values. If
it exhibits a random pattern, which
this plot does, then it indicates no
identifiable violations of the
assumptions underlying regression
analysis and is called a Null Plot.
Thank you

You might also like