You are on page 1of 4

SPSS Regression Output Regression

The box below is the first thing youll see in the standard SPSS regression output. Standard output means that I did not click ANY boxes or options to get this printout. For bivariate regression this first box is not of much interest, because it simply lists the single variable we are using as a predictor. Later for multiple regression this will show all the variables in our models, and it can also show the sets of variables for several models at once.
b Variables Entered/Removed

Model 1

Variables Entered SOCIO-EC ONOMIC STATUS COMPOSIT a E

Variables Removed

Method

Enter

a. All requested variables entered. b. Dependent Variable: MATH STANDARDIZED SCORE

The model summary box comes next. In it you will find R, R2 and the standard error of estimate (Sy.x), which is the square root of the mean squared error, or MSE. Here we interpret R as the correlation of the Y scores with the predicted values Y . The adjusted R2 is used in multiple regression. It is adjusted to account for the use of more predictors simply adding more Xs can raise your R2, so this value is adjusted downwards a little to penalize ourselves for just hunting around for significant predictors.
Model Summary Adjusted R Square .279 Std. Error of the Estimate 8.2148

Model 1

R R Square .531a .282

a. Predictors: (Constant), SOCIO-ECONOMIC STATUS COMPOSITE

Note that in the footnote to the model summary SPSS tells us what predictors are relevant for the R and R2, even though in this case we have only one predictor. If we were to run several bivariate regressions or several multiple regressions, we would get a list of several models. The word constant in parentheses refers to the intercept. This is printed because it is possible to force SPSS not to estimate an intercept. This is only done in unusual situations for most regressions, and all of the regressions we will run, we will allow SPSS to estimate the intercept term.

The box below shows the ANOVA table for the regression. ANOVA stands for Analysis Of Variance specifically the analysis of variation in the Y scores. Here we see the two sums of squares introduced in class the regression and residual (or error) sums of squares. The variance of the residuals (or errors) is the value of the mean square error or MSEhere it is 67.483. Recall that we compare the value of the MSE to the value of the variance of Y. The standard output does not give us the variance of Y you need to click the statistics button (in the regression menu) to get it OR run descriptive statistics on Y. Also in this table we find the F test. This tests the hypothesis that the predictor (here our only predictor) shows no relationship to Y. We can write hypothesis this in several ways, as mentioned in class. The F test has two numbers for its degrees of freedom (recall that our t test has one df). These are called the numerator and denominator degrees of freedom,. Of df1 and df2 . Here the numerator df (df1) tells us how many predictors we have (this time it is 1) and the denominator degrees of freedom are n - 1- df1 or n-2 for bivariate regression. The value of the test for our data is F(1,248) = 97.42. The table shows us this is significant (p < . 001). As the F is large, we determine that our predictor of math outcome (here , ses) is related to math score in our population.
b ANOVA

Model 1

Sum of Squares Regression 6574.387 Residual 16735.828 Total 23310.215

df 1 248 249

Mean Square 6574.387 67.483

F 97.423

Sig. .000 a

a. Predictors: (Constant), SOCIO-ECONOMIC STATUS COMPOSITE b. Dependent Variable: MATH STANDARDIZED SCORE

Again the footnote tells us what predictor is being used and what outcome is being predicted. Last, the table provides us with the data we need to compute R2. If we compute SS-regression divided by SS-Total, we should get R2. SS-regression /SS-Total = 6574.39/ 23310.21 = .282 The last table is full of information about the model. In it we find the slope (or slopes, in multiple regression). Our values of b0 and b1 are listed as unstandardized values, and their standard errors SE( b0 ) and SE( b1) are in the second column. The standardized coefficient for the predictor in a bivariate regression is simple the correlation. Check back to the value of R in the first table, and see that it is the same as Beta here -- .531. In our notation from class this is b*1.

We can write the sample regression model from these slopes and also the sample standardized regression model, if we like. Those models are Sample regression model: Yi = b0 + b1 (sesi) + eI Yi = 51.28 + 6.54 (sesi) + eI Sample standardized regression model: Z(yi) = b*1 Z(sesi) + eI* Z(yi) = .531 Z(sesi) + ei*

Note, you could also write these models using Y and omitting the error terms. Recall again the interpretation of the slopes. The unstandardized slope of 6.54 tells us that a students math score increases by about 6.5 points for every additional point on the SES scale. Higher SES scores are associated with higher math scores. The standardized slope tells us that for each standard-deviation unit of increase in SES, we predict slightly more than a half of a standard deviation increase in math score.
a Coe fficie nts

Model 1

(Constant) SOCIO-ECONOMIC STATUS COMPOSITE

Unstandardized Coefficients B Std. Error 51.277 .520 6.537 .662

Standardized Coefficients Beta .531

t 98.552 9.870

Sig. .000 .000

a. Dependent Variable: MATH STANDARDIZED SCORE

Last, the table gives us the t tests for the slope and intercept. In multiple regression we will get individual tests for each predictor. The table does not tell us the df for the t. We need to know that the df for each t is the same as the df for residuals in the F table above. Here the df is n-2. Each t test examines the hypothesis H0: = 0 for the predictor used. As we learned in class, the F is the square of the t test, when we have only one predictor. Here that is 9.87*9.87 = 97.42. Also our results must agree (if only one X is used). Here again we reject the null model, and decide that SES is a good predictor of math score, and has a slope that is not zero in the population. Last we check our assumptions. In regression we make assumptions about the structural part of the model, that is, about the predictor(s). We assume that all of the important predictors are in our model, and no unimportant ones are included. This is usually NOT a good assumption for a bivariate regression!!

We also make assumptions about our errors. Specifically, we assume that the residuals are independent and normally distributed, and that they have equal variances for any X value. Therefore we make a normal plot (to get this we need to click on the [Plots] button in SPSS). These residuals dont look very normal it is likely that other predictors could explain more variation in the data.

Histogram Dependent Variable: MATH STANDARDIZED SCORE


30

20

10

Frequency

Std. Dev = 1.00 Mean = 0.00 0


5 -.7 5 .2 -1 5 .7 -1 5 .2 -2 5 .7 -2 5 -.2 5 .2 5 .7 75 1. 25 1. 25 2.

N = 250.00

Regression Standardized Residual

To check the assumption about variances we make a scatterplot. We plot the residuals on the Y axis and the predictor variable (or equivalently the predicted values) on the X axis. Using the [Plots] button, we select zpred and put it on the X axis and use zresid on the Y axis. We hope to find equal scatter in the points all along the horizontal axis. This plot looks pretty good!!
Scatterplot Dependent Variable: MATH STANDARDIZED SCORE
3

Regression Standardized Residual

2 1 0 -1 -2 -3 -3 -2 -1 0 1 2 3

Regression Standardized Predicted Value

You might also like