Professional Documents
Culture Documents
Total Variability
Just like with simple linear regression we have total sum of squares due to regression SSR , and total sum of squares due to error, SSE, which are printed on the EXCEL output.
The formulas are a more complicated (they involve matrix operations)
Average Variability
Average variability (Mean variability) for a group is defined as the Total Variability divided by the degrees of freedom associated with that group: Mean Squares Due to Regression MSR = SSR/DFR Mean Squares Due to Error MSE = SSE/DFE
Degrees of Freedom
Total number of degrees of freedom DF(Total) always = n-1 Degrees of freedom for regression (DFR) = the number of factors in the regression (i.e. the number of xs in the linear regression)
Degrees of freedom for error (DFE) = difference between the two = DF(Total) -DFR
The F-Statistic
The F-statistic is defined as the ratio of two measures of variability. Here,
MSR F MSE
Recall we are saying if MSR is large compared to MSE, at least one 0. Thus if F is large, we draw the conclusion is that HA is true, i.e. at least one 0.
The F-test
Large compared to what? F-tables give critical values for given values of TEST: REJECT H0 (Accept HA) if: F = MSR/MSE > F,DFR,DFE
RESULTS
If we do not get a large F statistic
We cannot conclude that any of the variables in this model are significant in predicting y.
F = MSR/MSE
Results
We see that the F statistic is 20.89762 This would be compared to F.05,3,34
From the F.05 Table, the value of F.05,3,34 is not given. But F.05,3,30 = 2.92 and F.05,3,40 = 2.84. And 20.89762 > either of these numbers. The actual value of F.05,3,34 can be calculated by Excel by FINV(.05,3,34) = 2.882601
USE SIGNIFICANCE F
This is the p-value for the F-Test Significance F = 7.46 x 10-8 = .0000000746 < .05 Can conclude that at least one x is useful in predicting y
0 t 3 t.025,DFE or t.025,DFE s 3
Can it even happen that F says at least one variable is significant, but none of the ts indicate a useful variable?
YES EXAMPLES IN WHICH THIS MIGHT HAPPEN:
Miles per gallon vs. horsepower and engine size Salary vs. GPA and GPA in major Income vs. age and experience HOUSE PRICE vs. SQUARE FOOTAGE OF HOUSE AND LAND
Test 3 --What Proportion of the Overall Variability in y Is Due to Changes in the xs?
2 R
R2 = .442197 Overall 44% of the total variation in sales price is explained by changes in square footage, land, and age of the house.
Scatterplot
Sales vs Ad Dollars
$140,000 $120,000
$100,000
$80,000
Sales
$60,000 $40,000 $20,000
$0
Review
Are any of the xs useful in predicting y IN THIS MODEL
Look at p-value for F-test Significance F F = MSR/MSE would be compared to F,DFR,DFE
What proportion of the total variance in y can be explained by changes in the xs?
R2 Adjusted R2 takes into account the reduced degrees of freedom for the error term by including more terms in the model
1-regression equation