Professional Documents
Culture Documents
2/10/2012
2/10/2012
. . .
2/10/2012
Intercept: a
Predicted value for Y when every X is 0 The effect of each X on Y, holding all other X variables constant
2/10/2012
Coefficient of Determination: R2
Tests whether the X variables, as a group, can predict Y better than just randomly
2/10/2012
Sb1 , Sb2 ,. , Sbk (with n k 1 degrees of freedom) Indicates the estimated sampling standard deviation of each regression coefficient Used in the usual way to find confidence intervals and hypothesis tests for individual regression coefficients
2/10/2012
Input Data
2/10/2012
Intercept a = $4,043
Essentially a base rate, representing the cost of advertising in a magazine that has no audience, no male readers, and zero income level But there are no such magazines intercept a is merely there to help achieve best predictions
2/10/2012
2/10/2012
2/10/2012
2/10/2012
= $38,966
Audubon has Page Costs $13,651 lower than you would expect for a magazine with its characteristics
(Audience, Percent Male, and Median Income)
2/10/2012
Se = S = $21,578 Actual Page Costs are about $21,578 from their predictions for this group of magazines (using regression) Y Compare to SY = $45,446: Actual Page Costs are about $45,446 from their average (not using regression) Using the regression equation to predict Page Costs
2/10/2012
Coefficient of Determination R2
Indicates the percentage of the variation in Y that is explained by (or attributed to) all of the X variables How well do the X variables explain Y? For the magazine data
R2 = 0.787 = 78.7% The X variables (Audience, Percent Male, and Median Income) taken together explain 78.7% of the variance of Page Costs This leaves 100% 78.7% = 21.3% of the variation in Page Costs unexplained
2/10/2012
Where I has a normal distribution with mean 0 and constant standard deviation W, and this randomness is independent from one case to another An assumption needed for statistical inference
2/10/2012
Table 12.1.7
Uncertainty in Y
2/10/2012
Do the X variables, taken together, explain a significant amount of the variation in Y? The null hypothesis claims that, in the population, the X variables do not help explain Y; all coefficients are 0 H0: F1 = F2 = = Fk = 0
The research hypothesis claims that, in the population, at least one of the X variables does help explain Y H1: At least one of F1, F2, , Fk { 0
2/10/2012
Three equivalent methods for performing F test; they always give the same result
2/10/2012
significant percentage of the variation in Page Costs The p-value, listed as 0.000, is less than 0.0005, and is therefore very highly significant (since it is less than 0.001) The R2 value, 78.7%, is greater than 27.1% (from the R2 table at level 0.1% with n = 55 and k = 3), and is therefore very highly significant The F statistic, 62.84, is greater than the value (between 7.054 and 6.171) from the F table at level 0.1%, and is therefore very highly significant
2/10/2012
Does the jth X variable have a significant effect on Y, holding the other X variables constant? Hypotheses are H0: Fj = 0, H1: Fj { 0 b j s tSb j Test using the confidence interval
use the t table with n k ! b j / Sb tstatistic 1 degrees of freedom
j
2/10/2012
2/10/2012
Indicate relative importance of the information each X variable brings in addition to the others Ordinary regression coefficients are in different units
And cannot be compared without standardization b S /S
Xj
Correlation Coefficients
Indicate relative importance of the information each X variable brings without adjusting for the other X variables
2/10/2012
Multicollinearity
Variable Selection
Model Misspecification
Perhaps the multiple regression linear model is wrong
2/10/2012