Professional Documents
Culture Documents
1-12
variable.
The issue of omitted variable bias occurs regardless of the size of
the sample.
2-12
3-12
Minimize:
ei Yi Y i
2
Yi (b0 b1 X 1i b2 X 2i ... bk X ki )
4-12
Coefficient of Determination, R2
R2 is calculated the same way as in simple linear regression.
variance
explained variance
TSS-SSR
ESS
R 2 = total variancetotal- unexplained
=
=
=
variance
TSS
total variance
TSS
Adjusted R2
Unfortunately, R2 by itself may not be a reliable measure of the explanatory power of the
multiple regression model. This is because R2 almost always increases as independent
variables are added to the model, even if the marginal contribution of the new variables is
not statistically significant. Consequently, a relatively high R2 may reflect the impact of a
large set of independent variables rather than how well the set explains the dependent
variable. This problem is often referred to as overestimating the regression. To overcome
the problem of overestimating the impact of additional variables on the explanatory
power of a regression model, many researchers recommend adjusting R2 for the number
of independent variables. The adjusted R2 value is expressed as:
n-1
R a2 =1- n-k-1
(1-R 2 )
5-12
Example
Example: Calculating R2 and adjusted R2
An analyst run a regression of monthly value-stock returns on five independent
variables over 60 months. The total sum of squares for the regression is 460, and
the sum of squared errors is 170. Calculate the R2 and adjusted R2 .
Answer:
R2
460 170
460
0.630 63.0%
6-12
Example
Answer:
With nine independent variables, even though the R2 has increased from 63% to
65%, the adjusted R2 has decreased from 59.6% to 58.7%
7-12
201405
QUESTIONS 43 AND 44 REFER TO THE FOLLOWING INFORMATION
A bank analyst run an ordinary least squares regression of the daily returns of the stock
on the daily returns on the S&P 500 index using the last 750 trading days of data. The
regression results are summarized in the following tables:
Predictor
Coefficient
Standard Error
t-statistic
p-value
Constant
0.0561
0.00294
19.09710
0.00000
1.2054
0.00298
404.25225
0.00000
R2=87.86%
Analysis of Variance
Source
Degree of Freedom
Sum of Squares
Mean Square
Regression
11.43939
11.43939
Residual Error
749
0.05425
0.00007
Total
750
0.44677
8-12
F-statistic
p-value
163419.87971
0.00000
201405
9-12
Detecting Multicollinearity()
The most common way to detect multicollinearity is the situation where t-tests indicate
that none of the individual coefficients is significantly different than zero, while the R2 is
high.
Example:
Bob Watson runs a regression of mutual fund returns on average P/B, average P/E, and
average market capitalization, with the following results:
Variable
Average P/B
Average P/E
Market Cap
R2
Coefficient
3.52
2.78
4.03
89.60%
p-Value
0.15
0.21
0.11
11-12
FRM
12-12