You are on page 1of 3

Chapter 14

Multiple Regression
Section 14.5
14.1 The coefficients of independent variables in a multiple regression model are interpreted as the change in y
for a one-unit change in the corresponding independent variable when all other independent variables are
held constant. For example, B2 gives the change in y due to a one-unit change in x2 when x1, x3, ... , xk are
held constant.

14.3 The independent variables can have a non-linear relationship but cannot be linearly related.

14.5 The following are the assumptions of a multiple regression model:


1. The mean of the probability distribution of is zero, that is, E() = 0.
2. The errors associated with different sets of values of independent variables are independent.
Furthermore, these errors are normally distributed and have a constant standard deviation which is
denoted by .
3. The independent variables are not linearly related.
4. There is no linear association between the random error term and each independent variables xi.

14.7 a. y = 15.065 + .167 x1 .132 x2

b. The value of a = 15.065 gives the value for y when x1 = 0 and x2 = 0. However, since x1 = 0 and
x2 = 0 do not occur together in the sample data, the estimate is invalid. The value b1 = .167 gives the
change in y for a one-unit change in x1 when x2 is held constant. The value b2 = .132 gives the
change in y for a one-unit change in x2 when x1 is held constant.

c. se = 1.488, R 2 = .971, and R 2 = .964


d. y = 15.065 + .167 x1 .132 x2 = 15.065 + .167(87) .132(54) = 22.466

e. y = 15.065 + .167 x1 .132 x2 = 15.065 + .167(95) .132(49) = 24.462

f. df = n k 1 = 11 2 1 = 8
The 99% confidence interval for 1 is
b1 tsb1 = .167 (3.355)(.034) = .167 .114 = .053 to .281

g. Step 1: H0: B2 = 0, H1: B2 < 0


Step 2: Since is unknown, use the t distribution.
Step 3: For = .01 with df = 8, the critical value of t is 2.896.
Step 4: t = (b2 B2)/ sb2 = 1.919
Step 5: Do not reject H0 since 1.919 > 2.896.
Conclude that B2 is not negative.

14.9 a. y = 11.258 + .011x1 + .199 x 2

14-1
14-2 Chapter 14 Multiple Regression Analysis

b. The value of a = 11.258 gives the expected weekly sales for restaurants in areas with zero population
and a mean annual household income of $0. However, since the sample data does not include any
restaurants in areas with zero population and a mean annual household income of $0, the estimate is
invalid. The value b1 = .011 indicates that for each increase of 1000 in population, a restaurants sales
are expected to increase by $11 when mean annual household income is held constant. The value b2 =
.199 indicates that for each increase of $1000 in mean annual household income, a restaurants sales
are expected to increase by $199 when population is held constant.

c. se = 5.756, R 2 = .274, and R 2 = .092


d. y = 11.258 + .011x1 + .199 x 2 = 11.258 + .011(50) + .199(55) = 22.753
The predicted sales for a restaurant with 50 thousand people living within a five-mile area surrounding
it and $55 thousand mean annual income of households in that area is $22,753.
e. y = 11.258 + .011x1 + .199 x 2 = 11.258 + .011(45) + .199(60) = 23.693
The expected (mean) sales for all restaurants with 45 thousand people living within a five-mile area
surrounding them and $60 thousand mean annual income of households living in those areas is
$23,693.
f. df = n k 1 = 11 2 1 = 8
The 95% confidence interval for 2 is
b2 tsb2 = .199 (2.306)(.117) = .199 .270 = .071 to .469

g. Step 1: H0: B1 = 0, H1: B1 0


Step 2: Since is unknown, use the t distribution.
Step 3: For = .01 with df = 8, the critical values of t are 3.355 and 3.355.
Step 4: t = (b1 B1)/ sb1 = .120
Step 5: Do not reject H0 since .120 < 3.355.
Conclude that B1 is not different from zero.

Self -Review Test


1. c 2. a 3. c

4. A regression line obtained by using population data is called the population multiple regression model.
The estimated multiple regression model is obtained from sample data.

5. The regression coefficients in a multiple regression model are called the partial regression coefficients
because each of them gives the effect of the corresponding independent variable on the dependent variable
when all other independent variables are held constant.

6. R 2 is the proportion of the total sum of squares (SST) that is explained by the multiple regression model.
R 2 is the coefficient of multiple determination adjusted for degrees of freedom. R 2 generally increases as
more explanatory variables are added to the regression model while the value of R 2 may increase,
decrease, or stay the same as more independent variables are added. R 2 is always non-negative; R 2 can
be negative.

7. a. We would expect the relationship between sale price and lot size to be positive, the relationship
between sale price and living area to be positive, and the relationship between sale price and age to be
negative.
b. y = 200.153 + 11.889 x1 + .099 x 2 7.551x3
The signs of the coefficients of the independent variables obtained in the solution are consistent with
the expectations in part a.
Chapter 14 Self-Review Test 14-3

c. The value of a = 200.153 gives the expected sale price of a house for a lot size of zero and living area
of zero at age zero. However, since x1 = 0, x2 = 0, and x3 = 0 do not occur together in the sample data,
the estimate is invalid. In fact, a lot size of zero and a living area of zero do not make sense. The
value b1 = 11.889 indicates that for an increase of one acre in the lot size, the sale price of a house is
expected to increase by $11,889 when living area and age are held constant. The value b2 = .099
indicates that for an increase of one square foot in living area, the sale price of a house is expected to
increase by $99 when lot size and age are held constant. The value b3 = 7.551 indicates that for an
increase of one year in age, the sale price of a house is expected to decrease by $7551 when lot size
and living area are held constant.

d. se = 37.762, R 2 = .882, and R 2 = .842


e. y = 200.153 + 11.889 x1 + .099 x 2 7.551x3 = 200.153 + 11.889(2.5) + .099(3000) 7.551(14)
= 421.162
The predicated sale price of a house that has a lot size of 2.5 acres, a living area of 3000 square feet,
and is 14 years old is $421,162.
f. y = 200.153 + 11.889 x1 + .099 x 2 7.551x3 = 200.153 + 11.889(2.2) + .099(2500) 7.551(7)
= 420.952
The point estimate of the mean sale prices of all houses that have a lot size of 2.2 acres, a living area of
2500 square feet, and are 7 years old is $420,952.
g. df = n k 1 = 13 3 1 = 9
The 99% confidence interval for 1 is
b1 tsb1 = 11.889 (3.250)(23.697) = 11.889 77.015 = 65.126 to 88.904
The 99% confidence interval for 2 is
b2 tsb2 = .099 (3.250)(.043) = .099 .140 = .041 to .239
The 99% confidence interval for 3 is
b3 tsb3 = 7.551 (3.250)(1.988) = 7.551 6.461 = 14.012 to 1.090

h. The 98% confidence interval for A is


a ts a = 200.153 (2.821)(89.138) = 200.153 251.458 = 51.305 to 451.611

i. Step 1: H0: B1 = 0, H1: B1 > 0


Step 2: Since is unknown, use the t distribution.
Step 3: For = .01 with df = 9, the critical value of t is 2.821
Step 4: t = (b1 B1)/ sb1 = .502
Step 5: Do not reject H0 since .502 < 2.821.
Conclude that B1 is not positive.
j. Step 1: H0: B2 = 0, H1: B2 > 0
Step 2: Since is unknown, use the t distribution.
Step 3: For = .025 with df = 9, the critical value of t is 2.262.
Step 4: t = (b2 B2)/ sb2 = 2.319
Step 5: Reject H0 since 2.319 > 2.262.
Conclude that B2 is positive.
k. Step 1: H0: B3 = 0, H1: B3 < 0
Step 2: Since is unknown, use the t distribution.
Step 3: For = .05 with df = 9, the critical value of t is 1.833.
Step 4: t = (b3 B3)/ sb3 = 3.799
Step 5: Reject H0 since 3.799 < 1.833.
Conclude that B2 is negative.