Professional Documents
Culture Documents
xls
This workbook demonstrates perfect and near multicollinearity between two independent variables.
It uses a subset of the data from Multireg.xls
Both the Example1 and the Example2 sheets demonstrate perfect multicollinearity.
In Example1, one variable is directly proportional to a second variable.
In Example2, there is a slightly more complicated linear relationship between the two X variables.
The Table sheet uses the data from Example 1 to graphically demonstrate that many different combinations of
b1 and b2 find the exact same, minimum sum of squared residuals.
The NearMulti sheet demonstrates a case of near multicollinearity.
The Q&A sheet has questions pertaining to multicollinearity.
Excel 2003's LINEST function finds one solution, but doesn't directly warn the user about the multicollinearity.
Income
Quantity
Demanded
50
10
10.3
50
10
11.8
60
60
12
12
11.5
12.6
Intercept
18.0
16.0
Coefficients
0.146
3.405
R-Squared
0.067
0.439
4.433
2.126 RMSE
14.0
12.0
10.0
8.0
70
70
80
80
14
16.0
14
9.3
16
15.6
16
15.8
Summary Statistics
Price
(cents /gal.)
Mean
SD
Income
($1000s/
person)
65.00
11.95
13.00
2.39
Reg SS
4.686
21.170
6 df
27.109 SSR
6.0
4.0
2.0
0.0
45
QD (100s
gals. /Month)
LinEst Q = f(Income)
D
Slope
Intercept
12.86
0.728
3.405
2.63 Coefficients
0.336
4.433
R-Squared
0.439
2.126 RMSE
4.686
6 df
Reg SS
21.170 27.109 SSR
Q=f
18.0
16.0
14.0
12.0
10.0
8.0
6.0
4.0
2.0
0.0
9
10
18.0
16.0
Price
14.0
10
50
60
12.0
10.0
11.05
8.0
6.0
70
80
4.0
2.0
45
50
55
60
65
70
75
80
85
18.0
16.0
14.0
12.0
10.0
8.0
In c o m e ( $ 1 0 0 0 p .h ./p e r y e a r )
Grand Total
0.0
6.0
16
4.0
14
2.0
12
0.0
10
10
11
12
13
14
15
16
17
11.05
8
6
4
2
0
45
50
55
60
65
70
75
s: A Pivot Table
14
16 Grand Total
11.05
12.05
12.05
12.65
12.05
15.7
15.7
12.65
12.65
15.7
12.8625
3.405
b1 (Price)
b2 (Income)
1
-4.273
Price of
Heating Oil
50
50
60
60
Income
Per
Capita
10
10
12
12
Quantity
Demanded
of Heating
Oil
10.3
11.8
11.5
12.6
70
14
16.0
70
80
80
14
16
16
9.3
15.6
15.8
2x + 5.02429586778808E-015
60
65
70
75
80
85
SSR
27.109
b2
LinEst Output
b1
1.3E+014
Predicte
Squared
dY
Residual Residual
10.68
-0.38
0.14
10.68
1.12
1.25
12.14
-0.64
0.40
12.14
0.46
0.22
3.0E+015
0.441
1.974
21.401
###
b0
3.6125
###
###
2.328 #N/A
5 #N/A
27.099 #N/A
In c o m e ( $ 1 0 0 0 p .c ./p e r y e a r )
Multiple Regression
14
12
10
8
6
13.59
2.41
5.81
13.59
15.04
15.04
-4.29
0.56
0.76
18.40
0.31
0.57
0
45
50
55
60
65
70
75
) = 0.2x + 5.02429586778808E-015
55
60
65
70
75
80
85
Table
b0 (Intercept)
3.405
b1 (Price)
b2 (Income)
-9.27
2,500,000
2,000,000
Income
10
10
12
12
14
14
16
16
Y
10.3
11.8
11.5
12.6
16.0
9.3
15.6
15.8
Fitted Y
10.7
10.7
12.1
12.1
13.6
13.6
15.0
15.0
Residuals
-0.4
1.1
-0.6
0.5
2.4
-4.3
0.6
0.8
SSR
b2 (Income)
b1 (Price)
-2
-1
0
1
2
3
4
5
6
intercept
1
2
3
4
5
1,000,000
500,000
0
b2 Income
11
27
34,827
139,227
313,227
556,827
870,027
1,252,827
1,705,227
2,227,227
6
34,827
27
34,827
139,227
313,227
556,827
870,027
1,252,827
1,705,227
1
139,227
34,827
27
34,827
139,227
313,227
556,827
870,027
1,252,827
-4
313,227
139,227
34,827
27
34,827
139,227
313,227
556,827
870,027
-9
556,827
313,227
139,227
34,827
27
34,827
139,227
313,227
556,827
-14
870,027
556,827
313,227
139,227
34,827
27
34,827
139,227
313,227
-19
1,252,827
870,027
556,827
313,227
139,227
34,827
27
34,827
139,227
-2
-1
3.4 slope
1,500,000
-2
-1
Price
50
50
60
60
70
70
80
80
Squared
Residuals
0.1
1.3
0.4
0.2
5.8
18.4
0.3
0.6
27.1085
y
fitted line
87.31525263
18.4
133.3523188
33.4
115.3881732
48.4
89.51438906
63.4
183.0803842
78.4
15
residuals
68.91525263
99.95231878
66.9881732
26.11438906
104.6803842
residuals
squared
4749.312046
9990.46603
4487.415349
681.9613161
10957.98284
Page 7
Table
6
7
8
9
10
11
12
13
14
94.71872612
94.87461567
225.8659367
91.62931046
153.7234328
180.0987084
211.6802011
173.5201145
203.3097495
93.4
108.4
123.4
138.4
153.4
168.4
183.4
198.4
213.4
1.318726119
-13.5253843
102.4659367
-46.7706895
0.323432778
11.69870836
28.28020107
-24.8798855
-10.0902505
SUM
1.739038577
182.9360212
10499.26818
2187.4974
0.104608762
136.8597773
799.7697726
619.0087033
101.8131551
45396.13423
Page 8
Table
2,500,000
2,000,000
1,500,000
1,000,000
500,000
-2
-1
0
1
2
3
4
5
6
ome
b1 Price
-24
1,705,227
1,252,827
870,027
556,827
313,227
139,227
34,827
27
34,827
-29
2,227,227
1,705,227
1,252,827
870,027
556,827
313,227
139,227
34,827
27
Page 9
Table
Page 10
Income
Quantity
Demanded
50
7.1
50
5.7
Coefficients
0.265
60
60
11
11
5.7
6.6
R-Squared
0.067
0.721
70
70
80
80
13
13
15
15
13.2
9.7
11.3
15.6
Summary Statistics
Mean
SD
Price
(cents /gal.)
65.00
11.95
Income
($1000s/
person)
12.00
2.39
Reg SS
15.479
69.960
Intercept
-7.830
4.434
2.126 RMSE
6 df
27.119 SSR
Q=f(P): A De
18.0
16.0
14.0
12.0
10.0
8.0
6.0
4.0
2.0
0.0
f(x) = 0.2
45
50
55
Q=f(Income):
18.0
16.0
14.0
f(x) = 1.32
12.0
10.0
8.0
6.0
4.0
2.0
0.0
8
10
18.0
16.0
14.0
12.0
10.0
8.0
6.0
4.0
2.0
0.0
Price
9
50
60
Income
11.05
70
80
45
50
55
60
65
70
75
80
85
15
16
18.0
16.0
14.0
12.0
10.0
8.0
6.0
4.0
2.0
0.0
8
10
11
12
13
14
Grand Total
11.05
ivot Table
13
15 Grand Total
11.05
12.05
12.05
12.65
12.05
12.65
15.7
15.7
12.65
15.7
12.8625
b1 (Price)
b2 (Income)
Price of
Heating Oil
50
50
60
60
70
70
80
80
LINEST
Coefficients
Estimated SE
R-Squared
Reg SS
SSR
Income Per
Capita
9
9
11
Quantity
Demanded
of Heating
Oil
7.1
5.7
5.7
11
6.6
Predicted Y Residual
5.40
1.70
5.40
0.30
8.04
-2.34
8.04
-1.44
27.119
Squared
Residual
2.91
0.09
5.48
2.07
In c o m e ( $ 1 0 0 0 p .c ./p e r y e a r )
Price and I
16
f (x) = 0.2x - 1
14
12
10
8
6
4
2
13
13
15
15
13.2
9.7
11.3
15.6
10.68
10.68
13.33
13.33
2.52
-0.98
-2.03
2.27
6.33
0.97
4.12
5.15
0
45
50
55
60
Pr ice (ce n
LINEST results de
Excel 2003
b2
###
###
0.710
6.135210688
64.833
b1
Intercept
###
###
###
###
RMSE
2.299 #N/A
df
5 #N/A
SSR
26.418 #N/A
Income is zeroed out.
Earlier Versions of Excel
f (x) = 0.2x - 1
45
50
55
60
65
70
75
80
85
s zeroed out.
ersions of Excel
NearMulti
4.3896
b1 (Price)
-2.1856
b2 (Income)
11.5689
Price
50
50
60
60
70
70
80
80
Income
10
10
12
12
14
14
16
16.1
Y
10.3
11.8
11.5
12.6
16.0
9.3
15.6
15.8
Fitted Y
10.8
10.8
12.1
12.1
13.4
13.4
14.6
15.8
Residuals
-0.5
1.0
-0.6
0.5
2.6
-4.1
1.0
0.0
SSR
Squared
Residuals
0.2
1.0
0.3
0.3
7.0
16.5
0.9
0.0
26.232
b1 Price
b2 (Income)
b1 (Price)
-4.2
-3.2
-2.2
-1.2
-0.2
0.8
1.8
22
27
34,987
139,546
313,705
557,464
870,824
1,253,783
17
34,747
26
34,906
139,385
313,464
557,144
870,423
12
139,228
34,827
26
34,826
139,225
313,224
556,823
7
313,469
139,388
34,907
26
34,746
139,065
312,984
2
557,470
313,709
139,549
34,988
27
34,667
138,906
-3
871,232
557,791
313,951
139,710
35,069
28
34,588
Page 16
2
-8 -3
NearMulti
intercept
3.4 slope
x
1
2
3
4
5
6
7
8
9
10
11
12
13
14
y
fitted line
87.31525263
18.4
133.3523188
33.4
115.3881732
48.4
89.51438906
63.4
183.0803842
78.4
94.71872612
93.4
94.87461567
108.4
225.8659367
123.4
91.62931046
138.4
153.7234328
153.4
180.0987084
168.4
211.6802011
183.4
173.5201145
198.4
203.3097495
213.4
15
residuals
68.91525263
99.95231878
66.9881732
26.11438906
104.6803842
1.318726119
-13.5253843
102.4659367
-46.7706895
0.323432778
11.69870836
28.28020107
-24.8798855
-10.0902505
SUM
Page 17
residuals
squared
4749.312046
9990.46603
4487.415349
681.9613161
10957.98284
1.739038577
182.9360212
10499.26818
2187.4974
0.104608762
136.8597773
799.7697726
619.0087033
101.8131551
45396.13423
NearMulti
1,400,000
1,200,000
1,000,000
800,000
600,000
400,000
200,000
0
b1 Price
2
-8 -3
17 22
7 12
b2 Incom e
-8
1,254,755
871,634
558,113
314,192
139,872
35,151
30
1.814356276
Page 18
NearMulti
Page 19
X2
22
28
28
29
22
21
27
27
24
23
29
21
26
23
27
20
24
21
27
27
47
62
62
64.5
47
44.5
59.5
59.5
52
49.5
64.5
44.5
57
49.5
59.5
42
52
44.5
59.5
59.5
Y
345
450
450
467.5
345
327.5
432.5
432.5
380
362.5
467.5
327.5
415
362.5
432.5
310
380
327.5
432.5
432.5
LINEST Results
2.666667 10.83333
0.845428 2.11357
1 1.7E-014
9.0E+031
17
52430 5.0E-027
2. What does the zero mean in the top row (cell F7)?
3. Use the regression results to predict the value of Y for the first observation. Show your work and comment on how well the
4. Does R2 = 1 mean that there is perfect multicollinearity between the X's? Explain.
5. Can you recover the linear relationship between X1 and X2? If so, what is it?
me for Example 2.
Note: If you are not using Excel 2003, you may get different results
In Excel 2003, the results look like this: