You are on page 1of 50

5.

Regression through the origin

&

Multiple Regression
(Partial and overall significance test)

M. Balclar, ECON 503, EMU


5.2
Regression through the origin
The intercept term is absent or zero.
i.e., Yi = 2 X i + ui
Yi ^ ^
SRF : Yi = 2 X i

^
2

0 Xi
M. Balclar, ECON 503, EMU
5.3
Regression through the origin
The estimated model:
~ ~
Y = 2 X i
~ ~
or Y = 2 X i + u i
Applied OLS method:
^2
~ X iYi ~
and Var 2 = ( )
2 = X2i
Xi 2
~
and ~
2 =
u 2

N -1
M. Balclar, ECON 503, EMU
5.4
Some feature of interceptless model
1. u
~
i need not be zero
can occasionly turn out to be negative.
2. R2 may not be appropriate as a summary
statistics (goodness of fit).

3. df does not include the constant term,


i.e., (n-k)
In practice:
1. A very strong priori or theoretical expectation, otherwise stick
to the conventional model with an intercept.
2. If intercept is included in the regression model but it turns out
to be statistically insignificant, then we may drop the intercept
and re-run the regression.
M. Balclar, ECON 503, EMU
Regression through origin 5.5

Y = 1 + 2 X+ u i Y = 2 X + ui
^ xy XY
2 = ~
=
x2 2 X2
^
^ 2
Var (2 ) =
~ ^
2
Var 2 = ( )
x2 X2

^ 2 =
^2 ~2
u
~

2
=
u
n2 n 1

=
[ (X X)(Y Y)]
2

( XY ) 2
2
R
(X X) (Y Y)
2 2
raw R 2 =
( xy ) 2 X2 Y2
or R =
2

x y 2 2
M. Balclar, ECON 503, EMU
5.6
Y

False SFR:
^ ^ X
Y = 2

True and best SFR:


^ ^
Y = 1+ ^2X

^
1
X
10 50
M. Balclar, ECON 503, EMU
5.7
Y

True and best SFR:


^ ^
Y = 1+ ^2X

False SFR:
^
Y = ^2X
5

^
1
X
10 50
M. Balclar, ECON 503, EMU
5.8
Y
True and best SFR:
^
Y = ^ X
2
False SFR:
^ ^
Y =1+ ^2X False SFR:
^ ^ ^X
Y = 1+ 2

^
1
X
10 50
M. Balclar, ECON 503, EMU
5.9
Example 1: Capital Asset Pricing Model (CAPM) security is
expect risk premium=expected market risk premium

( ER i
rf ) = 2 ( ER m rf )

expected risk free expected


rate of return rate of
return on return on
security i market
portfolio

2 = a measure of systematic risk.


2 >1 ==> implies a volatile or aggressive security.
2 <1 ==> implies a defensive security.
M. Balclar, ECON 503, EMU
5.10

Example 1:(cont.)

ER i r f Security market line

2
1

ERm f

M. Balclar, ECON 503, EMU


5.11
Example 2: Covered Interest Parity
F e
(i i ) = = fN
* N

e
International interest rate differentials equal exchange rate
forward premium.
e
i.e., (i i ) = 2 (
* FN
)
e
i i*
Covered interest
parity line

2
1
FN e
e
M. Balclar, ECON 503, EMU
5.12
Example 2:(Cont.)
in regression: E (1 ) = 0
F e
(i i ) = 1 + 2(
*
) + ui
e
If covered interest parity holds, 1 is expected to be zero.

M. Balclar, ECON 503, EMU


5.13
y: Return on A Future Fund, %
X: Return on Fisher Index, %

^=
Formal report: Y 1.0899 X R2=0.714 N=10
(5.689) SEE=19.54
M. Balclar, ECON 503, EMU
5.14

H0: 1 = 0

1.279 - 0

7.668

^=
Y 1.2797 + 1.0691X R2=0.715 N=10
(0.166) (4.486) SEE=20.69

The t-value shows that b1 is statistically


M. Balclar, ECON 503,insignificant
EMU different from zero
5.15

Multiple Regression

Y = 1 + 2 X2 + 3 X3 ++ k Xk + u

M. Balclar, ECON 503, EMU


5.16
Derive OLS estimators of multiple regression
Y = 1 + 2 X2 + 3 X3 + u

u = Y - 1 - 2 X2 - 3 X3
OLS is to minimize the SSR( ^u2)
min. S = min. u2 = min. (Y - 1 - 2X2 - 3X3)2

RSS ^ ^ ^
^ =2 ( Y - - X - 3X3)(-1) = 0
1 1 2 2

RSS ^ ^ ^
^ =2 ( Y - - X - 3X3)(-X2) = 0
2 1 2 2

RSS ^ ^ ^
^ =2 ( Y - - 2 2 3X3)(-X3) = 0
X -
3 1

M. Balclar, ECON 503, EMU


5.17
rearranging three equations:
^ ^ ^
n1 + 2 X2 + 3 X3 = Y
^ X + ^ X 2 + ^ X X = X Y
2 2 2 2 3 2 3 2

^ X + ^ X X + ^ X 2 = X Y
1 3 2 2 3 3 3 3

rewrite in matrix form:


n X2 X3 ^
Y 2-variables Case
1
X2 X22 X2X3 ^
= X2Y
2
3-variables Case
X3 X2X3 X3 2 ^
X3Y
3

^
(XX) = XY Matrix notation

M. Balclar, ECON 503, EMU


Cramers rule: 5.18
n Y X3
X2 X2Y X2X3
X3 X3Y X32 (yx2)(x32) - (yx3)(x2x3)
^ =
2 =
n X2 X3 (x22)(x32) - (x2x3)2
X2 X22 X2X3
X3 X2X3 X32

n X2 Y
X2 X22 X2Y
X3 X2X3 X3Y (yx3)(x22) - (yx2)(x2x3)
^ =
3 =
n X2 X3 (x22)(x32) - (x2x3)2
X2 X22 X2X3
X3 X2X3 X32
_ ^_ ^_
^
1 = Y - 2 X2 - 3 X3
M. Balclar, ECON 503, EMU
Alternative Notation

k
Summation Form Yi = j X ji + ui , X 1i = 1, i
j =1

Matrix Form Y = X + u
where

Y1 1 X 21 " X k1 1 u1
Y = # , X = # # % # , = # , u = #
Yn 1 X 2 n " N kn k un
M. Balclar, ECON 503, EMU 5.19
5.20
or in matrix form:
(XX) ^ = XY
3x3 3x1 3x1
^ = (XX)-1
==> (XY)
3x1 3x3 3x1
^
u 2
^ ^u2 (XX)-1
Var-cov() = and ^
u2 =
n-3
Variance-Covariance matrix
^ ^ ^ ^ ^
Var(1) Cov(1, 2) Cov(1, 3)
^ ^ ^ ^) ^ ,^
Var-cov() = Cov (2 ,1) Var( 2 Cov( 2 3)
^ ,
Cov ( ^ ) Cov(
^ , ^) Var(^)
3 1 3 2 3

= ^u2(XX)-1
M. Balclar, ECON 503, EMU
5.21
-1
n X2 X3
=^
u2 X2 X22 X2X3
X3 X3X2 X32

and ^ 2= u^2 u^2


u n-3 = n- k

k=3
# of independent variables
( including the constant term)

M. Balclar, ECON 503, EMU


5.22

X3 = 2X 2
r23 = 1 ???

r23 = cor ( X 2 , X 3 ) =
cov( X 2 , X 3 )
=
x x 2 3

sd ( X 2 )sd ( X 3 ) x x 2
2 EMU
M. Balclar, ECON 503,
2
3
5.23
The meaning of partial regression coefficients
Y = 1 + 2X2 + 3X3 + u (suppose this is a true model)

Y measures the change in the mean


= 2 : 2
X2 values of Y, per unit change in X2,
holding X3 constant.
or The direct or net effect of a unit change in
X2 on the mean value of Y
Y holding X2 constant, the direct effect
= 3
X3 of a unit change in X3 on the mean
value of Y.
Holding constant:
To assess the true contribution of X2 to the change in
Y, we control the influence of X3.
M. Balclar, ECON 503, EMU
5.24
Properties of multiple OLS estimators
_ _ _
1. The regression line(surface)passes through the mean of Y1, X2, X3
_ _ _
i.e., ^ ^ ^
1 = Y - 2 X2 - 3 X3 Linear in parameters
_
^ ^_ ^ _ Regression through the mean
==> Y = 1 + 2 X2 + 3 X3
_
2. Y = Y + ^2x2 + ^3x3
^
^)=
Unbiased: E( i i
or y =^ x +^
2 2 3 3x
Zero mean of error
3. ^
u=0
4. ^ = uX
uX 2
^ =0
3
^ =0 )
(uX k
constant Var(ui) = 2

5. ^^
uY=0 random sample
M. Balclar, ECON 503, EMU
Properties of multiple OLS estimators 5.25
^
6. As X2 and X3 are closely related ==> var(2) and var(^3)
become large and infinite. Therefore the true values of 2
and 3 are difficult to know.
All the assumptions in the two-variable regression case
are also applied to the multiple variable regression.
But one addition assumption is
No exact linear relationship among the independent variables.
k-1
(No perfect collinearity, i.e., Xk iXj 2X2 + + k-1Xk-1)
i=2

7. The greater the variation in the sample values of X2 or


X3, the smaller variance of ^ ^ , and the
2 and 3
estimations are more precisely.
8. BLUE (Gauss-Markov Theorem)

M. Balclar, ECON 503, EMU


5.26
Example involving multicollinearity
add wealth to Chap 3 example, so family expenditures (Y) depend
on income (X2) and wealth (X3)
no problem as long as income is not proportional to wealth
but suppose that X3 = 2 (X2)
Yi = 1 +2X2i +3 (2X2i) +ui = 1 +(2 +23)X2i +ui
=1 + X2i +ui , where =2 +23
Venn diagram illustration

Y Y
Note that this is
3 5
1 2 not perfect
4
X2 X3 multicollinearity
X2 X3

linearly independent Xs partially collinear Xs


M. Balclar, ECON 503, EMU
5.27
Two examples
Expectations augmented Phillips curve
US data from 1970-1982 (13 annual observations)
Relate inflation rate (Y) to unemployment rate (X2) and
the expected inflation rate (X3)
Regr 1 Regr 2 Regr 3
Intercept 7.193 6.127 1.372
se 1.595 4.285 1.604
Unemp -1.392 0.245
se 0.305 0.630
Exp infl 1.470 0.955
se 0.176 0.226
R-Sq 0.877 0.014 0.619
SEE 1.171 3.156 1.960
M. Balclar, ECON 503, EMU
5.28
Two examples
Help wanted advertising
Was the high Help Wanted Advertisement Index (HWI)
of 1965 indicative of structural unemployment?
Relate NHWI (Y) to U (X2) and H (X3) new hiring
over 55 qtrs (1951-64)
Regr 1 Regr 2 Regr 3
Intercept 0.591 0.231 0.574
se 0.520
Unemp -8.428 -12.182
se 1.211 1.474
New hires 11.703 17.102
se 1.712 2.100
R-Sq 0.761 0.563 0.556
M. Balclar, ECON 503, EMU
5.29
Specification bias
Omitted variable bias
^
The simple regression of Y on X2 yields 12
^
12 will be biased estimate of true 2 if
X3 belongs in the model
correlation between X2 and X3 is non-zero
This can be seen from derivation in Maddala (4.4):
^ ^
12 = 2 + 3 32 + error
^ ^
E(12) = 2 + 3 32
Even though the net effect of X2 on Y is 2 , the gross
effect
= direct effect of X2 on Y (= 2)
^
+ indirect effect of X2 on Y (= 3 32)
See Maddala (4.4)M.for numerical
Balclar, ECON 503, EMUexample
5.30
Specification bias
Numerical example for Phillips curve
simple regr effect of U on Infl: ^12 = 0.245
^
multiple regr effect of U on Infl: 2|3 = -1.392
^
multiple regr effect of exp infl on Infl: 3|2 = 1.470
simple regr effect of U on exp infl: ^32 = 1.114
Then
^ ^ ^ ^
12 = 2|3 + 3|2 (32)
0.245 = -1.392 + 1.47(1.114)
Note from the sign on ^32 that U and exp infl are
positively correlated (64%). Does that make sense?
^ ^
Ex from Help wanted: 32 = -0.32 and 23 = -0.64 (r = -0.45)

M. Balclar, ECON 503, EMU


5.31
Polynomial regression
Adding higher order terms to the model captures nonlinear
patterns
For example the relationship between COST and
OUTPUT
Any degree is possible as long as you have enough
observations (more than number of parameters):
Yi = 0 +1Xi + 2Xi2 +3Xi3 + +kXik +ui
But wont X and X2 be correlated?
Yes -- but not perfectly, which is what matters.

M. Balclar, ECON 503, EMU


The adjusted R2 (R2) as one of indicators of the overall fitness 5.32
ESS RSS u^2
R2 = =1- =1-
TSS TSS y2
_ u^2 / (n-k)
R2 = 1 - k : # of independent
y / (n-1)
2
variables plus the
_ ^
2
2 constant term.
R =1- ^
Y
2 n : # of obs.
_ u^ 2 (n-1)
2
R =1-
y2 (n-k)
_
R2 = 1 - (1-R2) n-1
n-k
_
R2 R2 0 < R2 < 1
Adjusted R2 can be negative: R2 0
Note: Dont misuse theM. adjusted R 2, Gujarati(2003) pp. 222
Balclar, ECON 503, EMU
5.33
One rule of thumb is to add an additional explanatory
variable as long as the adjusted R2 increases

(the adjusted R2 will increase so long as additional


explanatory variable has a t-ratio greater than one in
absolute size)

However, this is a mechanical rule (just as we said


maximising the R2 was) and should not be used for
model building

My advice would be to include variables on the basis


of economic theory
M. Balclar, ECON 503, EMU
5.34

Y = 1 + 2 X2 + 3 X3 + u

Y TSS

^Y n-1
^u

M. Balclar, ECON 503, EMU


5.35

Suppose X4 is not an explanatory


Variable but is included in regression

C
X2
X3
X4

M. Balclar, ECON 503, EMU


5.36
Hypothesis Testing in multiple regression:
1. Testing individual partial coefficient
2. Testing the overall significance of all coefficients
3. Testing restriction on variables (add or drop): Xk = 0 ?
4. Testing partial coefficient under some restrictions
Such as 2+ 3 = 1;
or 2 = 3 (or 2+ 3 = 0); etc.
5. Testing the functional form of regression model.

6. Testing the stability of the estimated regression model


-- over time
-- in different cross-sections

M. Balclar, ECON 503, EMU


5.37
1. Individual partial coefficient test
1 holding X3 constant: Whether X2 has the effect on Y ?

H0 : 2 = 0 Y
= 2 = 0?
X2
H1 : 2 0
^
2 - 0 0.726
t= = = 14.906
Se (^ )2 0.048

Compare with the critical value tc0.025, 12 = 2.179

Since t > tc ==> reject Ho

^
Answer : Yes, 2 is statistically significant and is
significantly different from zero.
M. Balclar, ECON 503, EMU
5.38
1. Individual partial coefficient test (cont.)
2 holding X2 constant: Whether X3 has the effect on Y?
Y
H0 : 3 = 0 = 3 = 0?
X3
H1 : 3 0
^
3 - 0 2.736-0
t= = = 3.226
Se (^ ) 0.848
3

Critical value: tc0.025, 12 = 2.179

Since | t | > | tc | ==> reject Ho

^
Answer: Yes, 3 is statistically significant and is
significantly different from zero.
M. Balclar, ECON 503, EMU
5.39
2. Testing overall significance of the multiple regression

3-variable case: Y = 1 + 2 X2 + 3 X3 + u
H0 : 2 = 0, 3 = 0, (all variable are zero effect)
H1 : 2 0 or 3 0 (At least one variable is not zero,
At least one variable has the effect)

1. Compute and obtain F-statistics


2. Check for the critical Fc value (Fc, k-1, n-k)

3. Compare F and Fc , and


if F > Fc ==> reject H0

M. Balclar, ECON 503, EMU


6.40
F-Tests

F-Tests of this type are always right-tailed,


even for left-sided or two-sided hypotheses,
f(F) because any deviation from the null will
make the F value bigger (move rightward).

F follows the F-distribution with


k-1 (numerator) d.f. and n-k
(denominator) d.f.

(1 )
0 Fc M. Balclar, ECON 503, EMU
F
Analysis of Variance: Since y=^y + ^u 5.41
^2
^2 + u
==> y2 = y
TSS = ESS + RSS
ANOVA TABLE
(SS) (MSS)
Source of variation Sum of Square df Mean sum of Sq.
Due to regression(ESS) y^2 k-1 y^2
k-1
Due to residuals(RSS) u^2 n-k u^2 ^2
n-k = u

Total variation(TSS) y2 n-1


Note: k is the total number of parameters including the intercept term.

MSS of ESS ^y2/(k-1)


ESS / k-1
F= = = ^ 2 /(n-k)
MSS of RSS RSS / n-k u

H0 : 2 = = k = 0
if F > Fc,k-1,n-k ==> reject Ho
H1 : 2 k 0
M. Balclar, ECON 503, EMU
5.42
^ ^
Three- y = 2x2 + 3x3 + u^
variable ^ ^
y2 = 2 x2 y + 3 x3 y + u^2
case
TSS = ESS + RSS

ANOVA TABLE
Source of variation SS df(k=3) MSS
ESS ^ x y + ^ x y
3-1 ESS/3-1
2 2 3 3

RSS ^2
u n-3 RSS/n-3
(n-k)
TSS y2 n-1

ESS / k-1 (^2 x2y + ^3 x3y) / 3-1


F-Statistic = =
RSS / n-k ^2 / n-3
u
M. Balclar, ECON 503, EMU
5.43
An important relationship between R2 and F
ESS / k-1 ESS (n-k)
F= =
RSS / n-k RSS (k-1)
ESS n-k
=
TSS-ESS k-1 For the three-variables case :
ESS/TSS n-k
= R2 / 2
1 - ESS F=
TSS k-1 (1-R2) / n-3
R2 n-k
=
1 - R2 k-1

R2 / (k-1) (k-1) F
F = Reverse : R2 =
(1-R2) / n-k (k-1)F + (n-k)
M. Balclar, ECON 503, EMU
Overall significance test: 5.44

H 0 : 2 = 3 = 4 = 0
H1 : at least one coefficient
is not zero.
2 0 , or 3 0 , or 4 0

R 2 / k-1
F* = 2
=
(1-R ) / n- k
0.9710 / 3
=
(1-0.9710) /16
= 179.13
Fc(0.05, 4-1, 20-4) = 3.24

k-1 n-k
Since F* > Fc ==> reject H0.
M. Balclar, ECON 503, EMU
Example:Gujarati(2003)-Table6.4, pp.185) 5.45

H0 : 2 = 3 = 0
ESS / k-1 R 2 / k-1 0.707665 / 2
*
F = = 2
=
RSS/(n- k) (1-R ) / n- k (1-0.707665)/ 61

F* = 73.832

Fc(0.05, 3-1, 64-3) = 3.15

k-1 n-k

Since F* > Fc
==> reject H0.
M. Balclar, ECON 503, EMU
Construct the ANOVA Table (8.4) .(Information from EVIEWS) 5.46
Source of
variation
SS Df MSS
2 2 2 2
Due to R (y ) k-1 R (y )/(k-1)
regression =(0.707665)(75.97807)2x63
(ESS) = 257362.2121 =2 =128681.1061
2 2 2 2 2
Due to (1- R )(y ) or ( u ) n-k (1- R )(y )/(n-k)
Residuals =(0.292335)(75.97807)2x63
(RSS) = 106315.8165 =61 = 1742.8822
2
Total (y ) n-1
(TSS) =(75.97807)2x63
= 363678.0286 =63

Since (y)2 = Var(Y) = y2/(n-1) => (n-1)(y)2 = y2

MSS of regression 128681.1061


F* = = = 73.832
MSS of residual 1742.8822
M. Balclar, ECON 503, EMU
5.47
Y = 1 + 2 X2 + 3 X3 + u
H0 : 2 = 0, 3= 0,
H1 : 2 0 ; 3 0
Fc0.01, 2, 61 = 4.98
Compare F* and Fc, checks the F-table:
Fc0.05, 2, 61 = 3.15
Decision Rule:
Since F*= .73.832 > Fc = 4.98 (3.15) ==> reject Ho

Answer : The overall estimators are statistically significant


different from zero.

M. Balclar, ECON 503, EMU


Partial Correlation Coefficient

Consider Y, X2, X3
Assume cor(X2,X3) 0
What is the cor(Y,X2) net of effect of X3 on
X2. This is known as the partial correlation
between Y and X2, denated r123.
Conceptually, it is similar to the partial
regression coefficient.
Partial Correlation Coefficient
Partial Correlation Coefficient

You might also like