Professional Documents
Culture Documents
1. Regression Analysis
Course Overview
Date
Topic
Regression Analysis
Lecture
Regression Analysis
Tutorial
Tue. 22.04.
Event Studies
Lecture
Event Studies
Tutorial
Lecture
Tutorial
Please bring your laptop to the tutorials to follow along during the Excel exercises.
People:
Lecture: Dr. Nikolas Breitkopf (breitkopf@bwl.lmu.de)
Tutorial: Janis Bauer (janis.bauer@bwl.lmu.de)
Grading:
60 minute exam
Exam date: 02.06.2014, 18:3019:30 (please register for the exam via the LSF)
1
REGRESSION ANALYSIS
Content
Core questions
What is an estimator?
Properties of estimators
Which problems result from violation of OLS assumptions?
Agenda
Motivation
Ordinary Least Squares (OLS)
Effects of violation of assumptions
- Heteroscedasticity
- Correlation of the regressors with the error term (Endogeneity)
REGRESSION ANALYSIS
Literature
Basic literature
Barreto, H. and Howland, F. M.; Introductory Econometrics, Cambridge;
latest edition
Additional literature
Johnston, J. and DiNardo, J.; Econometric Methods, McGraw-Hill; latest
edition
Greene, W.; Econometric Analysis, Prentice Hall; latest edition
REGRESSION ANALYSIS
Motivation
What are regressions used for in the field of finance?
Estimation of beta of a stock
Asset-Pricing-Tests
Determinants of capital structure
Event studies
Determination of trends
Forecasting
Definition
A regression estimates the linear relationship between independent
variables (x) and the dependent variable (y).
The real relationship of the population is deduced from a sample.
REGRESSION ANALYSIS
Constant
0.60 ***
(196.48)
4.23 ***
(1,015.57)
0.33 ***
(5.94)
1.40 ***
(12.24)
0.67 ***
(9.66)
2.14 ***
(25.95)
Disclosure Requirement
(Dummy)
-0.20 ***
(-3.42)
-0.72 ***
(-6.54)
Yes
Yes
#Obs
427,164
4,716,000
#Stocks
1,306
15,185
Source: Beber/Pagano 2010, WP
REGRESSION ANALYSiS
REGRESSION ANALYSIS
REGRESSION ANALYSIS
E1
176.27
185.75
200.09
201.17
232.02
202.67
210.55
203.00
166.32
221.19
199.90
19.73
E2
185.11
182.41
215.10
203.69
197.46
225.78
236.46
202.90
179.36
162.50
199.08
22.62
E3
223.63
222.41
212.60
193.54
214.99
219.02
246.32
218.72
219.29
209.35
217.99
13.20
E4
207.93
214.81
187.94
182.40
208.36
165.95
191.26
218.34
197.47
194.19
196.86
16.06
E5
181.65
202.57
236.40
188.33
196.82
189.38
219.76
228.23
204.82
201.30
204.93
17.94
E6
254.48
189.30
217.17
202.42
216.74
168.10
202.98
203.15
194.76
199.27
204.84
22.34
E7
197.73
195.00
234.46
199.96
199.91
210.96
198.10
210.51
184.78
151.77
198.32
21.03
E8
233.67
164.47
212.61
218.67
166.11
221.77
227.80
194.88
173.09
199.35
201.24
25.89
202.89
Standard Error
(Experiments)
6.75
Standard Error
(theoretical)
6.32
REGRESSION ANALYSIS
Estimated population
mean
REGRESSION ANALYSIS
E (b) =
Efficiency
- The sample variance of unbiased estimator is the smallest of all unbiased
estimators.
- Example: OLS is the best linear unbiased estimator (BLUE).
Consistency
- A biased estimator is consistent, if it converges asymptotically against the true
parameter.
n
s12 =
s2
1
( xi x ) 2
n 1 i =1
1 n
= ( xi x ) 2
n i =1
E ( s12 ) = 2
E ( s22 ) =
n 1 2
lim E ( s22 ) = 2
10
REGRESSION ANALYSIS
Slope b
Properties
- Minimization of Residual Sums of Squares
- The straight line crosses the point
(x, y)
e1
e2
x
Intercept a
11
REGRESSION ANALYSIS
1 x
... x2, k
2, 2
;
y = X + u
with y = ... ; X =
= ... ; u = ...
1
x
...
x
n,2
n, k
RSS
= 2 X ' y + 2 X ' Xb = 0 ( X ' X )b = X ' y b = ( X ' X ) 1 ( X ' y )
b
k xk
x1
k
k x1
12
REGRESSION ANALYSIS
c d
a c
A' =
b d
Identity matrix I:
AI = A
IA=A
0
I =
...
ae + bg
AB =
ce + dg
af + bh
cf + dh
0 ... 0
1 ... 0
... ... ...
0 ... 1
REGRESSION ANALYSIS
Shape
Interpretation
( n k ) ( k m) = ( n m)
e' e
X'X
uu '
14
REGRESSION ANALYSIS
1
X = 1
OLS
b = ( X ' X ) 1 X ' y
1
1
( X ' X ) 1 = [1 1 1] 1 = (3) 1 =
3
1
1
X ' y = [1 1 1] 2 = 1 + 2 + 3 = 6
3
1
b= 6=2= y
3
15
REGRESSION ANALYSIS
1 1 0
X = 1 0 1
1 1 0
16
REGRESSION ANALYSIS
Note:
Var(ui) = 2
The variance of the error terms is constant.
E(uiuj) = 0 for i j
Individual errors are independent.
No autocorrelation.
17
REGRESSION ANALYSIS
Properties of OLS
Unbiasedness: E(b) =
b = ( X ' X ) 1 ( X ' y )
y = X + u
b = ( X ' X ) 1 X ' ( X + u )
(b ) = ( X ' X ) 1 X ' u
b = ( X ' X ) 1 ( X ' y )
( X ' X )b = X ' ( Xb + e)
( X ' X )b = ( X ' X )b + X ' e
X ' e = 0 E ( X ' e) = 0
Cov (e, X ) = 0
REGRESSION ANALYSIS
[u1
u2
E (u12 ) E (u1u2 )
2
E (u2u1 ) E (u2 )
... un ] =
...
...
E (u u ) E (u u )
n 1
n 2
... E (u1un ) 2 0
!
0 2
...
=
...
... ... ...
0
... E (un 2 ) 0
... 0
2
= I
... ...
... 2
...
Inference
Estimation of 2: E(ee), so-called standard error of the regression:
Variance of the OLS coefficients
var(b)
var(b)
= 2 ( X ' X ) 1
var(b) = s 2 ( X ' X ) 1
19
REGRESSION ANALYSIS
Hypothesis testing
The standard errors of the coefficients are the square root of the diagonal of
var(b). They can be used to calculate the t-statistic of an estimate: t = b/se(b)
A joint hypothesis test can be conducted by:
F ( q, n k )
0
1
0
0
0
0
1
0
0
0
0
0
r=
0
0
1
0
q=4
20
REGRESSION ANALYSIS
y = + x + u = 1, = 1
E (a)
StdDev(a )
E (b)
StdDev(b)
= 0.9921
= 0.1083
= 0.9993
= 0.1015
21
REGRESSION ANALYSIS
2 0
2
= I =
OLS makes the assumption = 2I
... ...
0
var(b)
= E[( X ' X ) 1 X ' uu ' X ( X ' X ) 1 ]
0
... 0
... 0
... ...
2
...
= 2 ( X ' X ) 1 as AA1 = I
REGRESSION ANALYSIS
Heteroscedasticity
OLS assumption:
The error term ui posses a constant variance for all observations i
(Homoscedasticity).
Example: 2 =f(x)=x2
Heteroscedasticity:
12 0
2
0 2
E (uu ' ) =
... ...
0
0
... 0
... 0
... ...
2
... n
23
REGRESSION ANALYSIS
Create 1000 samples of sample, each with a sample size of N = 100 observations
A regression from y to x is executed for each one of the 1000 data sets and the
resulting axis intercept and the slope is noted.
N=100
=1
1.0091
=1
0.99751
OLS s.e.
(avg. / incorrect)
0.60031
0.19875
OLS s.e.
(sim. distribution / correct)
0.38269
0.21949
White s.e.
0.37341
0.21262
Coefficients (OLS)
Standard errors of OLS are biased here: The estimated standard error of the
intercept is too large; the estimated standard error of the slope is too small.
Note: The coefficient estimates of OLS are still unbiased even in presence of
heteroscedasticity. However, for inference unbiased standard errors are essential.
24
REGRESSION ANALYSIS
= ( X ' X )1
1
(
)
(
)
X
X
X
X
'
'
k xn n xn n xk
S0 : k x k
12 0
2
0 2
with =
... ...
0
0
... 0
... 0
... ...
2
... n
White :
S 0 = ei 2 xi x 'i
i =1
REGRESSION ANALYSIS
y = 0 + 0 x + u with x = u +
Simulation experiment:
S=1000
=1
=0.5
=0.1
OLS (=0)
OLS
OLS
Parameter
0.5075
0.40766
0.10106
Avg. s.e.
0.050628
0.081588
0.10124
Std. Dev.
0.035988
0.067452
0.10102
1%tile
0.42086
0.24095
-0.14092
99%tile
0.59003
0.55394
0.32374
u , ~ N (0,1)
REGRESSION ANALYSIS
27
REGRESSION ANALYSIS
Panel Estimation
Panel data consists of cross-section and
time series data:
N individuals,
repeatedly observed at T points in time.
Solutions
Estimate a system of equations, one for
each individual.
Estimate a system of equations, with
restrictions requiring some homogeneity
(e.g. same slope, different intercepts)
Pooled OLS :
y = Xb + e
y
I1
I2
I3
x
28
REGRESSION ANALYSIS
0 0 ... i n
yn X n
y = [ X d1 ... d n ] +
Properties:
Computational intensive for large N.
Significance of fixed effects
F-test for the joint significance of the
dummies.