Professional Documents
Culture Documents
Matthias Parey
1 / 47
Outline
More on interpretation The Expected Value of the OLS Estimators Assumptions Result: OLS is unbiased Potential bias from misspecication The Variance of the OLS Estimators Homoscedasticity assumption Result Interpretation Variance in misspecied models Estimating 2 Eciency of OLS: The Gauss-Markov Theorem Reading: Wooldridge (2009), Introductory Econometrics, Chapter 3.
2 / 47
Outline
More on interpretation The Expected Value of the OLS Estimators Assumptions Result: OLS is unbiased Potential bias from misspecication The Variance of the OLS Estimators Homoscedasticity assumption Result Interpretation Variance in misspecied models Estimating 2 Eciency of OLS: The Gauss-Markov Theorem
3 / 47
Units of Measurement
What is the eect of changing units of x or y on our results? If the dependent variable y is multiplied by some constant c, then the OLS intercept and slope estimates are also multiplied by c. If the independent variable x is multiplied by some nonzero constant c, then the OLS slope coecient is divided by c (intercept is not aected). variable does not aect the intercept. What happens to R 2 when the unit of measurement of either the independent or the dependent variable changes?
4 / 47
schooling induces an approximate percentage change (%) in wages of 1001 . We write this as follows: %wage (1001 )schooling
5 / 47
Now consider the case where both x and y are transformed into logs: log(y ) = 0 + 1 log(x) + u Here we have 1 =
log(y ) x y = log(x) y x
This is known as the constant-elasticity model 1 measures the percentage change of y to a one percent change in x.
7 / 47
8 / 47
Outline
More on interpretation The Expected Value of the OLS Estimators Assumptions Result: OLS is unbiased Potential bias from misspecication The Variance of the OLS Estimators Homoscedasticity assumption Result Interpretation Variance in misspecied models Estimating 2 Eciency of OLS: The Gauss-Markov Theorem
9 / 47
10 / 47
There are other ways of incorporating non-linearities: e.g. interactions (set x3 = x1 x2 ) Thus, in principle the formulation is very general. Nontheless: The requirement is that the functional relation we end up choosing is appropriate. This is dicult to judge: economic theory rarely provides insights on what the exact functional form should be.
11 / 47
12 / 47
13 / 47
MLR.3 rules out the following cases: one variable is a constant multiple of another: e.g. x1 = x2 one variable can be expressed as an exact linear function of two or more of the other variables, e.g. x1 = 0 + 2 x2 + 3 x3 + ... + k xk . Intuition: If there is an exact linear relationship, it is impossible to tell apart the eect of one variable from the other: we have no variation to separate out the eects. There are many combinations of (0 , 1 , ..., k ) which all deliver the same value from the loss function (sum of squared residuals). Given the exact restriction on the relationship between the dierent covariates, the ceteris paribus notion is meaningless. Assumption MLR.3 also fails if the sample size, n, is too small in relation to the number of parameters being estimated, i.e. if n < k + 1. in a particular sample, the linear relationship from above happens to hold by chance (this is a very small sample problem).
14 / 47
15 / 47
Example (continued)
Note that the following relation is true for all individuals in the sample: cohort + age = 2009 Conclusion: We have to drop either the age eect or the cohort eect from the specication. That is, we eectively have to assume that either 1 = 0 or 2 = 0.
In this example, both age and vintage are potentially important. both factors may be genuinely relevant. but the data does not allow us to separate out the two eects.
16 / 47
When Assumption MLR.4 holds, we say we have exogenous explanatory variables. If xj is correlated with u for any reason, then xj is said to be an endogenous explanatory variable.
17 / 47
A list of cases when assumption MLR.4 can fail: 1. misspecied functional form (misspecication) 2. in case of omission of a variable that is correlated with any of the independent variables (omitted variable problem) 3. in specic forms of measurement error in an explanatory variable (measurement error problems) 4. in case one or more of the explanatory variables is determined jointly with y (simultaneity problems).
18 / 47
19 / 47
Two remarks
1. Unbiasedness is a statement about the sampling distribution of an estimator
If we kept drawing fresh samples from the population, how would the distribution of the estimator look like? Thus, unbiasedness says nothing about how a particular realization is related to the true parameter value. In a particular sample, the estimated coecient may be far away from the true value even though the estimator is unbiased.
2. Unbiasedness is a statement about the expected value, and not about dispersion
An estimator can be unbiased but still have a large dispersion around the true value. Unbiasedness says nothing about the probability of being close to the true parameter value.
20 / 47
We now ask if we can describe what happens in specic cases of misspecication. What is the eect of including irrelevant variables? What is the eect of excluding a relevant variable?
21 / 47
22 / 47
Suppose the true population model is: y = 0 + 1 x1 + 2 x2 + u and assume that this model satises Assumptions MLR.1 through MLR.4. However, due to data availability, we estimate instead the model by excluding x2 : y = 0 + 1 x 1
23 / 47
24 / 47
We use the following fact: 1 = 1 + 2 1 where 1 and 2 are the slope estimators from the multiple regression of yi on xi1 and xi2 , and 1 is the slope from the simple regression of xi2 on xi1 . Now compute the expected value: E 1 = E 1 + 2 1 = E 1 + E 2 1 = 1 + 2 1 which implies the bias in 1 (the omitted variable bias) is: Bias 1 = E 1 1 = 2 1
Conclusion: There are two cases where 1 is unbiased: 1. if the unobserved covariate is irrelevant for y : 2 = 0; 2. if x1 and x2 are uncorrelated: 1 = 0.
25 / 47
The sign of the bias in 1 depends on the signs of both 2 and 1 : Corr(x1 , x2 ) > 0 positive bias negative bias Corr(x1 , x2 ) < 0 negative bias positive bias
2 > 0 2 < 0
Terminology: If E 1 > 1 , then 1 has an upward bias. If E 1 < 1 , then 1 has a downward bias. The phrase biased towards zero refers to cases where E 1 is closer to zero than 1 :
if 1 > 0, then 1 is biased towards zero if it has a downward bias; if 1 < 0, then 1 is biased towards zero if it has an upward bias.
26 / 47
Outline
More on interpretation The Expected Value of the OLS Estimators Assumptions Result: OLS is unbiased Potential bias from misspecication The Variance of the OLS Estimators Homoscedasticity assumption Result Interpretation Variance in misspecied models Estimating 2 Eciency of OLS: The Gauss-Markov Theorem
28 / 47
29 / 47
Assumptions MLR.1 through MLR.5 are collectively known as the Gauss-Markov assumptions. Assumptions MLR.1 and MLR.4 can be written as: E (y |x) = 0 + 1 x1 + 2 x2 + ... + k xk Assumption MLR.5 can be written as: Var (y |x) = 2 where x is the set of all independent variables, (x1 , ..., xk ).
30 / 47
for j = 1, 2, ..., k, where SSTj = i=1 (xij xj )2 is the total sample variation in xj , and Rj2 is the R-squared from regressing xj on all other independent variables (and including an intercept). The size of Var j is important: a larger variance means a less precise estimator larger condence intervals and less powerful hypothesis tests.
31 / 47
Interpretation
The variance of j depends on three factors: 1. The error variance 2 2. The linear relationship among the independent variables, Rj2 3. The total sample variation in xj , SSTj
32 / 47
1. The error variance 2 : a larger 2 means larger variances for the OLS estimators.
This reects more noise in the data. To reduce the error variance for a given y , add more explanatory variables to the equation.
2. The total sample variation in xj , SSTj : the larger the total variation in xj , the smaller Var (j ).
To increase the sample variation in each of the independent variables, increase the sample size.
High (but not perfect) correlation between two or more independent variables is called multicollinearity.
33 / 47
Remarks: What ultimately matters for statistical inference is how big j is in relation to its standard deviation. Also note that a high degree of correlation between certain independent variables can be irrelevant as to how well we can estimate other parameters in the model.
34 / 47
2 2 . SST1 (1 R1 )
Assuming that x1 and x2 are correlated, we can draw the following conclusions: 1. When 2 = 0, 1 is biased, 1 is unbiased, and Var 1 < Var 1 . = When 2 = 0, there are two reasons for including x2 in the model:
any bias in 1 does not shrink as the sample size grows, but the variance does; the variance of 1 conditional only on x1 is larger than the one shown in the previous slide, where both regressors are treated as nonrandom.
2. When 2 = 0, 1 and 1 are both unbiased, and Var 1 < Var 1 . = 1 is preferred if 2 = 0.
36 / 47
Estimating 2
The unbiased estimator of 2 in the multiple regression case is: 2 = ui2 SSR = (n k 1) (n k 1)
n i=1
where the degrees of freedom is: df = n (k + 1) = (no. observations) (no. estimated parameters) Recall: the division by n k 1 comes from the fact that, in obtaining the OLS estimates, k + 1 restrictions are imposed on the OLS residuals, so that there are only n k 1 df in the residuals.
37 / 47
is called the standard error of the regression (SER), the standard error of the estimate or the root mean squared error. It is an estimator of the standard deviation of the error term. Note that can either decrease or increase when another independent variable is added to a regression. Why?
numerator: SSR goes down denominator: k increases, n k 1 goes down overall eect unclear.
38 / 47
For constructing condence intervals and conducting tests, we need to estimate the standard deviation of j : sd j = [SSTj (1 Rj2 )]1/2
Since is unknown, we replace it with its estimator . j : This gives us the standard error of se j = [SSTj (1 Rj2 )]1/2
39 / 47
Eects of heteroscedasticity
40 / 47
Outline
More on interpretation The Expected Value of the OLS Estimators Assumptions Result: OLS is unbiased Potential bias from misspecication The Variance of the OLS Estimators Homoscedasticity assumption Result Interpretation Variance in misspecied models Estimating 2 Eciency of OLS: The Gauss-Markov Theorem
41 / 47
Eciency of OLS
Gauss-Markov Theorem
Under the following set of assumptions: MLR.1 (Linear in parameters) MLR.2 (Random sampling) MLR.3 (No perfect collinearity) MLR.4 (Zero conditional mean) MLR.5 (Homoscedasticity) the OLS estimators (0 , 1 , ..., k ) are the best linear unbiased estimators (BLUEs) of (0 , 1 , ..., k ), respectively. We say that OLS is BLUE.
43 / 47
Linear: an estimator is linear if, and only if, it can be expressed as a linear function of the data on the dependent variable:
n
j =
i=1
wij yi
OLS
where each wij can be a function of the sample values of all the independent variables. Unbiased: j is an unbiased estimator of j if E j = j .
44 / 47
The criterion: Best: smallest variance. = Under the assumptions MLR.1-MLR.5, we have that for any estimator j that is linear and unbiased, Var j Var j where j is the OLS estimator.
Keep in mind: If any of the Gauss-Markov assumptions fail, then this theorem no longer holds.
45 / 47
Eects of heteroscedasticity
Under heteroscedasticity, the Gauss-Markov theorem no longer applies: MLR.5 does not hold any longer the OLS estimator is still unbiased (MLR.5 is not required for Theorem 3.1) but the Gauss-Markov theorem does not apply Intuition why OLS may not be ecient: Heteroscedasticity means that some observations are more informative (contain less noise) than others. but the OLS objective function puts equal weight on all squared residuals ui2 . thus, OLS does not exploit that we can extract more information from some observations it is not surprising that there may be a more ecient estimator!
46 / 47
Outline
More on interpretation The Expected Value of the OLS Estimators Assumptions Result: OLS is unbiased Potential bias from misspecication The Variance of the OLS Estimators Homoscedasticity assumption Result Interpretation Variance in misspecied models Estimating 2 Eciency of OLS: The Gauss-Markov Theorem Next lecture: Multiple Regression Analysis: Inference.
47 / 47
Outline
48 / 47
This continues the numerical exercise for the simple linear regression model (from week 3).
We estimated the coecients as follows (n = 10): wage = 3.569 + 0.8597 schooling, We can predict residuals ui in the sample, and estimate the variance of the error term: 2 = 1 (n 2)
n
ui2 =
i=1
95.54 = 11.94 10 2
id 1 2 3 4 5 6 7 8 9 10
schooling 8 12 16 18 12 12 17 16 13 12
49 / 47
50 / 47
=
2
(xi x )
We can summarize what we have learned so far about this regression by writing wage = -3.569 + 0.8597 schooling, n = 10, R 2 = 0.395 (5.23) (0.376)
51 / 47