Model Selection

MODEL SELECTION
What are the criteria in choosing a model for empirical analysis? What types of model specification errors exists? What are the consequences of specification errors? How does one detect specification errors? Having detected specification errors, what remedies can be adopted?
Model selection criteria Be admissible Be consistent with theory Have weekly exogenous regressors Exhibit parameter constancy Exhibit data coherency Be encompassing
Types of specification errors

Omission of relevant variables Inclusion of irrelevant variables Wrong Functional form Errors in variables/ measurement
Omitting a variable
Suppose the correct model is [wage] ~ [educ] + [ability] But we estimate [wage] ~ [educ] Would the coefficient for [educ] in the latter regression correctly measure the effect of [educ] on wage?
Omitting a variable (Consequences)

Omitting a variable creates a bias in the estimated coefficients if:
The omitted variable affects the dependent variable The omitted variable is correlated with some of the explanatory variables included in the estimated equation.
Sometimes we can guess the direction of the bias.
The direction of the bias

If [educ] is positively correlated with [abil] and [abil] has a positive effect on [wage], what is the direction of bias in the coefficient of [educ] in [wage] ~ [educ] ?
Irrelevant variables
If you include irrelevant variables that have are correlated with your regressors of interest, you decrease the precision with which you can measure your coefficients of interest. The tradeoff when choosing coefficients: bias vs. variance.
How to find irrelevant variables

Run a F- or t-test.
Functional form
Do we need squares? Higher polynomials? Logs? On the left? On the right? Both?
Sometimes F-tests can tell us.
RESET: A general test for functional form/specification error

Uses polynomials in fitted values as additional regressors. A significant F-statistic indicates specification problems.
Step1: Run the OLS regression of y on x and obtain the fitted values of y and R2 (old) Step2: Rerun the regression by introducing estimated values of y in some form as an additional regressors and obtain R2 (new). Step3:
(R 2 new R 2 old ) / number of new regressors Compute F (1 - R 2 new ) / n number of parameters in the new model
Step4: If the computed value is significant at chosen level of significance, accept the hypothesis that the first model is mis-specified.
LM test for adding variables: Step1: Run OLS regression and obtain residuals. Step2: Regress residuals on squares and cubes of explanatory variables.
i.e ui b1 b2 xi b3 xi b4 xi error
2 3
Step3: For large sample size, Engel shown n times R2 estimated from step2 follows the chi square distribution with df equal to number of restrictions imposed by the restricted regression. Step4: If the chi square value exceeds the critical value at the chosen level of significance, we reject the restricted regression.
SELECTING MODELS (F-tests)

If one model is a subset of another use F-tests.
If one model is not a subset of another, build a model that encompasses both and then use Ftests.
The Davidson-MacKinnon J test

D 1. Estimate Model D and obtain the estimatedY values Yi . 2. Add the predicted Y value in step1 as an additional regressorto Model C and estimatethe following model : D Y b b x b x b Y u
i 1 2 2i 3 3i 4 i
3. Use t test to test the hypothesisthat b4 0 4. If the hypothesisthat b4 0 is not rejected, we can accept that Model C as the ture model and conclude that Model C encompasse Model D. s 5. Now reversethe roles of Model C and Model D. C Estimate Model C and obtain the estimatedY values Yi . Add the predicted Y value as an additional regressorto Model D and estimatethe following model : C Y c c z c z c Y u
i 1 2 2i 3 3i 4 i
If the hypothesisthat c 4 0 is not rejected, we can accept that Model D as the ture model and conclude that Model D encompasse Model C. s
Model selection criteria

R2 criterian Adjusted R2 Akaike Information Criteria Schwartzs Bayesian Criteria

Model Selection

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Model Selection

Uploaded by

Copyright:

Available Formats

MODEL SELECTION

Types of specification errors

Omitting a variable (Consequences)

Sometimes we can guess the direction of the bias.

The direction of the bias

How to find irrelevant variables

Sometimes F-tests can tell us.

RESET: A general test for functional form/specification error

SELECTING MODELS (F-tests)

The Davidson-MacKinnon J test

Model selection criteria

You might also like