Econometrics Final

FINAL EXAM DANIKA LI
QUESTION #1
1. Does the estimator of regressing X1 on Y suffer from omitted variable bias?

Explain.
a. No, because although X2 may be a determinant of Y, which is the first
condition for omitted variable bias, both conditions of omitted
variable bias must hold. In this case, X2 and X1 have to be correlated
with each other for the omission of X2 to be considered omitted
variable bias. Because X2 and X1 are given to be uncorrelated, theres
no omitted variable bias.
2. Calculate the variance of Bhat1 when X1i and X2i are uncorrelated.
a. Var(Bhat1) = [(1/(n-1)]*[(sigmau^2)/(varX1)]*[1/(1-R1^2)]
=(1/400)*(4/6)*[(1/(1-R1^2)]
=0.0025*0.6667*[(1/(1-R1^2)]
=0.00167*[1/(1-0)]
=0.00167*1
=0.00167
3. Assume that cor(X1, X2)=0.5 and R1^2=0.25. Computer variance again
a. Var(Bhat1) = [(1/(n-1)]*[(sigmau^2)/(varX1)]*[1/(1-R1^2)]
=(1/400)*(4/6)*[(1/(1-R1^2)]
=0.0025*0.6667*[(1/(1-R1^2)]
=0.00167*[1/(1-0.25)]
=0.00167*(1/0.75)
=0.00222
4. When X1 and X2 are correlated, the variance of Bhat1 is larger than it would
be is X1 and X2 are uncorrelated. Thus if you are interested in Bhat1, it is
best to leave X2 out of the regression if its correlated with X1.
a. The first part of the statement is true, as X1 and X2 being correlated
lead to variance being increased from 0.00167 to 0.00222. However,
the second half of the statement is untrue, because if X1 and X2 are
correlated, then they satisfy the second condition of omitted variable
bias as well and the regression then suffers from possible omitted
variable bias.
QUESTION 2
1. Is educ significant at the 5% level for model 1?

> wage<-read.csv("wage2.csv", header=T)
> model1<-lm(log(wage)~educ, data=wage)
> library(zoo)
> library(lmtest)
> library(sandwich)
> coeftest(model1, vcov=vcovHC(model1, type="HC3"))
t test of coefficients:
Estimate Std. Error t value Pr(>|t|)

(Intercept) 5.9730625 0.0824651 72.431 < 2.2e-16 ***

educ 0.0598392 0.0060949 9.818 < 2.2e-16 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Yes, the variable education is significant at the 5% level, as the P-

value is far below 0.05 and the t-stat is higher than 1.96.
a. Confidence interval:
> .0598392-1.96*0.006
[1] 0.0480792
> lower_bound<-.0598392-1.96*0.006
> upper_bound<-.0598392+1.96*0.006
> print(c(lower_bound, upper_bound))
[0.0480792, 0.0715992]
The coefficient 0.059 lies within the 95% confidence interval, and
the previous section showed it to be significant at the 5% level,
meaning that education is statistically significant.
b. Interpretation of educ variable: For every additional year of

education, theres a 5.9% increase in monthly wages.
2. Estimation including experience and tenure
a. Model 2:
> model2<-lm(log(wage)~educ+exper, data=wage)
> summary(model2)
Call:
lm(formula = log(wage) ~ educ + exper, data = wage)
Residuals:
Min 1Q Median 3Q Max
-1.86915 -0.24001 0.03564 0.26132 1.30062
Coefficients:
(Intercept) 5.502710 0.112037 49.115 < 2e-16 ***
educ 0.077782 0.006577 11.827 < 2e-16 ***
exper 0.019777 0.003303 5.988 3.02e-09 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 0.393 on 932 degrees of freedom

Multiple R-squared: 0.1309, Adjusted R-squared: 0.129
F-statistic: 70.16 on 2 and 932 DF, p-value: < 2.2e-16
i. Coefficient interpretation: For every additional year of

education, monthly wages will increase by 7.78%.
b. Model 3:
> model3<-lm(log(wage)~educ+exper+tenure, data=wage)
> summary(model3)
Call:
lm(formula = log(wage) ~ educ + exper + tenure, data = wage)
Residuals:
-1.8282 -0.2401 0.0203 0.2569 1.3400
Coefficients:
(Intercept) 5.496696 0.110528 49.731 < 2e-16 ***
educ 0.074864 0.006512 11.495 < 2e-16 ***
exper 0.015328 0.003370 4.549 6.10e-06 ***
tenure 0.013375 0.002587 5.170 2.87e-07 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

i. Coefficient interpretation: For every additional year of

education, monthly wages will increase by 7.49%.
c. Does model 1 suffer from omitted variable bias? Is bhat1
overestimated or underestimated?
i. Yes, because not only do adding experience and tenure on as
additional independent variables change the coefficient
(bhat1) on education, but these additional variables are
significant at the 0.001 (0.1%) level.
ii. Bhat1 in model 1 (0.059) is therefore underestimated as
compared to bhat1 in model 2 (0.0778) and model 3 (0.0749).
3. Use the results of model 3, compute predicted monthly earnings at the
averages of educ, exper, and tenure.
a. > mean_educ<-mean(wage$educ)
> mean_exper<-mean(wage$exper)
> mean_tenure<-mean(wage$tenure)
>5.496696+(0.074864*mean_educ)+(0.015328*mean_exper)+(0.013
375*mean_tenure)
[1] 6.779003
> exp(6.779003)
[1] 879.1917
The predicted monthly earnings is $879.19/month.

4. Which model would you select for the return to education? Why?
a. I would use model 3 because the significant changes in bhat1 after
adding on experience AND tenure and the fact that these variables are
statistically significant at the 5% level indicates that they solve
significant omitted variable bias in model 1 and to a lesser extent
model 2. R-squared is also the highest for model 3, indicating that the
regression explains the most variation in the model.
QUESTION #3
1. log(wage)=B0+B1educ+B2educ*pareduc+B3tenure+u
Holding all else constant and only changing educ:
log(wage)=B1educ+B2educ*pareduc
log(wage)=educ(B1+B2pareduc)
log(wage)/educ = B1+B2pareduc
2. Estimate model 4
a. > model4<-lm(log(wage)~educ+educ:pareduc+tenure, data=wage)
> summary(model4)
Call:
lm(formula = log(wage) ~ educ + educ:pareduc + tenure, data = wage)
Residuals:
-1.90863 -0.24051 0.02678 0.26726 1.28671
Coefficients:
(Intercept) 6.0315779 0.1030887 58.509 < 2e-16 ***
educ 0.0325911 0.0102024 3.194 0.001462 **
tenure 0.0146925 0.0028870 5.089 4.6e-07 ***
educ:pareduc 0.0007413 0.0002138 3.467 0.000557 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

(213 observations deleted due to missingness)
b. Estimated return to educ when pareduc=32

i. > 0.0325911+0.0007413*32
[1] 0.0563127
When pareduc=32, for every additional year of education,
monthly wage goes up by 5.63%.
c. Estimated return to educ when pareduc=24

i. > 0.0325911+0.0007413*24
[1] 0.0503823
When pareduc=24, for every additional year of education,
monthly wage goes up by 5.04%.
3. Add pareduc into the model as a separate independent variable (Model 5)
and estimate it. Does the estimated return to education now depend on
parent education?
a. >model5<-lm(log(wage)~educ+pareduc+educ:pareduc+tenure,
data=wage)
> summary(model5)
Call:
lm(formula = log(wage) ~ educ + pareduc + educ:pareduc + tenure,
data = wage)
Residuals:
-1.91704 -0.23329 0.02131 0.26594 1.29484
Coefficients:
(Intercept) 5.487671 0.368315 14.899 < 2e-16 ***
educ 0.071609 0.027338 2.619 0.009 **
pareduc 0.025999 0.016903 1.538 0.124
tenure 0.014891 0.002887 5.158 3.24e-07 ***
educ:pareduc -0.001094 0.001212 -0.902 0.367
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

(213 observations deleted due to missingness)

(Intercept) 5.4876712 0.3658890 14.9982 < 2.2e-16 ***
educ 0.0716086 0.0271934 2.6333 0.008638 **
pareduc 0.0259987 0.0172617 1.5062 0.132468
tenure 0.0148915 0.0028721 5.1849 2.815e-07 ***
educ:pareduc -0.0010938 0.0012420 -0.8807 0.378786
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
The estimated return on education does not depend on parental

education on a statistically significant level. Although model 5
shows a coefficient on educ:pareduc, the result of the t-test
shows its not statistically significant even at the 10% level. I
chose to analyze it on a 5% significance level. Therefore, because
the p-value is higher than 0.05 and the t-stat is lower than 1.96,
theres no statistical significance. The results of running a linear
hypothesis test (shown below) further reinforce this answer.
> library(car)
>linearHypothesis(model5,c("pareduc=0","educ:pareduc=0"),
vcov=vcovHC(model5, type="HC3"))
Linear hypothesis test
Hypothesis:
pareduc = 0
educ:pareduc = 0
Model 1: restricted model

Model 2: log(wage) ~ educ + pareduc + educ:pareduc + tenure
Note: Coefficient covariance matrix supplied.
Res.Df Df F Pr(>F)
1 719
2 717 2 7.532 0.0005791 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
4. By omitting pareduc by itself as a variable, model 4 suffers from significant

omitted variable bias. By including it in model 5, you solve for this bias and
get a more accurate estimate on the effect on return to education as based on
pareduc. This is a good example of what happens when you dont include all
the necessary terms in a regression, as the estimate wasnt just off by a few
decimals, but by a positive-negative difference.
QUESTION #4
1. What is the approximate difference in monthly salary between blacks and

nonblacks? Is this different statistically significant on a 5% significance level?
a. >model6<-
lm(log(wage)~educ+exper+tenure+married+black+south+urban,
data=wage)
> summary(model6)
Call:
lm(formula = log(wage) ~ educ + exper + tenure + married + black +
south + urban, data = wage)
Residuals:
-1.98069 -0.21996 0.00707 0.24288 1.22822
Coefficients:
(Intercept) 5.395497 0.113225 47.653 < 2e-16 ***
educ 0.065431 0.006250 10.468 < 2e-16 ***
exper 0.014043 0.003185 4.409 1.16e-05 ***
tenure 0.011747 0.002453 4.789 1.95e-06 ***
married 0.199417 0.039050 5.107 3.98e-07 ***
black -0.188350 0.037667 -5.000 6.84e-07 ***
south -0.090904 0.026249 -3.463 0.000558 ***
urban 0.183912 0.026958 6.822 1.62e-11 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

According to the model, the difference in monthly salary is that

those who are black have a 18.83% lower monthly salary than non-
blacks. According to the t-test below, this difference is statistically
significant at the 0.001 level (and therefore the 0.05 level as well).
The p-value is miniscule, and much lower than 0.05. Additionally,
the absolute value of the t-stat is higher than the critical value of
1.96.

(Intercept) 5.3954970 0.1137966 47.4135 < 2.2e-16 ***
educ 0.0654307 0.0064452 10.1519 < 2.2e-16 ***
exper 0.0140430 0.0032611 4.3062 1.838e-05 ***
tenure 0.0117473 0.0025532 4.6010 4.789e-06 ***
married 0.1994171 0.0401269 4.9697 7.986e-07 ***
black -0.1883499 0.0370303 -5.0864 4.417e-07 ***
south -0.0909037 0.0275051 -3.3050 0.0009863 ***
urban 0.1839121 0.0272624 6.7460 2.673e-11 ***

---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
2. Add the interaction term educ:black. Is this coefficient statistically significant

at the 5% significance level?
a. >model7<-
lm(log(wage)~educ+exper+tenure+married+black+south+urban+edu
c:black, data=wage)
> summary(model7)
Call:
lm(formula = log(wage) ~ educ + exper + tenure + married + black +
south + urban + educ:black, data = wage)
Residuals:
-1.97782 -0.21832 0.00475 0.24136 1.23226
Coefficients:
(Intercept) 5.374817 0.114703 46.859 < 2e-16 ***
educ 0.067115 0.006428 10.442 < 2e-16 ***
exper 0.013826 0.003191 4.333 1.63e-05 ***
tenure 0.011787 0.002453 4.805 1.80e-06 ***
married 0.198908 0.039047 5.094 4.25e-07 ***
black 0.094809 0.255399 0.371 0.710561
south -0.089450 0.026277 -3.404 0.000692 ***
urban 0.183852 0.026955 6.821 1.63e-11 ***
educ:black -0.022624 0.020183 -1.121 0.262603
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1


(Intercept) 5.3748170 0.1161704 46.2666 < 2.2e-16 ***
educ 0.0671153 0.0067213 9.9854 < 2.2e-16 ***
exper 0.0138259 0.0032655 4.2339 2.526e-05 ***
tenure 0.0117870 0.0025532 4.6165 4.453e-06 ***
married 0.1989077 0.0400959 4.9608 8.351e-07 ***

black 0.0948087 0.2163599 0.4382 0.661344
south -0.0894495 0.0274961 -3.2532 0.001183 **
urban 0.1838523 0.0272556 6.7455 2.684e-11 ***
educ:black -0.0226236 0.0169537 -1.3344 0.182390
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
After running a t-test on model 7, you see that the interaction

term educ:black is not statistically significant, even on the 10%
level. Therefore, due to the p-value being higher than 0.05 and
the absolute value of the t-stat being smaller than the critical
value of 1.96, educ:black is not statistically significant at the 5%
level education does not depend on race at a statistically
significant level.
3. Test the null hypothesis that the effects of all dummy variables are equal to
zero with a heteroskedasticity F-test
a. > linearHypothesis(model6, c("exper=0", "tenure=0","married=0",
"black=0", "south=0", "urban=0"), vcov=vcovHC(model6, type =
"HC3"))
Linear hypothesis test
Hypothesis:
exper = 0
tenure = 0
married = 0
black = 0
south = 0
urban = 0
Model 1: restricted model

Model 2: log(wage) ~ educ + exper + tenure + married + black + south
+
urban
Note: Coefficient covariance matrix supplied.
Res.Df Df F Pr(>F)
1 933
2 927 6 37.058 < 2.2e-16 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Because the p-value is smaller than the 0.05 alpha level and the f-
stat is higher than the 1.96 critical value, we reject the null
hypothesis that the effects of the dummy variables are equal to

zero. In effect, this means that the dummy variables have a
statistically significant effect.
4. Extend the original model so that education depends on the amount of work
experience (model 8 with interaction term educ:exper). Obtain thetahat1
and a 95% confidence interval for theta1.
a. Holding all else equal:
log(wage)=B0+B1educ+B2exper+B3educ*exper+B4tenure+B5marrie
d+B6black+B7south+B8urban+u
log(wage)=B0+B1educ+B2exper+B3educ*exper+u
Plug in B1= 1-10B3
Log(wage)=B0+(1-10B3)educ+B2exper+B3educ*exper+u
=B0+1educ+B2exper-10B3educ+B3educ*exper+u
=B0+1educ+B2exper+B3educ(exper-10)+u
Therefore, regress log(wage) on educ, exper, and educ(exper-10)
>model8<-lm(log(wage)~educ+exper+I(educ*(exper-10)),
data=wage)
> summary(model8)
Call:
lm(formula = log(wage) ~ educ + exper + I(educ * (exper - 10)),
data = wage)
Residuals:
-1.88558 -0.24553 0.03558 0.26171 1.28836
Coefficients:
(Intercept) 5.949455 0.240826 24.704 <2e-16 ***
educ 0.076080 0.006615 11.501 <2e-16 ***
exper -0.021496 0.019978 -1.076 0.2822
I(educ * (exper - 10)) 0.003203 0.001529 2.095 0.0365 *
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

Theta hat in this case is the return to education, or 0.0761. The

code for the confidence interval is shown below, but essentially
the 95% confidence interval for theta hat is [0.063, 0.089]. Since
our estimate for theta hat (0.076) is within this confidence
interval, its a statistically sound estimate.
> lower_bound<-0.076080-1.96*0.006615
> upper_bound<-0.076080+1.96*0.006615
> print(c(lower_bound, upper_bound))
[1] 0.0631146 0.0890454
QUESTION #5
1. Estimate the model with only 1988 data. What are the estimated effect of
education and union membership? Are they significant at the 5% level?
a. > nls_panel<-read.csv("nls_panel.csv", header=TRUE)
> nls88<-subset(nls_panel, year==88)
>model88<-
lm(log(wage)~educ+exper+I(exper^2)+tenure+I(tenure^2)+black+s
outh+union, data=nls88)
> summary(model88)
Call:
lm(formula = log(wage) ~ educ + exper + I(exper^2) + tenure +
I(tenure^2) + black + south + union, data = nls88)
Residuals:
-1.55873 -0.23842 -0.00052 0.23490 1.82679
Coefficients:
(Intercept) 0.2237350 0.2240258 0.999 0.318281
educ 0.0776627 0.0063978 12.139 < 2e-16 ***
exper 0.0787905 0.0307279 2.564 0.010549 *
I(exper^2) -0.0016709 0.0010510 -1.590 0.112327
tenure 0.0076095 0.0098219 0.775 0.438745
I(tenure^2) -0.0002872 0.0005024 -0.572 0.567701
black -0.1310958 0.0372783 -3.517 0.000465 ***
south -0.1370122 0.0336513 -4.072 5.2e-05 ***
union 0.1300245 0.0356196 3.650 0.000281 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1



(Intercept) 0.22373504 0.20407101 1.0964 0.2732951
educ 0.07766268 0.00683453 11.3633 < 2.2e-16 ***
exper 0.07879051 0.02890784 2.7256 0.0065778 **
I(exper^2) -0.00167093 0.00102090 -1.6367 0.1021328
tenure 0.00760953 0.01128314 0.6744 0.5002670
I(tenure^2) -0.00028724 0.00055969 -0.5132 0.6079599
black -0.13109578 0.03383677 -3.8744 0.0001168 ***
south -0.13701219 0.03338922 -4.1035 4.544e-05 ***
union 0.13002448 0.03474406 3.7424 0.0001971 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Education: In 1988, for every additional year of education, this lead

to a 7.7% increase in wage/hour. This value is statistically significant
at the 0.05 level, because its p-value is smaller than 0.05 and its t-stat
is higher than the 1.96 critical value.
Union membership: In 1988, membership in a union lead to a 13%
increase in wage/hour. This value is statistically significant at the 0.05
level, because its p-value is smaller than 0.05 and its t-stat is higher
than the 1.96 critical value.
2. Estimate the model above with 1987 data. What are the estimated effects of
education and union membership? Are they similar to the results of 1988
data? Explain.
a. > nls87<-subset(nls_panel, year==87)
>model87<-
lm(log(wage)~educ+exper+I(exper^2)+tenure+I(tenure^2)+black+s
outh+union, data=nls87)
> summary(model87)
Call:
lm(formula = log(wage) ~ educ + exper + I(exper^2) + tenure +
I(tenure^2) + black + south + union, data = nls87)
Residuals:
-1.52585 -0.25020 -0.01483 0.21843 2.61713
Coefficients:
(Intercept) 0.2242087 0.1890339 1.186 0.23599
educ 0.0759663 0.0062708 12.114 < 2e-16 ***
exper 0.0854817 0.0280038 3.053 0.00235 **
I(exper^2) -0.0020485 0.0010488 -1.953 0.05119 .
tenure 0.0068705 0.0097102 0.708 0.47945
I(tenure^2) -0.0001893 0.0005442 -0.348 0.72801
black -0.1574320 0.0366493 -4.296 1.99e-05 ***
south -0.1014177 0.0328986 -3.083 0.00213 **
union 0.1662697 0.0352498 4.717 2.89e-06 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1


(Intercept) 0.22420875 0.16043638 1.3975 0.1627037
educ 0.07596626 0.00764732 9.9337 < 2.2e-16 ***
exper 0.08548166 0.02559507 3.3398 0.0008825 ***
I(exper^2) -0.00204848 0.00099797 -2.0526 0.0404751 *
tenure 0.00687051 0.01026421 0.6694 0.5034808
I(tenure^2) -0.00018933 0.00056956 -0.3324 0.7396668
black -0.15743205 0.03350536 -4.6987 3.147e-06 ***
south -0.10141769 0.03165266 -3.2041 0.0014158 **
union 0.16626968 0.03784696 4.3932 1.288e-05 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Education: In 1987, an additional year of education yielded a 7.6%

increase in wages/hour. This effect is statistically significant at the
0.05 level, as its p-value is smaller than 0.05 and its t-value higher
than the 1.96 critical value.
Union membership: In 1987, membership in a union yielded a
16.62% increase in wages/hour. This effect is statistically significant
at the 0.05 level, as its p-value is smaller than 0.05 and its t-value
higher than the 1.96 critical value.
Similarities to 88: The effect of education was very similar to the

1988 effects, with 1987 yielding a .1% smaller increase. However,
union membership in 1987 yielded a 3.62% higher increase. This is

not a massive change in effect, and these similarities can be attributed
to the fact that the two models are only one year apart.
3. Using the original dataset, estimate the pooled OLS model. Whats the
estimated effect of being in a union? Using robust standard errors, did you
find any insignificant variables at the 5% significance level?
a. >pooledmodel<-
plm(log(wage)~educ+exper+I(exper^2)+tenure+I(tenure^2)+black+
south+union,data= nls.plm,model="pooling")
> summary(pooledmodel)
Pooling Model
Call:
plm(formula = log(wage) ~ educ + exper + I(exper^2) + tenure +
I(tenure^2) + black + south + union, data = nls.plm, model =
"pooling")
Balanced Panel: n=716, T=5, N=3580
Residuals :
Min. 1st Qu. Median 3rd Qu. Max.
-1.70000 -0.23300 -0.00438 0.21500 2.58000
Coefficients :
Estimate Std. Error t-value Pr(>|t|)
(Intercept) 0.47660008 0.05615585 8.4871 < 2.2e-16 ***
educ 0.07144879 0.00268939 26.5669 < 2.2e-16 ***
exper 0.05568504 0.00860716 6.4696 1.116e-10 ***
I(exper^2) -0.00114754 0.00036129 -3.1762 0.0015046 **
tenure 0.01496002 0.00440728 3.3944 0.0006953 ***
I(tenure^2) -0.00048604 0.00025770 -1.8860 0.0593697 .
black -0.11671387 0.01571590 -7.4265 1.387e-13 ***
south -0.10600256 0.01420083 -7.4645 1.045e-13 ***
union 0.13224321 0.01496161 8.8388 < 2.2e-16 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Total Sum of Squares: 772.56

Residual Sum of Squares: 521.03
R-Squared: 0.32559
Adj. R-Squared: 0.32408
> coeftest(pooledmodel, vcov=vcovHC(pooledmodel, type="HC3"))

(Intercept) 0.47660008 0.08480039 5.6203 2.053e-08 ***
educ 0.07144879 0.00550493 12.9790 < 2.2e-16 ***
exper 0.05568504 0.01139134 4.8884 1.062e-06 ***
I(exper^2) -0.00114754 0.00049662 -2.3107 0.02091 *
tenure 0.01496002 0.00714088 2.0950 0.03624 *
I(tenure^2) -0.00048604 0.00041159 -1.1809 0.23772
black -0.11671387 0.02816113 -4.1445 3.485e-05 ***
south -0.10600256 0.02708239 -3.9141 9.244e-05 ***
union 0.13224321 0.02710551 4.8788 1.114e-06 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Union member: By being a member of a union, your hourly wage

should increase by 13.22%. Based on the t-test, I(tenure^2) is the only
statistically insignificant variable. All other variables were significant
at the 5% or smaller level.
4. Estimate the individual fixed effects model. What is the sample size? Can you
find the coefficient on educ? Why was it dropped? Explain.
a. fixed_id<-
plm(log(wage)~educ+exper+I(exper^2)+tenure+I(tenure^2)+black+
south+union, data=nls.plm, model="within")
>coeftest(fixed_id,vcov=vcovHC(fixed_id,type="HC3",
cluster="group"))

exper 0.04108314 0.00825483 4.9769 6.846e-07 ***
I(exper^2) -0.00040905 0.00033058 -1.2374 0.2160467
tenure 0.01390895 0.00422300 3.2936 0.0010011 **
I(tenure^2) -0.00089623 0.00024992 -3.5860 0.0003414 ***
south -0.01632239 0.05921706 -0.2756 0.7828471
union 0.06369724 0.01690438 3.7681 0.0001678 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Sample size results are shown from the R command below. R

dropped the coefficient on educ automatically when a fixed
effects model was implemented which is why youre unable to
find a coefficient for educ in the model.
Sample size:
> pdim(fixed_id)
Balanced Panel: n=716, T=5, N=3580
5. What is the estimated effect of union membership from the fixed_id model?
Compare it with the result from the pooled OLS model. Which model is
statistically reliable between the pooled OLS and fixed_id model?
a. Union membership in the fixed_id model is shown to yield a
6.37% increase in hourly wage. This is much smaller than the
estimated effect of union membership in the pooled OLS model
(13.22%).
> pFtest(fixed_id, pooledmodel)
F test for individual effects
data: log(wage) ~ educ + exper + I(exper^2) + tenure + I(tenure^2) +

...
F = 15.188, df1 = 713, df2 = 2858, p-value < 2.2e-16
alternative hypothesis: significant effects
The results of this pFtset show that the null hypothesis (no
significant fixed state effects) is rejected in favor of the
alternative hypothesis of significant fixed state effects.
Statistically, this test shows that you should use the fixed state
effects model.
During this examination, all work has been my own. I give my word that I have not
resorted to any ethically questionable means of improving my grade or anyone elses
on this examination and that I have not discussed this exam with anyone other than
my instructor.

Econometrics Final

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Econometrics Final

Uploaded by

Copyright:

Available Formats

FINAL EXAM DANIKA LI

1. Does the estimator of regressing X1 on Y suffer from omitted variable bias?

1. Is educ significant at the 5% level for model 1?

Estimate Std. Error t value Pr(>|t|)

(Intercept) 5.9730625 0.0824651 72.431 < 2.2e-16 ***

Yes, the variable education is significant at the 5% level, as the P-

b. Interpretation of educ variable: For every additional year of

Residual standard error: 0.393 on 932 degrees of freedom

i. Coefficient interpretation: For every additional year of

Residual standard error: 0.3877 on 931 degrees of freedom

i. Coefficient interpretation: For every additional year of

The predicted monthly earnings is $879.19/month.

Residual standard error: 0.3892 on 718 degrees of freedom

b. Estimated return to educ when pareduc=32

c. Estimated return to educ when pareduc=24

Residual standard error: 0.3888 on 717 degrees of freedom

> coeftest(model5, vcov=vcovHC(model5, type="HC3"))

Estimate Std. Error t value Pr(>|t|)

Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

The estimated return on education does not depend on parental

Model 1: restricted model

Note: Coefficient covariance matrix supplied.

4. By omitting pareduc by itself as a variable, model 4 suffers from significant

1. What is the approximate difference in monthly salary between blacks and

Residual standard error: 0.3655 on 927 degrees of freedom

According to the model, the difference in monthly salary is that

> coeftest(model6, vcov=vcovHC(model6, type="HC3"))

Estimate Std. Error t value Pr(>|t|)

urban 0.1839121 0.0272624 6.7460 2.673e-11 ***

2. Add the interaction term educ:black. Is this coefficient statistically significant

Residual standard error: 0.3654 on 926 degrees of freedom

> coeftest(model7, vcov=vcovHC(model7, type="HC3"))

Estimate Std. Error t value Pr(>|t|)

married 0.1989077 0.0400959 4.9608 8.351e-07 ***

After running a t-test on model 7, you see that the interaction

Model 1: restricted model

Note: Coefficient covariance matrix supplied.

hypothesis that the effects of the dummy variables are equal to

Therefore, regress log(wage) on educ, exper, and educ(exper-10)

Residual standard error: 0.3923 on 931 degrees of freedom

Theta hat in this case is the return to education, or 0.0761. The

Residual standard error: 0.4044 on 707 degrees of freedom

Multiple R-squared: 0.3132, Adjusted R-squared: 0.3054

> coeftest(model88, vcov=vcovHC(model88, type="HC3"))

Estimate Std. Error t value Pr(>|t|)

Education: In 1988, for every additional year of education, this lead

Residual standard error: 0.3956 on 707 degrees of freedom

> coeftest(model87, vcov=vcovHC(model87, type="HC3"))

Estimate Std. Error t value Pr(>|t|)

Education: In 1987, an additional year of education yielded a 7.6%

Similarities to 88: The effect of education was very similar to the

union membership in 1987 yielded a 3.62% higher increase. This is

Balanced Panel: n=716, T=5, N=3580

Total Sum of Squares: 772.56

Estimate Std. Error t value Pr(>|t|)

Union member: By being a member of a union, your hourly wage

Estimate Std. Error t value Pr(>|t|)

Sample size results are shown from the R command below. R

Balanced Panel: n=716, T=5, N=3580

> pFtest(fixed_id, pooledmodel)

F test for individual effects

data: log(wage) ~ educ + exper + I(exper^2) + tenure + I(tenure^2) +

You might also like

Signif. codes: 0 * 0.001 0.01 * 0.05 . 0.1 1