You are on page 1of 6

Econometrics 206-1

Exam III: 10.10 AM -11.40 AM, 24 April 2017

Aditya Khemka

In answering these below, paste the Stata output only when it is asked. When
pasting output, use the copy as picture option. When testing a hypothesis, be sure
to mention the distribution of the test statistic, its degrees of freedom, the level of
significance and the associated critical value. DO NOT USE THE STATA test
COMMAND.

It would be easiest if you inserted your answer between the questions below and
returned this document. Rename the document as `your name.docx’ and upload it
on LMS.

You have to do this exam by yourself. You are allowed to consult the textbook and
your notes. You are NOT allowed to consult anybody whether by speaking, by text
messages or email or any other means. Violations will attract penalties as per
Ashoka policy.

1. (a) Regress log of wages on a constant and the female dummy. Paste
output here.

(b) Interpret the coefficient on the female dummy.

The coefficient on the female dummy is -.8553, which is interpreted as holding


all other factors constant, being a female (female = 1) reduces wages by 85.5% as
compared to the male counterparts (female = 0). This coefficient is the expected
difference in wages conditional on being a female and a male.
(c) Test the null hypothesis that the coefficient on female dummy is -0.5 against
the alternative that the coefficient on female dummy is less than -0.5. Show your
workings.
H0: Deltafem = -0.5
HA: Deltafem < -0.5

The t statistic = [Deltafem (from regression) – Deltafem (if H0 is


true)]/s.e.(female)
= -0.855 + 0.5 /0.063
= -5.63
The critical value (c) with df = 998 and at 5% significance level for the one sided
test is ~(-1.658).
T statistic < c
Therefore the null is rejected at the 5% significance level (and also at the 1%
significance level). The evidence against the null is very strong.

2. (a) Regress log of wages on a constant, the female dummy, age of the
individual and the square of age. Paste your output here.

(b) Controlling for age and the square of age does not seem to substantially
change the coefficient of the female dummy. Why is that so?

The coefficient of the female dummy does not get changed by much because in
the omitted variable bias Beta(fem)* = Beta(fem) + Corr(age,fem)*Beta(age)
[Where Beta(fem)* is the beta from the regression where omitted variable is
included], the Corr(age, fem) is very low (0.02) as well as the Beta(age) is very
low (0.06). Together, they contribute to a very small change (~0.0012) in the
beta from the regression where age was omitted. This also reflects that keeping
age constant with their male counterparts, wages of females are significantly
less.
3. (a) Regress log of wages on a constant, the female dummy, age of the
individual the square of age and the social group dummies for scheduled
caste, for scheduled tribe and for other backward caste. Note the omitted
category is the general castes (or forward castes). Paste your output here.

(b) Test the null hypothesis that none of the social group dummmies matter, i.e.,
controlling for sex, age and square of age, the average of log wages is the same
for all categories: scheduled castes, scheduled tribes, other backward castes and
the general (forward) castes. Do NOT use the Stata test command.

We will make use of the F-test to impose these restrictions on the model.

H0: Beta(scd) = 0, Beta(std) = 0, Beta(obc) = 0


HA: H0 is not true

The restricted model is the same as the regression modeled in the previous
question.

F statistic = ([SSR(r) – SSR (ur)]/j)/SSR(ur)/n-k-1


= (707.43-679.66/3)/(679.66/993)
= 13.52
The critical value (c) for the F test at the 5% significance level with df (3,993) is
~2.64.
F > c (5%)
Incidentally, F > c (1%) as well
Hence, there is enough evidence at the 5% and the 1% significance level to reject
the null. Social group dummies do matter in explaining wages.
(c) Test the null hypothesis that relative to the general (forward) castes,
scheduled castes and other backward castes suffer the same extent of
discrimination. If this requires new regressions, paste the output in your
answer.

H0: Beta(scd) = Beta (obc) => Beta(scd)-Beta(obc) = 0


HA: Beta(scd)-Beta(obc) not equal to 0
Let Theta = Beta(scd) - Beta(obc)
Then H0: Theta = 0
Define new variable caste = scd + obc
Theta is now the coefficient on scd

The t statistic of Theta(scd) is -0.90 and the p-value is 0.367. The null cannot be
rejected at the 5% significance level as the critical value is -1.96. Hence, there is
not enough evidence to reject the null that scheduled castes and Other backward
classes suffer the same extent of discrimination.
4. (a) Regress log of wages on a constant, the female dummy, age of the
individual the square of age, the social group dummies for scheduled
caste, for scheduled tribe and for other backward caste, and the education
dummies for illiterate, literate, primary, secondary, and higher secondary.
Paste the output here.

(b) Compare the above regression with the regression in question 3 (without the
education dummies). Does the inclusion of education dummies alter the
discrimination against women, scheduled castes, scheduled tribes and other
backward castes? Why?

The inclusion of education dummies alters the discrimination against women, SC,
ST and OBC by greatly reducing their economic significance in the model. It can
be expected that with the same level of education, the amount of discrimination
against women as compared to their male counterparts or against castes as
compared to the general caste. Hence, in the regression done without education
dummies, there is an omitted variable bias. The beta of all the education
dummies is quite large. Also, the correlation between the education dummies
and the other variables mentioned here is quite large. Therefore, the bias
Beta(x1)* - Beta(x1) = Corr(x1,x2)Beta(x2) where Beta(x1)* is the beta on x1
after the inclusion of omitted variables, is quite large. This alters the
discrimination against women, SC, ST and OBC.
5. (a) To the explanatory variables in the regression in Qn 4(a), add land owned
(LandO) and land possessed (LandP) and re-run the regression. DO NOT paste
the output.

(b) Is either of the land variables individually significant at the 5 or 10% level?

No, as their p-values are both greater than 0.05 and 0.1 (0.208 and 0.494).
Therefore, they are not individually significant at the 5 or the 10% level.

(c) Now drop land owned (LandO) and re-run the regression. Is the included
land variable significant at the 5 or 10% level?

The included land variable (LandP) has a t statistic 2.20 and a p value of 0.028.
Hence, it is significant both at the 5 and the 10% level, as p < 0.05 and p < 0.1.

(d) Explain the pattern of results observed in (b) and (c).

The pattern of results can be explained by observing the statistical and economic
significance of the Land dummies in both cases. There is high correlation
between LandO and LandP (0.9), and hence when LandO is dropped from the
regression, the coefficient on LandP increases in economic significance through
the bias effect of LandO. When LandP’s economic significance increases, we
expect its t-statistic to rise and become statistically significant. Due to high
degree of correlation between the two land dummies, the pattern of results in (b)
and (c) are observed.

You might also like