Professional Documents
Culture Documents
Aditya Khemka
In answering these below, paste the Stata output only when it is asked. When
pasting output, use the copy as picture option. When testing a hypothesis, be sure
to mention the distribution of the test statistic, its degrees of freedom, the level of
significance and the associated critical value. DO NOT USE THE STATA test
COMMAND.
It would be easiest if you inserted your answer between the questions below and
returned this document. Rename the document as `your name.docx’ and upload it
on LMS.
You have to do this exam by yourself. You are allowed to consult the textbook and
your notes. You are NOT allowed to consult anybody whether by speaking, by text
messages or email or any other means. Violations will attract penalties as per
Ashoka policy.
1. (a) Regress log of wages on a constant and the female dummy. Paste
output here.
2. (a) Regress log of wages on a constant, the female dummy, age of the
individual and the square of age. Paste your output here.
(b) Controlling for age and the square of age does not seem to substantially
change the coefficient of the female dummy. Why is that so?
The coefficient of the female dummy does not get changed by much because in
the omitted variable bias Beta(fem)* = Beta(fem) + Corr(age,fem)*Beta(age)
[Where Beta(fem)* is the beta from the regression where omitted variable is
included], the Corr(age, fem) is very low (0.02) as well as the Beta(age) is very
low (0.06). Together, they contribute to a very small change (~0.0012) in the
beta from the regression where age was omitted. This also reflects that keeping
age constant with their male counterparts, wages of females are significantly
less.
3. (a) Regress log of wages on a constant, the female dummy, age of the
individual the square of age and the social group dummies for scheduled
caste, for scheduled tribe and for other backward caste. Note the omitted
category is the general castes (or forward castes). Paste your output here.
(b) Test the null hypothesis that none of the social group dummmies matter, i.e.,
controlling for sex, age and square of age, the average of log wages is the same
for all categories: scheduled castes, scheduled tribes, other backward castes and
the general (forward) castes. Do NOT use the Stata test command.
We will make use of the F-test to impose these restrictions on the model.
The restricted model is the same as the regression modeled in the previous
question.
The t statistic of Theta(scd) is -0.90 and the p-value is 0.367. The null cannot be
rejected at the 5% significance level as the critical value is -1.96. Hence, there is
not enough evidence to reject the null that scheduled castes and Other backward
classes suffer the same extent of discrimination.
4. (a) Regress log of wages on a constant, the female dummy, age of the
individual the square of age, the social group dummies for scheduled
caste, for scheduled tribe and for other backward caste, and the education
dummies for illiterate, literate, primary, secondary, and higher secondary.
Paste the output here.
(b) Compare the above regression with the regression in question 3 (without the
education dummies). Does the inclusion of education dummies alter the
discrimination against women, scheduled castes, scheduled tribes and other
backward castes? Why?
The inclusion of education dummies alters the discrimination against women, SC,
ST and OBC by greatly reducing their economic significance in the model. It can
be expected that with the same level of education, the amount of discrimination
against women as compared to their male counterparts or against castes as
compared to the general caste. Hence, in the regression done without education
dummies, there is an omitted variable bias. The beta of all the education
dummies is quite large. Also, the correlation between the education dummies
and the other variables mentioned here is quite large. Therefore, the bias
Beta(x1)* - Beta(x1) = Corr(x1,x2)Beta(x2) where Beta(x1)* is the beta on x1
after the inclusion of omitted variables, is quite large. This alters the
discrimination against women, SC, ST and OBC.
5. (a) To the explanatory variables in the regression in Qn 4(a), add land owned
(LandO) and land possessed (LandP) and re-run the regression. DO NOT paste
the output.
(b) Is either of the land variables individually significant at the 5 or 10% level?
No, as their p-values are both greater than 0.05 and 0.1 (0.208 and 0.494).
Therefore, they are not individually significant at the 5 or the 10% level.
(c) Now drop land owned (LandO) and re-run the regression. Is the included
land variable significant at the 5 or 10% level?
The included land variable (LandP) has a t statistic 2.20 and a p value of 0.028.
Hence, it is significant both at the 5 and the 10% level, as p < 0.05 and p < 0.1.
The pattern of results can be explained by observing the statistical and economic
significance of the Land dummies in both cases. There is high correlation
between LandO and LandP (0.9), and hence when LandO is dropped from the
regression, the coefficient on LandP increases in economic significance through
the bias effect of LandO. When LandP’s economic significance increases, we
expect its t-statistic to rise and become statistically significant. Due to high
degree of correlation between the two land dummies, the pattern of results in (b)
and (c) are observed.