You are on page 1of 6

Econometrics 206-1

Exam III: 10.10 AM -11.40 AM, 24 April 2017

In answering these below, paste the Stata output only when it is asked. When
pasting output, use the copy as picture option. When testing a hypothesis, be sure
to mention the distribution of the test statistic, its degrees of freedom, the level of
significance and the associated critical value. DO NOT USE THE STATA test
COMMAND.

It would be easiest if you inserted your answer between the questions below and
returned this document. Rename the document as `your name.docx’ and upload it
on LMS.

You have to do this exam by yourself. You are allowed to consult the textbook and
your notes. You are NOT allowed to consult anybody whether by speaking, by text
messages or email or any other means. Violations will attract penalties as per
Ashoka policy.

1. (a) Regress log of wages on a constant and the female dummy. Paste output
here.

(b) Interpret the coefficient on the female dummy.

The coefficient on the female dummy is listed as -0.8553. This would mean that
the there would be an 85.53% decrease in wages when the employee is female,
as compared to male.
More accurately, it would be 100*[(exp. -.8553)-1] = 100*(.425 -1) = 100*.575 =
57.5%

(c) Test the null hypothesis that the coefficient on female dummy is -0.5 against
the alternative that the coefficient on female dummy is less than -0.5. Show your
workings.
[5+5+10]

To test the null hypothesis that the coefficient on the dummy variable is -.5, we
can check the confidence interval for the the female dummy. Since nothing it
specified in the question regarding level of significance, we take it as 5% level of
significance. Using the the confidence interval given by stata (when we used the
reg command) we can see that the confidence interval is between -.730 and -
.979. It does not include -.5. Therefore, we can reject the null hypothesis that
coefficient on the female dummy variable is -.5.

2. (a) Regress log of wages on a constant, the female dummy, age of the
individual and the square of age. Paste your output here.

(b) Controlling for age and the square of age does not seem to substantially
change the coefficient of the female dummy. Why is that so?
[5+5]

Despite controlling for age and agesq we do not see a substantial change in the
coefficient of the female dummy since the coefficient of age itself if quite low,
only around 6%, as compared to the female dummy coefficient which is 85.7%.
The effect of age, in this data set, seems to have a very small effect the wages
earned. Agesq, similarly, has too a small effect, and considering it is only a linear
combination of the age variable, it is not expected to have an effect on the female
dummy.
3. (a) Regress log of wages on a constant, the female dummy, age of the
individual, the square of age and the social group dummies for scheduled caste,
for scheduled tribe and for other backward caste. Note the omitted category is
the general castes (or forward castes). Paste your output here.

(b) Test the null hypothesis that none of the social group dummmies matter, i.e.,
controlling for sex, age and square of age, the average of log wages is the same
for all categories: scheduled castes, scheduled tribes, other backward castes and
the general (forward) castes. Do NOT use the Stata test command.

We have to compare the R2 of the two types of equations, one which factors in
the social groups as part of the equations (Restricted R^2) and one which does
not include the social groups (Unrestricted R^2)

The table from 3a shows us the Restricted R^2 as 0.25. The table in 2a shows us
the Unrestricted R^2 as 0.22.
Using: [(R^2unrestricted – R^2restricted)/J]/[(1-R^2unrestricted)/(n-k-1)]
Where J is the number of restrictions, n is the number of observations, and k is
the number of variables.

[(0.25-0.22)/3]/(0.75/993) = 0.1/0.0007 = 142.85

This tells us that that we must reject the null hypotheiss that none of the social
group dummies collectively matter, at the 5% nor the 1% significance levels.

Lnwages = beta0 + beta1female + beta2age + beta3agesq + beta4scd + beta5std +


beta6 obc + u

Also, we can see that compared to the general castes, all three of the socially
groups which are included in the equation have a disadvantage when it comes to
wages. This can be seen by the negative co-efficients on the variables. -.044, -
0.24, -.038, would mean that the groups have a disadvantages of 44%, 24% and
38%.

(c) Test the null hypothesis that relative to the general (forward) castes,
scheduled castes and other backward castes suffer the same extent of
discrimination. If this requires new regressions, paste the output in your
answer.
[5+15+15]

The question wants us to check if beta4 = beta6.


We will generate a new variable called bwc, which is beta4 – beta 6.
Let beta4 – beta 6 = theta
Ho: theta = 0
Ha: theta =/= 0
Beta4 = theta + beta6

Lnwages = beta0 + beta1female + beta2age + beta3agesq + (theta + beta6)scd +


beta5std + beta6 obc + u
Lnwages = beta0 + beta1female + beta2age + beta3agesq + thetascd + beta5std +
beta6 (obc + scd) + u

This shows that the Confidence Interval for scd does not include 0, which means
that the null hypothesis that theta = 0 is to be rejected.

4. (a) Regress log of wages on a constant, the female dummy, age of the
individual the square of age, the social group dummies for scheduled caste, for
scheduled tribe and for other backward caste, and the education dummies for
illiterate, literate, primary, secondary, and higher secondary. Paste the output
here.

(b) Compare the above regression with the regression in question 3 (without the
education dummies). Does the inclusion of education dummies alter the
discrimination against women, scheduled castes, scheduled tribes and other
backward castes? Why?
[5+15]

Just by looking at the coefficients you can see that the included education
dummies have negative effect on wages, as compared to the graduate above. You
can also look at the R^2 of the two models. The one which includes the education
dummies is higher (0.48), which means that more of the variation is wages is
explained, than compared to the model which does not contain the education
dummies., where R^2 is 0.25.

You can also claim that the inclusion of the education dummies alters the
discrimination against women, scheduled castes, scheduled tribes, and other
backward castes since each of the variable’s coefficients decease.

5. (a) To the explanatory variables in the regression in Q4(a), add land owned
(LandO) and land possessed (LandP) and re-run the regression. DO NOT paste
the output.

(b) Is either of the land variables individually significant at the 5 or 10% level?

Both LandO and LandP are not significant at the 5% level, since we can see that
the 0 falls in the 95% confidence interval for both of them, so we fail to reject the
null.
Both LandO and LandP are not significant at the 10% level, since the p value are
both above 0.1. P values should have been below 0.1 for the variables to be
significant.

(c) Now drop land owned (LandO) and re-run the regression. Is the included
land variable significant at the 5 or 10% level?

LandP’s P value is 0.028, which is below 0.05, so we reject the null that LandP’s
coefficient is 0, and the LandP is significant.
At the 10% level we will still reject the null that LandP’s co-efficient is 0, since
the P value of 0.028 is less than 0.1, and we reject the null, and landP is
significant.
(Also, if something is significant at the 5% level, it is significant at the 10% level)

(d) Explain the pattern of results observed in (b) and (c).

Individually, LandP and LandO are not significant (as we see in 5b). But,
collectively (in 5c) they are significant. When taken as a single variable, as in 5c,
the effects of land ownership and land possession are combined. LandP’s
coefficient is positive when the two variables are treated as one, but when taken
individually, the coefficient is negative.

Omitting the variable LandO messes up the co-efficients and levels of


significance.

[0+4+4+7]

You might also like