Professional Documents
Culture Documents
20123 2013
Econometrics
The Determinents
Title: Final
term paperof College Professor Salary
(ID): 12D62382
2 Multiple regression
2.1 Data source
The salary data used in this paper (reference see below), consists of observations on six variables for
52 tenure-track professors in a small college (as already mentioned). The variables are:
sx = Sex, coded 1 for female and 0 for male
rk = Rank, coded
o
Reference: S. Weisberg (1985). Discrimination in Salaries. New York: John Wiley and Sons. Page 194.
In the first step the variable dg is removed, because according to the t-test it has a low significance.
In that case, this means that dg is not significant at the 10% level.
Now, sx is removed for the same reason than dg in the step before. According to the t-test it has a
low significance. This means that sx is not significant at the 10% level.
In this decision also the change in the constants was considered, which dont suggest collinearity in
this case. The same applies for the standard errors.
At the same time R2 stays almost the same, which means that the same variance in salary can be
explained by the variables still in the model.
Now, yd is removed for the same reason than sx and dg in the steps before. According to the t-test it
has a low significance. This means that yd is not significant at the 10% level. Also the coefficients and
standard error show no significant change.
At the same time R2 stays almost the same, which means that the same variance in salary can be
explained by the variables still in the model.
Now all the variables left are highly significant at 1% level (at any level).
(70.45792)
- Interpretation of coefficients:
4731.256 rk This means that an increase of rank for 1 unit (for example from rank 1 to rank 2) will
increase the estimated salary by 4,731.26 USD.
376.4993 yr This means that an increase of 1 year in current rank will increase the estimated
salary by 376.50 USD.
- Significance of explanatory variables: All the explanatory variables are highly significant at 1% level.
2.4 Conclusion
The data used in this paper consists of observations on six variables for 52 tenure-track professors in
a small college. The purpose was to find out which factors have a significant influence on the salary
of those professors. In the first place the insignificant variables were removed in a step by step
process. Here it is interesting that sex, highest degree earned and number of years since highest
degree was earned do not play a significant role in the model. Only the highly significant variables rk
(rank) and yr (number of years in current rank) play a significant role, which is very interesting.
When the explanatory variable yr is dropped, both the coefficient and standard error of rk change
significantly. The coefficient changes from 4731.256 to 5952.779 and the standard error changes
from 450.0083 to 482.7553. This means that there is a considerable omitted variable bias.
R2 changes from 0.8436 to 0.7525 which also represents a significant change. The same is the case
for adjusted R2, which would not be the case if the variable could be omitted from the equation. In
this case the adjusted R2 would either not change its value or even increase its value.
3.2
As discussed in class it is very interesting to look at a variable which it is not possible to obtain data.
For this data and case, it would be really interesting to know the impact of years of practical
experience outside of university. If that explanatory variable was included, the possible impact on
the estimated coefficients would be that both the coefficients of rk and yr would decrease
especially in an environment where practical experience is seen as a positive contributor so success
in teaching.
As already explained in task 2.3, all the explanatory variables are highly significant at a 1% level. This
means that you are wrong with 0% if you say rk has an impact on the estimation.
- F test
F test (of the variable rk):
The F tests for both variables show significance to 1% level. One can even say to every percent level,
since the value is 0.0000. This means that the non hypothesis can be rejected at any level.
4.2
a = 1000 This means that the non hypothesis can be rejected at any level.
4.3
This makes only sense with more than 2 valuables, which is not the case here.
5 Functional form
5.1
1.) log-level:
2.) level-log:
3.) log-log:
5.2
As requested, three different functional forms (log-log, log-level and level-log) were analyzed.
Hereby, only the sl and yr were considered, as rk does not have an continuous course, in which case
it does not make sense to use a log-model for this variable.
Among the above models the first model (log-level) is the best model, because the R2 increases and
also the t test shows a higher significance. However, this model will not be used in task 2.2 as agreed
with the teacher!
6 Prediction
0 = 15308.22 to 17580.63
7 Heteroskedasticity
7.1 Perform BP test (hint: estat hettest)
This means that the non hypothesis can be rejected at the 5% level, which implies heteroskedasticity.
7.2 Report estimated equation with heteroskedasticity-robust standard error (hint: reg y x, robust)
(63.01736)
robust
n = 52 R2 = 0.8436
This result is different in terms of the standard error, which is the case because of the given
heteroskedasticity.