You are on page 1of 3

Question 1 (10 points) – Participation in Pension Plans

The data in 401K.dta are a subset of data analyzed by Papke (1995) to study the
relationship between participation in a 401(k) pension plan and the generosity of the plan.
The variable prate is the percentage of eligible workers with an active account; this is the
variable we would like to explain. The measure of generosity is the plan match rate,
mrate. For example, if mrate = 0.5, then a $1 contribution by the worker is matched by a
50 cents contribution by the firm. Answer the following questions based on the STATA
output below.
. use "C:\Econ3210\BookResources\statafiles\401K.dta", clear
. regress prate mrate

Source | SS df MS Number of obs = 1534


-------------+------------------------------ F( 1, 1532) = 123.68
Model | 32001.7271 1 32001.7271 Prob > F = 0.0000
Residual | 396383.812 1532 258.73617 R-squared = 0.0747
-------------+------------------------------ Adj R-squared = 0.0741
Total | 428385.539 1533 279.442622 Root MSE = 16.085

------------------------------------------------------------------------------
prate | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
mrate | 5.861079 .5270107 11.12 0.000 4.82734 6.894818
_cons | 83.07546 .5632844 147.48 0.000 81.97057 84.18035
------------------------------------------------------------------------------

a. Write out the population regression model and population regression function
(PRF).

b. Write out the sample regression function (SRF) and interpret the slope coefficient.

c. Differentiate between regression error term and regression residual.

1
d. The third observation has prate = 97.6 and mrate = 0.91. What is the fitted value
of participation rate and what is the regression residual?

e. Find out the following regression results from the STATA output above (some
results may require simple calculation):
SST =
SSE =
SSR =
R2 =

Question 2 (4 points) – Interpret Regression Results I


Let grthemp denote the proportionate growth in employment, at the county level, from
1990 to 1995, and let salestax denote the county sales tax rate, stated as a proportion.
The sample regression function from a county level data is

grthemp  .043  .78salestax

a. For a hypothetical county, if its 1990 employment rate was 65% and its sales tax
increased 2 percentage points, what would be the predicted employment rate in
1995.

b. The average sales tax in the sample is 0.06. Given the SRF, if the county level
sales tax increased by 20% (using the sample mean as benchmark), what would be
the predicted growth in employment from 1990 to 1995.

Question 3 (6 points) – Interpret Regression Results II

Using the data in WAGE1.dta, we obtain the following sample regression function:

log  wage   0.584  0.083educ
where wage is hourly wage in 1976 US dollar and educ denotes years of schooling.

a. Interpret the slope coefficient.

2
b. If educ = 16, what is the predicted hourly wage?

c. What is the predicted percentage difference in hourly wage between high school
graduates (typically with 12 years of schooling) and college graduates (typically
with 16 years of schooling)? Calculate this percentage difference first relying on
the slope coefficient directly as an approximation and then relying on the
exponential function to get the accurate number.

You might also like