You are on page 1of 3

Homework 3

Due Monday, October 23

Answer all questions.

You should find all datasets in practice Lab folder

1. Using the wage data (wage2.dta), Obtain basic summary statistics for educ, sibs, meduc, and feduc for
non-missing observations of meduc and feduc. (Hint: You should get 722 observations for each of the 4
variables.)
B. Estimate a linear regression of educ on sibs, meduc, and feduc. Write down the predicted
equation. How much of the variation in educ is explained by sibs, meduc and feduc?
1. Attach the Stata program you used as well as the output. rerun the linear regression of educ on
sibs, meduc, and feduc;
C. Which of the factors are individually statistically significant at the 5% level of significance?
D. Construct the 99% confidence interval for each of the population coefficients associated with
the explanatory variables: sibs, meduc, and feduc.
E. Test the joint significance of sibs, meduc and feduc.
F. Test the joint significance of meduc and feduc.

2. We are interested in estimating the relationship between firm size - usually measured by annual sales -
and spending on research and development (R&D), controlling for profit margin. For the population of
firms in the chemical industry, let rd denote annual expenditures on research and development, and let
sales denote annual sales (both are in millions of dollars). Provide descriptive statistics for rd, sales and
profits. You can then generate new variables needed for the regression analysis below.

A. Estimate a model that implies a constant elasticity of rd with respect to sales. Write down the estimated
equation in the usual form, including the sample size and R-squared. Write down the estimate of the
elasticity of rd with respect to sales and explain in words what this
elasticity means.

B. One might also be interested in the ceteris paribus effect of the profit margin (say profmarg) - that is,
profits as a percentage of sales - on R&D. Estimate the equation obtained by adding the variable profmarg
to the model estimated in A above. Write down the estimated
equation in the usual form. What is the estimated elasticity of R&D with respect to sales, holding profit
margin fixed? Interpret the estimated coefficient on the profit margin.
C. Which of the explanatory variables are individually significant?
D. Test the null hypothesis H0 : 0 = 1 against H1 : 1> 1, where 1 is the parameter associated with log
(sales) in the regression model. What do you conclude? (Remarks: It is not enough to just say reject the
null or fail to reject the null hypothesis.)

3. In this problem, you will analyze medical expenditures of individuals 65 years and older who qualify for
health care under the U.S. Medicare program. The original data source is the Medical Expenditure Panel
Survey for year 2003. You will use Stata data in file mpe1.dta. The file contains 3064 observations on 10
variables. The basic equation you will estimate can be written as

y =0 + 1suppins + 2phylim + 3actlim + 4totchr + 5age + 6age2+ 7female +


8income + 9white + 10msa + u
where y is either totexp or log(totexp), all the observed basic variables have been defined in the data format,
and u is the disturbance term. Here totexp is the total annual medical expenditures of an individual,
measured in dollars.

Use data for all individuals to answer the following questions.

A. Obtain the description of the data in the file. Then find summary statistics for all variables in the data.

B. Provide detailed summary statistics for totexp. (Hint: use the sum command with detail option). Then
obtain the histogram of totexp. What can you say about the skewness of the distribution of total
medical expenditure?

C. Create a new variable named totexp_group, which equals 0 if totexp 0; equals 1 if totexp > 0. Then
find summary statistics for all variables in the data by totexp_group. How many individuals in the
data have zero medical expenditures? How many have positive medical expenditures?

D. Recode age so that a new variable called age_category takes on a value of 1, 2 , ..., or 6 corresponding
to, respectively, age 65-69, 70-74, 75-79, 80-84, 85-89, and 90-94. Then tabulate age_category and
indicate what percent of the individuals have age 75 or higher.

4. Restricting your data to individuals with positive medical expenditures, answer the following questions.
(Hint: You should get 2955 observations.)

A. Obtain the correlations (correlation matrix) of the following 10 variables: totexp suppins phylim
actlim totchr age female income white msa. Which of the latter 9 variables has the highest correlation
with the dependent variable totexp?

B. Making sure to generate new variables you may need, estimate equation (1) in level form of the
dependent variable (y = totexp). Present your regression results in a summary table; see the remarks
below. Use the results from this regression to answer parts (C) through (E) of Question 2 below.

C. Discuss the sign and the magnitude of the estimated coefficient on totchr. Does this regressor have a
significant effect on total expenditure? Justify your answer.

1
D. Are age and age-squared individually significant? Are they jointly significant? Justify your answers.

E. At what age is total medical expenditure the highest? Explain. (Hint: Find the turning point of total
medical expenditures function as it relates to age).

5. Again restricting your data to individuals with positive medical expenditures, answer the following
questions.

A. Making sure to generate new variables you may need, estimate equation (1) in log form of the
dependent variable (y = log(totexp). Present your regression results in a summary table; see the
remarks below. Use the results from this regression to answer parts (B) through (D) of Question 3
below.

B. Interpret the estimate associated with tochr.

C. Test the joint significance of the health status variables: phylim actlim totchr.
Remarks - Organize your submission in the usual format and order: answers to all parts of the problem;
printout of command ( do ) file; well-labeled output. Staple all the pages together.Use outreg to find
out tables.

You might also like