Ex08 PDF

3 Summer 2008 examination
EC220
Introduction to Econometrics
Suitable for all candidates
Instructions to candidates Time allowed: 3 hours + 15 minutes reading time This paper contains NINE questions. Answer any FOUR questions. All questions will be given equal weight (25%). You are supplied with: Graph paper Statistical tables Logarithm tables (available on request).
Calculators are NOT allowed in this examination.
LSE 2008/EC220
Page 1 of 11
1. A variable Yi is generated as Yi = 1 + ui (1.1)
where 1 is a fixed parameter and ui is a disturbance term that is independently and identically
2 distributed with expected value 0 and population variance u . The least squares estimator of 1 is
, where n is the number of observations in n the sample. However a researcher believes that Y is a linear function of another variable X and uses ordinary least squares to fit the relationship
Y , the sample mean of Y, with population variance
2 u
Y = b1 + b2 X
(1.2)
calculating b1 as Y b2 X , where X is the sample mean of X. The population variance of b1 is

2 u +
. X may be assumed to be a nonstochastic variable. (X i X )2 X2
(a) [4 marks] Given that (1.1) is the true relationship, demonstrate that Y is indeed the least squares estimator of 1. (b) [4 marks] What would be the value of R2 in such a regression (a regression with only a constant and no explanatory variables)? Give a mathematical explanation. (c) [2 marks] Demonstrate that Y is an unbiased estimator of 1. (d) [2 marks] Demonstrate that the population variance of this estimator is
2 u
(e) [5 marks] Determine whether the researchers estimator b1 in equation (1.2) is biased or unbiased, and if biased, determine the direction of the bias. (f) [3 marks] Mathematically, the expression for the variance of the researchers estimator shows that it is an inverse function of the sum of the squared deviations of the observations on X around the sample mean X . Explain intuitively why this should be the case. (g) [2 marks] Mathematically, the expression for the variance of the researchers estimator shows that it is the same as that of Y for the special case X = 0. Explain why this should be the case. (h) [3 marks] Suppose that it is not known whether Y depends on X or not and that the coefficient of X in a regression of Y on X is not significant. Explain the potential advantages and disadvantages of using Y b2 X , rather than Y , as an estimator of 1.
LSE 2008/EC220
Page 2 of 11
2. A researcher investigating the determinants of juvenile delinquency has the following data for 2007 for a sample of 100 cities in a certain country: A, the number of arrests per 1,000 juveniles, defined as persons aged 1418, in the city, P, the number of households per 1,000 in the city with incomes below the poverty line, and S, the number of single-parent households per 1,000 in the city. He is considering fitting the model
A = 1 + 2 P + 3 S + u
(2.1)
where u is a disturbance term that may be assumed to satisfy the usual regression model assumptions. The correlation between P and S is 0.96. State what is correct, mistaken, confused or incomplete in the following statements, giving an explanation when the statement is not correct. (a) [4 marks] The high correlation between P and S will give rise to the problem known as multicollinearity. (b) [5 marks] Multicollinearity does not cause the estimates of the coefficients to be biased but it does cause them to be inconsistent. (c) [4 marks] The standard errors will be biased, probably downwards. (d) [1 mark] The t tests and F test will be invalid.
(e) [2 marks] The problem will be even worse if there is a high correlation between A and P or between A and S. (f) [2 marks] One way of dealing with the problem is to run two separate regressions, one with A regressed on P, the other with A regressed on S. Multicollinearity is not usually a serious problem in models with only one explanatory variable. (g) [3 marks] Another way of dealing with the problem is to use the following procedure: choose values a2 and a3 and construct Z = a2P + a 3S Regress A on Z. Do this for a large number of combinations of a2 and a3. Choose the combination that yields the smallest residual sum of squares when A is regressed on Z. This is known as Two Stage Least Squares. (h) [4 marks] Of course it is possible that A depends only on P and not on S. In this case we should just perform the simple regression of A on P. We should, however, be careful to check whether the residuals from this regression are significantly correlated with A, P, or S. A high correlation with any of these variables would be evidence of misspecification.
LSE 2008/EC220
Page 3 of 11
3. A researcher believes that the expenditure of an enterprise on training per worker, T, is related to the size of the enterprise, S, and value added per worker V, where T and V are measured in thousands of pounds, S is measured as the number of employees of the enterprise. She has data for the year 2005 on T, S, and V for a sample that consists of 1,000 enterprises in the manufacturing sector and another 1,000 enterprises in the service sector. She hypothesizes that T is determined by the linear relationships Manufacturing Services
T = 1 + 2 S + 3V + u M
T = 1 + 2 S + 3V + u S
(3.1) (3.2)
where uM and uS are independent disturbance terms that satisfy the usual regression model assumptions. In 2005, 500 of the manufacturing enterprises, chosen randomly from the sample, were offered access to an incentive scheme that was designed to increase their expenditure on training. Similarly, 500 of the service enterprises, also chosen randomly, were offered access to the same scheme. (a) [4 marks] The researcher has been asked to determine whether the incentive scheme had a significant impact on training expenditure (1) for the manufacturing enterprises, and (2) for the service enterprises. How should she do this? (b) [7 marks] The researcher has been asked to test whether the impact of the incentive scheme was significantly different for the manufacturing and the service enterprises. How should she do this? State your regression specification and explain the reasons for your proposed procedure. (c) [6 marks] When the researcher presents her results at a workshop, one of the participants suggests that she could have used a Chow test to test the hypothesis in part (b). Describe how you would perform the Chow test and evaluate whether the suggestion is correct. (d) [5 marks] Some of the 2,000 enterprises have dedicated training departments, while the others do not. Provide an outline of a method that the researcher could use for testing the hypothesis that manufacturing enterprises are more likely than service enterprises to have training departments, controlling for size of enterprise and value added. (e) [3 marks] If the researcher restricted her sample to manufacturing enterprises with the same values of V, would a simple correlation between having a training department and S be an adequate test statistic for testing the hypothesis in part (d)?
LSE 2008/EC220
Page 4 of 11
4. A researcher investigating whether government expenditure tends to crowd out investment has data on government recurrent expenditure, G, investment, I, and gross domestic product, Y, all measured in US$ billion, for 30 countries in 2005. She fits two regressions (standard errors in parentheses; t statistics in square brackets; RSS = residual sum of squares). (1) A regression of log I on log G and log Y:
^
log I = 2.44 0.63 log G + 1.60 log Y (0.26) (0.12) (0.12) [9.42] [5.23] [12.42]
I G (2) a regression of log on log Y Y
R2 = 0.98 RSS = 0.90
(4.1)
I G log = 2.65 0.63 log Y Y (0.23) (0.12) [11.58] [5.07]
R2 = 0.48 RSS = 0.99
(4.2)
The correlation between log G and log Y in the sample is 0.98. The table gives some further basic G data on log G, log Y, and log . Y
sample mean 3.75 5.57 1.81 mean square deviation 2.00 1.95 0.08
log G log Y
G log Y
(a) [3 marks] Explain why the second specification is a restricted version of the first. State the restriction. (b) [5 marks] Perform a test of the restriction. G (c) [2 marks] The researcher expected the standard error of the coefficient of log in (4.2) to Y be smaller than the standard error of the coefficient of log G in (4.1). Explain why she expected this. (d) [5 marks] However the standard error is the same, at least to two decimal places. Give an explanation. (e) [5 marks] Show how the restriction could be tested using a t test in a reparameterized version of the specification for (4.1). (f) [5 marks] When the researcher presents her results at a seminar, one of the participants says that, since I and G and have been divided by Y, (4.2) is less likely to be subject to heteroscedasticity than (4.1). Evaluate this suggestion.
LSE 2008/EC220
Page 5 of 11
5. (a) [4 marks] In the context of explanatory variables in a regression model, explain the difference between a proxy variable and an instrumental variable. (b) A researcher hypothesizes that annual health expenditure per capita, H, is related to annual aggregate income per capita, Y, by the specification
log H = 1 + 2 log Y + u
(5.1)
where u is a disturbance term that may be assumed to satisfy the usual regression model assumptions. Her data relate to a sample of developing countries. Unfortunately the data on Y are not reliable. Much economic activity is in the informal sector, for which there is little hard information. The relationship between the logarithm of published estimate of output per capita, log Z and the logarithm of actual output is log Z = log Y + w (5.2)
where w is a random quantity. The data on health expenditure are no better. The relationship between the logarithm of the published estimate, log K, and the logarithm of actual expenditure is given by log K = log H + r (5.3) where r is a random quantity. The model in terms of observable variables thus becomes log K = 1 + 2 log Z + u + r 2 w (5.4)
It may be assumed that w and r are both identically and independently distributed with zero means and that they are not related to Y or H or each other. (i) [2 marks] Explain mathematically the likely impact of r on the ordinary least squares (OLS) estimate of 2 obtained by the researcher. (ii) [5 marks] Explain mathematically the likely impact of w on the OLS estimate of 2 obtained by the researcher. (c) The researcher believes Y is proportional to E, annual consumption of electricity per capita, apart from a multiplicative random factor v, so that
Y = Ev
(5.5)
The random factor v has a lognormal distribution such that log v is normally distributed with zero mean and constant variance. E may be assumed to be measured accurately. (i) [5 marks] Explain the advantages and disadvantages of using E as a proxy for Y in (5.4) regressing log K on log E instead of log Z. (ii) [5 marks] Explain mathematically the effect of using log E as an instrument for log Z in (5.4). (iii) [4 marks] The researcher presents her results at a seminar. One of the participants suggests fitting (5.5), regressing log Z on log E, and using the fitted values of log Z as an instrument for log Z in (5.4). Discuss whether this procedure is likely to yield improved estimates.
LSE 2008/EC220
Page 6 of 11
6. A researcher investigating the relationship between aggregate wages, W, aggregate profits, P, and aggregate income, Y, postulates the following model:
W = 1 + 2Y + u
(6.1) (6.2) (6.3)
P = 1 + 2Y + 3 K + v
Y =W + P
where K is aggregate stock of capital and u and v are disturbance terms that satisfy the usual regression model assumptions and may be assumed to be distributed independently of each other. The third equation is an identity, all forms of income being classified either as wages or as profits. The researcher intends to fit the model using data from a sample of industrialized countries, with the variables measured on a per capita basis in a common currency. K may be assumed to be exogenous. (a) [2 marks] Explain what is meant by exogenous. (b) [8 marks] Explain why ordinary least squares (OLS) would yield inconsistent estimates if it were used to fit (6.1) and derive the large-sample bias in the slope coefficient. (c) [4 marks] Explain what can be inferred about the finite-sample properties of OLS if used to fit (6.1). (d) [4 marks] Demonstrate mathematically how one might obtain a consistent estimate of 2 in (6.1). (e) [2 marks] Explain why (6.2) is not identified (underidentified). (f) [2 marks] Explain whether (6.3) is identified. (g) [3 marks] At a seminar, one of the participants asserts that it is possible to obtain an estimate of 2 even though equation (6.2) is underidentified. Any change in income that is not a change in wages must be a change in profits, by definition, and so one can estimate 2 as (1 b2), where b2 is the estimate of 2 found in (d). The researcher does not think that this is right but is confused and says that he will look into it after the seminar. What should he have said?
LSE 2008/EC220
Page 7 of 11
7. In a simulation experiment, a sample of 20 observations on a variable Yt is generated by the process
Yt = 10 + 0.9Yt 1 + u t
(t = 1, ..., 20)
(7.1)
where ut is independently and identically distributed, its values being drawn randomly from a normal distribution with mean zero and unit variance, and Y0 = 0. Yt is then regressed on Yt1. using ordinary least squares (OLS) The simulation consists of 100,000 such samples and regressions. The distribution of the coefficient of Yt1 in the regressions is shown in the figure.
3
2.5
Probability density
Mean = 0.70
1.5
0.5
0 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 1.2
Coefficient of Y(t-1)
(a) [5 marks] Demonstrate that the OLS estimator of the slope coefficient may be expressed as the true value of the slope coefficient plus a linear combination of the values of the disturbance term in the observations in the sample. (b) [5 marks] Hence explain why it is not possible to demonstrate that the OLS estimator of the coefficient of Yt1 is unbiased. [Note: You are not expected to investigate the direction of the bias and no credit will be given for it.] (c) [5 marks] Explain analytically how you would expect the distribution to change if the simulation were repeated with a very large number of observations in each sample. (d) [7 marks] What difference would it make to your answers to (a), (b), and (c) if the disturbance term were generated by the process
u t = u t 1 + t
where t is independently and identically distributed? (e) [3 marks] Suppose that = 1 in (7.2). Demonstrate that Yt is nonstationary. convenience, you may assume u0 = 0.
(7.2)
For
LSE 2008/EC220
Page 8 of 11
8. A researcher has annual data for the years 19662005 on aggregate output, Y, aggregate capital input, K, and aggregate labour input, L, all measured in real terms at 2000 constant prices, for a certain country. He hypothesizes that output is determined by the CobbDouglas production function
Yt = 1 K t 2 L3 e 4t t
(8.1)
modified by an appropriately-defined disturbance term. t is a time index running from 1 (for 1966) to 40 (for 2005). The researcher performs the following ordinary least squares (OLS) regressions: (1) log Yt on log Kt, log Lt and t. All logarithms are natural logarithms to base e. (2) log Yt on log Kt, log Lt, t, and log Y, log K, and log L lagged one time period. (3) log Yt on log Kt, log Lt, t, and log Y lagged one time period. He also fits the first regression using a technique appropriate for AR(1) autocorrelation in the disturbance term. The results of the four regressions are shown in the table. t statistics are given in parentheses. RSS is the residual sum of squares. A constant was included in each specification but is not shown. Potential problems relating to nonstationary processes may be ignored.
(1) OLS 0.401 (3.32) 0.503 (4.07) 0.022 (2.94) 0.80 0.035 (2) OLS 0.207 (1.24) 0.300 (1.55) 0.017 (1.82) 0.800 (3.23) 0.123 (1.13) 0.152 (0.98) 2.00 0.020 (3) OLS 0.235 (2.23) 0.340 (2.73) 0.017 (1.87) 0.750 (4.03) 1.99 0.022 (4) AR(1) 0.351 (2.68) 0.458 (2.95) 0.021 (2.20) 1.95 0.770 0.022
log Kt log Lt t log Yt1 log Kt1 log Lt1 d
RSS
(a) [2 marks] In regression (1), what assumptions are being made concerning the disturbance term in the CobbDouglas production function? (b) [2 marks] Give interpretations of the coefficients of log Lt and t in regression (1). (c) [5 marks] Comparing the results for regressions (1) and (2), explain which appears to be more satisfactory. (d) [4 marks] Comparing the results for regressions (2) and (3), explain which appears to be more satisfactory. (e) [3 marks] Explain the dynamics implicit in the specification of regression (3). (f) [3 marks] Explain mathematically the theory underlying the specification of regression (4). (g) [4 marks] You are asked to compare regressions (2) and (4). Describe the test that you would perform, specifying the null hypothesis and test statistic. You are not expected to calculate the test statistic, and no credit will be given for doing so. (h) [2 marks] Assume that the test statistic in (g) is not significant. What would you conclude? Explain whether the estimates of the coefficients in the table appear to be compatible with your conclusion.
LSE 2008/EC220
Page 9 of 11
9. A researcher has the following data for 3,763 respondents in the United States National Longitudinal Survey of Youth 1979 : hourly earnings in dollars in 1994 and 2000, years of schooling as recorded in 1994 and 2000, and years of work experience as recorded in 1994 and 2000. The respondents were aged 1421 in 1979, so they were aged 2936 in 1994 and 3542 in 2000. 371 of the respondents had increased their formal schooling between 1994 and 2000, 210 by one year, 101 by two years, 47 by three years, and 13 by more than three years, mostly at college level in non-degree courses. The researcher performs the following regressions: (1) the logarithm of hourly earnings in 1994 on schooling and work experience in 1994 (2) the logarithm of hourly earnings in 2000 on schooling and work experience in 2000 (3) the change in the logarithm of hourly earnings from 1994 to 2000 on the changes in schooling and work experience in that interval. The results are shown in columns (1) (3) in the table (t statistics in parentheses), and are presented at a seminar.
(1) log earnings 1994 0.114 (30.16) 0.052 (18.81) 0.214 (12.03) 0.149 (5.23) 0.039 (1.11) 4.899 (74.59) 0.265 3,763 (2) log earnings 2000 0.116 (28.99) 0.038 (14.59) 0.229 (11.77) 0.199 (6.44) 0.053 (1.38) 5.023 (65.02) 0.243 3,763 (3) Change in log earnings 19942000 0.090 (5.00) 0.024 (2.75) 0.102 (2.13) 0.007 3,763 (4) log earnings 2000 0.108 (24.53) 0.037 (14.10) 0.004 (4.79) 0.230 (11.88) 0.167 (5.29) 0.071 (1.84) 4.966 (63.69) 0.248 3,763 (5) Change in log earnings 19942000 0.006 (0.16) 0.003 (0.15) 0.389 (3.05) 0.0002 371
Dependent variable Schooling Experience Cognitive ability score Male Black Hispanic Change in schooling Change in experience constant R2 n
(a) [7 marks] He is unable to explain why the coefficient of the change in schooling in regression (3) is so much lower than the schooling coefficients in (1) and (2). Someone says that it is because he has left out relevant variables such as cognitive ability, region of residence, etc, and the coefficients in (1) and (2) are therefore biased. Someone else says that cannot be the explanation because these variables are also omitted from regression (3). Explain what would be your view. (b) [3 marks] He runs regressions (1) and (2) again, adding a measure of cognitive ability. The results for the 2000 regression are shown in column (4). The results for 1994 were very similar. Discuss possible reasons for the fact that the estimate of the schooling coefficient differs from those in (2) and (3). (c) [4 marks] Someone says that the researcher should not have included a constant in regression (3). Explain why she made this remark and assess whether it is valid.
LSE 2008/EC220 Page 10 of 11
(d) [3 marks] Someone else at the seminar says that the reason for the relatively low coefficient of schooling in regression (3) is that it mostly represented non-degree schooling. Hence one would not expect to find the same relationship between schooling and earnings as for the regular pre-employment schooling of young people. Explain in general verbal terms what investigation the researcher should undertake in response to this suggestion. Mathematical analysis is not required and will not earn marks. (e) [3 marks] Another person suggests that the small minority of individuals who went back to school or college in their thirties might have characteristics different from those of the individuals who did not, and that this could account for a different coefficient. Explain in general verbal terms what investigation the researcher should undertake in response to this suggestion. Mathematical analysis is not required and will not earn marks. (f) [5 marks] Finally, another person says that it might be a good idea to look at the relationship between earnings and schooling for the subsample who went back to school or college, restricting the analysis to these 371 individuals. The researcher responds by running the regression for that group alone. The result is shown in column (5) in the table. The researcher also plots a scatter diagram, reproduced below, showing the change in the logarithm of earnings and the change in schooling. For those with one extra year of schooling, the mean change in log earnings was 0.40. For those with two extra years, 0.37. For those with three extra years, 0.47. What conclusions might be drawn from the regression results? Mathematical analysis is not required and will not earn marks.
4
Change in log earnings
0
0
-1
-2
-3
-4
Change in schooling
LSE 2008/EC220
Page 11 of 11
EC220 INTRODUCTION TO ECONOMETRICS Marking Scheme for the 2008 Examination
Notes to examiners:
This marking scheme will in due course be posted on the EC220 website as a resource for future students. For this reason some of the explanations are more detailed than is necessary for a marking scheme. Please mark each item in each question to the nearest half mark. Examiners are encouraged to award additional marks if the candidate makes points that are not included in the marking scheme, provided that they are wholly relevant and provided that they are not just a statement of unnecessary detail. A few possible additional marks have been included in the marking scheme. If, contrary to instructions, a candidate has answered more than four questions, all those after the first four should be disregarded. Candidates have been told that this will be the case.
1. (a) [4 marks] Let the fitted model be Y = b1 Then the residual in observation i, ei, is given by
ei = Yi b1
and the residual sum of squares by
RSS =
Then
e = (Y
2 i
b1 ) =
2
Y
i
2b1
Y
i
+ nb12
dRSS = 2 db1
Y
i
+ 2nb1
Setting this equal to zero, we have b1 = Y . This is the least squares estimator since the second differential, 2n, is positive and we have therefore found a minimum. Give 3 marks for the method and 1 mark for getting the mathematics right.
(b) [4 marks] Zero [1 mark]. R 2
(Y Y ) = (Y Y )
i i i i
and Yi = Y for all i [3 marks].
(c) [2 marks]
E (Y ) = E ( 1 + u ) = 1 + E (u ) = 1
since E (u ) = 0. E (u ) = 0 because E (u i ) = 0 for all i. (d) [2 marks]

2 b1 = E [Y E (Y )] = E [1 + u 1 ]2 = E u 2 = 2
) (
) ( )
2 u
LSE 2008/EC220
Page 12 of 11
[Alternatively, and equivalently,

var(b1 ) = var(Y ) = var( 1 + u ) = var(u ) = var (e) [5 marks] First we need to show that E (b2 ) = 0 .
u
n
1 = var n2
( u ) = n n
i
2 u 2
2 u
b2 =
(X
i i
X )(Yi Y )
i
(X
X)
(X
i
X )( 1 + u i 1 u )
(X
i
X)
(X
i i
X )(u i u )
i
(X
X)
Hence, given that we are told that X is nonstochastic,
E (b2 ) = E =
(X
i i
X )(u i u ) = (X i X )2
(X
i
1
i
X)
(X
i
X )(u i u )
2 (X i X ) i i
(X i X )E (u i u ) = 0
since E(u) = 0 [3 marks]. Thus

E (b1 ) = E (Y b2 X ) = 1 XE (b2 ) = 1
and the estimator is unbiased [2 marks]. (f) [3 marks] [Note to examiners: Candidates do not have to follow the suggested procedure. Award appropriate credit for alternative methods]. The denominator of the second term may be written nMSD(X) where the second factor, the mean square deviation 1 (X i X )2 [1 mark]. The larger the value of n of X about its sample mean, is n i
or the MSD, the larger will be the denominator and hence the lower will be the variance. In the case of n, the intuition is that the greater the amount of information, the more precise should be the estimates [1 mark]. In the case of MSD, the greater the variation in X (holding the variation in u constant), the easier it will be to separate the effects of X and u on Y, and hence the more precise should be the estimates of the parameters [1 mark]. (g) [2 marks] If X = 0 , the estimators are identical. Y b2 X reduces to Y . (h) [3 marks] If X really is a determinant, Y will yield biased estimates (unless X = 0 : candidates are not expected to say this) [1 mark]. If it is not, Y will be more efficient since it will have a smaller variance (again, unless X = 0 : candidates are not expected to say this) [2 marks].
LSE 2008/EC220
Page 13 of 11
2. (a) [4 marks] The high correlation between P and S will give rise to the problem known as multicollinearity. 3 marks for agreeing, explicitly or implicitly that the potential problem is multicollinearity. 1 mark for explaining that will is too strong because a high correlation is a necessary but not sufficient condition. If the variance of the disturbance term is small, or the number of observations large, or the mean square deviations of P and S are large, there may be no real problem of multicollinearity. (b) [5 marks] Multicollinearity does not cause the estimates of the coefficients to be biased but it does cause them to be inconsistent. 2 marks for agreeing with the statement about unbiasedness. 3 marks for disagreeing with the statement about inconsistency, explaining that the estimators would remain unbiased as the number of observations became large and there is no reason for the variance not to tend to zero. (c) [4 marks] The standard errors will be biased, probably downwards. Incorrect. They remain valid [2 marks; give 1 extra mark if the candidate explains that multicollinearity does not cause a violation of the regression model assumptions]. Multicollinearity may cause the standard errors to be large, warning that there is a loss of precision and that the point estimates of the coefficients may be erratic [2 marks]. (d) [1 mark] The t tests and F test will be invalid. Incorrect. They remain valid. (e) [2 marks] The problem will be even worse if there is a high correlation between A and P or between A and S. This is nonsense. The researcher wants Y to be highly correlated with the explanatory variables and a high correlation between it and them has no bearing on a problem of multicollinearity. (f) [2 marks] One way of dealing with the problem is to run two separate regressions, one with A regressed on P, the other with A regressed on S. Multicollinearity is not usually a serious problem in models with only one explanatory variable. 1 mark for saying that this would merely give rise to omitted variable bias. 1 mark for saying that not usually should be never because the problem cannot arise in a simple regression model. (g) [3 marks] Another way of dealing with the problem is to use the following procedure: choose values a2 and a3, construct
Z = a2P + a3S
and regress A on Z. Do this for a large number of combinations of a2 and a3. Choose the combination that yields the smallest residual sum of squares when A is regressed on Z. This is known as Two Stage Least Squares. 2 marks for saying that the estimates would effectively be the same as in a linear regression. 1 mark for saying that this is not what is known as TSLS. (h) [4 marks] Of course it is possible that A depends only on P and not on S. In this case we should just perform the simple regression of A on P. We should, however, be careful to check whether the residuals from this regression are significantly
LSE 2008/EC220
Page 14 of 11
correlated with A, P, or S. A high correlation with any of these variables would be evidence of misspecification. 1 mark for agreeing that simple regression would indeed be appropriate if A depends only on P. 1 mark for saying that the residuals must be positively correlated with A. Give 1 for a mathematical explanation, for example: extra mark cov( A, e ) = cov A + e , e = var(e ) > 0, cov A, e necessarily being equal to zero
([
])
( )
(basic OLS property). 1 mark for saying that the residuals cannot be correlated with P (basic OLS property). 1 mark for saying that a correlation of the residuals with S would indeed be evidence of misspecification.
LSE 2008/EC220
Page 15 of 11
3. (a) [4 marks] Define a dummy variable I that is equal to 1 if the firm were offered access to the incentive scheme and 0 if it were not [2 marks]. Add this variable to both (3.1) and (3.2) and perform t tests on its coefficients in those regressions [2 marks]. (b) [7 marks] Define a dummy variable M that is equal to 1 if the firm is in the manufacturing sector and 0 if it is in the service sector [1 mark]. Define interactive variables MS, MV, and MI defined as the product of M and S, V, and I, respectively [2 marks]. We now have a model on the lines of
T = 1 + 2 S + 3V + I + 1 M + 2 MS + 3 MV + 4 MI + u
and we would perform a t test on the coefficient of MI [1 mark]. The coefficient of I provides an estimate of the impact of the incentive scheme in the services sector and the coefficient of MI provides an estimate of the extra impact in the manufacturing sector [1 mark]. We need to include the dummy variable M and the interactive terms MS and MV to allow the basic coefficients of the relationship to differ for the two sectors. Equations (3.1) and (3.2) make it clear that we should not assume that they are the same [2 marks]. (c) [6 marks] For a Chow test, the appropriate regression specification would include the incentive scheme dummy variable, but not the manufacturing sector dummy variable or the interactive terms [1 mark; the candidate may earn it either verbally or by stating the specification mathematically]. The regression is performed three times, once with only the manufacturing enterprises, a second time with only the service enterprises, and a third time with the pooled sample. Let the residual sums of squares be RSSM, RSSS, and RSSP, respectively. The null hypothesis is that all the coefficients in the regression specification are the same for the two subsamples, and as a consequence there will be no significant improvement in fit on dividing the sample [1 mark]. The test statistic is
F (4,1992 ) =
(RSS P [RSS M + RSS S ]) / 4 (RSS M + RSS S ) / 1992
since there are 2,000 observations and relaxing the restrictions consumes 4 degrees of freedom [2 marks]. However, the test is inappropriate since rejection of the null hypothesis could be attributable to differences in the intercept or coefficients of S or V, rather than a difference in the coefficients of I [2 marks]. (d) [5 marks] Define a variable D equal to 1 if the enterprise had a training department, 0 otherwise, and regress D on S, V, and M using probit or logit analysis. Test the coefficient of M [5 marks]. Give up to 2 extra marks if the candidate should provide an outline of the probit or logit model. (e) [3 marks] The correlation coefficient is not an adequate test statistic because it tests the wrong hypothesis.
LSE 2008/EC220
Page 16 of 11
4. (a) [3 marks] Write the second equation as log It may be re-written log I = 1 + 2 log G + (1 2 ) log Y + u This is a special case of the specification of the first equation,
I G = 1 + 2 log + u Y Y
log I = 1 + 2 log G + 3 log Y + u

with the restriction 3 = 1 2 . (b) [5 marks] The null hypothesis is H0: 2 + 3 = 1 [1 mark]. The test statistic is
F (1,27 ) =
(0.99 0.90) / 1 = 2.7 [3 marks]

0.90 / 27
The critical value of F(1, 27) is 4.21 at the 5 percent level. Hence we do not reject the null hypothesis that the restriction is valid [1 mark]. (c) [2 marks] The imposition of the restriction, if valid, should lead to a gain in efficiency [1 mark] and this should be reflected in lower standard errors [1 mark]. (d) [5 marks] The standard errors of the coefficients of G in (4.1) and G/Y in (4.2) are given by
2 su 1 and . 2 n MSD(G ) 1 rG ,Y 2 su . n MSD(G / Y )
2 respectively, where su is an estimate of the variance of the disturbance term, n is
the number of observations, MSD is the mean square deviation in the sample, and rG,Y is the sample correlation coefficient of G and Y [2 marks]. n is the same for both standard errors and su will be very similar. We are told that rG,Y = 0.98, so its square is 0.96 and the second factor in the expression for the standard error of G is (1/0.04) = 25. Hence, other things being equal, the standard error of G/Y should be much lower than that of G. However the table shows that the MSD of G/Y is only 1/25 as great as that of G. This just about exactly negates the gain in efficiency attributable to the elimination of the correlation between G and Y [3 marks; give 2 marks if the candidate offers a verbal explanation on the right lines, instead of a numerical one]. (e) [5 marks] Note to examiners: The reparameterization technique, as taught in lectures this year and as described in the new version of the study guide, is different from that in the text. Candidates may use either procedure. Both are described here. Either way, give 4 marks for the reparameterization and 1 mark for describing the t test.
Lectures version: Define = 2 + 3 1 , so that the restriction may be written
= 0 . Then 3 = 2 + 1 . Use this to substitute for 3 in the unrestricted

model:
LSE 2008/EC220
Page 17 of 11
log I = 1 + 2 log G + 3 log Y + u
= 1 + 2 log G + ( 2 + 1) log Y + u
Then log I log Y = 1 + 2 (log G log Y ) + log Y + u and
I G log = 1 + 2 + log Y + u Y Y
Hence the restriction may be tested by a t test of the coefficient of log Y in a regression using this specification.
Text version:
The unrestricted version is
log I = 1 + 2 log G + 3 log Y + u

The restricted version may be written log I = 1 + 2 log G + (1 2 ) log Y + u (see above). Subtracting the second equation from the first, we have 0 = ( 3 1 + 2 ) log Y Adding this term converts the restricted version back into the unrestricted version.
I G log = 1 + 2 + ( 3 1 + 2 ) log Y + u Y Y
A t test on the coefficient of log Y is a test of the restriction. If the coefficient is not significant, we can drop the term and use the restricted version. If the coefficient is significantly different from zero, we cannot drop the term and must stay with the unrestricted version. (f) [5 marks] If the restriction is valid, imposing it will have no implications for the disturbance term and so it could not lead to any mitigation of a potential problem of heteroscedasticity. [This is enough for the 5 marks. EC220 students have seen how scaling through a linear specification by a variable proportional in observation i to the standard deviation of ui in observation i would lead to the elimination of heteroscedasticity. The present specification is logarithmic and dividing I and G by Y does not affect the disturbance term.]
LSE 2008/EC220
Page 18 of 11
5. (a) [4 marks] A proxy is used to mitigate a potential problem of omitted variable bias when there are no data on a regressor [2 marks]. An instrument is used when there are data on the regressor, but the regressor is not distributed independently of the disturbance term]. Provided that the instrument satisfies certain conditions, the IV estimator will be consistent [2 marks. (b) (i) [2 marks] Standard theory. r is effectively an addition to the disturbance term and therefore increases the variance of the estimator of 2, but does not impart bias. (ii) [5 marks] Standard theory. The OLS estimator of 2 will be inconsistent, the estimator tending to 2 2
2 w . Candidates are expected to provide a proof [3 2 2 Y + w
marks]. They should explain why they are taking plims rather than expectations [1 mark] and they should give a reason when asserting that the plims of certain terms are zero [1 mark].
(c) (i) [5 marks] Substituting for Y in (5.1), we have log H = 1 + 2 log Ev + u
= 1 + 2 log + 2 log E + u + 2 log v

[1 mark]. The researcher still has to use K rather than H, but the OLS estimator of 2 will now be unbiased [2 marks] and the standard errors and test statistics will be valid [1 mark]. The only potential disadvantage is that the variance could be large if the variance of log v is large [1 mark]. (ii) [5 marks] The use of log E as an instrumental variable will yield a consistent estimate. The IV estimator is
IV b2
(log E log E ])(log K = (log E log E ])(log Z

i i i
) log Z )
log K
(log E
i
log E ] [1 + 2 log Z i + u i + ri 2 wi ] 1 + 2 log Z + u + r 2 w
)(
])
(log E
i i i i
log E ] log Z i log Z

2 i 2
)(
= 2
(log E log E ])([u + r w ] [u + r w ]) + (log E log E ])(log Z log Z )

i i i i
It will not be possible to obtain a closed-form expression for the expectation because w is a determinant of the denominator, via Z, as well as of the numerator. Instead we take plims, having first divided the numerator and denominator of the error term by n so that they have limits:
LSE 2008/EC220
Page 19 of 11
plim
IV plim b2
= 2 +
1 n
(log E
i
log E ] ( [u i + ri 2 wi ] [u + r 2 w ])
i
plim = 2 +
1 n
(log E
i
log E ] log Z i log Z
)(
cov(log E , u ) + cov(log E , r ) 2 cov(log E , w) = 2 cov(log E , log Z )
assuming that log E is distributed independently of u, r, and w. [3 marks for standard theory, 1 mark for explaining why plims, not expectations, 1 mark for explaining why some plim terms are zero] (iii) [4 marks] Let the fitted relationship be log Z = a1 + a 2 log E (strictly speaking, we should exploit the fact that the coefficient of log E is
^ ^
one, but this makes no difference). log Z will be a linear function of log E and therefore will yield the same estimator when used as an instrument:
^ log Z log K i log K ^ ^ log Z i log Z log Z i log Z
IV b2 =
log Z
^ i
) ) ) log Z )
i i
([a = ([a (a = (a
i i i i 2
+ a 2 log E i ] [ a1 + a 2 log E ] log K i log K + a 2 log E i ] [a1

2 i
)( + a log E ])(log Z ) log Z )
log E i a 2 log E log K i log K log E i

i 2 i
)( a log E )(log Z
i
(log E log E )(log K = (log E log E )(log Z

i i
) log Z )
log K
LSE 2008/EC220
Page 20 of 11
6. (a) [2 marks] For the purposes of EC220, it is a variable whose value in each observation is determined outside the model in question and therefore may be taken as given. The critical requirement is that it should be distributed independently of the disturbance term (b) [8 marks] At some point we will need the reduced form equation for Y. Substituting into the third equation from the first two, and re-arranging, it is
Y= 1 1 2 2
( 1 + 1 + 3 K + u + v ) [2 marks]
Since Y depends on u, the assumption that the disturbance term be distributed independently of the regressors is violated in (6.1). [2 marks; candidates may earn these two marks without making this statement if they successfully demonstrate mathematically that OLS yields inconsistent estimates.]
OLS b2
(Y
i
Y )(Wi W )
i
(Y
i
Y )
= 2 +
(Y
i i
Y )(u i u )
i
(Y
Y )
after substituting for W from (6.1) and simplifying. We are not able to obtain a closed form expression for the expectation of the error term because u influences both its numerator and denominator, directly and by virtue of being a component of Y, as seen in the reduced form [1 mark; candidates are expected to explain why they take plims rather than expectations]. Dividing both the numerator and denominator by n, and noting that plim 1 n
(Y
i
Y ) = var(Y )
2
as a consequence of a law of large numbers, and that it can also be shown that
plim 1 n
(Y
i
Y )(u i u ) = cov(Y , u )
we can write plim

OLS plim b2
= 2 +
1 n
(Y
i i
Y )(u i u )
i
1 plim n
(Y
Y )
= 2 +
cov(Y , u ) var(Y )
Now
1 cov(Y , u ) = cov 1 ( 1 + 1 + 3 K + u + v ), u 2 2 = 1 1 2 2
( 3 cov(K , u ) + var(u ) + cov(v, u ))
the covariance of u with the constants being zero. Since K is exogenous, cov(K, u) = 0. We are told that u and v are distributed independently of each other, and so cov(u, v) = 0. Hence
LSE 2008/EC220
Page 21 of 11
OLS plim b2 = 2 +
1 2 2 plim var(Y )
2 u
From the reduced form equation for Y it is evident that (1 2 2) > 0, and so the large-sample bias will be positive [3 marks]. (c) [4 marks] It is not possible for an estimator that is unbiased in a finite sample to develop a bias if the sample size increases. Therefore, since the estimator is biased in large samples, it must also be biased in finite ones [3 marks]. [Technically, the expectation might not even exist in a finite sample, but candidates are not expected to say this.] The plim may well be a guide to the mean of the estimator in a finite sample, but this is not guaranteed and it is unlikely to be exactly equal to the mean. [1 mark; give 1 extra mark for a particularly good discussion] (d) [4 marks] Use K as an instrument for Y.
IV b2
(K K )(W = (K K )(Y
i i i i
W ) Y ) = 2
(K + (K
i i
K )(u i u ) K )(Yi Y )
after substituting for W from (6.1) and simplifying. We are not able to obtain a closed form expression for the expectation of the error term because u influences both its numerator and denominator, directly and by virtue of being a component of Y, as seen in the reduced form [1 mark; candidates are again expected to explain why they take plims rather than expectations]. Dividing both the numerator and denominator by n, and noting that it can be shown that plim 1 n
(K
i
K )(u i u ) = cov(K , u ) = 0
since K is exogenous, and that plim we can write

IV plim b2 = 2 +
1 n
(K
i
K )(Yi Y ) = cov(K , Y )
cov(K , u ) = 2 cov(K , Y )
cov(K, Y) is non-zero since the reduced form equation for Y reveals that K is a determinant of Y. Hence the instrumental variable estimator is consistent [3 marks]. (e) [2 marks] (6.2) is underidentified because the endogenous variable Y is a regressor and there is no valid instrument to use with it. The only potential instrument is the exogenous variable K and it is already a regressor in its own right. (f) [2 marks] (6.3) is an identity so the issue of identification does not arise. (g) [3 marks] The argument would be valid if Y were exogenous, in which case one could characterize 2 and 2 as being the effects of Y on W and P, holding other variables constant. But Y is endogenous, and so the coefficients represent only part of an adjustment process. Y cannot change autonomously, only in response to variations
LSE 2008/EC220 Page 22 of 11
in K, u, or v. [Give the 3 marks for this or similar. The following is not necessary, but give 1 extra mark, or even 2, for a mathematical explanation]. The reduced form equations for W and P are
W = 1 + =
2 ( 1 + 1 + 3 K + u + v ) + u 1 2 2
1
1 2 2
(1 + 1 2 2 1 + 3 2 K + (1 2 )u + 2 v )
P = 1 + =
2 ( 1 + 1 + 3 K + u + v ) + 3 K + v 1 2 2
1
1 2 2
( 1 1 2 + 2 1 + 3 (1 2 )K + 2 u + (1 2 )v )
Thus, for example, a change in K will lead to changes in W and P in the proportions 2 : (1 2), not 2 : 2. The same is true of changes caused by a variation in v. For a variation in u, the proportions would be (1 2) : 2. If instead the candidate simply insists that, given standard theory, it is not possible for the profits equation to be identified, and that the argument therefore somehow must be wrong, give 1.5 marks.
LSE 2008/EC220
Page 23 of 11
7. (a) [5 marks] The OLS estimator of the slope coefficient of Yt = 1 + 2 X t + u t is

OLS b2
(X
t t
X )(Yt Y )
t
(X
X)
Substituting for Y, this may be decomposed as

OLS b2 = 2 +
(X
t
X )(u t u ) = 2 +
(X
t
X )u t
where =
(X X )
. Hence one may write
OLS b2 = 2 +
a u
t t
where a t =
Xt X . In the present case, Xt = Yt1.
(b) [5 marks] For unbiasedness we need E(atut) = 0 for all t. Since E(ut) = 0, this follows if E(atut) = E(at)E(ut). However, the decomposition requires that at and ut be distributed independently. This cannot be the case since at depends on all the values of Y in the sample, not just Yt, and ut influences Yt+1 and all subsequent values of Y. (c) [5 marks] The OLS estimator will be consistent. Still writing Yt1 as Xt, from (a) we can decompose b2 as
1 = 2 +
OLS b2
(X
t
X )(u t u ) = 2 +
1 n
(X
t
X )(u t u ) 1 n
Thus
OLS plim b2 = 2 +
cov( X t , u t ) var( X t )
since plim
plim 1 n
1 = var( X t ) by a law of large numbers and it can be shown that n

t
(X
t
X )(u t u ) = cov( X t , u t ) . Now Xt = Yt1. Since Yt1 has already
been determined by the time ut is generated, the covariance is zero and

OLS plim b2 = 2 . The distribution will collapse to a spike at the true value as the
number of observations in each sample becomes large. (d) [7 marks] The answer to (a) would be unaffected [2 marks]. Likewise we would reach the same conclusion for (b) [2 marks], except that ut is no longer independent of values of Y prior to Yt+1 (candidates do not have to say this). However it is no longer true
LSE 2008/EC220 Page 24 of 11
that cov(Yt1, ut) = 0 since both Yt1 and ut depend on ut1. As a consequence the OLS estimator is inconsistent [3 marks]. (e) [3 marks] If = 1, u t =
s =1
. Hence
Yt = 10 1 + 0.9 + 0.9 2 + ... +
)
t s =1
+ 0.9
s =1
t 1
+ ...
The expectation of Yt will be constant at 100 for sufficiently large t but the variance will grow with t. Yt is therefore nonstationary.
LSE 2008/EC220
Page 25 of 11
8. (a) [2 marks] It is assumed to be multiplicative, so that the relationship would more properly be written Yt = 1 K t 2 L 3 e 4T v t where v is the disturbance term [1 mark], and for the purpose of calculating standard errors and hence t statistics, lognormally distributed [0.5 mark]. It will then be additive and normally distributed in the regression model. It is also assumed to be distributed independently of K and L [0.5 mark]. (b) [2 marks] The elasticity of output with respect to labour input is 0.50 [0.5 mark], meaning than if labour input increases by 1 percent, output will increase by 0.5 percent [0.5 mark]. The proportional growth rate per year is 0.022, or 2.2 percent [1 mark; candidates may earn the mark with only the percentage interpretation]. (c) [5 marks] The DurbinWatson statistic in (1) is very low. With three explanatory variables and 40 observations, dL is 1.15 at the 1 percent level, so we reject the null hypothesis of no autocorrelation at that level [1 mark]. Since d = 2.00, the h statistic for (2) is exactly zero, so there is no evidence of autocorrelation in that specification [1 mark; actually, it cannot be computed, because 39*squared standard error of the coefficient of log Yt1 is going to be greater than one, but candidates are not expected to notice this]. The coefficient of log Yt1 is significantly different from zero at the 1 percent level, so (1) is suffering from omitted variable bias [1 mark]. RSS is significantly lower for regression (2), the (0.035 0.020) / 3 = 8.00 and the critical value at the test statistic being F (3,32) = 0.020 / 32 1 percent significance level being about 4.5 [2 marks]. [Strictly speaking, we should run (1) a second time dropping the first observation, so that the sample periods for the two regressions are the same. Candidates are not expected to say this.] (d) [4 marks] (3) seems more satisfactory since the lagged capital and labour terms appear to be superfluous. They both have insignificant t statistics and the F statistic for their (0.022 0.020) / 2 = 1.60 . The critical value joint explanatory power is F (2,32) = 0.020 / 32 at the 5 percent level is about 3.3 [2 marks]. Dropping these variables alleviates multicollinearity and as a consequence the standard errors of the coefficients of current capital and labour are smaller (they are not shown, but this must be the case, given that the coefficients are about the same and the t statistics are higher) and their t statistics are higher [1 mark]. Since d = 1.99 in (3), the h statistic must be low and so this specification also appears to be free from autocorrelation [1 mark]. [Actually, again, one would not be able to compute the h statistic.] (e) [3 marks] The coefficients of K and L are estimates of the short-run elasticities [1 mark]. One may also obtain estimates of long-run elasticities by setting the logarithms of Y, K, and L at equilibrium values: log Y = 0.235log K + 0.340log L + 0.017t + 0.750log Y = 0.940log K + 1.360log L + 0.068t
LSE 2008/EC220
Page 26 of 11
(ignoring the constant and not worrying about the fact that a static equilibrium is not actually possible). Thus one obtains estimates of 0.94 and 1.36 for the long run elasticities of K and L and a long-term annual growth rate of 6.8 percent [2 marks]. (f) [3 marks] Standard. (g) [4 marks] From the answer to (f), the coefficient of lagged capital in (4) is theoretically equal to minus the coefficient of current capital multiplied by the coefficient of lagged output. The same relationship links the theoretical coefficients of lagged and current labour. We would perform a common factor test, the null hypothesis being that both restrictions are valid [1 mark] for explaining the restrictions with reference to the mathematics in (f). The test statistic would be (40 1)* log(RSS4/RSS2) = 39*log(0.022/0.020). The first factor is the number of observations in regressions (2) and (3), which is 39 because the data contain 40 observations but the first cannot be used, given the presence of lagged variables in the specification [2 marks]. Under the null hypothesis that the restrictions are valid, the test statistic would be distributed as a chi-squared statistic with two degrees of freedom (degrees of freedom equal to the number of restrictions) [1 mark]. (h) [2 marks] We would conclude that we would not reject the hypothesis that the AR(1) specification is an adequate representation of the data [1 mark]. If it is valid, the coefficients in specification (2) should conform to the nonlinear restrictions described in (f). Minus the coefficient of log Kt multiplied by the coefficient of log Yt1 = 0.207*0.800 = 0.166, which is of the same order of magnitude as the coefficient of log Kt1, 0.123. Minus the coefficient of log Lt multiplied by the coefficient of log Yt1 = 0.300*0.800 = 0.240, which seems high compared with the coefficient of log Lt1, 0.152. [1 mark]
LSE 2008/EC220
Page 27 of 11
9. (a) [7 marks] [Note to examiners: EC220 students have fitted wage equations of this kind in their practical work and these are the names of the variables used; ASVABC is the composite score on the cognitive components of the Armed Services Vocational Aptitude Battery] Suppose that the true model is LGEARN = 1 + 2 S + 3 EXP + 4 ASVABC + 5 MALE + 6 ETHBLACK + 7 ETHHISP + 8 X 8 + u where X8 is some further fixed characteristic of the respondent. ASVABC and X8 are absent from regressions (1) and (2) and so those regressions will be subject to omitted variable bias. In particular, since ASVABC is likely to be positively correlated with S, and to have a positive coefficient, its omission will tend to bias the coefficient of S upwards [3 marks]. However, if the specification is valid for both 1994 and 2000 and unchanged, one can eliminate the omitted variable bias by taking first differences as in regression (3):
LGEARN = 2 S + 3 EXP + u
By fitting this specification one should obtain unbiased estimates of the coefficients of schooling and experience, and the former should therefore be smaller than in (1) and (2). Note that all the fixed characteristics have been washed out [3 marks]. The suggestion that ASVABC should have been included in (3) is therefore incorrect [1 mark]. Note that (3) should not have included an intercept. This is discussed in part (c). (b) [3 marks] The estimate of the coefficient of S differs from that in (2) because the omitted variable bias attributable to the omission of ASVABC in that specification has now been corrected [1.5 marks]. However it is still biased if X8 (representing other omitted characteristics) is a determinant of earnings and is correlated with S. This partial rectification of the omitted variable problem accounts for the fact that the coefficient of S in (4) lies between those in (2) and (3) [1.5 marks]. (c) [4 marks] Given the specification in (1) and (2), there should have been no intercept in the first differences specification (3) [2 marks]. One would therefore expect the estimate of the intercept to be somewhere near zero in the sense of not being significantly different from it [1 mark]. Nevertheless, it is significantly different at the 5 percent level [1 mark]. However, suppose that the relationship shifted between 1994 and 2000, and that the shift could be represented by a dummy variable D equal to zero in 1994 and 1 in 2000, with coefficient . Then (3) should have an intercept [1 extra mark]. Its estimate, 0.102, suggests that earnings grew by 10 percent from 1994 to 2000, holding other factors constant. This seems entirely reasonable, perhaps even a little low [1 extra mark]. Give 1 extra mark if, alternatively, the candidate suggests that the apparently significant t statistic may have arisen as a matter of Type I error.
LSE 2008/EC220
Page 28 of 11
(d) [3 marks] Give the 3 marks for something on the following lines. Divide S into two variables, schooling as of 1994 and extra schooling as of 2000, with separate coefficients. Then use a standard F test (or t test) of a restriction to test whether the coefficients are significantly different. (e) [3 marks] Give the 3 marks for something on the following lines. The issue is sample selection bias and an appropriate procedure would be that proposed by Heckman. One would use probit analysis with an appropriate set of determinants to model the decision to return to school between 1994 and 2000, and a regression model to explain variations in the logarithm of earnings of those respondents who do return to school, linking the two models by allowing their disturbance terms to be correlated. One would test whether the estimate of this correlation is significantly different from zero. (f) [5 marks] It is difficult to predict what candidates may write, so be receptive to good ideas and mark flexibly and generously. For this reason a detailed allocation of marks has not been suggested. The schooling coefficient is effectively zero! [These are real data, incidentally.] The scatter diagram shows why. Irrespective of whether the respondent had one, two, or three years of extra schooling, the gain is about the same, on average. (These are the only categories with large numbers of observations, given the information at the beginning of the question, confirmed by the scatter diagram.) So the results indicate that the fact of going back to school, rather than the duration of the schooling, is the relevant determinant of the change in earnings. The intercept indicates that this subsample on average increased their earnings between 1994 and 2000 by 38.9 percent. (This is good enough for exam purposes; the actual proportion would be better estimated as e0.389 1 = 0.476, but the candidates do not have calculators.) This figure is confirmed by the diagram, and it would appear to be much greater than the effect of regular schooling. Candidates are not expected to explain the effect but 1 extra mark, or even 2, could be given for suggestions. One could be sample selection bias, as in (e). A more likely possibility is that the respondents were presented with opportunities to increase their earnings substantially if they undertook certain types of formal course, and they took advantage of these opportunities. )
LSE 2008/EC220
Page 29 of 11

Ex08 PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ex08 PDF

Uploaded by

Copyright:

Available Formats

3 Summer 2008 examination

Suitable for all candidates

Calculators are NOT allowed in this examination.

1. A variable Yi is generated as Yi = 1 + ui (1.1)

Y , the sample mean of Y, with population variance

calculating b1 as Y b2 X , where X is the sample mean of X. The population variance of b1 is

. X may be assumed to be a nonstochastic variable. (X i X )2 X2

R2 = 0.98 RSS = 0.90

I G log = 2.65 0.63 log Y Y (0.23) (0.12) [11.58] [5.07]

R2 = 0.48 RSS = 0.99

(6.1) (6.2) (6.3)

7. In a simulation experiment, a sample of 20 observations on a variable Yt is generated by the process

0 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 1.2

log Kt log Lt t log Yt1 log Kt1 log Lt1 d

Change in log earnings

EC220 INTRODUCTION TO ECONOMETRICS Marking Scheme for the 2008 Examination

(b) [4 marks] Zero [1 mark]. R 2

and Yi = Y for all i [3 marks].

since E (u ) = 0. E (u ) = 0 because E (u i ) = 0 for all i. (d) [2 marks]

[Alternatively, and equivalently,

Hence, given that we are told that X is nonstochastic,

since E(u) = 0 [3 marks]. Thus

(RSS P [RSS M + RSS S ]) / 4 (RSS M + RSS S ) / 1992

log I = 1 + 2 log G + 3 log Y + u

(0.99 0.90) / 1 = 2.7 [3 marks]

2 respectively, where su is an estimate of the variance of the disturbance term, n is

Lectures version: Define = 2 + 3 1 , so that the restriction may be written

= 0 . Then 3 = 2 + 1 . Use this to substitute for 3 in the unrestricted

log I = 1 + 2 log G + 3 log Y + u

log I = 1 + 2 log G + 3 log Y + u

(c) (i) [5 marks] Substituting for Y in (5.1), we have log H = 1 + 2 log Ev + u

= 1 + 2 log + 2 log E + u + 2 log v

(log E log E ])(log K = (log E log E ])(log Z

log E ] [1 + 2 log Z i + u i + ri 2 wi ] 1 + 2 log Z + u + r 2 w

log E ] log Z i log Z

(log E log E ])([u + r w ] [u + r w ]) + (log E log E ])(log Z log Z )

log E ] log Z i log Z

cov(log E , u ) + cov(log E , r ) 2 cov(log E , w) = 2 cov(log E , log Z )

+ a 2 log E i ] [ a1 + a 2 log E ] log K i log K + a 2 log E i ] [a1

)( + a log E ])(log Z ) log Z )

log E i a 2 log E log K i log K log E i

(log E log E )(log K = (log E log E )(log Z

we can write plim

( 3 cov(K , u ) + var(u ) + cov(v, u ))

since K is exogenous, and that plim we can write

7. (a) [5 marks] The OLS estimator of the slope coefficient of Yt = 1 + 2 X t + u t is

Substituting for Y, this may be decomposed as

. Hence one may write

Xt X . In the present case, Xt = Yt1.

1 = var( X t ) by a law of large numbers and it can be shown that n

X )(u t u ) = cov( X t , u t ) . Now Xt = Yt1. Since Yt1 has already

been determined by the time ut is generated, the covariance is zero and

Yt = 10 1 + 0.9 + 0.9 2 + ... +

You might also like