You are on page 1of 4

ECON232

Problem set 2: Randomized Evaluations


This problem set consists for two parts: Part 1 is additional Stata Practice. Part 2 asks you to analyze data
from a randomized controlled trial. Part 2 is a slightly modified version of a problem set by Garrett S.
Christensen in his course “Global Poverty & Impact Evaluation: Learning what works for the world’s
poor”
Please submit a log file, a do file, and a write-up of your answers to the exercises.

Part 1: More Stata Practice


Background: The 2000 Welfare Monitoring Survey (WMS2000) is a survey conducted but the Ethiopian
Central Statistical Authority. The WMS2000 is a survey of approximately 25,900 households (8,600
urban and 17,300 rural households) and the 125,000 individuals residing in those households. The full
data are representative (if weighted) of about 11.5 million Ethiopian households. For this problem set we
will use only a subset of the data.
You have two data files for the WMS2000:
- ur20_1st_restr.dta - The dataset contains basic population characteristics, health, education and
anthropometry information of the household. The dataset is organized at individual level.
- ur20_2nd_restr.dta - The dataset contains information about housing amenities, access to
facilities, household assets and land property information, household major expenditure and
income.

Households in the data sets are uniquely identified by the combination of the following variables:
killil zone wereda town keftegna kebele ea hhld_id

Region is the variable:


killil

Using these two data files, compute the following set of statistics indicated in tables 1 and 2 below by
“xx”).

Table 1: Rural Household Wage Income and Rental Income in the Last 6 months
All Female Headed Households whose Head is
Households Households less than 40 years old
(Et Birrr) (Et Birrr) (Et Birrr)
Mean income from wages and rent xx Xx xx

Median income from wages and rent xx Xx xx

Mean income from wages and rent, among xx Xx xx


household with any such income
Table 2: Rural Households with any Income from
Non-farm Enterprises in the 6 months (percent)
Region All Female Headed Households whose
Households Households Head is less than 40
years old
Tigray xx Xx xx

Afar xx Xx xx

Amhara xx Xx xx

Oromia xx Xx xx

Somali xx Xx xx

Benshangu xx Xx xx

SNNPR xx Xx xx

Gambela xx Xx xx

Harari xx Xx xx

Addis Aba xx Xx xx

Dire Dawa xx Xx xx

Total xx Xx xx
Total number of xx Xx xx
observations
Part 2: Analyzing Data from Randomized Controlled Trials
The goal of this part of problem set is to analyze data from a randomized impact evaluation. We will be
using data from a paper by Thornton (AER 2008) titled, “The Demand and Impact of Learning HIV
Status”. The questions on this problem set will lead you through a series of exercises that are standard
practice when analyzing data from a randomized evaluation.
Background: Thornton (2008) examines whether varying the cost of HIV testing can increase the number
of people who get their test results. There are two interventions: 1) cash payments to individuals who
receive their HIV test results and 2) the distance a person needs to travel to obtain their HIV test results.
Both interventions are randomly assigned on an individual level.
Key Variables:
any = 1 if randomly assigned cash incentive to obtain HIV test result
under = 1 if randomly assigned distance to get test result is under 1.5 km
Ti = the randomly assigned cash incentives (amounts) to obtain HIV test results
Going forward, those who were assigned a cash incentive or have to travel less than 1.5 km to get their
test results are known as the “treatment” group and those who are not receiving a cash incentive or have
to travel further than 1.5 km are known as the “control” group.
Q1) Present summary statistics for the study sample for age, gender, education, marital status and HIV
rates. (i.e., What is the average age? What percentage of males are in the study? What are the average
years of education? What is the proportion of people who are married? What percentage of people are
infected with HIV?)
Q2) Present summary statistics separately for those in the control and treatment group. Are there major
differences in any of the variables (i.e. age, gender, education, marital status, HIV rates)? Do a test to see
whether differences in any of the variables are statistically different between the treatment and control
group. Do you see any differences (using a p-value of .05 as the statistically significant threshold)?
Q3) Using OLS regression analysis is one of the most common tools used to estimate the effect of a
treatment or randomly assigned intervention. Run an OLS regression, testing whether getting your HIV
test result is affected by receiving a cash incentive. Let the treatment effect be denoted by 𝛽.
a) What is your estimate of 𝛽?
b) Is 𝛽 statistically significantly different from zero (at the 5% level)? Use the standard error and t-
statistic to tell.
c) What happens when you include additional control variables (age, male, education, marriage)? Does
your estimate of the treatment effect change? Is this what you would have expected and why?
Q4) Now run a similar regression, but this time replace “Any Cash Incentive” with “Cash Amount”.
a) What is your estimate of 𝛽?
b) Is 𝛽 statistically significant? Use the p-value to tell.
c) What happens when you include additional control variables (age, male, education, marriage)? Is this
what you would have expected and why?
Q5) Now interpret your findings.
a) Based on your estimates from Q3, what can you say about the effect of offering cash incentives on
people learning their HIV status? Would you say that this is a big or small effect?
b) Now look at your estimates from Q4. What is the effect of doubling of the cash incentive from 100
Kwacha to 200 Kwacha? Put this number in a sentence.
Q6) We might be interested in whether the treatment has different effects for sub-populations. For
example does giving cash incentives have a different effect for men and women? How would you specify
your regression model if you wanted to test this? Run your newly specified model and interpret your
results.

You might also like