Professional Documents
Culture Documents
Households in the data sets are uniquely identified by the combination of the following variables:
killil zone wereda town keftegna kebele ea hhld_id
Using these two data files, compute the following set of statistics indicated in tables 1 and 2 below by
“xx”).
Table 1: Rural Household Wage Income and Rental Income in the Last 6 months
All Female Headed Households whose Head is
Households Households less than 40 years old
(Et Birrr) (Et Birrr) (Et Birrr)
Mean income from wages and rent xx Xx xx
Afar xx Xx xx
Amhara xx Xx xx
Oromia xx Xx xx
Somali xx Xx xx
Benshangu xx Xx xx
SNNPR xx Xx xx
Gambela xx Xx xx
Harari xx Xx xx
Addis Aba xx Xx xx
Dire Dawa xx Xx xx
Total xx Xx xx
Total number of xx Xx xx
observations
Part 2: Analyzing Data from Randomized Controlled Trials
The goal of this part of problem set is to analyze data from a randomized impact evaluation. We will be
using data from a paper by Thornton (AER 2008) titled, “The Demand and Impact of Learning HIV
Status”. The questions on this problem set will lead you through a series of exercises that are standard
practice when analyzing data from a randomized evaluation.
Background: Thornton (2008) examines whether varying the cost of HIV testing can increase the number
of people who get their test results. There are two interventions: 1) cash payments to individuals who
receive their HIV test results and 2) the distance a person needs to travel to obtain their HIV test results.
Both interventions are randomly assigned on an individual level.
Key Variables:
any = 1 if randomly assigned cash incentive to obtain HIV test result
under = 1 if randomly assigned distance to get test result is under 1.5 km
Ti = the randomly assigned cash incentives (amounts) to obtain HIV test results
Going forward, those who were assigned a cash incentive or have to travel less than 1.5 km to get their
test results are known as the “treatment” group and those who are not receiving a cash incentive or have
to travel further than 1.5 km are known as the “control” group.
Q1) Present summary statistics for the study sample for age, gender, education, marital status and HIV
rates. (i.e., What is the average age? What percentage of males are in the study? What are the average
years of education? What is the proportion of people who are married? What percentage of people are
infected with HIV?)
Q2) Present summary statistics separately for those in the control and treatment group. Are there major
differences in any of the variables (i.e. age, gender, education, marital status, HIV rates)? Do a test to see
whether differences in any of the variables are statistically different between the treatment and control
group. Do you see any differences (using a p-value of .05 as the statistically significant threshold)?
Q3) Using OLS regression analysis is one of the most common tools used to estimate the effect of a
treatment or randomly assigned intervention. Run an OLS regression, testing whether getting your HIV
test result is affected by receiving a cash incentive. Let the treatment effect be denoted by 𝛽.
a) What is your estimate of 𝛽?
b) Is 𝛽 statistically significantly different from zero (at the 5% level)? Use the standard error and t-
statistic to tell.
c) What happens when you include additional control variables (age, male, education, marriage)? Does
your estimate of the treatment effect change? Is this what you would have expected and why?
Q4) Now run a similar regression, but this time replace “Any Cash Incentive” with “Cash Amount”.
a) What is your estimate of 𝛽?
b) Is 𝛽 statistically significant? Use the p-value to tell.
c) What happens when you include additional control variables (age, male, education, marriage)? Is this
what you would have expected and why?
Q5) Now interpret your findings.
a) Based on your estimates from Q3, what can you say about the effect of offering cash incentives on
people learning their HIV status? Would you say that this is a big or small effect?
b) Now look at your estimates from Q4. What is the effect of doubling of the cash incentive from 100
Kwacha to 200 Kwacha? Put this number in a sentence.
Q6) We might be interested in whether the treatment has different effects for sub-populations. For
example does giving cash incentives have a different effect for men and women? How would you specify
your regression model if you wanted to test this? Run your newly specified model and interpret your
results.