Professional Documents
Culture Documents
Warning!
Goal of presentation is not to make you a sampling expert
Goal is also not to give you a headache.
Rather an overview: How do sampling features affect what it is
possible to learn from an impact evaluation?
Outline
1. Sampling frame
What populations or groups are we interested in?
How do we find them?
2. Sample size
Why it is so important: confidence in results
Determinants of appropriate sample size
Further issues
Examples
3. Budgets
Sampling frame
Who are we interested in?
a) All SMEs?
b) All formal SMEs?
c) All formal SMEs in a particular sector?
d) All formal SMEs in a particular sector in a particular region?
But should also keep in mind feasibility and what you want to
learn
Might not be possible or desirable to pilot a very broadly defined
program or policy
Sampling frame:
Finding the units were interested in
Depends on size and type of experiment
Lottery among applicants
Example: BDS program among informal firms in a particular area
Can use treatment and comparison units from applicant pool
If not feasible (50,000 get the treatment), need to draw a sample to
measure impact
Policy change
Example: A change in business registration rules in randomly selected
districts
To measure impact on profits, cannot sample all informal businesses in
treatment and comparison districts.
Will need to draw a sample of firms within districts.
Required information before sampling
Complete listing all of units of observation available for sampling in each area or group
Tricky for units like informal firms, but there are techniques to overcome this
Outline
1. Sampling frame
What populations or groups are we interested in
How do we find them?
2. Sample size
Why it is so important: confidence in results
Determinants of appropriate sample size
Further issues
Examples
3. Budgets
Sample size and confidence
Types of errors
Type 1 error: You say there is a program impact when
there really isnt one.
Type 2 error: There really is a program impact but you
cannot detect it.
Sample size and confidence
Type 1 error: Find program impact when theres none
Error can be minimized after data collection, during statistical analysis
Need to adjust the significance levels of impact estimates (e.g. 99% or
95% confidence intervals)
Variance of outcomes
Less underlying variance easier to detect
difference can have lower sample size
Calculating sample size
Variance of outcomes
How do we know this before we decide our
sample size and collect our data?
Ideal pre-existing data often .non-existent
Can use pre-existing data from a similar
population
Example: Enterprise Surveys, labor force surveys
Group-disaggregated results
Are effects different for men and women? For different
sectors?
If genders/sectors expected to react in a similar way, then
estimating differences in treatment impact also requires
very large samples
Who is taller?
Detecting smaller differences is harder
Further issues
Group-disaggregated results
To ensure balance across treatment and comparison
groups, good to divide sample into strata before
assigning treatment
Strata
Sub-populations
Common strata: geography, gender, sector, initial
values of outcome variable
Treatment assignment (or sampling) occurs within
these groups
Why do we need strata?
Geography example
=T
=C
Why do we need strata?
Take-up
Low take-up increases detectable effect size
Can only find an effect if it is really large
Effectively decreases sample size
Data quality
Poor data quality effectively increases required
sample size
Missing observations
Increased noise
Can be partly addressed with field coordinator on
the ground monitoring data collection
Example from Ghana
Calculations can be made in many statistical packages e.g. STATA, OD
Experiment in Ghana designed to increase the profits of microenterprise firms
Baseline profits
50 cedi per month.
Profits data typically noisy, so a coefficient of variation >1 common.
Results
10% increase (from 50 to 55): 1,178 firms in each group
20% increase (from 50 to 60): 295 firms in each group.
50% increase (from 50 to 75): 48 firms in each group (But this effect size not realistic)
1. Sampling frame
What populations or groups are we interested in
How do we find them?
2. Sample size
Why it is so important: confidence in results
Determinants of appropriate sample size
Further issues
Examples
3. Budgets
Budgets
What is required?
Data collection
Survey firm
Data entry
Data analysis
Budgets
How much will all of this cost?
Huge range. Often depends on
Length of survey
Ease of finding respondents
Spatial dispersion of respondents
Security issues
Formal vs informal firms
Required human capital of enumerator
Et cetera.
Firm-level survey data:$40-350/firm
Household survey data: $40+/household
Field coordinator: $10,000-$40,000/year
Depends on whether you can find a local hire
Administrative data: Usually free
Sometimes has limited outcomes, can miss most of the informal sector
Summing up
The sample size of your impact evaluation will
determine how much you can learn from your
experiment
Some judgment and guesswork in calculations but
important to spend time on them
If sample size is too low: waste of time and money
because you will not be able to detect a non-zero impact
with any confidence
If little effort put into sample design and data collection:
See above.
Questions?