Professional Documents
Culture Documents
18101823, 2012
2012 Elsevier Ltd. All rights reserved.
0305-750X/$ - see front matter
www.elsevier.com/locate/worlddev
http://dx.doi.org/10.1016/j.worlddev.2012.04.030
1. INTRODUCTION
Academic researchers, development practitioners, donors,
and others increasingly recognize the importance of understanding micro-level socioeconomic dimensions of the rural
economy. Our interest in livelihood portfolios, often characterized by income, consumption, expenditure, or time use patterns, is motivated by the desire to understand the lives of the
poor, how to best design development interventions, and to
provide information about how development policies and projects aect livelihood outcomes. Demand for high quality
information on rural livelihoods has resulted in the development of a diversity of methods for eliciting information from
local people (e.g., Basic Necessities Survey (Davies, 1997);
Stages of Progress (Krishna, 2005); Sustainable Livelihoods
Framework (Carney, 1998; Ellis & Freeman, 2004); Household Livelihood Security Assessments (CARE, 2002); the
Living Standards Measurement Study Surveys (LSMS) etc.).
Data collection methods vary greatly with respect to: the use
of qualitative and quantitative data collection methods;
whether questions are put to individuals or groups of people;
and the degree to which information is collected by an interviewer vs. generated through joint learning with representative
groups. Discussions regarding the strengths and weaknesses of
alternatives approaches have accompanied the emergence of
these various methods (Campbell, 2002; Davis & Whittington,
1998; Fisher, Reimer, & Carr, 2010; Menton, Lawrence,
Merry, & Brown, 2010; Scoones, 1995; Schrekenberg et al.,
2010).
1811
Figure 1. Study area (highlighted in box), Hoima and Kibaale Districts, western Uganda.
livelihoods, one would have to follow and document the production, consumption, expenditure, and/or time allocation
decisions of all household members for an extended period
of time. This approach is both intrusive to the respondent,
and has high nancial and time costs for the person collecting
the data. Further, there is no consensus regarding what set of
variables provides the most accurate information about household welfare. Development economists often favor detailed
consumption and expenditure data (e.g., LSMS, the RePEAT
survey (Research on Poverty, Environment, and Agriculture
in Ethiopia, Kenya, and Uganda)) whereas other social scientists may consider subjective assessments of the relative importance of income sources to more accurately represent
household livelihood portfolios (Chambers, 1994). These
inquiries are informed by a diversity of social science research
methods drawing on a wide range of disciplines including economics, sociology, anthropology, and geography.
Virtually every researcher engaged in social science eld
work has faced decisions regarding the appropriate level of
aggregation to collect data. But these decisions are made largely based on individual experiences and disciplinary training,
as there is little empirical literature on this topic. Our goal in
this paper is to discuss issues regarding the accuracy and precision of data associated with aggregated vs. disaggregated approaches to household livelihood surveys. We hope to inform
further research that addresses the biases inherent in various
approaches to collecting data on livelihood portfolios in developing country settings.
One means of dierentiating between data collection approaches is to consider how specic vs. general the questions
that researchers ask to respondents are. In the extreme, the
1812
WORLD DEVELOPMENT
aggregated approach, in the context of our study, is a participatory rural appraisal exercise that asks household member
to think in an aggregate way about categories of livelihood
activities. Both methods rely on participation from the diversity of household members in responding to questions, including household heads, spouses, children, and other individuals
that contribute to the economic welfare of the household. 1
We use two indicators in our analysis of livelihood portfolios: income and household expenditures. Our aim is to better
understand how to collect accurate and precise livelihoods
data, to consider the potential biases arising from the use of
various data collection methods, and to contribute to a discussion about methodological choices and their implications for
our ability to design development projects, formulate policies
to improve rural livelihoods, and assess the impact of development programs and policies.
We begin in the next section by identifying potential
strengths and weaknesses associated with disaggregated vs.
aggregated approaches to data collection. We then present
an experiment where we collect data on indicators of rural
livelihood portfolios for two sub-samples of the same population of households in western Uganda using dierent survey
instruments: a highly disaggregated income and expenditure
survey, and a participatory rural appraisal (PRA) survey
instrument that collects more aggregated information. We
compare the results of these two approaches to see whether
and why they are dierent. We conclude with a summary of
our assessment about the appropriateness of the methods for
various research and data collection eorts.
2. ACCURACY AND PRECISION OF
DISAGGREGATED AND AGGREGATED
APPROACHES TO COLLECTING DATA
(a) Data accuracy and precision
In this paper we focus our eorts on issues of data quality
associated with disaggregated vs. aggregated approaches to
collecting household level data that may be used to characterize livelihood portfolios. 2 We focus on two attributes that
contribute to the quality of data: accuracy and precision.
Accuracy implies that the data are representative of the truth,
although the data may be widely distributed around the true
value. For example, a grouping of arrows closer to the
bulls-eye of a target is more accurate than a similarly dispersed grouping that is farther away. 3 The concept of accuracy is parallel to the concept of bias in statistics. More
accurate estimates are less biased. Precision refers to the tightness of the distribution of the data collected, whether or not
the distribution is around the true value. For example, you
could have a tight grouping of arrows that is precise, but it
could be inaccurate if not grouped around the bulls-eye. The
concept of precision is parallel to the concept of variance in
statistics. More precise estimates have smaller variances.
The precision of data can be conceptualized at various levels. At one level, the precision of a single piece of information
collected from a household may be assessed by repeated measures to see if there is measurement error. This type of precision can be dicult to measure in household livelihood
surveys. If one were to visit the same household repeatedly,
to see if the same answers were forthcoming, the conditions
surrounding a households livelihood, such as seasonality,
would likely change between visits, causing answers to change.
Alternatively, if one were to repeatedly ask the same questions
within a short time frame, respondents may get fatigued or
1813
to day livelihood decisions of respondents are made in the context of bounded rationality (Jones, 2001) that recognizes the
limits in cognition discussed above. Therefore, information
which reects the more general perceptions of respondents
may be more relevant to understanding values and behavior
than the disaggregated information compiled by researchers.
Moreover, these general perceptions may, in some ways, be
more holistic indications of respondents values than disaggregated approaches. For example, in disaggregated approaches
that collect data on prices and quantities of sold and consumed forest products, there may be questions regarding the
relevance of prices in reecting local values. Many forest products are harvested for home consumption and not marketed.
Responses about products that have market prices may not
internalize negative environmental eects of activities such as
forest clearing. The market-based information that the researcher is aggregating may be less reective of social values
than the values that respondents are aggregating in their perceptions.
The advantages regarding the relevance and holistic qualities of respondents perceptions suggest that an aggregated approach could increase the accuracy of the information
collected. If aggregated approaches are successful in capturing
these perceptions then, all other things being equal, the information may be thought of as being closer to the true values of
the respondent. But, similar to the situation described above,
it is not clear whether such an approach will lead to greater
precision of estimates of central tendencies. Perceptions
regarding the importance of livelihood activities can vary signicantly within a population even if they are accurately collected.
A nal advantage of aggregated information is the cost of
data collection. The repeated and intensive sampling associated with more disaggregated approaches generally involves
higher costs. Data collection costs frequently involve expenses
to hire, train, and monitor teams of enumerators and supervisors. Once the data have been collected, coding, entering, processing, and analysis of disaggregated data can be costly. In
addition to saving researchers time, aggregated methods also
save respondents time. Disaggregated household surveys can
take a very long time to administer, particularly during the
early rounds of data collection when respondents are asked
to recall very detailed information in a structured format.
Aggregated surveys generally demand far less respondent
time.
(d) Levels of aggregation and data quality
The above discussion points out that, conceptually, there
are potential data quality problems associated with both
researchers and respondents aggregating information. With
little guidance regarding how severe the problems are, or
which problems are likely to be worse, we turn to an empirical approach to attempt to shed light on these issues. With
potential biases inuencing both aggregated and disaggregated approaches, we cannot interpret either approach as being
more accurate than the other. Therefore, in the empirical
analysis that follows, we are restricted to assessing whether
dierent levels of respondent and researcher aggregation
yield dierent results, and then conjecturing about why the
results may be dierent. Moreover, with no a priori expectations about the relative precision of central tendencies of
data collected with aggregated vs. disaggregated approaches,
we can only report results with the hope that they provide
further insights.
1814
WORLD DEVELOPMENT
Recall period
Respondents
Disaggregated
Questions
Disaggregated
Aggregated
Income from:
Unprocessed forest products
Processed forest products
Fishing/aquaculture
Wild products
Wage labor
Businesses
30 days
1 year
At least 1 adult;
Preferably 1 adult male
and 1adult female and
any other household
members
At least 1 adult;
Preferably 1 adult male
and 1 adult female and
any other household
members
Aggregated
Disaggregated (e.g.)
Aggregated
Income from:
Agriculture
Livestock
Livestock
Products
Other
3 months
1 year
At least 1 adult;
Preferably 1 adult male
and 1adult female and
any other household
members
At least 1 adult;
Preferably 1 adult male
and 1 adult female and
any other household
members
Expenditures
1 week
1 year
At least 1 adult;
Preferably 1 adult male
and 1 adult female and
any other household
members
Table 1. Summary of survey questions, recall periods and respondents for disaggregated (PEN) and aggregated (UFRIC) questionnaires
1815
1816
WORLD DEVELOPMENT
ably are part of the household responses in the aggregated approach. Given the nature of cash expenditures in this
particular setting, we feel that expenditure data for one adult
female and one adult male are fairly representative of total
expenditures for a household (i.e., most children and elderly
are not making signicant cash expenditures).
4. EMPIRICAL ANALYSIS
(a) How comparable are the disaggregated (PEN) and aggregated (UFRIC) samples?
Before making comparisons between our variables of interest
(i.e., income and expenditure portfolios) we investigate whether
the samples we drew for the disaggregated and aggregated surveys are comprised of comparable households (Table 2). We
conducted t-tests (i.e., two-group mean comparison tests) to
compare means of common demographic and socioeconomic
characteristics of the two groups. 11 The test assumes the variances for the two populations are the same. If two tailed p value
is <0.05 we conclude that the dierence of means between males
and females is dierent from 0. The data demonstrate that there
are few dierences between the samples. The variable minutes to
nearest forest is the only variable where we observe mean values
that are signicantly dierent between the two samples at the
5% level. Though the lack of signicant dierences in means
could also be due to large variances, we note, subjectively, that
neither the variances nor the dierences in the absolute values of
the means appear to be large. From these data we conclude that
we have administered the two separate questionnaires to relatively comparable sub-samples of households, and that if dierent methods yield dierent characterizations of livelihood
portfolios, these dierences are likely to be due to dierences
in survey methods.
(b) Household income portfolios
Estimates of the proportion of annual net household income
from various activities are presented in Table 3. Households
were asked to consider sources of subsistence and cash income, less cash payments for variable inputs and hired labor
(i.e., net income), for both the disaggregated and aggregated
questionnaires. The results indicate the disaggregated and
aggregated methods yield quantitatively dierent pictures of
the relative importance of various sources of income. The
mean values or share of total net income from various
Table 2. Descriptive statistics for disaggregated (PEN) and aggregated (UFRIC) samplesa
Variable
Forest owned by household, hectares
Arable land owned by household, hectares
Female headed household (0/1)
Household size, number
Dependency ratiob
Education of household head, years
Household head is migrant (0/1)
Household owns bike (0/1)
Household owns mobile phone (0/1)
Distance to nearest forest, minutes
a
(1.45)
(3.14)
(0.32)
(2.6)
(115.9)
(3.2)
(0.46)
(0.54)
(0.26)
(7.8)
1817
Table 3. Measures of income, shares and ranks for disaggregated (PEN) and aggregated (UFRIC) samplesa
Sources of income
Agriculture
Business
Unprocessed forest products
Wild products
Livestock
Wages
Other
Livestock products
Remittances
Processed forest products
a
*
Aggregated (N = 88)
Disaggregated (N = 170)
Aggregated (N = 88)
% Share of total
Rank
% Share of total
Rank
% Share of total
Rank
% Share of total
Rank
47.0*
(17.2)
12.7*
(17.2)
10.9*
(6.7)
10.2*
(5.9)
5.7*
(7.4)
5.1*
(9.9)
3.4
(4.5)
2.4
(4.8)
1.3*
(5.6)
1.2*
(3.0)
34.7
(17.6)
6.1
(11.0)
14.4
(6.6)
4.5
(4.7)
9.5
(8.3)
9.5
(13.0)
4.1
(8.6)
2.7
(4.4)
6.7
(13.8)
4.6
(7.1)
27.5*
(23.6)
27.5*
(28.7)
1.1*
(5.0)
0.3*
(1.3)
14.6*
(17.8)
12.5
(20.4)
10.8*
(14.9)
1.1*
(4.1)
3.0*
(11.3)
1.4*
(6.7)
39.5
(21.3)
7.2
(13.8)
4.8
(5.6)
4.9
(5.3)
10.6
(10.0)
10.9
(15.2)
5.2
(11.6)
3.0
(5.2)
8.4
(16.6)
5.5
(9.3)
2
3
4
5
6
7
8
9
10
6
2
8
3
3
9
10
5
7
1
8
10
3
4
5
8
6
7
5
9
8
3
2
7
10
4
6
livelihood activities are signicantly dierent for the two survey methods for all categories, with the exception of the relative shares of income from livestock products and other
income. 12 The largest dierences are observed for agriculture
(+12.3% for disaggregated approach), business (+6.6% for
disaggregated approach), and wild products (+5.5% for disaggregated approach).
Another way of comparing data on income portfolios is by
ranking the categories used in both surveys and observing
whether there is general consensus between the two methods
on the order of importance of each income category. In
Table 3, agriculture is ranked rst using both survey methods, for both net total and cash income, but beyond this rst
category there is considerable variation. For example, if you
wanted to design a policy intervention around the three most
important sources of net income for rural households in the
study area, the disaggregated method points to focusing
attention on agriculture, business, and the harvesting of
unprocessed forest products. Analysis of the data using the
aggregated method points to agriculture, unprocessed forest
products, and wage income. If we look at the top ve categories, three sources of net income are common to both
survey methods.
To test whether respondents are better at recalling cash (vs.
subsistence) income we present relative shares and rankings
for sources of household cash income. A limitation of collecting income data in rural settings where a considerable share
production is for home use is that it may be cognitively dicult for respondents to aggregate items they have not marketed (e.g., number of cobs of corn harvested and consumed
by all household members, or number of head loads of fuel
wood collected by women and children). With the exception
of wage income all estimates of sources of income are signicantly dierent between the two survey methods. There are
large dierences in the relative importance of agriculture
(+12% for aggregated approach), and business income
(+20.3% for disaggregated approach). Rankings of cash income suggest that there is considerable synergy between the
two methods. Agriculture, business, livestock, and wages are
among the top ve sources of cash income using both methods.
Further to the discussion above, we hypothesize that some
of these dierences may arise because the aggregated approach
is more directed toward subjective measures of importance,
while the disaggregated approach is more directed toward
objective quantitative measures of income. The subjective
measures may include considerations such as the importance
of income sources as safety nets during hard times from events
such as droughts or a household member falling ill. If this
interpretation is correct, we would expect unprocessed forest
products, livestock, wage income, and remittances to be more
important in the subjective, aggregated approach because
these activities play larger safety net functions than agriculture
and business income which are more regularized activities.
This explanation seems to be supported by the results, as local
returns to agriculture and business are likely to be disrupted
by droughts and other shocks, while forests products (processed and unprocessed), remittances, wages, and livestock
may act as safety nets for livelihoods (Dercon, 2002; Pattanayak & Sills, 2001).
Considering measures of net cash income also seems to
support this line of reasoning. For those sectors that have
cash ows that persist during droughts and other shocks
to the household, we would expect subjective measures of
their importance to be higher than the more quantitative
measures. Accordingly, as per above results, we see higher
rankings for unprocessed and processed forest products,
and remittances for aggregated approaches. But the same logic does not hold for agriculture, wild products, or livestock
which have reversed patterns compared to net total income
measures. For agriculture and wild products, higher aggregated values suggest that the cash part of income may be
1818
WORLD DEVELOPMENT
Table 4. Proportion of expenditures for various categories of goods and servicesa
Categories of expenditures
Other
Agricultural production and food
Livestock
Medical
Entertainment
Transportation
School fees and other education
Formal social occasions
Forest products
Fish
a
*
46.3
(27.7)
19.6
(26.3)
11.0*
(14.1)
10.3*
(17.2)
5.4*
(11.0)
3.6
(11.6)
2.5
(9.3)
1.2*
(3.8)
0.1*
(0.7)
0
Aggregated (N = 80)
Disaggregated
(N = 173)
Aggregated (N = 80)
Rank
Share of total
Rank
Share of total
Rank
Share of total
Rank
9.2
(11.9)
23.9
(12.0)
5.8
(7.4)
23.2
(14.7)
11.2
(7.5)
3.4
(4.5)
8.9
(9.4)
5.4
(5.0)
8.9
(9.4)
0.1
(1.1)
NA
NA
NA
NA
32.9
(33.0)
23.4*
(26.3)
18.3*
(26.6)
10.9
(20.1)
6.1
(0.36)
4.1*
(14.1)
4.0
(14.4)
0.2*
(1.8)
0
25.8
(12.2)
6.1
(7.7)
26.5
(16.9)
12.5
(8.3)
3.6
(4.8)
9.7
(10.4)
6.3
(6.3)
9.3
(10.4)
0.2
(1.5)
2
3
4
5
6
7
8
9
10
7
2
3
9
5
8
5
10
2
3
4
5
6
7
8
9
7
1
3
8
4
6
5
9
same bias is occurring as may have been the case for the
other category. Livestock includes frequent and numerous
types of expenditures that may be underestimated if not delineated with disaggregated methods. On the other hand, the
disaggregated approach, which only collected data for expenditures seven days prior to each quarterly visit, may have
missed some irregular, yet relatively large household expenditures such as medical and school fees. The high standard deviations associated with these categories for the disaggregated
approach supports this hypothesis. The rank order of household expenditure categories is also somewhat inconsistent between the two approaches. For the disaggregated approach
without the other category, the three most important expenditure categories are agricultural production and food, livestock,
and medical expenses. The aggregated method tells us that
medical expenses, agricultural production and food, and entertainment are the three most important expenditure categories.
Considering the top ve categories for each method, three categories are common. Comparisons of the standard deviations
between the two approaches suggest that the precision of estimates of means are somewhat greater for the aggregated approach, which generally has standard deviations similar to,
or lower than, mean values. In contrast, the disaggregated approach tends to have standard deviations larger than the
means.
As noted in the discussion of sampling and questionnaire design, the data we present for the disaggregated method reect
expenditures over a seven day time period (for each of four
visits) for one adult male and one adult female from each
household. Given the limited amount of cash income that
households in these communities have, and limited opportunities for spending cash, we feel that our analysis of the disaggregated data are a reasonable proxy of household expenditure
patterns for the entire household. That is, we do not feel that
we missed a large share of expenditures by failing to collect
1819
Source of income
Agriculture
Business income
Unprocessed forest products
Wild products
Livestock
Wage income
Other income
Livestock products
Remittances
Processed forest products
Aggregated
(N = 88)
47.0*
(17.2)
12.7*
(17.2)
10.9*
(6.7)
10.2*
(5.9)
5.7*
(7.4)
5.1*
(9.9)
3.4
(4.5)
2.4
(4.8)
1.3*
(5.6)
1.2*
(3.0)
34.7
(17.6)
6.1
(11.0)
14.4
(6.6)
4.5
(4.7)
9.5
(8.3)
9.5
(13.0)
4.1
(8.6)
2.7
(4.4)
6.7
(13.8)
4.6
(7.1)
37.3
(29.5)
12.9**
(21.6)
13.4
(10.5)
15.2**
(14.1)
6.4**
(12.7)
4.3**
(10.7)
3.6
(6.1)
3.2
(7.9)
1.5**
(8.0)
1.9**
(5.1)
nding highlights the importance of recall periods, and demonstrates the potential limitations of collecting household income data using a one year recall period.
5. DIFFERENT METHODS, DIFFERENT STORIES
1820
WORLD DEVELOPMENT
Table 6. Determinants of share of forest incomea,b
Disaggregated
Aggregated
2.355***R
(0.844)
1.772***R
(0.597)
0.959
(1.788)
0.006
(0.006)
1.900
(1.461)
0.218
(0.202)
1.287
(1.318)
0.030**R+
(0.047)
0.029
(0.048)
167
0.1663
0.992
(1.044)
0.304
(0.481)
0.623
(3.644)
0.031***
(0.009)
6.321***
(2.306)
0.409
(0.339)
1.362
(2.097)
10.752**
(4.161)
0.116
(0.134)
81
0.2815
1821
NOTES
1. For a discussion on issues associated with asking questions of dierent
household members, see Fisher et al. (2010).
1822
WORLD DEVELOPMENT
Land Act tenants were granted freehold status for mailo parcels held since
1986. While the 1998 Land Act clearly outlines the provisions of various
land tenure systems and land rights, the act is weakly enforced (Nkonya et
al., 2004).
8. The categories of wage income, business income, and remittances are
assumed to be entirely cash income.
9. Note that the problem of interdependent observations is an inherent
problem in using data that rely on relative rather than absolute measures.
10. In the study area the average number of adults per household is 2.2
(i.e. those over the age of 15).
11. We calculated means and standard deviations for the sub-population
of the disaggregated dataset (N = 28) for the villages where the
aggregated survey was conducted. These households are not statistically
signicantly dierent from the other households in the disaggregated
sample for variables important to forest-based livelihoods including
hectares of forest owned by the household, distance to forest etc. There
were dierences with respect to the number of migrant households (40%),
and asset ownership (bicycle and mobile phone ownership was 64% and
21% respectively).
12. Other income includes: support from the government or NGOs; gifts
including condolences; pension income; payments for forest services;
payments for renting out land/houses; sale of land/standing trees; etc.
13. Recall that we hypothesize above that results for forest products
from the aggregated approach may be biased by the strategic behavior of
households.
14. Recent innovations in tracking household activities (cf. Shirima et al.
(2007) who use PDAs to collect and aggregate data in real time) could help
us understand for a set of representative households what the most
accurate representation of income, expenditures, and time use.
REFERENCES
Bradley, S. (1995). How people use pictures: An annotated bibliography.
London, UK: International Institute for Environment and Development.
CARE. (2002). Household livelihood security assessments: a toolkit for
practitioners. Atlanta, GA: CARE, USA.
Campbell, J. (2002). A critical appraisal of participatory methods in
development research. International Journal of Social Research Methodology, 5(1), 1929.
Campbell, B. M., Jerey, S., Kozanayi, W., Luckert, M. K., Mutamba,
M., & Zindi, C. (2002). Household livelihoods in semi-arid regions:
Options and constraints. Bogor, Indonesia: Center for International
Forestry Research.
Campbell, B. M., & Luckert, M. K. (Eds.) (2002). Uncovering the hidden
harvest: Valuation methods for woodland and forest resources. People
and Plants Conservation Series. London, UK: Earthscan Publications
Inc.
Carney, D. (Ed.) (1998). Sustainable rural livelihoods: What contribution
can we make?. London, UK: DFID.
Cavendish, W. (2000). Empirical regularities in the poverty-environment
relationship of rural households: Evidence from Zimbabwe. World
Development, 28(11), 19792003.
Cavendish, W. (2002). Quantitative methods for estimating the economic
value of resource use to rural households. In B. M. Campbell, & M. K.
Luckert (Eds.), Uncovering the hidden harvest: Valuation methods for
woodland and forest resources. London, UK: Earthscan Publications
Ltd.
Chambers, R. (1994). The origins and practice of participatory rural
appraisal. World Development, 22(7), 953969.
1823
Shirima, K., Mukasa, O., Armstrong Schellenberg, J., Manzi, F., John,
D., Mushi, A., et al. (2007). The use of personal digital assistants for
data entry at the point of collection in a large household survey in
southern Tanzania. Emerging Themes in Epidemiology, 4(5), 18.
Sjaastad, E., Angelsen, A., Vedeld, P., & Bojo, J. (2005). What is
environmental income?. Ecological Economics, 55, 3746.
UBOS. (2006). 2002 Uganda population and housing census. Kampala,
Uganda: Uganda Bureau of Statistics.
Vedeld, P., Angelsen, A., Sjaastad, E., & Berg, G. K. (2004). Counting on
the environment: Forest incomes and the rural poor. In Environmental
Economics Series, No. 98. Washington, DC: World Bank.
Wonnacott, T., & Wonnacott, R. (1977). Introductory statistics for
business and economics (2nd ed.). New York: Wiley.