You are on page 1of 14

Measuring Outputs & Outcomes

The Ignored Dimension of Evaluation


Keith T Linard
Senior Lecturer, School of Civil Engineering
University College (University of New South Wales) Australian Defence Force Academy
Email: keithlinard#@#yahoo.co.uk (Remove hashes to email)

SUMMARY
The Australian Department of Defences Defence Managers Toolbox, a CD ROM which
incorporates the Commonwealth Managers Toolbox, has tens of thousands of management
reports, reviews, guidelines, directives and laws. Of these, some 2,500 documents refer to
measuring (or grammatical variations of the term) in the context of evaluation. A great
number of these emphasise the critical importance of, or the difficulty in measuring outputs
and outcomes. NOT ONE gives useful practical guidance on the task of measuring. The
presumption seems to be that, having defined what to measure, the actual task of
measurement is trivial and can be left to the technicians.
The field of data specification, collection, measurement and analysis, however, is a highly
specialised one. It is in this area that many evaluations are deficient.
This paper is a practical guide for managers on the subject of the measurement of program
outputs and outcomes. It does not pretend to be a comprehensive checklist. Rather it aims to
raise awareness of some of the key issues and to provide guidance on managing this
important facet of program evaluation.
Keywords: Public sector management; program evaluation; program performance
measurement; evaluation design.

Keith Linard, as Chief Finance Officer, Australian Department of Finance, was responsible
for the Machinery of Government Section and later the Financial Management Improvement
Section during the 1983-88 'reform' of the Australian Federal Public Service. Keith
currently runs the postgraduate system dynamics program at the Australian Defence Force
Academy and co-directs the postgraduate program in project management.
Introduction

Government documents on evaluation tend to have an air of a Lewis Carroll nonsense rhyme
about them. The literally thousands of such documents I have studied in general tend to
generality and platitude. This is especially the case in relation to the critical area of
measuring evaluation outputs and outcomes. It is apparent that most of the writers or editors
have little if any familiarity with either the statistical or mathematical side of measurement,
or with the data specification and measuring side of the issue. 1

"Would you tell me please, which way


I ought to go from here?"
"That depends a good deal on where
you want to get to", said the Cat.
"I don't much care where", said Alice.
"Then it doesn't matter which way
you go", said the Cat.
Alice's Adventures in Wonderland: Lewis Carroll

In preparing for this paper I scanned thousands of evaluation related documents in The
Australian Department of Defences Defence Managers Toolbox 2, a CD ROM which
incorporates the Commonwealth Managers Toolbox. The CD ROM contains some 2,500
documents which refer to measuring (or grammatical variations of the term) in the context
of evaluation. A great number of these emphasise the critical importance of, or the difficulty
in measuring outputs and outcomes. NOT ONE gives useful practical guidance on the task of
measuring. The presumption seems to be that, having defined what to measure, the actual
task of measurement is trivial and can be left to the technicians.

This paper does not address in any depth what measures of performance might be
appropriate. It assumes that the what of measurement, i.e. the specification of the
performance criteria, has been determined. Rather it focuses on the data aspects of
measurement:
the nature of the data to be acquired
the sources of the data
methods to be used in sampling
methods of collection
timing and frequency of data collection
basis of comparing outcomes (to analyse cause and effect)
data analysis methods to be applied.

1
It is possibly unfair to Lewis Carroll to compare him to the evaluation bureaucrats. Carroll after all
was mathematically and statistically literate.
2
Department of Defence, Defence Managers Toolbox. Directorate of Publishing, Defence Centre
Canberra. June 1996.
Sage advice from the bureaucracy
The how to do it manual, Doing Evaluations - A practical guide3 emphasises that
evaluation is about measuring. It notes:
Evaluations of effectiveness are primarily concerned with:
measuring outcomes;
measuring factors that affect those outcomes . . .
Measurement should focus on the most important attributes (rather than those most
easily measured), especially those which are crucial to achieving higher level
outcomes.
But how should we undertake this task? Our practical guide gives impeccable advice; get
the consultant to figure it out.
Most evaluations, therefore, should also include terms of reference which relate to:
the identification and measurement of unanticipated outcomes; and
recommendations as to how unanticipated outcomes may be maximised if positive,
or minimised if negative.
Quality for our Clients: Improvement for the Future4 talked discursively on the subject of
measurement, but again with little enlightenment on the how.
Some activities can be measured "objectively" ie, quantified and others assessed
"qualitatively" ie, subjectively. The idea that complex service delivery can only be
assessed qualitatively is only justified when quantitative measures are exclusively
used as unit costs measures or productivity measures (inputs/outputs), because these
are not always applicable to some activities (such as teaching or health..
The Management Advisory Board (MAB) was charged under the then Public Service Act
with advising the Commonwealth Government, through the Prime Minister, on significant
issues on the management of the Australian Public Service (APS). The MAB acknowledged
that measurement of outputs was critically important, but was difficult.
Measurement of the performance of individuals and of groups is difficult and
challenging in both the private and public sectors. The measurement of overall
performance in the public sector is, however, conceptually different. There is often,
for example, no market model to apply. In the public sector the term performance is
taken to mean the achievement of planned outcomes or results, and the taking of
actions designed to stimulate such outcomes. The yardsticks of performance are many
and varied: there is generally no identifiable profit measure (or equivalent) which
can be applied. 5

3
Department of Finance, Doing Evaluations - A practical guide. Commonwealth of Australia, Canberra.
1994
4
Department of Finance, Quality for our Clients: Improvement for the Future. Internal Working Group
Report. Canberra. 1995
5
MAB-MIAC Publication Series No 10, Performance Information and the Management Cycle.
February 1993
The MAB also seemed to think that we solve the problem of measurement with clear and
precise terms in a contract specification.6 Unfortunately they give no practical guidelines
on how to specify this difficult and challenging task.
116. The contract specification should define in clear and precise terms the scope of
the work to be undertaken, including the output in amount and quality,
standards to be met, response times, how and when the work will be measured,
the frequency and measurement of the work, the responsibilities of the agency
and contractor, key milestones and the contractor's continuing responsibilities.
In this paper I seek to redress some of this imbalance by focusing specifically on the how,
identifying the key issues involved in output and outcome measurement and suggesting
sources for more detailed understanding.

Designing for data measurement, collection and analysis


The issue of measurement is appropriately considered within the overall context of evaluation
design. Developing a valid evaluation methodology is a crucial step in evaluation design.
The evaluation methodology concerns how we can answer the evaluation questions with
credibility; what precisely we have to measure; what are the sources of the data; how that
data will be measured and collected; and how it will be analysed.
Examples abound of evaluations launched or data collections embarked upon with no clear
idea of what specific hypothesis the data is to address or of how the data is to be analysed.
These have been costly, in wasted resources and in loss of credibility.
It is essential that a valid methodology is designed before any resources are devoted to data
measurement or collection.
The main considerations which influence the choice of evaluation tools are:
the type of evaluation
(compliance , efficiency, effectiveness etc)
the time and cost constraints
the significance of the expected pay-offs
the degree of controversy surrounding the program or the evaluation itself
Whilst this paper focuses specifically on measurement of outputs and outcomes, it must be
remembered that there may be many common data elements relevant at varying stages of the
program cycle. Also, the data may well be maintained in common databases. Figure 1,
overleaf, summarises some important characteristics of different types of evaluation,
including the purpose of each evaluation type (which in turn is indicative of the type of
evaluation question to be addressed), the typical timing of data collections, the type of data
used and the key evaluators.

6
MAB-MIAC Publication Series No. 8, Contracting for the Provision of Services in Commonwealth
Agencies, December 1992
CHARACTERISTICS
PURPOSE OF KEY USERS DATA TYPE OF KEY PAYOFF
EVALUATION OF COLLECTION DATA USED EVALUATOR EXPECTED
TYPES OF EVALUATION IN S
EVALUATION EVALUATION

Help decide Corporate, Ad hoc during Input, Program & Improved design
implementation how best to set program & the program process s line and management
analysis up new program line planning variables managers of initial
managers phase program
implementation

Assess whether Parliament, Periodic or Input, Internal or Maintenance of


compliance audit program corporate, cyclical process & external minimum
activities are in program & output audit unit, acceptable
line with rules line variables program & standards
managers line
managers

Determine Corporate, Ad hoc or Input, Program & More efficient


efficiency whether program & periodic process & line management of
evaluation program Line output managers , program
activities are managers variables audit unit activities; value
done efficiently for money

Measure and Parliament, Continuous, Environment Program & Improved


performance record actual public, supplemented al, input, line program
monitoring results Ministers, by periodic process, managers management;
compared with corporate, special output & warning of need
planned program and collections outcome for corrective
line variables action; input to
managers all types of
evaluations

Assess needs for Parliament, Ad hoc during Environment Evaluation Improved policy
ex-ante evaluation policy Minister, the program al, input, unit; policy development;
development; corporate planning output unit; best choice of
assess best management phase variables; program options
options guesses manager;
task force

Determine Parliament, Collection: Environment Evaluation Improved


effectiveness achievement & Minister, beginning, al, input, unit; policy resource
evaluation continued corporate & within & end process, unit; allocation; better
relevance of program output & program achievement of
Analysis:
program managers outcome manager; Government
periodic variables task force policies
meta evaluation
Determine value Corporate & Ad hoc Input, Research Improved
of evaluation program process & institutions evaluations
activities for the managers; output &
decision makers evaluators. variables universities

FIGURE 1: TYPES OF EVALUATION AND THEIR CHARACTERISTICS


Examples abound of evaluations launched or expensive data collections embarked upon with
no clear idea of how the data is to be analysed. These are costly, in wasted resources and in
loss of credibility. It is essential that a valid methodology be designed before any
resources are devoted to data collection.

KEY ELEMENTS OF EVALUATION METHODOLOGY

kind of data to be acquired


sources of data
methods to be used for sampling data
sources e.g., random sampling)
methods of collecting the data (e.g., self administered questionnaires)
timing and frequency of data collection
basis for comparing outcomes with and without a program (to ascertain cause-and
effect)
analysis plan (e.g., discriminant analysis etc.

In every evaluation there will be a trade-off between the theoretical purity of the
methodology and resource and timing constraints. Corporate management agreement should
be sought if such constraints are likely to endanger the evaluation's credibility or value.
The design is , of course, dependent on the nature of the evaluation questions being asked.
Figure 2 groups some common types of analytical tools according to the type of evaluation
question for which they are most relevant.
Based on the evaluation objectives, the evaluation design should have specified specific
hypotheses to test or specific questions to answer. The nature of these questions or
hypotheses will suggest which statistical techniques are relevant.
EVALUATION QUESTION METHOD OF ANALYSIS ANALYTICAL TOOLS
What is the difference between Gap analysis Statistical Inference
what is and what should be? analysis of variance
hypothesis testing
multi-dimensional scaling

What priority should options Scaling methods Rating scales


have? Rankings
Nominal group techniques
What is the best / most efficient Optimisation analysis Operations research tools
/ most effective . . . ? System dynamics modelling
To what extent was the program Cause and effect analysis Systems analysis
responsible for observed Logic analysis
changes ? Simulation modelling
What are the common patterns Classification methods Statistical Inference
cluster analysis
discriminant analysis
factor analysis
What will happen if . . . ? Trend analysis Simulation Modelling
Statistical Inference
FIGURE 2: COMMON EVALUATION QUESTIONS AND RELATED METHODS OF ANALYSIS
Gap Analysis: A common need in evaluation is to examine the gap between what is and
what is desired or what was planned. Here we need to examine whether the observed gap
is statistically significant, or whether it is simply a random variation due to the particular
sample chosen. Statistical inference techniques are most appropriate to such problems.
Advanced statistical texts, and the major statistical software packages provide guidance on
appropriate tests and their application. This is, however, an area requiring mathematical
training.
Scaling Methods: Scaling methods are used to establish the relative significance of issues.
A frequently used one is the Likert scale, on which respondents indicate their level of
agreement, for example, from "strongly agree" to "strongly disagree". Other frequently used
measurements include ranges of importance or desirability. Such data, however, comprise
ordinal numbers, and only a limited number of statistical tests are valid.
Cause and Effect Analysis: Unless the evaluation involves a full experimental design with
pilot and (randomly assigned) control groups(i), statistical analysis can only suggest that an
apparent relationship exists between program activities and the observed results. It cannot
prove causality. Different approaches to addressing causality are discussed later in this
paper.

What do we measure?
All evaluation involves measurement. Figure 3 illustrates some key measurement definitions
which we will call upon later.
The criterion (that is, the performance to be measured)
is depicted here as the vertical scale.
The performance standard level indicates the desired (or
STANDARD
acceptable level of the criterion towards which the
TARGET
program is aiming. The target (T) level is the actual

MEASURE
} DISCREPANCY performance level being aimed at, consistent with budget
constraints.

BASELINE
} CHANGE
When we come to measurement, two points are critical:
the base-line measure (B) - what was the state of
performance when the program was initiated; and
the current measure (M) - the performance at some
point in time after the program has commenced.
The difference between the two measurements (B-M) is a
CRITERION measure of the change as operationally defined on this
Figure 3: EVALUATION criterion.
MEASURE OR CRITERION7
The performance discrepancy is shown as the difference
between the target and the current measure (T-M).

Sources of data
Availability of data is often a critical constraint in evaluation.

7
Source: A Guide to Measuring Performance and Outcomes. Children Services Program Victoria.
1985.
Rarely is much thought given to the data necessary for comprehensive assessment of impacts
until the program is up and running. In such cases effectiveness evaluations, in particular,
must rely on special post-program collection efforts to establish a base-line datum. This is
usually costly, and it is often difficult to get the desired accuracy.
Future evaluation data needs, the data sources and mechanisms for collecting and storing it
should be addressed during the initial planning for the program. Data sources may be
considered under five broad categories: management information systems, special collection
efforts, existing records and statistics, simulation modelling and expert judgement.

Management information systems (MIS):


Every agency should have in place an MIS
which captures input, process and output
data. As far as is practical such data should
be collected as an automatic by-product of
normal work processes.
In practice government agencies are far
from this ideal. All agencies hold vast
amounts of data, of unknown quality, on
diverse, and often incompatible, computer
systems, on manual indexes, in files, in
reports.
The advent of CD ROM storage, Intranets
and the World Wide Web potentially give
easier access to these data. However,
Figure 4: "Our biggest problem around here issues of collation and quality control,
is lack of information. Of course, I have
nothing to base that on."
especially with respect to data definition,
become, if anything, more critical.
Special collection efforts:
These range from social surveys relying on subjective responses, to task analyses that
identify the specific attributes of a job, to precise measurement by scientific equipment (e.g.,
automatic traffic counters). In a ideal situation special collection efforts would normally
consist of three discrete activities:
base-line survey to determine the conditions at the beginning of the project;
re-survey of the same populations at intervals after program implementation;
special topic surveys to analyse specific program operating problems, especially
deviations from the assumed logic.
Existing records and statistics:
General purpose statistical collections by the national Bureau of Statistics or other
government or private sector bodies may provide surrogate measures for variables on which
statistics are not available, or may be useful to check the representativeness of sample
populations.
In using such sources it is important to check data definitions, population characteristics and
any adjustments made to the original data (e.g., re-basing or smoothing).
Simulation and modelling:
Sometimes physical or mathematical models can be developed to simulate the program
operations. All State Road Authorities in Australia, for example, have computer models
which help estimate the likely impact of changes in the road system. Powerful, yet
inexpensive, computer modelling tools are now being used to model a wide range of social
systems. In particular, the graphically oriented system dynamics simulation packages such as
Powersim and Ithink, will in the future become standard tools for the evaluator. The
calibration of such models may, of course, require extensive historical data for calibration.
Expert judgement:
Not all change can be measured directly. In social fields the assessment of qualitative
changes often depends on expert judgement. It is important that the rating procedures used
allow the expression of judgements in a comparable and reproducible way.

Issues in data sampling


Data collection is expensive. Costs can be cut by using sample survey techniques. The crux
of survey design is its method of probability sampling, which permits generalisations to be
made, from the findings about the sample, to the population as a whole.
The validity of generalising from the sample to the entire population requires that the sample
be selected according to rigorous rules and that there is a uniform approach to data collection
to every unit in the sample.
The sample size affects how statistically reliable the findings will be when projected to the
entire population. Other important considerations include response-rate, uniformity in the
sampling technique, and ambiguity in the data collection /questions instruments.
Sample surveys are specialist tasks, and advice should be sought before embarking on
their use.

Methods of data collection


Having decided whether and what type of sampling technique is appropriate, the next step is
to determine how to collect the needed data from the various sources available. Many
approaches are available. Some require the involvement of individuals or groups; others,
such as observation and review of existing data can largely be done by the researcher alone.
Figure 5 is but a small sample of data collection techniques. For discussion of these and
many others one of the most comprehensive summaries is still Van Gundy, Techniques of
Structured Problem Solving, Van Nostrand Reinhold, NY, 1981.
There is a tendency to assume that `any fool' can design a questionnaire, run a brainstorming
session or undertake systematic observation. These also are areas which demand expertise
and experience. Undertaking major collections without staff with the requisite skills can be
costly in terms of dubious quality data and loss of credibility.
Individually Oriented Methods Interviews
Questionnaires
Polls
Tests
Group-Oriented Methods Sensing interviews
Committees
DELPHI techniques
Nominal-group technique
Brainstorming
Observation Systematic observation
Complete observation
Participant observation
Review of Existing Data Records analysis
Usage rates
Use traces
Other Simulation
Figure 5: Data Collection Techniques

The fundamental issue of cause and effect


With effectiveness evaluation in particular, but also with other evaluation types (see figure 1)
a major task of data collection relates to measuring change. However, it is equally important
to be able to determine how much of that change is due to the program itself, and how much
results from other factors.
CASE 1: After measurement only
No before measurement or comparison or control group
We measure population characteristics at a single point in time, and compare these with the
target performance. For example, if a program aims to eliminate child poverty by 1999,
measurement of the extent of child poverty in 1999 will suggest whether the objective has
been met.
While this approach is cheap, and quite common, it has two critical defects:
without baseline data the basis of the target is questionable and the amount of change is
uncertain (change from what?);
even if change has occurred there is no valid basis for ascribing its cause to the program.
We can often mitigate the first deficiency by estimating baseline conditions from existing
data, through `expert judgement' etc. The issue of causality remains.
Case 2: Before and after measurement
No comparison or control group
This is one of the more common evaluation designs. As an example, to test the efficacy of a
public service management improvement program we might survey departmental
management practices in 1990 and again in 1996, and ascribe any improvement to the
program.
This design is better than the preceding one in that the before measurements provide a
baseline from which change may be estimated. However it still has serious shortcomings. In
particular, without other supporting evidence there is no basis for ascribing the change to the
program;
the change may reflect non program factors or simply be random fluctuations;
the change may be due to the program but may not reflect longer term trends.
CASE 3: After measurement only
Of both pilot and comparison groups
This is a variant on the simple `after' measurement. By measuring the characteristics of a
pilot grouping which benefits from the program, against a comparison grouping which is not
affected by the program, we can estimate change due to the program.
For example, we might compare the crop yield of treated paddocks with untreated ones
(which have the same basic characteristics). We assume that any difference between the
pilot and the control is due solely to the treatment program. The doubt remains:
were the baseline conditions between the pilot and the control identical?
are the program effects the sole differences in impacts on the respective populations.
(With social programs even the awareness of a pilot program can impact on control
groups.)
CASE 4: Time series and econometric analyses
In a time series design the trends in the relevant indicators are analysed for the period prior to
the program. We project these forward in time and assume the projections represent what
would have been without the program. The difference between the projections and the
actuals is presumed to be solely due to the program.
Econometric techniques are statistical methods that estimate, from historical data, the
mathematical relationship between social and economic variables which are considered to
have a causal link to the evaluation question. The mathematical formula is used to predict
what would have been in the absence of the program. We then compare this with the actuals.
Because they consider more variables than simply time, econometric approaches have a
better predictive value than time series methods.
These approaches provide more reliable information than the previous two designs, and are
relatively inexpensive provided the requisite data is available. Their limitations are that:
they presume rather than prove causality; hence
we cannot be certain that projections validly represent what will be.
CASE 5: Quasi-experimental design: before & after
measurements of pilot and comparison group
This design involves two or more measurements over time on both a pilot and a comparison
group. Both rates of change and amount of change between the two groups are then
compared. This protects to a large degree against changes which might have resulted from
other factors.
The main problem with this design is ensuring the pilot and control groups have sufficiently
similar characteristics. For example if a pilot lifestyle education program is run in
Homebush, NSW, and the control group is located in Broadmeadows, Victoria, subsequent
health differences could be due to non program factors, such as climate, ethnicity etc.
CASE 6: Experimental design: before & after measurements
of randomly assigned pilot and control groups
This design is the most rigorous, but also the most time consuming and costly. It is similar to
the quasi-experimental design in that specific changes in pilot and control groups are
analysed. It differs in the way these groups are chosen. In this design a target population is
assigned in a statistically randomly way to either the pilot or control group. initial
differences between the two groups can then be described through precise statistical
parameters.
CASE 7: systems modelling
Systems modelling seeks to model mathematically the operations of a program, emphasising
actual causation rather than statistical correlation (which underlies econometric and time
series analyses). In the past this required a mainframe computer and skilled programmers, in
addition to the modellers. The availability of powerful systems dynamics modelling tools for
PCs will increase the use of this approach. Such models, of course, require calibration data,
which will generally come from one or more of the evaluation designs above.
DATA ANALYSIS METHODS
Data analysis is often the weakest link in evaluation. It is outside the scope of this paper,
however three point will suffice here:
there is often a lack of understanding about the differences between number scales,
interval scale;
there seems to be little understanding of appropriate statistical techniques to analyse data
where there are multiple variables;
there appears to be little awareness of, or skills in the use of, pattern recognition or
classification techniques such as cluster analysis, factor analysis, multidimensional
scaling, discriminant analysis etc.
How the data will be analysed should be considered in the evaluation design stage. It should
not be left, as is so often the case, until the data has been collected.
PRINCIPLES FOR SELECTING THE APPROPRIATE EVALUATION DESIGN
Four practical guidelines will help in choosing between the alternative evaluation designs:
The needs of the decision makers are paramount: Where do their `certainty needs' lie on a
scale from gut feel to absolute certainty. Will 70% confidence suffice? Are they
vitally concerned about the full gamut of issues, or do they want reassurance on a few
crucial factors? How would they trade off scientific precision against timeliness and cost.
Be realistic about time, budget and skills constraints: Timeliness is often crucial. It is
better to have a simple evaluation whose limitations are known, than a sophisticated
design whose requirements must later be compromised because of budget, time or
personnel constraints.
Resist the urge to collect those additional data items which might be useful, one day.
Data collection and maintenance is costly. In many agency evaluations much of the data
collected is never used.
Get expert advice on the validity and appropriateness of the design. The most costly
design is that which is inappropriate to the problem.

SAMPLE TERMS OF REFERENCE RE MEASUREMENT


Evaluation Terms of Reference should provide clear guidance for the consultant or
evaluation team, and give the project manager adequate leverage to ensure quality control.
The following abbreviated version of the terms of reference from a scientifically rigorous
evaluation illustrate some keys points of such a specification, including:-
the evaluation questions;
the data collection methodology, including key data items, data sources;
the approach to validating presumed program effects (control groups etc);
1 Evaluation Questions:
1.1 Assess the efficiency of program implementation and operation:
1. Has the program operated according to its guide-lines?
2. How do sponsored activities differ according to sponsor, activity undertaken and
regional location?
3. How have administrative procedures helped or hindered program efficiency?
1.2 Assess the effectiveness of the program in meeting objectives:
1. By how much has the program improved the longer term employment prospects of
the participants?
2 Methodology of the Evaluation:
The evaluation will use, as appropriate, administrative data held in departmental files on each
sponsored project; special surveys of project sponsors, project participants and a control-
group of non-participants; and field investigations and discussions. The method of
investigation for each aspect of the evaluation will be as follows:
(a) Has the program operated according to its guide-lines?
Administrative and survey data will be used to assess extent of compliance with
program guide-lines, particularly regarding labour-capital ratio and equal employment
opportunity.
(b) How do sponsored activities differ according to sponsor, activity
undertaken and regional location?
Administrative and survey data will be used to identify factors likely to explain the
relative success or failure of projects in terms of objectives achievement. This will
necessitate analysis of projects according to type of sponsor (government, private
sector or community group), the activity undertaken (103 categories) and the regional
location. Other characteristics to be examined would include project cost, number of
persons employed and sponsor contribution.
(c) How have administrative procedures helped or hindered program
efficiency?
Administrative and survey data, on site examination of administrative procedures and
field discussions will be used to assess appropriateness of administrative procedures.
Issues to be examined will include delays or problems resulting from lack of local
decision making authority or from the level of professional or administrative
resources.
(e) By how much has the program improved the longer term employment
prospects of the participants?
This analysis will use the participants' initial and follow-up surveys, which will be
matched against a control group of non-participants. Labour market performance will
be assessed in the light of participants' previous employment history and the state of
the labour market at the time of the surveys. The perceptions of participants and their
employers of the impact of the program on subsequent employment history will also
be assessed.
3 Guide-lines for Special Surveys:
Participant Survey: Surveys will be based on stratified random samples. The sample frame
of 600 projects will be stratified according to nature of program, type of sponsor and region.
Participants will be surveyed within the selected projects by self-administered questionnaires.
A minimum of three follow-up contacts are planned.
Participant Follow-Up Survey: Follow-up survey will be posted three months after the
initial participant survey. Reply-paid envelopes will be provided and three follow-up
contacts made to ensure high response rate. Special attention will be given to the
characteristics of non-respondents using available administrative and survey data.
Survey of Non-Participants (Quasi-Control Group): A sample of 600 persons will be
selected whose socio-economic and geographic characteristics are as close as possible to the
participant sample. The sample will be drawn from people referred to the program but not
placed.

i
. Refer Evaluating Government Programs - A Handbook. Canberra, AGPS, 1987.

You might also like