You are on page 1of 17

Basic Concepts in Assessment

How can we use assessment as a tool to improve our teaching?

Assessments as Tools
• Assessment is a process of observing a sample of students’ behavior
and drawing inferences about their knowledge and abilities.
• We use a sample of student behavior to draw inferences about
student achievement.

Forms of Educational Assessment


• Informal vs. formal assessment
• Paper-pencil assessment vs. performance assessment
• Traditional assessment vs. authentic assessment
• Standardized test vs. teacher-developed assessment

Informal vs. formal assessment


• Informal assessments are spontaneous, day-to-day observations of
students’ performance in class.
• Formal assessment is planned in advance & used for a specific
purpose to determine what is learned in a specific domain.

Paper-pencil vs. Performance assessment


• Paper-pencil: asks students to respond in writing to questions.
• Performance: asks students to demonstrate knowledge or skills in
some other fashion. Students perform in some way.

Traditional vs. authentic assessment


• Traditional: assesses basic knowledge & skills separate from real-
world tasks.
• Authentic: assesses students’ ability to use what they’ve learned in
tasks similar to those in the outside world.

Standardized test vs. teacher-developed test


• Standardized test: developed by test experts, published for use in
many schools.
• Teacher-developed tests: developed by a teacher for use in individual
classroom.
Purposes for assessment
• Formative evaluation: assessing what students know before & during
instruction. We can redesign lesson plans as needed.
• Summative evaluation: assessment after instruction to determine what
students have learned, to compute grades.

Promoting learning
• Assessments as motivators
• Assessments as mechanisms for review
• Assessments as influences on cognitive processing- studying more
effectively for types of test items.
• Assessments as learning experiences
• Assessments as feedback

Qualities of good assessments- RSVP


• Reliability
• Standardization
• Validity
• Practicality

Reliability
• The extent to which the instrument gives consistent information about
the abilities being measured.
• Reliability coefficient- correlation coefficient +1 to -1

Standard error of measurement


• SEM- shows how close a student’s score is to what it should be.
• A true score is the ideal score for a student on a subject based on past
performance.
• The test manual will compute common errors in the scoring. Scores
must be given within this range- the confidence interval.

Enhancing the reliability of classroom assessments


• Use several tasks in each instrument
• Define each task clearly enough so students know what is being
asked.
• Use specific, concrete criteria
• Keep expectations out of judgment.
• Avoid assessing a child when s/he is ill, tired, out of sorts in some
way.
• Use the same techniques and environment for assessing all kids.

Standardization
• The concept that assessment instruments must have similar, consistent
content, format, & be administered & scored in the same way for
everyone.
• Standardized tests reduce error in assessment results & are considered
to be more reliable.

Validity
• The extent an instrument measures what it is designed to measure.
• Content validity- items are representative of skills described
• Predictive validity- how well an instrument predicts future
performance. SAT, ACT
• Construct validity- how well an instrument measures an abstract,
internal characteristic- motivation, intelligence, visual-spatial ability.

Essentials of testing
• An assessment tool may be more valid for some purposes than for
others.
• Reliability is necessary to produce validity.
• But reliability doesn’t guarantee validity.

Practicality
• The extent to which instruments are easy to use.
• How much time will it take?
• How easily is it administered to a group of children?
• Are expensive materials needed?
• How much time will it take?
• How easily can performance be evaluated?

Standardized tests
• Criterion-referenced scores show what a student can do in accord with
certain standards.
• Norm-referenced scores compare a student’s performance with other
students on the same task.
• Norms are derived from testing large numbers of students.

Types of standardized tests


• Achievement tests- to assess how much students have learned of what
has been taught
• Scholastic aptitude tests- to assess students capability to learn, to
predict general academic success.
• Specific aptitude tests- to predict how students are likely to perform in
a content area.

Technology and Assessment


• Allows adaptive testing
• Can include animation, simulation, videos, audios
• Enables easy assessment of specific problems
• Assesses students’ abilities with varying levels of support
• Provides immediate scoring

Guidelines for choosing standardized tests


• Choose a test with high validity for your purpose & high reliability.
• Be sure the test’s norm group is relevant to your population.
• Follow directions closely.

Types of test scores


• Raw scores- based on number of correct responses.
• Criterion-referenced scores- compare performance to criteria or
standards for success.
• Norm-referenced scores- compare student’s performance to the
average of students the same age.

Norm-referenced scores
Grade-equivalents and age-equivalents compare a student’s
performance to the average performance of students at the same age/ grade.
Percentile ranks- show the percentage of students at the same age/
grade who made lower scores than the individual.
Standard scores- show how far the individual performance is from the
mean by standard deviation units.

Standard scores
• Normal distribution- bell curve
• Mean
• Standard deviation- variability of a set of scores.
• IQ scores
• ETS scores
• Stanines
• Z-scores

Standard deviation
• IQ scores- mean of 100, SD of 15
• ETS scores- (Educational Testing Service tests- SAT, GRE)
mean of 500, SD of 100
• Stanines- for standardized achievement tests- mean- 5, SD- 2
• z-scores- mean of 0, SD of 1- used statistically

Norm- vs. criterion-referenced scores


• Norm-referenced scores- grading on the curve, based on the class
average. Sets up a competitive environment, not a sense of
community. May be used in performance tests- who gets to be first
chair in band.
• Criterion-referenced scores show if students have mastered objectives.
Interpreting test scores
• Compare 2 norm-referenced test scores only when those scored come
from equivalent norm groups.
• Have a clear rationale for cutoff scores for acceptable performance.
• Never use a single test score to make important decisions.

High-stakes testing and accountability


• High-stakes testing- Making major decisions on the basis of a single
assessment.
• Accountability- holding teachers, administrators responsible for
students’ performance on those tests.
• Some tests have determined passing a grade or graduation.

Problems with high-stakes testing


• Tests don’t always show instructional objectives.
• Teachers spend time teaching to the tests.
• Low achievers or special ed students are often not included.
• Criteria often bias against students from lower SES.
• Not enough emphasis on helping schools/ students improve.

Potential solutions to the problems


• Identify what is most important for students to know.
• Educate the public about what tests scores can do.
• Look at alternatives to tests.
• Use multiple measures in making high-stakes decisions. Identify what
is most important for students to know.
• Educate the public about what tests scores can do.
• Look at alternatives to tests.
• Use multiple measures in making high-stakes decisions.

Confidentiality & communication of test results


• Family Educational Rights & Privacy Act- limits testing to
achievement/ scholastic aptitude.
• Restricts test results to students, parents, & teachers.
• Restricts students grading others’ papers, posting scores
publicly, or going through student papers to find one’s own
paper.
• Parents/ students can review test scores & school records.
Communicating classroom assessment results
• Assessment is primarily to help students learn & achieve more
effectively.
• Class results must be communicated to parents to enable student
success.
Explaining standardized test results
• Be sure you understand the test results yourself.
• It may be sufficient to explain test results in general terms.
• Use percentile ranks rather than IQ or grade equivalents.
• Describe the SEM & confidence intervals if you know them.

Taking student diversity into account


• Developmental differences
• Test anxiety
• Cultural bias
• Language differences
• Testwiseness

Accommodating students with special needs


• Modify format of test
• Modify response format
• Modify timing
• Modify setting
• Administering part, not all test
• Use instruments that are more compatible with students’ level
testing
Show me everything on Performance Management

definition -

In general, testing is finding out how well something works. In terms of human beings,
testing tells what level of knowledge or skill has been acquired. In computer hardware
and software development, testing is used at key checkpoints in the overall process to
determine whether objectives are being met. For example, in software development,
product objectives are sometimes tested by product user representatives. When the design
is complete, coding follows and the finished code is then tested at the unit or module
level by each programmer; at the component level by the group of programmers
involved; and at the system level when all components are combined together. At early or
late stages, a product or service may also be tested for usability.

At the system level, the manufacturer or independent reviewer may subject a product or
service to one or more performance tests, possibly using one or more benchmarks.
Whether viewed as a product or a service or both, a Web site can also be tested in various
ways - by observing user experiences, by asking questions of users, by timing the flow
through specific usage scenarios, and by comparing it with other sites.
 Meaning of Evaluation

• Evaluation has its origin in the Latin word “Valupure” which

• means the value of a particular thing, idea or action. Evaluation,

• Thus, helps us to understand the worth, quality, significance amount,

• degree or condition of any intervention desired to tackle a social

• problem.

• Meaning of evaluation:

• Evaluation means finding out the value of something.

• Evaluation simply refers to the procedures of fact finding

• Evaluation consists of assessments whether or not certain activities, treatment and


interventions are in conformity with generally accepted professional standards.

• Any information obtained by any means on either the conduct or the outcome of
interventions, treatment or of social change projects is considered to be
evaluation.

• Evaluation is designated to provide systematic, reliable and valid information on


the conduct, impact and effectiveness of the projects.

• Evaluation is essentially the study and review of past operating experience.

 Purpose of Evaluation

• From an accountability perspective:


• The purpose of evaluation is to make the best possible use of funds by the
program managers who are accountable for the worth of their programs.

• Measuring accomplishment in order to avoid weaknesses and future mistakes.

• -Observing the efficiency of the techniques and skills employed

• -Scope for modification and improvement.

• -Verifying whether the benefits reached the people for whom the program was
meant .

• Form a knowledge perspective:

• The purpose of evaluation is to establish new knowledge about social problems


and the effectiveness of policies and programs designed to alleviate them.

• Understanding people’s participation & reasons for the same.

• Evaluation helps to make plans for future work.




 Principles of Evaluation

• The following are some of the principles, which should be kept in view in
evaluation.

• 1. Evaluation is a continuous process (continuity).

• 2. Evaluation should involve minimum possible costs (inexpensive).

• 3. Evaluation should be done without prejudice to day to day work (minimum


hindrance to day to day work).

• 4. Evaluation must be done on a co-operative basis in which the entire staff and
the board members should participate (total participation).

• 5. As far as possible, the agency should itself evaluate its program but
occasionally outside evaluation machinery should also be made use of (external
evaluation).

• 6. Total overall examination of the agency will reveal strength and weaknesses.
(agency / program totality).
• 7. The result of evaluation should be shared with workers of the agency (sharing).

 Stages in Evaluation.

• 1. Program Planning Stage.

• Pre – investment evaluation or

• Formative evaluation or

• Ex – ante evaluation or Early / Formulation

• Pre project evaluation or

• Exploratory evaluation or

• Need assessment.

• 2.Program Monitoring Stage.

• Monitoring Evaluation or Ongoing / interim.

• Concurrent evaluation

• 3.Program completion Stage.

• Impact evaluation or

• Ex- post evaluation or (Summative / Terminal / Final)

• Final evaluation.

 Criteria for Evaluating Development Assistance


 Steps in Evaluation :
 Types of Evaluation

• Evaluation can be categorized under different headings

• A) By timing (when to evaluate)

• Formative Evaluation

• Done during the program -Development stages

• (Process Evaluation, ex-ante evaluation, project appraisals)


• Summative Evaluation

• Taken up when the program achieves a stable of operation or when it is


terminated

• (Outcome evaluation, ex post evaluation etc.)

• B) By Agency. Who is evaluating?

• Internal Evaluation External Evaluation

• It is a progress / impact Unbiased, objective detailed

• Monitoring by the management it self assessment by an outsider

• (Ongoing / concurrent evaluation)

• C) By Stages

• On going Terminal Ex – post

• During the implementation At the end of After a time lag

• of a project or immediately from completion

• after the completion of a project

 Types of Evaluation Desired Situation Sustained benefits and impact Present


Situation Mid-Term review End-of project or final evaluation Ex-post or impact
evaluation Time PROJECT
 Internal / External Evaluation:

• Internal Evaluation: (Enterprise Self Audit)

• Internal evaluation (or otherwise monitoring, concurrent evaluation) is a


continuous process which is done at various points and in respect of various
aspects of the working of an agency by the agency staff itself i.e. staff board
members and beneficiaries.

• External / Outside Evaluation: (This is done by outsiders /Certified Management


Audit)

• Grant giving bodies in order to find out how the money given is utilized by the
agency or how the program is implemented sent experienced and qualified
evaluators (inspectors) to assess the work E.g. Central social welfare Board
• Some donors may send consultants in order to see how far the standards laid
down are put into practice.

• Inter agency evaluation. In this type two agencies mutually agree to evaluate their
program by the other agency.

• Inter agency tours.

 Methods of Evaluation: (Tools / techniques)

• Over the years, a variety of the methodologies have been evolved by


academicians, practitioners and professionals for evaluating any program /
project. Some of the commonly used practices are given below.

• First hand Information :

• One of the simplest and easiest methods of evaluation by getting first hand
information about the progress, performance, problem areas etc,. of a project from
a host of staff, line officers, field personnel, other specialists and public who
directly associated with the project. Direct observation & hearing about the
performance and pitfalls further facilitate the chances of an effective evaluation.

• Formal / Informal Periodic Reports.

• Evaluation is also carried out through formal and informal reports.

• Formal reports consists of

• -Project Status Report

• -Project Schedule chart

• -Project financial status Report.

• Project Status Report:

• From this one can understand the current status, performance, schedule, cost and
hold ups, deviations from the original schedule.

• Project Schedule Chart:

• This indicates the time schedule for implementation of the project. From this one
can understand any delay, the cost of delay and the ultimate loss.

• Project Financial Status Report:


• It is through financial report, one can have a look at a glance whether the project
is being implemented within the realistic budget and time.

• Informal Reports:

• Informal reports such as anonymous letters, press reports, complaints by


beneficiaries & petitions sometimes reveal the true nature of the project even
though these reports are biased and contains maligned information.

• Graphic presentations:

• Graphic presentations through display of Charts, Graphs, Pictures, Illustrations


etc. in the project office is yet another instrument for a close evaluation.

• Standing Evaluation Review Committees:

• Some of the organizations have setup standing committees, consisting of a host of


experts and specialists who meet regularly at frequent intervals to discuss about
problems and to suggest remedial measures.

• Project Profiles:

• Preparation of the project profiles by the investigating teams on the basis of


standardized guidelines and models developed for the purpose, is also another
method of evaluation.

 Views about evaluation

• Evaluation primarily perceived from three perspectives.

• Evaluation as an analysis – determining the merits or deficiencies of a program,


methods and process.

• Evaluation as an audit – systematic and continuous enquiry to measure the


efficiency of means to reach their particular preconceived ends.

• In the agency context

• Evaluation of administration means appraisal or judgement of the worth and


effectiveness of all the processes (e.g. Planning, organizing, staffing etc.)
designed to ensure the agency to accomplish its objectives.

•  Areas of evaluation:

• Evaluation may be split into various aspects, so that each area of the work of the
agency, or of its particular project is evaluated. These may be,
• 1.Purpose 2.Programs 3.Staff 4.Financial Administration 5.General.

• Purpose:

• The review the objectives of the agency / project and how far these are being
fulfilled.

• Programs:

• Aspects like number of beneficiaries, nature of services rendered to them, their


reaction to the services, effectiveness and adequacy of services etc. may be
evaluated.

• Staff:

• The success of any welfare program / agency depends upon the type of the staff
an agency employs. Their attitude, qualifications, recruitment policy, pay and
other benefits and organizational environment. These are the areas which help to
understand the effectiveness of the project / agency.

• Financial Administration:

• The flow of resources and its consumption is a crucial factor in any project /
agency. Whether the project money is rightly consumed any over spending in
some headings, appropriation and misappropriation. These are some of the
indicators that reveal the reasons for the success or failures of any project.

• General:

• Factors like public relations strategies employed by the project / agency, the
constitution of the agency board or project advisory committee and their
contribution future plans of the agency are important to understand the success or
failures of any project.

 Evaluation ……

• Analysis on how successful the project has been in

• Transforming the means (i.e. the resources and inputs

• allocated to the project) through project activities into

• concrete project results

• Provides the stakeholders with information on


• inputs/costs per unit produced

Overall Objectives Efficiency Means + Preconditions Activities+ Assumptions Results +


Assumptions Project Purpose + Assumptions Change utilisation action allocation
Analysis on how well the production of project results Contributes to the achievement of
the project purpose, i.e.: Are there clear Indications of changes and improvements that
benefit the beneficiaries of the project? Uses base-line information on the pre project
situation as a starting point Effectiveness Impact Analysis of the overall effects of the
project Analysis of the contribution of the project purpose to the overall objectives Focus
on long-term changes in the environment of the project “ Collection” and analysis of
information at the levels of communities and society at large focusing on the final
beneficiaries of the project Also analysis of unintended impacts (negative and positive)


 Criteria for Evaluating Development Assistance

• Relevance

• = The extent to which the aid intervention is

• suited to the priorities and policies of the target

• group, partner country and donor

• Possible questions:

• To what extent are the objectives of the program

• still valid?

• Are the activities and outputs of the program

• consistent with the overall goal and the

• attainment of its objectives?

• Are the activities and outputs of the program

• consistent with the intended impacts and effects?

• Efficiency

• = Efficiency measures the outputs – qualitative

• and quantitative – in relation to the inputs.


• It is a term which signifies that the aid uses the

• Least costly resources in order to achieve the

• Desired results. This generally requires

• Comparing alternative approaches

• to achieving the same outputs, to see whether

• the most efficient process has been adopted

• Possible questions:

• Were the activities cost-efficient?

• Were objectives achieved on time?

• What were the major factors influencing the

• achievement of the results?

Effectiveness = A measure of the extent to which an aid intervention attains its objectives
Possible questions: To what extent were the objectives achieved/are likely to be
achieved? What were the major factors influencing the achievement or non-achievement
of the objectives? Impact = The positive and negative changes produced by an
intervention, directly or indirectly, intended or unintended. Possible questions: What has
happened as a result of the programme or project? What real difference has the activity
made to the beneficiaries? How many people have been affected? Sustainability =
Sustainability is concerned with measuring whether the benefits of an activity are likely
to continue after donor funding has been withdrawn . Possible questions: To what extent
did the benefits of a programme or project continue after donor funding ceased? What
were the major factors which influenced the achievement or non-achievement of
sustainability of the program or project?

You might also like