You are on page 1of 13

Principles of Language Assessment

Group I Citra Puji Rahmanie Larissa Huda Lukman Fitrah Renita Maria Pane

Practicality
An effective test is PRACTICAL this means that :
It is not excessively expensive a test that is expensive is impractical, such as a test of language proficiency that takes a student five hours to complete, etc. (page 19) It stays within appropriate time constraints It is relatively easy to administer It has a scoring/evaluation procedure that is specific and time-efficient

Reability
A reability test is consistent and dependable. The issue of reliability of a test may best be addressed by considering a number of factors that may contribute to the unreliability of a test. Considered the following possibilities (adapted from Mousavi, 2002, p. 804): fluctuations in the student, in scoring, in test administration, and in the test itself.

Student-Related Reliability The most common learner-related issue in reability is caused by temporary illness, fatigue, a bad day, anxiety, and other physical or psycological factors, which may make an observed score deviate from ones true score. Also included in this category are such factors as a test-takers test-wiseness or strategies for efficient test taking (Mousavi, 2002, p. 804)

Rater Reliability Inter-rater realibility occurs when two or more scorers yield inconsistent scores of the same test, possibly for lack of attention to scoring criteria, inexperience, inattention, or even preconceived biases. Rater reliability issues are not limited to context where two or more scorers are involved. Intra-rater reliability is a common occurance for classroom teachers because of unclear scoring criteria, fatigue, biased toward particular good and bad students, or simple carelessness. One solution to such intra-rater unreability is to read through about half of the test before rendering any final scores or grades, then to recycle back through the whole set of tests to ensure an even-handed judgement. In test of writing skills, rater realibility is particularly hard to achieve since writing proficiency involves numerous traits that are difficult to define.

Test Administration Reliability Unrealiability may also result from the conditions in which the test is administered. This was a clear case of unreliability caused by the conditions of the test administration. Other sources of unreability are found in photocopying variations, the amount of light in different parts of the room, variations in temperature, and even the condition of desk and chairs. Test Reliability Sometimes the nature of the test itself can cause measurement errors. Timed test may discriminate againts students who do not perform well on a test with time limit. Poorly written test items may be a further source of test unreliability.

Validity
Grounlund 1998, Validity is the extent to which inferences made from assessment results are appropriate, meaningful, and useful in terms of the purpose of the assessment.

Five Types of Evidence


Content-Related Evidence Criterion-Related Evidence Construct-Related Evidence Consequential Validity Face Validity

Content-Related Evidence
Mousavi, 2002; Hughes, 2003 A test that actually samples the subject matter about which conclusions are to be drawn and requires the test-taker to perform the behavior that is being measured, and can claim content-related evidence of validity. Another way of understanding content validity is to consider the difference between direct and indirect testing.

Criterion-Related Evidence
Criterion-related evidence usually falls into one of two categories: - Concurrent validity if its results are supported by other concurrent performance beyond the assessment itself. - Predictive validity of an assessment becomes important in the case of placement tests, admission assessment batteries, language aptitude tests, and the like. The assessment criterion in such cases is to assess (and predict) a test-takers likelihood of future success.

Construct-Related Evidence
Construct is any theory, hypothesis, or model that attempts to explain observed phenomena in our universe of perception. Constructs may or may not be directly or empirically measured. Proficiency and : communicative competence are linguistics constructs; self-esteem and motivation are psychological constructs. Construct validity is a major issue in validating large-scale standardized tests of proficiency because such test may not be able to contain all the content of particular field or skill.

Consequential Validity
Consequential validity encompasses all the consequences of a test, including such consideration as its accuracy in measuring intended criteria, its impact on the preparation of test takers, its effect on the learner, and the (intended an unintended) social sequences of a tests interpretation and use.

Face Validity
It means that students view the assessment as fair, relevant, and useful for improving learning. Face validity refers to the degree to which a test looks right, and appears to measure the knowledge or ability it claims to measure. Face validity will likely be high if learners encounter
A well-constructed, expected format with familiar tasks A test that is clearly doable with allotted time limit Items that are clear and uncomplicated Directions that are crystal clear Tasks that relate to their course work A difficulty level that presents a reasonable challenge

You might also like