You are on page 1of 4

PROBLEMS IN LANGUAGE TESTING Constructing a good language test is not easy and simple as one may imagine.

It must fulfill certain criteria and not deviate from its aims. Some others may develop test of certain categories merely because those categories are easier to construct while others are more difficult. Apart from reading, structure and vocabulary, for example, dominate English language test at secondary schools merely because they are easier to test (Jabu, 2001). We can attempts to identify the problems that may appear in language testing, there are: A. Sampling Language testing is designed to measure the actual competence and performance of the learners in the language they are learning. Heaton (1988) states that the longer the test, the more reliable and representative a measuring instrument it will be although lenght, itself, is no guarantee of a good test. The construction of short test that function efficiently is often a difficult matter. The test must cover an adequate and representative section of those areas and skills it is desired to test. Before starting to construct any test items, the test showing aspects of the skills being tested and giving a comprehensive coverage of the specific language elements to be included. Heaton (1988) proposes that a classroom test should be closely related to the ground covered in the class teaching. There should be association between different areas covered in the test and the lenght of time spent on teaching those areas in class. There is a constant danger of concentrating too much on testing those areas and skills, which most easily lend themselves to being tested. It may be helpful for the teachers to draw up a rough inventory of those areas, for example, of grammatical features or functions and notions, which they wish to test, assigning to each one a percentage according to importance. For example, a teacher wishing to construct a test of grammar might start by examining the relative weighting to be given to the various areas in the light of the teaching that has just taken place. Say, for example, the areas to be included are tenses (30%), preposition (15%), comparison of adjective (20%), clauses (30%), and concord (5%). It is quite adequately inferred that there are two main problems of sampling anticipated, i.e. representative samples related to the adequacy of coverage and to the corresponding test and teaching. B. Teachers Deviations (Sins) in Testing Language testing is designed to measure the actual language competence and performance of the laerners correctly and accurately. In practice, however, there are some

deviations that teachers might do in testing their students. Some teachers treat their test as a punishment, rather than a constructive tool. A good test should never be constructed in such a way to trap the students into giving an incorrect answer. When techniques of error analysis are used, the setting of intentional traps or pitfalls for incautious students should be avoided. Heaton (1988) states that many testers, themselves, are caught out by constructing test items which succeed only in trapping the more able students. Teachers often do not return the students test papers. They might not correct them. If they do, they delay returning them, so that feedback, when given, becomes irrelevant. They do not provide satisfactory corrections and explanations about errors. C. Criteria of Tests Language testing is a form of measurement. Tests of language abilities may be inaccurate or unreliablein the sense that repeated measures may give different results. These, may also invalid in the sense that other abilities are mixed in. 1. Validity The validity of the test is the extent to which it measures what is supposed to measure and nothing else (Underhill, 1987; Madsen, 1983; and Heaton, 1988). The test must aim to provide a true measure of the particular skill which it is intended to measure. If it measures of the particular external knowledge and other skills at the same time, it will not be a valid test. The frequently existing problem is that some teachers, without adequate knowledge on constructing tests, tend to develop their own tests regadless their test validity. Whether the tests measure what is intended is not taken into account. 2. Reliability Reliability is a necessary characteristic of any good test. If the test is administered to the same learners on different occasions (with no language practice taking place between these occasions), then, to the extent that it produces similar result, it is considered reliable. In short, in order to be reliable a test must be consistent in its measurements. Henning (1987) infers that reliability is a measure of accuracy, consistency, dependability, or fairners of scores resulting from administration of a particular examination. 3. Discrimination An important feature of a good test is its capacity to discriminate among the different learners and to reflect the differences in the performances of the individuals in the group. To find out whether a test has a discrimination power, the test is first tried out

on representative sample of students. The results of the test are then examined to determine the extent to the which it discriminates between individuals. The test should be constructed so as to discriminate as much as possible. Heaton (1988) suggests that the items in a test should be spread over a wide difficulty level, from exttremely easy items through extremely difficult ones. 4. Administration A test must be practicable. In other words, it must be fairly straight-forward to administer. The lenght of time available for the administration of the test is frequently misjudged, especially if the complete tests consist of the number of sub-tests. In such cases, sufficient time may not be allowed for the administration of the test, the colletion of the answer sheets, the reading of the test instructions, etc. The time to be allowed should be decided on a result of a pilot administration, or a tryout, of the test. To come to the point, those criteria discussed above are frequently neglected by the teachers when constructing their own tests because the teachers might not be equipped with sufficient knowledge on how to construct good tests. They also might not have sufficient time as well as proper payment to do so. D. Test Techniques A communicative approach to language learning has been applied in the learning process, the test techniques should be based on the types of language task practiced in learning program. More spesifically, the decision to choose a certain test techniques to assess certain language skill or element should be based on the learning activities that have taken place. The important point is that assumptions about the value of particular types of acquisition activities must be defined relative to testing context (Bransford, 1979). In other words, if the process of acquisition is recognition or identification, the test should not be evaluation or analysis. It may be a multiple-choice test, true false, or completion; but not essay test. The reason is that recognition or identification needs simpler level of cognitive process than evaluation or analysis. E. Directions Most students taking any test are working under certain mental pressures. It is, therefore, essential that all instructions or rubrics are clearly written and that examples are given. It is sometimes difficult to avoid writing clumsy rubrics or rubrics consisting of complex sentences above the difficulty level being tested. Such rubrics should be rewritten in short sentences clearly and concisely.

Sometimes it happens that students do not do well on the test not because they do not know the answer to the test items. However, they might not understand the rubrics well. A rubric should never, in itself, become a test of reading comprehension. Each word should be carefully considered. For example, the word best is used in certain instances instead of correct, if the test items contain several correct answers. F. Scoring System Since making or scoring system is a vital part of a test, it must be integrated into the whole process of test design from the beginning. Constructing the test items is only half of the job. Scoring the test is equally challenging. Scoring objective testing is quite simple. Each correct response may be awarded certain score. Subjetive test, such as test of writing or speaking are rather difficult. Madsen (1983) proposes two approahes to scoring: discrete scoring that test for nearly every utterance or response the learner makes, a holistic scoring that evaluates the entire body of students speech simultaneously. G. System of Management (Policy) Testing is highly influenced by the system of management or policy that is applied in a certain area or at schools. The administration of test at schools principal may require the teachers to score the students tests not less than certain level, although they cannot get that level of scores. Value of scores awarded by the same teacher to students of different schools. H. Conclusion Language testing is designed to measure the language competence and performance of the learners correctly and accurately. Some possible problems can influence construction and administration of the test. A teacher or test costructor may find difficulties in determining the extent to which the test items are considered representative. Teachers might do some deviations in testing their students. Some teachers treat their test as a punishment, trap them into giving incorrect answer, and do not return their test papers. Many tests administered to the students are tried out. Therefore, they do not fulfill the test criteria: validity, reliability, discrimination, and administration. Another problem in test construction is that it is difficult to decide what tets technique should match the teaching and learning technique that has been done. The test directions or rubrics be brief and accompanied with examples.

You might also like