Professional Documents
Culture Documents
English, are instructed as follows, when assessing pronunciation, examiners should try to put themselves in the position of a non-EFL specialist, native speaker of English and assess the amount of strain on the listener and the degree of patience and effort required to understand the candidate. This procedure raises the following doubts: 1. A professional teacher of English cannot be required to pretend to be a nonEFL specialist who, in addition, is a native speaker of English; not everyone has a talent of pretending to be a completely different person (what if he fails?). 2. It is not clear what kind of native speaker the examiner is supposed to impersonate a well-travelled university professor, familiar with many nonnative varieties of English or a small-town housewife who has never left her birthplace? 3. A nonnative teacher in most cases can understand even very bad English of his fellow-countrymen because of his/her frequent exposure to it. He is, therefore, in no position to judge its intelligibility to users of English of different nationalities than his own. 4. Having no precise criteria of pronunciation assessment, the examiner is likely to adopt his own subjective principles of evaluation (see section 3). This often happens in spite of standardization procedures and examiners training. We can conclude that the examinations under analysis do not provide clear-cut criteria of assessing the examinees pronunciation by relying too heavily on very imprecise impressionistic judgements and by making unreasonable demands on nonnative examiners. This, in turn, seriously undermines their inter-rater reliability. 3. Holistic versus atomistic pronunciation testing As shown in the preceding section, Cambridge English Examinations, similarly to many other language tests, employ rather objectionable impressionistic evaluation. It is therefore crucial to examine its logical alternative, i.e. analytic testing. In this section these two approaches to pronunciation assessment are compared and verified. In the holistic approach to language testing (Alderson et al. 1996:289), examiners are asked not to pay too much attention to any one aspect of a candidates performance, but rather to judge its overall effectiveness. The greatest advantage of this procedure is that it can be administered to large groups of learners within a short period of time. Moreover, according to Underhill (1987:101), impression marking is used for the kind of categories that are very hard to define but everybody agrees are important: fluency, ability to communicate, style, naturalness of speech, and so on. For these reasons it is advocated by many researchers (e.g. Celce-Murcia et. al.1996, Hughes 1991, Koren 1995). Nevertheless, global pronunciation testing has many drawbacks. It is often too general and imprecise since the assessment criteria in the rating scales, as has been shown in section 2, tend to be vague. This means, in consequence, that different raters might adopt their own criteria of evaluation. Finally, as pointed out by Underhill (1987: 101), making accurate impression-based assessments requires a lot of experience. () Even experienced assessors find it difficult to make consistent impression-based judgements. In other words, this procedure raises problems both of intra-rater and inter-rater reliability. Analytic evaluation consists in establishing a detailed marking scheme in which specific aspects of the learners performance are evaluated separately. Subsequently
these different ratings are combined to provide an overall mark. An atomistic approach to pronunciation testing thus involves judgements on the correctness of the learners production of particular vowels, consonants, stress, rhythm, intonation, etc. This method of pronunciation testing is claimed to be more objective than the holistic approach as it provides a more detailed diagnosis of the learners problems and achievements. It is generally preferred by pronunciation specialists and phoneticians (e.g. Vaughan-Rees 1989). On the other hand, atomistic procedure is not without its problems. It is extremely time-consuming and requires recording the learners speech samples and subsequent listening to them several times by the raters. For these reasons this approach seems unsuitable for large classes and examinations with many participants. According to Hughes (1991), the choice between holistic and analytic scoring depends to some extent on the purpose of testing; atomistic tests are more reliable for diagnostic purposes in the language classroom and in the situations in which scoring is carried out in many places by different judges, while holistic evaluation, which is faster, is more appropriate for experienced scorers who are well familiar with the grading system. In order to compare both approaches, we have carried out an experiment whose primary goal was to examine whether the holistic and atomistic procedures of pronunciation testing are equivalent and bring about the same results. In the experiment reported here 10 judges, all teachers of English, evaluated the pronunciation of 10 randomly selected intermediate Polish learners, secondary school pupils, who were asked to read aloud a short passage, which was subsequently recorded. The raters were first asked to evaluate holistically pupils pronunciation recorded on the tape using an ordinary scale of Polish school marks of 1, 2, 2,5, 3, 3,5, 4, 4,5, 5 and 6, where 1 = failure and 6 = excellent. After a break of two weeks the same group of raters assessed the recordings once again. On this occasion they were given the following 6 criteria to be employed in the evaluation: pronunciation of individual words, vowel quality (the /i/ - /i:/ distinction in particular), the interdental fricatives, the -ing suffix, word stress and other phonetic features. Each of these aspects were rated individually using the same scoring scale as before. Subsequently, the means were calculated. Finally, the assessors were asked to comment on the strengths and weaknesses of both approaches. The questionnaires have revealed that in making holistic evaluation the raters adopted, in fact, various analytic criteria (such as the pronunciation of silent letters, intonation, pauses, devoicing of final obstruents, etc.), which differed from person to person. Moreover, 90% of assessors regarded atomistic testing as more reliable and objective. The table below contains the results of the experiment. We provide averaged atomistic and holistic marks given by the raters.
ASSESSMENT Holistic Atomistic 3.7 3 3 2.5 4 3.2 3.2 3.1 4 3.3 4.4 4.1 4.1 3.6 3 3 2.5 2.8 3.4 3.1 3.53 3.17
Table 1. Results of holistic and atomistic assessment As can clearly be observed, in 8 cases out of 10 the mean atomistic marks are lower that the holistic marks. In one case the results are reversed and in one are the same. The obtained means are 3,53 in the holistic evaluation and 3,17 in the analytic procedure. To verify the obtained results, another experiment, a replica of the previous one, has been conducted with a different group of 5 raters and 5 other learners. This time the mean scores have been 3.56 in the holistic and 3.04 in the analytic assessment. Thus, a conclusion can be drawn that the holistic and atomistic approaches to pronunciation testing are not equivalent; the former usually results in higher scores than analytic assessment. This means that raters generally tend to be more lenient in their overall impressions than in judgements made on the basis of more specific criteria. An explanation of this phenomenon can be sought in the likely assumption that in atomistic testing the focus seems to be on error finding more than in the holistic procedure, where the criterion of intelligibility is employed, which allows for a more tolerant approach to phonetic inaccuracies. 4. Final remarks Pronunciation is extremely difficult to test in an objective and reliable fashion. We have demonstrated that Cambridge English Examinations, just like other similar tests, are based entirely on impressionistic evaluation and raise many objections with regard to their reliability. We have considered an alternative procedure of analytic evaluation and demonstrated that the two methods are not exactly equivalent, the former being more lenient and permissive than the latter. The atomistic approach can be regarded as more objective and reliable, and is particularly well-suited for diagnostic purposes as it allows the teacher to identify specific pronunciation problems of the learners to be dealt with in the course of subsequent instruction. It is, however, time-consuming and not easy to execute with large groups of learners or examinees. Holistic testing, on the other hand, is technically simpler to carry out. It is invaluable in assessing the overall impression, the intelligibility of the learners speech and other aspects of his pronunciation which cannot be easily expressed by means of definite, clear-cut criteria. Its reliability, however, is questionable. Apparently, none of these two methods can be viewed as fulfilling all the necessary requirements of objectivity, reliability and practicality.
References Alderson, C. J., Wall, D. & C. Claphaim. (1996). Language Test Construction and Evaluation. Cambridge: Cambridge University Press. Celce-Murcia, M., Brinton, D. & J. Goodwin. 1996. Teaching Pronunciation: a Reference for Teachers of English to Speakers of Other Languages. Cambridge: Cambridge University Press. Heaton, J. B. 1988. Writing English Language Tests. London: Longman. Hughes, A. (1991). Testing for Language Teachers. Cambridge: Cambridge University Press. Koren, S. (1995). Foreign language pronunciation testing: a new approach. System 23 (3). 387-400. Szpyra-Kozowska, J. (2003). Miejsce i rola fonetyki w midzynarodowych egzaminach Cambridge, TOEFL i TSE. Zeszyty Naukowe PWSZ w Pocku. Neofilologia. Tom V. 181-191. Underhill, N. (1987). Testing Spoken Language. A handbook of oral testing techniques. Cambridge: Cambridge University Press. Vaughan-Rees, M. (1989). The testing of pronunciation receptive skills. Speak Out! 4. p. 8.