High-Stakes Testing Washback

High-Stakes Testing Washback: A Survey on the Effect of Iranian MA Entrance Examination on Teaching
Mojtaba Mohammadi1
High-stakes tests work efficiently to bring about changes. They affect the participants as well as process and product of an educational system. MA Entrance Examination in Iran is a case in point. It is primarily designed to screen the candidates for postgraduate studies. Nevertheless, its changes in the classroom, generally known as washback in applied linguistics, are often more than what the designers expect. This paper aims at conducting a survey of the washback effect of MA Entrance Examination on teachers methodology and attitudes. 45 subjects, all of whom university professors, were selected using convenience random sampling. Then, a validated researcher-made questionnaire was administered. To have more reliable data, some were randomly selected to be interviewed so as to cross-check the data collected through questionnaire. The data analysis revealed that the majority of the subjects were positively affected by the examination. Moreover, they are fully aware that their methodology and attitudes were gradually set to the demands of the examination. Keywords: Washback, High-stakes tests, MA Entrance Examination, Teachers
Islamic Azad University - Roudehen Branch
Introduction It has widely been acknowledged that tests, especially high-stakes ones like school-leaving examinations, employment exams, or university-entrance exams, can directly or indirectly influence the educational systems. The reason probably lies in the fact that they usually involve a set of determining functions in testees life ranging from employment and promotion to placement and achievement. Brown (1996) clarifies two categories that help administrators and teachers to make program-level decisions, on the one side, and to make classroom-level decisions on the other side. The results of the tests as Wall (2000) called them differentiating rituals, can sometimes be so crucial in the testees' future life that they require the testees to take any possible measures to overcome the tests. The story is about the same with the other human elements (e.g., teachers and administrators) and non-human elements (e.g., materials and curriculum)in an educational system. Each of these elements are also expected to adopt and adapt certain skills, techniques, and tasks in order to meet the test demands and satisfy the students' needs. With these issues in mind, I conducted this survey to examine the effect of MA Entrance Examination, annually held in Iran to screen post-graduate applicants, on the methodology of university professors during their under-graduate courses and their attitudes toward the probable effect they receive from such examination.
Washback and Related Concepts This relatively new topic in general educational circle is the phenomenon called backwash which is used to refer to the influence of tests on teaching. Within language assessment, however, the term washback is preferably used recently (Alderson and Wall 1993; Messick 1996; Cheng 1997; Alderson 2004). Shohamy (1992) also focuses on washback in terms of language learners as test-takers when she
describes "the utilization of external language tests to affect and drive foreign language learning in the school context" (p. 513; as cited in Bailey, 1996, p.3). Some Scholars like Cheng (1997) see the term washback as a change in curriculum and use it to indicate an active direction and function of intended curriculum change by means of the change of public examinations (p.38). As part of consequential validity, Messick (1996: 261) says that: Washback refers to the extent to which the introduction and the use of a test influences language teachers and learners to do things that they would not otherwise do that promote or inhibit language learning. (Fulcher and Davidson, 2007: 221) The focus of any washback study, as Fulcher and Davidson (2007) claim, is on those things that we do in the classroom because of the test, but would not otherwise do. A number of other key terms have also grown in the literature which seem to convey, as Cheng (2005) claims, similar meaning and equated with washback (Green, 2007). Shohamy (1993a, p. 4) summarized some of these key concepts: 1. Measurement driven instruction refers to the notion that tests should drive learning. 2. Curriculum alignment focuses on the connection between testing and the teaching syllabus. 3. Systemic validity implies the integration of tests into the educational system and the need to demonstrate that the introduction of a new test can improve learning. (Bailey, 1999) Morrow (1986) coined the term washback validity and defined a valid test like this: test is valid when it has good washback and conversely test is invalid when it has negative washback (Alderson and Wall, 1993).
More recently, Bachman and Palmer (1996, pp. 29-35) have discussed impact of a test as distinguished from its washback. The impact of test use, as they think,
operates at two levels: the micro level (i.e., the effect of the test on individual students and teachers) and the macro level (i.e., the impact on society and its educational systems). Many scholars consider these two concepts within the realm of a theoretical notion first introduced by Messick. According to this notion called consequential validity, the social consequences of testing are part of a broader, unified concept of test validity (Messick, 1989, 1996). Arguing Morrows washback validity, he suggests that tests which satisfy validity criteria are more likely to have a positive influence on teaching and learning, and so counsels that washback is not a sign of test validity, but that a valid test is likely to generate positive washback (Green, 2007).
Aspects of Washback Wall (2000) considered tests as having positive (beneficial) and negative (harmful)effects. Positive effects as she called included inducing them to cover their subjects thoroughly, forcing them to complete their syllabuses within a prescribed time limit, compelling them to pay as much attention to weak pupils as to strong ones, and making them familiar with the standards which other teachers and schools were able to achieve. Quoting Wiseman (1961), Wall mentioned the possible negative effects of tests as encouraging teachers to `watch the examiner's foibles and to note his idiosyncrasies' in order to prepare pupils for questions that were likely to appear, limiting the teachers' freedom to teach subjects in their own way, encouraging them to do the work that the pupils should be doing, tempting them to overvalue the type of skills that led to successful examination performance, and convincing them to pay attention to the `purely examinable side' of their professional work and to neglect the side which would not be tested. To be away from the complexity of the concepts, Alderson and Wall (1993) explicitly stated 15 washback hypotheses through reading the literature and their
experience. The factors which are influenced are: teaching, learning, content, rate, sequence, degree, depth, attitudes and also the number of teachers or learners affected by a test. Hughes (1993) suggested a trichotomy model for washback, considering participants, process, and products as components of washback. In Hughes framework, participants include language learners and teachers, administrators, materials developers, and publishers, "all of whose perceptions and attitudes toward their work may be affected by a test". The term process covers any actions taken by the participants which may contribute to the process of learning. According to Hughes, such processes include materials development, syllabus design, changes in teaching methods or content, learning and/or test-taking strategies, etc. Finally, in Hughes' framework, product refers to "what is learned (facts, skills, etc.) and the quality of learning (fluency, etc.)" (As cited in Bailey, 1999) Watanabe (2004) conceptualized washback in terms of: Dimension (specificity, intensity, length, intentionality and value of the washback), aspects of learning and teaching that may be influenced by the examination, and the factors mediating the process of washback being generated (test factors, prestige factors, personal factors, macro-context-factors). Andrews et al. (2002) found out in their study that the impact of a test can be immediate or delayed. According to these researchers, washback seems to be associated primarily with highstakes tests, that is, tests used for making important decisions that affect different sectors. That is why this paper is going to deal with the influences of MA Entrance Examination as a high-stakes test on aspects mentioned in Alderson &Walls washback hypotheses. But before the study itself, I would like to touch upon some of the empirical studies carried out on what this paper is concerned with, that is, teachers methodology, attitudes and content.
Washback and Language Teachers Among other groups like counselors, administrators, course designers, materials developers, the most visible participants in washback studies, according to Hughes framework, are language teachers. A number of studies delving into the teacher issue in washback surveys have enriched the literature. Cheng's (1997) reporting on the revised Hong Kong Certificate of Education Examination (HKCEE) found that "84% of the teachers commented that they would change their teaching methodology as a result of the introduction of the revised HKCEE" (p. 45). Focusing on methodology washback, Lam (1994) concluded that experienced teachers were much more examination-oriented than their younger counterparts"(p. 91) and underlined changing the teaching culture as the challenge (As cited in Bailey 1999, p.20). A landmark study in the investigation of washback is no doubt Alderson and Walls (1993) in Sri Lanka which ended up with the following summary statements (p. 67): 1. A considerable number of teachers do not understand the philosophy/approach of the textbook. Many have not received adequate training and do not find that the Teacher's Guides on their own give enough guidance. 2. Many teachers are unable, or feel unable, to implement the recommended methodology. They either lack the skills or feel factors in their teaching situation prevent them from teaching the way they understood they should. 3. Many teachers are not aware of the nature of the exam- what is really being tested. They may never have received the official exam support documents or attended training sessions that would explain the skills students need to succeed at various exam tasks. 4. All teachers seem willing to go along with the demands of the exam (if only they knew what they were). 5. Many teachers are unable, or feel unable, to prepare their students for everything that might appear on the exam. On her report, Wall (1996) revisiting the Sri Lankan impact study, stated that the examination had had considerable impact on the content of English lessons and on the
way teachers designed their classroom tests, but it had had little to no impact on the methodology they used in the classroom or on the way they marked their pupils' test performance. As another experience, Shohamy et al. (1996) observed that a new test of Arabic (ASL) made class activated more test-like and teachers and students highly motivated to master the materials (p. 301). Experienced teachers, they added, turned to the test as their main source of guidance for teaching oral language, while the novice teachers used "a variety of additional activities to do that (As cited in Bailey, 1999). In the context of Japan, Watanabe (1996) found that entrance exam did not influence teachers in the same way and proposed three factors of 1) teachers' educational background and/or experiences, 2) differences in teachers' beliefs about effective teaching methods, and 3) the timing of the researcher's observations that can promote washback in teachers. Watanabe concluded that "teacher factors may outweigh the influence of an examination"(ibid., p. 331) in terms of how exam preparation courses are actually taught. Chen (2002) also investigated the effects of public exams on teachers. Chen wrapped up with enumerating the factors that can influence the degree of washback on teachers: teaching experience; teachers education; teachers fear or embarrassment of their students poor performance; teachers awareness of test content; level of stake; and gender. Methodology The participants of this study were 45 Iranian university professors who were teaching English to undergraduate students. They were all either MA or PhD holders in Teaching English as a Foreign language (TEFL) or English language literature(ELL). The scope of thepopulation was all Islamic Azad University (IAU) branches in zones 8 and 12. The subjects were selected using convenient sampling method. They were 26 male
and 19 female teachers. To determine the extent to which the subjects their methodology, teaching contents, and attitudes toward teaching and testing - were influenced by the national MA Entrance Examination, a researcher-made questionnaire was administered. Out of 67 people received a copy of the questionnaire, 45 turned it back answered. The questionnaire consisted of two sections: The first section comprised 20 statements. It had a Likert Scale response format ranging from very much (which was given the weight of 5) and not at all (which was given the weight of 1). The second section was a brief mostly-selection type of items regarding the subjects demographics, including their gender, major, teaching experience, etc. To Check the reliability of the Washback Effect Questionnaire, a pilot study was conducted with 20 English teachers. Test-retest method was applied with a three week interval between two administrations. Then, Pearson Product Moment Coefficient of Correlation formula was used to calculate the index as 0.85. As the researcher desired to reach at more reliable and valid results, he decided to triangulate the data applying a structured interview to cross-check the data collected through the questionnaire. Thus, after a time interval of three weeks, long enough not to remember their responses to the items of questionnaire, 15 teachers out of those handed in their questionnaires were interviewed. The questions were mostly reworded form of the statements of the questionnaire. The responses were taperecorded for later detailed investigation. Data Analysis After meticulous analysis of the answers given to the items of the questionnaire, I came up with interesting results. First, to get familiar with the sample, some preliminary statistics on them are presented. The subjects were 45 English language
professors majoring in Teaching English as Foreign Language (TEFL) (69%) or English Language and Literature (ELL) (31%). They were 57% male and 43% female. Their age ranges are summarized in the following table:
Age Range N. Per.
26 35 36 45 46 - above
14 19 12
31.1 42.2 26.6
Table 1: Number & percentage of the subjects age ranges
Regarding their experience as language teachers, they are categorized as in Table 2:

Teaching Experience(years) N. Per.
1-3 4-6 7 10 Over 10
1 6 15 23
2.2 13.3 33.3 51
Table 2: Number & percentage of the subjects teaching experience
Moreover, 76% of subjects claimed that they usually check the MA exam items every year, while the rest (24%) answered that they never or hardly ever do that annually. For the second part, some determining areas I had in mind for this paper to investigate are touched upon: Teaching and testing methods. 1. Teaching methods Items 7, 9, 10, 14, and 16 are dealing with teaching method in one way or another. The following graphs show the responses to the items mentioned. Item No. 7: I use MA Examination items, as examples, while teaching in my classes.
35 30 25 20 15 10 5 0
no ta ta ll no tr ea lly im es lo t qu it e a ve ry m uc h
Figure 1: Response Percentage for the options in item 7
The figure shows that just 28% of the subjects use MA Exam items
so m et
Item No. 9: If I were supposed to teach in an MA preparation course, I would use the same methods and techniques I am using now.
35 30 25 20 15 10 5 0
Item No. 10: I teach the contents according to their sequence of importance in MA examination.
40 35 30 25 20 15 10 5 0
no ta ta ll im es ve ry m uc h no tr ea lly a qu ite lo t
Item No. 14: I think my teaching method is helping students to get ready for both final exam and MA exam
40 35 30 25 20 15 10 5 0
so m et
so m et
.
Item No. 16: I teach the students the tips and tricks to answer the MA exam items.
so m et
35 30 25 20 15 10 5 0
no ta ta ll no tr ea lly im es lo t qu it e ve ry m uc h a
2. Testing Methods Items 6, 12, and 13 are all related to the second area, which is testing method. Item No. 6: In my class, I explain about the content or type of MA exam's items.
45 40 35 30 25 20 15 10 5 0
so m et
Item No. 12: I use MA exam items in my mid-term or final exams.
35 30 25 20 15 10 5 0
Item No. 13: My final exams items are essay-type.
so m et
so m et
35 30 25 20 15 10 5 0
Moreover, by comparing these responses with other factors like, experience, gender, and major some interesting results have come up:
The interesting result came up when the percentage of the subjects, in different categories of teaching experience, who answered 4(Quite a lot) and 5(Very much) to all the above items. Teaching Experience(year) 4 -6 6 -9 Over 10 Percentage 10 45.3 50.4
Table 3: Percentage of the subjects who selected 4 or 5 to teaching method category according to their age groups
Taking gender into consideration, the percentage of those who selected choices 4 and 5 is as follows:
Gender Male Female
so m et
Percentage 40 38.3
When major of the subjects was studied, there was no significant difference between those majoring in TEFL and ELL: Major TEFL ELL Percentage 36.2 38.3
Conclusions and Implications In teaching method category, while figure 1 shows the subjects unwillingness to use MA Exam items as examples, Figure 2 indicates that more than half of them tend to
teach the way appropriate for the Exam. Moreover, the majority (60%) thought their teaching method can be of help to prepare learners for the Exam. Nevertheless, as seen in figure 5, the teachers sometimes or not really teach the tips and tricks of the Exam. In the second category, where the effect of the MA entrance Exam is studied on teachers testing method, figures depict that not the convincing majority of the subjects explain about the tips and tricks of the Exam (Fig. 6). Figure 7 illustrates that more than half of them avoid using exact MA items in their mid-term or final exams. Depicted in Figure 8, the selection of item type for their exam seems to be affected by the form of the MA exam which is not essay-type. This, however, can have other reasons like ease of scoring for them. As clearly indicated in the above charts and tables, the findings of the survey indicate that teachers are positively affected by Iranian MA Entrance Examination as the high-stakes test. The impact of this exam on teaching methods is positive as it makes them teach the way students can be ready for the exam, they are using the same methods and techniques appropriate for the exam, they teach according to the sequence of their importance in the exam, they do not change the class to a mere introduction of the students to the types of the target items, they do not spend their class time teaching tips and tricks which per turns the class to an exam-oriented one. In addition, regarding the test method of teachers in this study it is quite clear that except their disinclination to use essay-type items, which may have some other reasons, the other aspects like explanation about MA exam items or using those items in their class test are to a great extent uncommon. This survey also endorsed the study conducted by Lam (1994) and Shohamy (1996)in saying that experienced teachers were much more examination-oriented than their younger counterparts. Nevertheless, washback effect was not significantly distinctive for the variable of teachers gender and field of study.
The pedagogical implications of the present survey is that teachers awareness of the Irainian MA Entrance examination can to many extent influence on how well they manage the class period, manipulate right techniques to teach the content with an eye to the MA exam, and manage to design their classroom test.
References
Alderson, J.C. & Wall D. (1993). Does washback exist? Applied Linguistics, 14 (2), 115-129. Andrews, S., Fullilove, J. & Wong, Y. (2002). Targeting washback a case study. System, 30, 207-223. Alderson, J. C. & Banerjee, J. (2000). Impact and washback research in language testing. In Elder, C. Brown, A. Grove, E. Hill, K. Iwashita, N. Lumley, T. McNamara, T. & O Loughlin, K. (Eds.), Studies in Language Testing 11: Experimenting with Uncertainty. Cambridge University Press. Bailey, K. M. (1999). Washback in language testing. TOEFL Monograph Series. Princeton, NJ: ETS. Bachman, L (1990). Fundamental considerations in language testing. Oxford University Press. Bachman, L and Palmer, A (1996). Language testing in practice: Designing and Developing Useful Language Tests. Oxford University Press. Bailey, K (1996). Working for washback: a review of the washback concept in language testing. Language Testing, 13/3, 257-279. Banerjee, J (1996). The Design of the classroom observation instruments, UCLES Internal Report, University of Cambridge Local Examinations Syndicate. Brown, J. D. (1996). Testing in language programs. Upper Saddle River, NJ: Prentice Hall Regents. Cheng, L. (2005). Changing language teaching through language testing: A washback study. Studies in Language Testing, 21, Cambridge University Press, Cambridge. Cheng, L. (1997). How does washback influence teaching? Implications for Hong Kong, Language in Education, 11(1), 38-54. Cheng, L. (forthcoming). Teacher perspectives and actions toward a public examination change.Unpublished manuscript. Hong Kong: University of Hong Kong, Department of Curriculum Studies. Cheng, L., Watanabe, Y. & Curtis, A. (Eds.). (2004). Washback in language testing: Research contexts and methods. Mahwah, N.J.: London: Lawrence Erlbaum. Fulcher, G. and Davidson, F. (2007). Language testing and assessment: An advanced resource book. New York: Routledge. Hamp-Lyons, L (1997). Washback, impact and validity: Ethical concerns, Language Testing 14/3, 295-303.
Henning, G. (1987). A guide to language testing." Development, evaluation, research, New York: Newbury House. Hughes, A. (1988). Introducing a needs-based test of English language proficiency into an English medium university in Turkey. In A. Hughes (Ed.), Testing English for university study (pp. 134-146). (ELT Documents #127). London: Modem English Publications in association with the British Council. Lam, H. P. (1994). Methodology washback- an insider's view. In D. Nunan, R. Berry, & V. Berry (Eds.), Bringing about change in language education: Proceedings of the International Language in Education Conference 1994 (83-102). Hong Kong: University of Hong Kong. Messick, S. (1996). Validity and washback in language testing. Language Testing 13(3), 241256. Qi, L. (2005). Stakeholders conflicting aims undermine the washback function of a highstakes test. Applied Linguistics, 22(2), 142-173. Saif, S. (2006). Aiming for positive washback: a case study of international teaching assistants. Language Testing, 23 (1), 1-34. Shohamy, E., Donitsa-Schmidt, S., & Ferman, I. (1996). Test impact revisited: Washback effect over time. Language Testing 13(3), 298-317. Shohamy, E. & Hornberger, N. H. (eds). ( 2008). Encyclopedia of Language and Education, 2nd Edition, Vol.7: Language Testing and Assessment, ixi. Available on line at http://spiina1001z/womat/production/PRODENV/ 0000000005/0000001 817/0000000016/0000590828.3D. Taylor, L. (2005). Washback and impact. ELT Journal, 59: 154-155. OUP. Turner, C. E. (2000). The need for impact studies of L2 performance testing and rating: Identifying areas of potential consequences at all levels of the testing cycle. In Elder, C. Brown, A. Grove, E. Hill, K. Iwashita, N. Lumley, T. McNamara, T. & OLoughlin, K. (Eds.), Studies in language testing 11: Experimenting with uncertainty. Cambridge University Press. Wall, D. (2000). The impact of high-stakes testing on teaching and learning: can this be predicted or controlled? System, 28: 499-509. Wall, D. & Alderson, J. C. (1993). Examining washback: The Sri Lankan impact study. Language Testing 10(1), 41-69. Watanabe, Y. (1996). Does grammar translation come from the entrance examination ? Preliminary findings from classroom-based research. Language Testing, 13(3), 318-333.
APPENDIX Interview Questions: 1. To what extent do you think the MA Entrance Examination influence your instruction? 2. Did you have to change the teaching techniques to meet the needs of the testing syllabus? 3. Do you use MC items in your final exams? 4. If the MA examination items change to be essay-type, do you generally change your final exams items?

High-Stakes Testing Washback

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

High-Stakes Testing Washback

Uploaded by

Copyright:

Available Formats

High-Stakes Testing Washback: A Survey on the Effect of Iranian MA Entrance Examination on Teaching

Islamic Azad University - Roudehen Branch

31.1 42.2 26.6

Table 1: Number & percentage of the subjects age ranges

Regarding their experience as language teachers, they are categorized as in Table 2:

1-3 4-6 7 10 Over 10

2.2 13.3 33.3 51

Table 2: Number & percentage of the subjects teaching experience

Figure 1: Response Percentage for the options in item 7

Figure 2: Response Percentage for the options in item 9

Figure 3: Response Percentage for the options in item 10

Figure 5: Response Percentage for the options in item 16

Figure 6: Response Percentage for the options in item 6

Item No. 12: I use MA exam items in my mid-term or final exams.

Figure 7: Response Percentage for the options in item 12

Item No. 13: My final exams items are essay-type.

Figure 8: Response Percentage for the options in item 13

Gender Male Female

You might also like