J. Read 2008 - Diagnostic Assessment

Journal of English for Academic Purposes 7 (2008) 180e190 www.elsevier.
com/locate/jeap
Identifying academic language needs through diagnostic assessment

John Read*
Department of Applied Language Studies and Linguistics, University of Auckland, Private Bag 92019, Auckland 1142, New Zealand
Abstract The increasing linguistic diversity among both international and domestic students in English-medium universities creates new challenges for the institutions in addressing the students needs in the area of academic literacy. In order to identify students with such needs, a major New Zealand university has implemented the Diagnostic English Language Needs Assessment (DELNA) programme, which is now a requirement for all rst-year undergraduate students, regardless of their language background. The results of the assessment are used to guide students to appropriate forms of academic language support where applicable. This article examines the rationale for the assessment programme, which takes account of some specic provisions governing university admission in New Zealand law. Then, drawing on the test validation network by Read and Chapelle [Read, J., & Chapelle, C. A. (2001). A framework for second language vocabulary assessment. Language Testing, 18, 1e32] the article considers in some detail: 1) the way in which DELNA is presented to staff and students of the university, and 2) the procedures for reporting the results. It also considers the criteria by which the programme should be evaluated. 2008 Elsevier Ltd. All rights reserved.
Keywords: Language assessment; English for academic purposes; Diagnosis; University admission; Undergraduate students; Language support
1. Introduction The internationalisation of education in the major English-speaking countries has long created the need to provide various forms of academic language support for those international students who have been admitted to the institution, but whose prociency is still not fully adequate to meet the language demands of their degree studies. Language support most often takes the form of English for academic purposes (EAP) courses targeting specic skills such as writing or listening, but it can also include adjunct language classes linked to a particular content course, writing clinics, peer editing programmes, self-access centres, and so on. A typical strategy is to require incoming international students to take an in-house placement test, the results of which are used either to exempt individuals from the EAP programme or to direct them into the appropriate courses to address their needs. Accounts of tests designed broadly for this purpose at various universities can be found in Brown (1993), Fox (2004), Fulcher (1997), and Wall, Clapham, and Alderson (1994).
* Tel.: 64 9 373 7599x87673; fax: 64 9 308 2360. E-mail address: ja.read@auckland.ac.nz 1475-1585/$ - see front matter 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.jeap.2008.02.001
J. Read / Journal of English for Academic Purposes 7 (2008) 180e190
181
At the same time, it is now well recognised that many students who are not on student visas also have academic language needs. This may result from the success of policies to recruit students from indigenous ethnic or linguistic minority groups which have traditionally been underrepresented in tertiary education. Another major category consists of relatively recent migrants or refugees, who have received much if not all of their secondary education in the host country and thus have met the academic requirements for university admission, but who still experience difculties with academic reading and writing in particular (Harklau, Losey, & Siegal, 1999). The term Generation 1.5 has been coined in the US to refer to the fact that these students are separated from the country of their birth but often not fully integrated e linguistically, educationally or culturally e into their new society. Beyond these two identiable categories, there is a broader continuum of academic literacy needs within the student body in the contemporary English-medium university, including many students who are monolingual in English. Although various forms of language support may be available to these domestic students on campus, the issue is how to identify the ones who need such support and to what extent they should be required to take advantage of it. There can be legal or ethical constraints on directing students into language support on the basis of their language background or other demographic characteristics. It may also be counterproductive to make it obligatory for students to participate in a support programme when they have no wish to be set apart from their peers and are reluctant to acknowledge that they have language needs. One way to address the situation is to introduce some form of diagnostic assessment, comparable to the in-house placement tests for international students. In fact, one of the tests cited above (Fulcher, 1997) was designed to be administered at the University of Surrey in the UK to all incoming students, regardless of their immigration status or language background. A similar solution is emerging at the university which is the subject of the present article. Having regard for these various considerations, it is necessary to give some careful thought to the development of an assessment procedure for this purpose. There are technical issues, such as how to assess native and non-native speakers by means of a common metric and how to reliably identify those with no need of language support within the minimum amount of testing time. However, the focus of this discussion will be on the need to present the assessment to the students and to the university community in a manner that will achieve its desired goals while at the same time avoiding unnecessary compulsion. 2. The context The particular case to be considered here is a programme called Diagnostic English Language Needs Assessment (DELNA), which has been implemented at the University of Auckland in New Zealand. The programme was introduced to address concerns that developed through the 1990s with the inux of students who are now collectively identied as having English as an additional language (EAL). During that decade New Zealand tertiary institutions vigorously recruited international students, particularly from East Asia. These students were required to demonstrate their prociency in English as a condition of admission. However, the typical requirement for undergraduates of Band 6.0 in IELTS came to be recognised as a relatively modest level of English prociency, particularly for students whose cultural background and previous educational experience made it difcult to meet the academic expectations of their lecturers and tutors (Read & Hayes, 2003). In the absence of any moves to raise the minimum English requirement for entry, then, the University of Auckland e like other New Zealand universities and polytechnics e needed to provide various forms of ongoing language support for international students. The liberalisation of immigration policy in the late 1980s also opened up opportunities for skilled migrants and business investors to migrate to New Zealand with their families. This led to an inow of new immigrants from Taiwan, China, South Korea, India and Hong Kong, peaking in 1995 but continuing at lower levels to this day. The vast majority of the new immigrants settled in the Auckland metropolitan area and in time these communities produced substantial numbers of students for tertiary institutions in the region, and for the University of Auckland in particular. The students from these communities had quite similar linguistic, educational and cultural proles to international students; many students in both categories had attended a New Zealand secondary school for one, two or more years before entering the university. However, there was one crucial difference. Under New Zealand law (the Education Act 1989), permanent residents are classied as domestic students for the purpose of university admission and cannot be subjected to any entry requirement that is not also imposed on citizens of the country. This means specically that new migrants cannot be targeted to take an English prociency test or enrol in ESL classes as a condition of being admitted into a university.
182
Another provision in the Education Act creates further challenges. The law allows any domestic student who has reached the age of 20 to apply for special admission to a New Zealand university, regardless of their level of prior educational achievement. Thus, in principle adult migrants as well as citizens have had open entry to tertiary education, although in practice their choices have been constrained by admission requirements for particular degree programmes, and those lacking a New Zealand secondary school qualication are likely to be strongly counselled to initially take on a light, part-time workload. Students accepted for special admission have diverse language needs. Whereas those from the East Asian migrant communities may resemble international students linguistically and culturally, others are mature students from English speaking backgrounds who may not lack prociency in the language as such but rather academic literacy. These students include members of the Pacic Nations communities (particularly from Samoa, Tonga, the Cook Islands, Niue and the Tokelau Islands) who may have native prociency in general conversational English but whose low level of achievement in their secondary schooling would have excluded them from further educational opportunity, had the special admission provision not been available. Although the Pacic communities are long established in New Zealand, it has only been in more recent years that the universities have made systematic efforts to recruit Pasika students, with a particular emphasis on programmes in Education, Health Sciences and Theology. Thus, through the 1990s the University of Auckland faced various challenges in responding to the growing linguistic diversity of its student body, not least because of the constraints imposed by the Education Act. Proposals from two leading professors (Ellis, 1998; Ellis & Hattie, 1999) that the university should introduce an entrance examination in English for students who could not produce evidence of adequate competence in the language received support from the Faculty of Arts and were accepted by the central administration of the university. The development and piloting of the DELNA instruments took place in 2000e01 (Elder & Erlam, 2001) and the programme became operational in 2002. 3. DELNA: its philosophy and design Before looking at how DELNA operates in practice, it is useful to outline several basic principles underlying its development. To some extent, the principles reect the constraints imposed on the university by the Education Act, but they can also be seen as a positive commitment by the institution to enhancing the educational opportunities of the whole student body. One principle was that the test results would not play any role in admissions decisions; students were to be assessed only after they had been accepted into the university for their chosen degree programme. In this sense, then, the administration of DELNA represents a low-stakes situation, although from another point of view the stakes are higher for students who are at serious risk of failing courses or not achieving their academic potential as a result of their limited prociency in the language. The university, too, has a stake in preserving academic standards and maintaining good completion rates, particularly on equity grounds for M aori, Pasika and other students from historically underrepresented groups on the campus. As a means of emphasising the point that DELNA was not IELTS under another guise, it was deliberately called an assessment rather than a test, and the individual components are known as measures. There was to be an important element of personal choice for students in their participation in DELNA and their subsequent uptake of opportunities for language support and enhancement. In practice, particular departments and degree programmes have required their students either to take DELNA and/or to participate in some form of language support, but the principle remains that students should be strongly encouraged to take advantage of this initiative rather than being compelled to do so against their will. DELNA represented a recognition by the university that it shares with students a joint responsibility to address academic language needs. This contrasts with the situation of international students applying for admission, where the onus is on the students to demonstrate, by paying a substantial fee for an international English test, that they have adequate competence in the language. For students and for departments, DELNA is free of charge and several of the language support options are available to students at no additional cost to them. In operation, DELNA involves two phases of assessment, Screening and Diagnosis, as shown in Table 1. The Screening measures were designed to provide a quick and efcient means of separating out native speakers and other
J. Read / Journal of English for Academic Purposes 7 (2008) 180e190 Table 1 The structure of DELNA Screening (30 min) Vocabulary Speed Reading Diagnosis (2 hours) Listening to a mini-lecture Reading academic-type texts Writing an interpretation of a graph
183
procient users of the language who were unlikely to encounter difculties with academic English, and exempting them from further assessment. Both of the Screening measures are computer-based. One is a vocabulary test, assessing knowledge of a sample of academic words by means of a simple wordddenition matching format (Beglar & Hunt, 1999). The other, variously known in the literature as a speed reading (Davies, 1975, 1990) or cloze-elide (Manning, 1987) format, is a kind of reverse cloze procedure. In each line of an academic-style text an extraneous word is inserted and the test takers must identify each inserted word under a speeded condition which means that only the most procient students complete all 73 items within the time available. In a validation study (Elder & Erlam, 2001), the reliability estimates were 0.87 for Vocabulary and 0.88 for Speed Reading. The two tests correlated with a composite listening, reading and writing score from the Diagnosis (see below) individually at 0.74 (vocabulary) and 0.77 (speed reading), and collectively at 0.82. For students who score below a threshold level on the Screening, the three measures in the Diagnosis phase provide a more extensive, task-based assessment of their academic language skills. Unlike the computerised Screening measures, they are all paper-based instruments. In the Listening test (30 min), the students hear an audio-recorded mini-lecture on a non-specialist topic and respond to short answer, multiple-choice and information transfer items. The Reading test (45 min) is based on one or two reading texts on topics of general interest totalling about 1200 words. Various item types are used, including cloze, information transfer, matching, multiple-choice, true-false and short answer. For the Writing task (30 min), the candidates write 200 words of commentary on a social trend, as presented to them in the form of a simple table or graph. Their writing is rated on three analytic scales: uency, content, and grammar and vocabulary. The Diagnosis phase takes 2 hours to administer, as compared to 30 min for the Screening, and is obviously more expensive in other respects, in that it requires manual scoring and, in the case of the writing task, double rating on the three scales by trained examiners (for research on the training procedures, see Elder, Barkhuizen, Knoch, & von Randow, 2007; Knoch, Read, & von Randow, 2007). The Elder and Erlam (2001) validation study obtained reliability estimates of 0.82 for Listening and 0.83 for Reading. In the case of Writing, the two recent studies just cited (Elder et al., 2007; Knoch et al., 2007) produced estimates of 0.95e0.97 for the reliability of candidate separation, using the FACETS program. Further details of the two phases of the DELNA assessment, including sample items and tasks, can be found in the DELNA Handbook, which is downloadable from the programme website: www.delna.auckland.ac.nz . Set out this way, DELNA looks very much like a conventional language test. Certainly the Diagnosis tasks are similar to those found in IELTS and other EAP prociency tests. However, the intended purpose of the instrument is different and this means that it needs to be presented in a distinctive manner, in keeping with the principles outlined at the beginning of this section. 4. An analysis of test purpose A useful framework for analysing how test purpose should inuence test design and delivery is that developed by Read and Chapelle (2001). Although the framework is exemplied in terms of vocabulary testing, it has general applicability to various forms of language assessment. As shown in Fig. 1, the framework has numerous components and it is beyond the scope of the present article to consider them all in detail. At the top level of the framework, test purpose is decomposed into three components e inferences, uses and intended impacts e which in turn lead to validity considerations and mediating factors. It is the second and third mediating factors which are of particular concern here, but it is also necessary to address the rst component briey.
184
TEST PURPOSE
Inferences
Uses
Intended Impacts
VALIDITY CONSIDERATIONS
Construct Validity
Relevance and Utility
Actual Consequences
MEDIATING FACTORS
Construct Definition
Performance Summary and Reporting
Test Presentation
TEST DESIGN
Decisions about the Structure and Formats of the Test
VALIDATION
Arguments based on Theory, Evidence and Consequences
Fig. 1. A framework for incorporating a systematic analysis of test purpose into test validation (adapted from Read & Chapelle, 2001, p. 10).
4.1. Construct denition The inferences to be made on the basis of performance in DELNA can be dened in terms of academic literacy in English: the ability of incoming undergraduate students to cope with the language demands of their degree programme. Although ultimately the assessment is targeted at students for whom English is an additional language (EAL), the construct is broader than academic literacy in English as an additional language because many of those to be assessed come from English-speaking backgrounds, and the whole function of the initial Screening phase of DELNA is to separate out students for whom adequate academic literacy is unlikely to be at issue. Designing a test for students with English as both a rst and an additional language creates a special challenge because it cannot be assumed that items and tasks will perform the same way for the two groups. Elder, McNamara, and Congdon (2003) used Rasch analysis to investigate this issue and found a somewhat complex pattern, whereby each of the DELNA tasks except the vocabulary measure exhibited some signicant bias in favour of either native or non-native speakers. However, since the bias was in both directions and relatively small in magnitude overall, the researchers considered that it was within tolerable limits for a low-stakes assessment of this kind. Read and Chapelle (2001) distinguish three levels of inference: whole test, sub-test and item. For DELNA itemlevel inferences are not appropriate. In the Screening phase, the construct is dened specically in terms of efcient access to academic language knowledge and it is sufcient to make inferences at the level of the whole test. Thus, the vocabulary and speed reading scores are combined into a single result to determine whether the student should proceed to the Diagnosis phase. Elder and von Randow (in press) have investigated the validity of inferences based on the Screening score examining its suitability as a basis for determining whether students needed to proceed to the Diagnosis. Their study involved an analysis of the performance of 353 students who took both the Screening and Diagnosis measures. A minimum criterion score was set on the basis of performance in the listening, reading and writing tests of the Diagnosis phase. Then, by means of regression analysis, an optimum cut score (combining the vocabulary and speed reading scores) was established for the Screening phase. This cut score successfully identied 93% of the students whose performance fell below the criterion level in the Diagnosis phase. However, it also meant that relatively few students would be exempted from taking the costly Diagnosis measures and so, with nancial considerations in mind, a lower cut score was set. The lower score identied only 81% of the students who were under the criterion level but on the other hand it resulted in less than 1% of false negatives: students below the cut score who nevertheless had achieved the criterion level in the Diagnosis. Therefore, for operational purposes it is only students whose Screening performance falls under the lower cut score who are required to proceed to the Diagnosis. Those who are between the two cut scores receive a general recommendation to seek academic language support (see 4.3 below).
185
For those who complete the Diagnosis, sub-test inferences are desirable so that students can be advised on whether they should seek language support in each of the three skill areas of listening, reading and writing. This means that each sub-test needs to provide a reliable measure of the skill involved. The reliability estimates quoted in Section 3 are very satisfactory from this perspective. 4.2. Test presentation Although test presentation comes third in the Read and Chapelle framework, it is more appropriate to discuss it next in this account of DELNA. Presentation is a mediating factor that comes from a consideration of the impact of a test (Messick, 1996). Read and Chapelle (2001) point out that most research on impact in language testing has focused on the washback effects of existing tests and examinations (see, e.g. Alderson, 1996; Cheng & Watanabe, 2004). However, Read and Chapelle argue that if the consequences of implementing a test are to be seen as an integral element in evaluating its quality, a statement of the intended impact of the instrument needs to be included in the specication of test purpose early in the development of a new test. Thus, the actual consequences of putting the test into operation can be evaluated by reference to the prior statement of intended impact. This means in turn that the test developers should consider how the intended impact can be achieved through the way that the test is presented. Test presentation is a concept that has not received much attention in the literature and it deserves some consideration here. It consists of a series of steps, taken as part of the process of developing and implementing the test, to inuence its impact in a positive direction. Since there are numerous stakeholders in assessment, particularly when the stakes are high, [t]est developers choose to portray their tests in ways that will appeal to particular audiences (Read & Chapelle, 2001, p. 18). These can include educational administrators, teachers, parents, users of the test results, and of course the test takers, who need to be familiar with the test formats and willing to accept that the test is a fair assessment of their language abilities. Seen in this light, test presentation has a strong connection to that much maligned concept in testing, face validity. Authors of introductory texts on language testing, starting with Lado (1961), have generally dismissed this concept as not being a credible form of evidence to support a validity argument, since it is based on simple inspection (Lado, 1961, p. 321) or the judgment of an untrained observer (Davies et al., 1999, p. 59). However, this rejection of the concept has generally been accompanied by an acknowledgement that, although the term may be a misnomer, it represents a matter of genuine concern in testing. That is to say, test developers are confronted with a real problem if e regardless of the technical merits of the test e one or more of the stakeholder groups are not convinced that the content or the formats are suitable for the assessment purpose. Thus, Alderson, Clapham, and Wall (1995) give face validity a positive gloss as meaning acceptable to users (p. 173), echoing Carroll (1980), who had earlier proposed acceptability as one of the four desirable characteristics (along with relevance, comparability and economy) of a communicative test. In addition, Bachman and Palmer (1996, p. 42) and Davies et al. (1999, p. 59) refer to the even more positive notion of test appeal. Thus, test presentation can be seen as a proactive approach to promoting the acceptability of the test to the various stakeholders, and above all to the test takers, in order to achieve the intended impact. The test developer needs to ensure that the purpose and nature of the assessment is clearly understood, that it meets stakeholder expectations as much as possible, and that the test takers in particular engage with the test tasks in a manner that will help produce a valid measure of their language ability. Major prociency tests generate a strong external motivation for students because of the stakes involved, whereas with a programme like DELNA it is more important to create a positive internal motivation based on a recognition of the benets that the results of the assessment may bring for the student. 4.2.1. Presentation of DELNA to students The general principles underlying the presentation of DELNA are those that were introduced in Section 3 above: the fact that the results are never used for admissions purposes; the term assessment is preferred to test; there is a signicant element of personal choice for students; and the university shares with its students the responsibility for addressing their academic language needs. The slogan Increases your chance of success, which has featured in DELNA publicity, is also intended to express the positive intent of the programme. There have been two main pathways to DELNA for students entering the university each semester. The rst was literally by invitation. In the admissions ofce the records of incoming domestic students (citizens and permanent
186
residents) were reviewed to identify those who had not provided evidence of their competence in English for tertiarylevel study. Students coming directly from secondary school in the last few years hold the National Certicate of Educational Achievement (NCEA), which includes a literacy requirement to demonstrate prociency in academic reading and writing in English. However, mature students and recently arrived immigrants who enter the university under special admission often lack a recognised secondary school qualication or any other evidence of academic literacy. Students thus identied received a letter inviting them to take the DELNA diagnosis. For statutory reasons, as previously explained, the university could not make it mandatory but, that consideration aside, the wording of the letter had a positive tone which emphasised the intended role of DELNA in enhancing the students study experience. Initially, the uptake of these invitations was relatively low but by 2005 it had reached about 40% (116/295). The other main pathway, which has now essentially superseded the rst one, results from decisions by departments or faculties to require all students in designated rst-year courses to take DELNA. Initially, this applied to programmes which attracted a high proportion of EAL students, such as the Bachelor degrees in Business and Information Management, and in Film, TVand Media Studies. However, from 2007 it has ofcially become a requirement for almost all rst-year students, regardless of their language background, to take the DELNA Screening. This not only observes the legal niceties but also highlights the important role of the Screening phase in efciently separating out academically literate students for exemption from further assessment. In 2007 a total of 5427 students were assessed through the DELNA programme. These students are estimated to represent around 70% of all the rst-year students at the university that year, although the percentage is higher if groups such as transferring and exchange students are excluded. Of those who completed the Screening, 1208 were recommended to return for the Diagnosis phase; however, only 504 (42%) did so. This shortfall is discussed in Section 5 below. In terms of presentation, as DELNA assessment has become the norm for rst-year students, it is increasingly accepted as just another part of the experience of entering the university. Students are informed of the assessment requirement in their departments handbook and can obtain further information from the programme website, including the downloadable DELNA Handbook, with its sample versions of the assessment tasks and advice on completing them. In addition, it is easy for students to book for a DELNA session online at their preferred day and time. One other appealing feature of the Screening measures in particular is that they are computer-administered, which adds a novelty value for students who may never have taken such a language test before. 4.2.2. Presentation of DELNA to staff Much of the initial impetus for the development of DELNA came from the concerns of teaching staff across the university in the 1990s about the academic literacy needs of students in their classes. This created a positive environment for the acceptance of a programme like DELNA to address those needs, but of course that is not the same as an understanding of how this particular programme works. The establishment of DELNA saw the formation of a Reference Group chaired by the Deputy Vice-Chancellor (Academic) and composed of representatives from all the university faculties as well as from the various language support programmes. The group meets regularly to discuss policy issues, monitor the implementation of DELNA and provide a channel of communication from the programme staff to the faculties. The departments which were the early participants in DELNA are well represented on the group but, as the assessments have expanded, it has been necessary to open new avenues of communication to academic and administrative staff across the university to ensure that: a) an informed decision is made when departments or faculties decide to require their rst-year students to take the assessment; b) the relevant staff correctly interpret the DELNA results when they receive them; and c) effective follow-up action is taken to give students access to language support if they need it. In 2005 an information guide for staff was produced in pamphlet form and it has been followed by an FAQ document. However, experience has shown that the printed material must be backed up by face-to-face meetings with key staff members responsible for DELNA administration in particular faculties or departments. 4.3. Performance summary and reporting This brings us back to the second mediating factor of the Read and Chapelle (2001) framework, performance summary and reporting, which relates to the intended use of the test. The assessment results are used to identify students
187
who may be at risk because of their low academic literacy and then to advise them on suitable forms of language support and enhancement. Where participation in DELNA is a course requirement, the results also go to the academic programme or department for follow-up action as appropriate. Thus, the two main recipient groups for the results are the students and their departments. Given that the whole purpose of the programme is to address academic literacy needs, the reporting of student performance includes not only the assessment result but also a recommendation for language enhancement where appropriate. At this point, then, it is useful to list the main language support options available on the campus. Credit courses in academic language skills: ESOL100e102 for EAL students, and ENGWRIT101, a writing course for students from English-speaking backgrounds. Workshops, short non-credit courses, individual consultations and self-access study facilities offered by the Student Learning Centre (SLC e available to all students) and the English Language Self-Access Centre (ELSAC e specialising in services for EAL students). Discipline-specic language tutorials linked to particular courses (following a kind of adjunct model) which have for some time attracted a high proportion of EAL students. Currently these courses are in Commerce, Health Sciences, Theology, and Film, TV and Media Studies. The Screening phase of DELNA is primarily intended to exempt highly procient students from further assessment. Thus, the scores from the vocabulary and speed reading measures are combined to divide the test-takers into three categories with deliberately non-technical labels: Good e no language enrichment required. Satisfactory e some independent activity at SLC or ELSAC recommended. Recommended for Diagnosis e should take the DELNA Diagnosis. The Screening result is sent individually to each student by email and, when DELNA is a departmental requirement, an Excel le of results for each course is forwarded to a designated staff member. Until 2006, the Screening reports included the two actual test scores for vocabulary and speed reading. However, the fact that the cut scores for the three categories varied according to which form of the test each student took caused some confusion and, in addition, there were indications that the scores were being used in at least one academic programme as quasi-prociency measures to assign students to tutorial groups according to their language ability. This led to the current policy of reporting just the students category. In the case of the Diagnosis phase, a scale modelled on the IELTS band scores (from a top level of 9 down to 4) has been used for rating performance and reporting the results to students. However, for reporting to staff a simpler A-B-C-D system is used for each of the three skills (listening, reading and writing). The A and B grades correspond to the Good and Satisfactory categories respectively in the Screening, and students whose three-grade average is at one of these levels receive an email report. On the other hand, students averaging in the C and D range, who are considered to be at signicant risk, are sent an email request to collect their results in person from the DELNA language adviser. As with the Screening, the results are also sent to the designated staff member when the Diagnosis is required by the department. The appointment of the language adviser, beginning in 2005, resulted from a recognition that students scoring low in the Diagnosis were generally not accessing the recommended language support. A small-scale study by Bright and von Randow (2004), involving follow-up interviews with eighteen DELNA candidates at the end of the academic year, found that only four of them had taken the specic advice given in the results notication. Although most of the participants had in fact passed their courses, they acknowledged that they had really struggled to meet the language demands of their studies. One strong message from the interviews was that the students would have greatly appreciated the opportunity to discuss their DELNA results and their language support options face-to-face, rather than just receiving the impersonal emailed report. Thus, the language adviser now meets with each student individually, goes over their prole of results, and directs them to the most appropriate form(s) of support. She often follows up the initial meeting with ongoing monitoring of their progress through the semester or even longer. Thus performance summary and reporting in this case involves not simply the form of the report but also, for the less procient students, the medium by which the result is communicated to them.
188
5. Evaluating the programme The extended discussion in Section 4, drawing on the Read and Chapelle (2001) framework, has shown how the purpose of the assessment has been worked out through the design and delivery mechanisms of DELNA. At the time of writing, the programme is still being rolled out. It has yet to achieve full participation by the incoming rst-year student population in the Screening phase, and furthermore in 2006 only 30% of students (444 out of 1340) who were recommended for Diagnosis on the basis of their Screening results actually went on to the second phase of the assessment. Higher levels of participation will depend on the extent to which faculties and departments enforce the requirement that their students should take one or both phases of DELNA. Some academic programmes have introduced specic incentives for students to take the assessment, by for instance withholding the rst essay grade or subtracting a few percent of the nal course grade of students who do not comply. However, the point of the exercise is not just to assess the students but rather to address their academic language needs where appropriate. As noted in the previous section, there is now provision within the DELNA programme itself, through the work of the language adviser, to provide intensive counselling for those students whose results in the Diagnosis phase show that they have the most serious language needs. Some academic units have introduced their own follow-up measures for such students. For example, the Bachelors degree in Business and Information Management has a well-established Language and Communication Support Programme (LanCom), which integrates various forms of support into the delivery of their courses. In the Faculty of Engineering students who score below a minimum level in the DELNA Diagnosis must undertake a quasi-course with its own course code, involving attendance at 15 hours of workshops at the Student Learning Centre (SLC) and satisfactory completion of another 15 hours of directed study at the English Language Self-Access Centre (ELSAC). With the expansion of DELNA assessment into the Faculties of Arts and Science, it is more of a challenge to respond to the language needs of students enrolled for a degree which includes courses offered by several different departments. In the rst instance, the Screening results may simply provide course conveners with a broad prole of the language needs of their students, who may be several hundred in number. Many departments lack the resources to offer specialised language support to their students. One realistic option for them is the introduction of systematic procedures for referring students in need to SLC or ELSAC; another option may be to review their teaching and assessment practices to avoid creating unnecessary difculties for EAL students in their courses. Returning briey to the Read and Chapelle (2001) framework, one element in the validation of a test or assessment procedure is an investigation of its actual consequences as compared to its intended impact. At the institutional level, the intended impact can be dened in terms of levels of academic literacy in the student population. The implementation of DELNA is supposed to lead to a meaningful reduction over time in the number of students whose academic performance is hampered by language-related difculties. The question is what kind of data counts as evidence that the goal is being achieved for the undergraduate student body as a whole. Davies and Elder (2005) took up this point in their review of current theory and practice in language test validation, using DELNA as a case study. They formulated a series of eight hypotheses that can be investigated to build an argument for the validity of DELNA. Most of the hypotheses relate either to the technical qualities of the DELNA tests as measures of academic literacy or to the utility of the scores to the users. However, the nal hypothesis takes up the issue of the wider impact of the programme: H.8 The student population will benet from the information that the test provides. (2005, p. 805). Davies and Elder highlight a number of challenges in, rst, dening the nature of the benet and then gathering evidence in support of the hypothesis. One way to address the hypothesis would be to dene the benet as an increase in academic literacy, as measured by a further assessment of their language prociency after, say, a semester or two of study. However, DELNA is set up as a one-time assessment for each student and the system blocks them from taking it more than once. In addition, there are currently no plans to introduce an exit test of English prociency for graduating students. This means that we need to look for benets in other ways. One kind of evidence relates to student uptake of the DELNA advice by accessing the various language support options available to them. If they enrol in an ESOL credit course or attend a tutorial linked to one of their subject courses, their progress and end-of-course achievement will be assessed by their tutors. On the other hand, it is more of a challenge to monitor the benet gained by students who participate in the support programmes at SLC and ELSAC. Students are required to register when they rst access
189
these programmes and records are kept of their attendance at workshops and individual consultations, but that is not the same as assessing the benet of these language support opportunities in improving the students academic language prociency. A broader approach to the situation is to look at grade point averages and retention rates for whole undergraduate cohorts, particularly in courses with large EAL student enrolments. As Davies and Elder (2005) point out, though, it is difcult to separate out language prociency from academic ability, motivation, sociocultural adjustment and the range of other factors that inuence student achievement in their university studies, particularly if underachievement is represented not just by dropout or failure rates but also by lower grades than the student might otherwise have achieved. The issues involved are reminiscent of those which have complicated research on the predictive validity of major prociency tests like TOEFL and IELTS (see, e.g., Hill, Storch, & Lynch, 1999; Light, Xu, & Mossop, 1987). Thus, global university-wide measures of impact may prove to be less useful than more focused investigations of particular groups of students. One such study, being conducted by the DELNA Programme in conjunction with the Department of Film, TV and Media Studies, is tracking a cohort of students through their three years of study towards a BA major in FTVMS. The data include annual interviews with the students as well as the quantitative measures provided by the initial DELNA results and their course grades. Through this kind of targeted research, it will become possible to develop a validity argument that combines rich qualitative evidence with more objective measures of students language prociency and academic achievement. 6. Conclusion The DELNA assessment programme has a number of features that differentiate it from other tests of English for academic purposes. First, it does not function as a gatekeeping device for university admission, and students cannot be excluded from the institution on the basis of their results in either phase of the assessment. The fact that some students nd this hard to believe helps to account for the relatively low participation rate in the Diagnosis phase of DELNA among those who are recommended to take it. Secondly, it is not simply a placement procedure to direct students into one or more courses within a required EAP programme according to their level and areas of need. There is a range of language support options that students are recommended to participate in as appropriate. A related feature is the distinctive philosophy behind the programme which holds that students should retain a degree of personal choice as to whether they take advantage of the opportunities for language and study support which are available to them. Although it partly reects the constraints imposed by national education legislation, this approach is also based on the assumption that academic language support will be more effective if students recognise for themselves the extent of their language needs and make a commitment to attend to them. One other important characteristic of DELNA is that it is centrally funded with a direct management line to the ofce of the Deputy Vice-Chancellor (Academic). Although its ofces are located in the Department of Applied Language Studies and Linguistics, the programme has always been conceived as a university-wide initiative. This helps to avoid any perception that DELNA is just serving the interests of a particular department or faculty. It is an issue that has emerged in discussions with staff from other New Zealand universities about the possibility of introducing a DELNA programme on their own campuses. Initial enquiries have typically come from student learning advisers or ESOL tutors who have thought in terms of purchasing a set of diagnostic tests for their own institution. However, a brieng on the full scope of DELNA and its associated language support provisions reveals how much more is involved, with a rm commitment by senior management being a crucial element in the successful operation of the programme at Auckland. The DELNA programme is moving to a consolidation phase after the considerable expansion in the coverage of incoming undergraduate students over the past couple of years. There is a consequent need to ensure that effective use is made of the DELNA results and that an increasing proportion of the targeted students participate in the appropriate forms of language support and enhancement. The position of DELNA as a centrally funded programme is secure for the foreseeable future, although it remains to be seen to what extent the university will be able to commit sufcient resources to meet the range of language needs that the assessment results are revealing. Other related issues may yet emerge, such as the need to set language prociency standards for students graduating from Bachelors programmes or concerns about the academic literacy of postgraduate students. For now, though, it is widely accepted within the institution that DELNA is a very worthwhile means of addressing the language needs of incoming undergraduates.
190
References
Alderson, J. C. (Ed.). (1996). Washback in language testing [Special issue]. Language Testing, 13(3). Alderson, J. C., Clapham, C., & Wall, D. (1995). Language test construction and evaluation. Cambridge: Cambridge University Press. Bachman, L. F., & Palmer, A. S. (1996). Language testing in practice. Oxford: Oxford University Press. Beglar, D., & Hunt, A. (1999). Revising and validating the 2000 word level and university word level vocabulary tests. Language Testing, 16, 131e162. Bright, C., & von Randow, J. (2004, September). Tracking language test consequences: the student perspective. Paper presented at the Ninth National Conference on Community Languages and English for Speakers of Other Languages (CLESOL), Christchurch, New Zealand. Brown, J. D. (1993). A comprehensive criterion-referenced language testing project. In: D. Douglas, & C. Chapelle (Eds.), A new decade of language testing research (pp. 163e184). Washington, DC: TESOL. Carroll, B. J. (1980). Testing communicative performance. Oxford: Pergamon. Cheng, L., & Watanabe, Y. (2004). Washback in language testing: Research contexts and methods. Mahwah, NJ: Lawrence Erlbaum Associates. Davies, A. (1975). Two tests of speed reading. In: R. L. Jones, & B. Spolsky (Eds.), Testing language prociency (pp. 119e130). Arlington, VA: Center for Applied Linguistics. Davies, A. (1990). Principles of language testing. Oxford: Blackwell. Davies, A., Brown, A., Elder, C., Hill, K., Lumley, T., & McNamara, T. (1999). A dictionary of language testing. Cambridge: Cambridge University Press. Davies, A., & Elder, C. (2005). Validity and validation in language testing. In: E. Hinkel (Ed.), Handbook of research in second language teaching and learning (pp. 795e813). Mahwah, NJ: Lawrence Erlbaum. Elder, C., Barkhuizen, G., Knoch, U., & von Randow, J. (2007). Evaluating rater responses to an online training program for L2 writing assessment. Language Testing, 24, 37e64. Elder, C., & Erlam, R. (2001). Development and validation of the diagnostic english language needs assessment (DELNA): Final report. Auckland: Department of Applied Language Studies and Linguistics, University of Auckland. Elder, C., McNamara, T., & Congdon, P. (2003). Rasch techniques for detecting bias in performance assessments: An example comparing the performance of native and non-native speakers on a test of academic English. Journal of Applied Measurement, 4, 181e197. Elder, C., & von Randow, J. (in press). Exploring the utility of a web-based English language screening tool. Language Assessment Quarterly. Ellis, R. (1998). Proposal for a language prociency entrance examination. Unpublished manuscript, New Zealand: University of Auckland. Ellis, R., & Hattie, J. (1999). English language prociency at the University of Auckland: A proposal. Unpublished manuscript, New Zealand: University of Auckland. Fox, J. (2004). Test decisions over time: Tracking validity. Language Testing, 21, 437e465. Fulcher, G. (1997). An English language placement test: issues in reliability and validity. Language Testing, 14, 113e139. Harklau, L., Losey, K. M., & Siegal, M. (Eds.). (1999). Generation 1.5 meets college composition: Issues in the teaching of writing to U.S. educated learners of ESL. Mahwah, NJ: Lawrence Erlbaum. Hill, K., Storch, N., & Lynch, B. (1999). A comparison of IELTS and TOEFL as predictors of academic success. In: R. Tulloh (Ed.), IELTS research reports, (Vol. 2, pp. 52e63). Canberra: IELTS Australia. Knoch, U., Read, J., & von Randow, J. (2007). Re-training writing raters online: How does it compare with face-to-face training? Assessing Writing, 12, 26e43. Lado, R. (1961). Language testing. London: Longman. Light, R. L., Xu, M., & Mossop, J. (1987). English prociency and academic performance of international students. TESOL Quarterly, 21, 251e261. Manning, W. H. (1987). Development of cloze-elide tests of English as a second language. TOEFL Research Report, No. 23. Princeton, NJ: Educational Testing Service. Messick, S. (1996). Validity and washback in language testing. Language Testing, 13, 241e256. Read, J., & Chapelle, C. A. (2001). A framework for second language vocabulary assessment. Language Testing, 18, 1e32. Read, J., & Hayes, B. (2003). The impact of IELTS on preparation for academic study in New Zealand. In: R. Tulloh (Ed.), IELTS research reports 2003, (Vol. 4, pp. 153e205). Canberra: IELTS Australia. Wall, D., Clapham, C., & Alderson, J. C. (1994). Evaluating a placement test. Language Testing, 11, 321e344. John Read is Head of the Department of Applied Language Studies and Linguistics at the University of Auckland. His primary research interests are in vocabulary assessment and testing English for academic and professional purposes. He is the author of Assessing Vocabulary (Cambridge, 2000) and has been co-editor of Language Testing.

J. Read 2008 - Diagnostic Assessment

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

J. Read 2008 - Diagnostic Assessment

Uploaded by

Copyright:

Available Formats

Journal of English for Academic Purposes 7 (2008) 180e190 www.elsevier.

Identifying academic language needs through diagnostic assessment

J. Read / Journal of English for Academic Purposes 7 (2008) 180e190

J. Read / Journal of English for Academic Purposes 7 (2008) 180e190

J. Read / Journal of English for Academic Purposes 7 (2008) 180e190

Relevance and Utility

Performance Summary and Reporting

Decisions about the Structure and Formats of the Test

Arguments based on Theory, Evidence and Consequences

J. Read / Journal of English for Academic Purposes 7 (2008) 180e190

J. Read / Journal of English for Academic Purposes 7 (2008) 180e190

J. Read / Journal of English for Academic Purposes 7 (2008) 180e190

J. Read / Journal of English for Academic Purposes 7 (2008) 180e190

J. Read / Journal of English for Academic Purposes 7 (2008) 180e190

J. Read / Journal of English for Academic Purposes 7 (2008) 180e190

You might also like