You are on page 1of 32

Chapter- 4

Standardized Testing
Prepared by
Fransiska Marsela Hambur
13020317410014
Question for Discussion

What is Standardization?

Standardization means that there are set of criteria or objectives


applied to broad competencies which are not exclusive to certain
curriculum. (Brown, p,67)

Every one of you at some point of your academic life


have been affected by a standardized test.

Do you agree?
What is standardization?

• A standardized test presupposes certain standard objectives or


criteria that are held constant across one form of the test to
another.

• A good standardized test is the product of a thorough process of


empirical research and development. It has standard procedures
and scoring.

• It measures children’s mastery of the standards or competencies


that have been prescribed for specific grade levels.
Examples of Standardized Test

• Scholastic Aptitude Test (SAT) is designed for college entrance exam of


many high school seniors seeking further education.

• The Graduate Record Exam (GRE) for entry into many graduate school
programs like Graduate Management Admission Test (GMAT) and the
Law School Aptitude Test (LSAT) specialize in particular disciplines.

• Test of English as a Foreign Language (TOEFL)produced by the


Educational Testing
What is the use of standardization?

• Test for entry into many graduate school programs


• Graduate Management Admission Test (GMAT) & Law School Aptitude Test
(LSAT): tests that specialize in particular disciplines
• Test of English as a Foreign Language (TOEFL)
It is produced the Educational Testing Service (ETS) in the United State
and/or its British counterpart, the International English Language Testing
System(IELTS)
Based on examples above, can you conclude the characteristics of
standardized test?
Characteristics of Standardized Test

Set of competencies

Given Domain (writing, listening, etc)

Process of constructing validation

Set of Task to measure competencies


Characteristics of Standardized Tests
IELTS as standardized test

• In the U.S and/or British counterpart, the International English Language


Testing System (IELTS) are standardized.
• The test specify a set of competencies for a given domain and through a
process of construct validation they program a set of tasks that have been
designed to measure those competencies.
• In general standardized test, items are in the form of MC.
• MC provide ‘objective’ means for determining correct and incorrect
responses.
• However, MC is not the only test item type in standardized test.
• Human scored tests of oral and written production are also involved.
Advantages of Standardized Test

• It is a ready-made previously validated product that frees the


teacher from having to spend hours creating a test.
• Administration to large groups can be accomplished within
reasonable time limits.
• In the case of multiple-choice formats, scoring procedures are
streamlined for either scannable computerized scoring or
handscoring with a hole-punched grid for fast turnaround time.
Disadvantages of Standardized
Test
• Inappropriate use of such tests.
Example: using an overall proficiency test as an achievement test
simply because of the convenience of the standardization

• Potential misunderstanding of the difference between direct and


indirect testing.
Example: Some standardized tests include task that do not directly
specify performance in target objective.
e.g. before 1996, TOEFL included neither a written nor an oral
production section despite statistics showed there was correspondence
between performance on the TOEFL and a student’s written and oral
production.
DEVELOPING A STANDARDIZED TEST
• Knowing how to develop a standardized test can be helpful to revise
an existing test, adapt or expand an existing test, create a smaller-
scale standardized test.

Three differents standardized test will be used to explain the process of


standardized test design:
(A) The Test of English as a Foreign Language (TOEFL) ‘general ability or
proficiency’
(B) The English as a Second Language Placement Test (ESLPT), San
Francisco State University (SFSU) ‘placement test at a university’
(C) The Graduate Essay Test (GET), SFSU ‘gate-keeping essay test’
Steps of Creating Standardized Test

1 • Determine the purpose and the objectives of the test

2 • Design test specification

3 • Design, select and arrange test tasks/items

4 • Make appropriate evaluations of different kinds of items.

5 • Specify scoring procedures or formats

6 • Perform on going construct validation studies


1. Determine the purpose and objectives of
the test.
• Standardized tests are expected to be valid and practical.
• TOEFL: To evaluate the English proficiency of people whose NL is not
English.
• Colleges and universities in the US use the score TOEFL score to admit or
refuse international applicants for admission.
• ESLPT: To place already admitted students in an appropriate course in
academic and to provide teachers some diagnostic information about
students’ oral production and grammar.
• GET is given to determine whether their writing ability is sufficient to
permit them to enter graduate-level courses in their programs. It is offered
at the beginning of each term.
2. Design test specification

• TOEFL: This is the step of laying the foundation stones of the test.
Example:
TOEFL – Specifications:
1. Listening Section
2. Structure Section
3. Reading Section
4. Writing Section
Each specs are not just stated that way, it should include what does it
measures, what does it covers, and what material it uses.
EXAMPLE OF TEST SPESIFICATION
TOEFL SPESIFICATION: LISTENING SECTION
EXAMPLE OF TEST SPESIFICATION
TOEFL SPESIFICATION: READING SECTION
Comparison of Listening and Reading
Specification
• TOEFL Listening section focuses on a particular feature of language or
overall listening comprehension. It includes various listening stimuli,
such as dialogues, order of process, supporting ideas, etc.
• Real-world situation are emphasized more in listening section rather
than in reading section.
• While Reading section aims to test comprehension of long/short
passages, single sentence, phrases or words. Test items of reading
section are stated or implied without emphasizing real-word situation.
3. Design, select and arrange test
tasks/items
• Once specifications for a standardized test have been stipulated, the task of
designing, selecting and arranging test tasks/items begins.
• The specs act much like a blueprint in determining the number and types of
items to be created.
(A)TOEFL test design specifies that each item be coded for content and statistical
characteristics.
Content coding ensures assessment of variety skills and cover a variety of subject
matter without considering background of test-takers.
Examples: in TOEFL reading section, some items may target the assessment of
comprehension of main idea, stated details, unstated details, implied details, and
vocabulary in context.
EXAMPLES OF TOEFL READING SECTION
(Brown, p. 74-75)
DESIGNING ESLPT and GET

(B)The selection of items in the ESLPT entailed two processes: summary of


reading and response to reading.
The main hurdles were (a) selecting appropriate passages for test-takers to
read, (b) providing appropriate prompts, and (c) processing data from pilot
testing.

(C) The GET prompts are designed by a faculty committee of examiners who
are specialists in the field of university academic writing.
The assumption is made that the topics are appealing and capable of
producing essay that requires an organized logical argument and conclusion.
Examples of prompt of GET (Brown, p.76-77)
4. Make appropriate evaluations of
different kinds of items.
Terminology:
Item facility (IF) – % of people who give the right answer
Item Discrimination (IDis) – indicates the extent to which success on an item
corresponds to success on the whole test.
Item Difficulty (ID) - finding out the % of people who get the item right in
the try-out group.

Performing them may not be practical, especially if the classroom-based


test is a one-time test.
But for a standardized multiple-choice test that is designed to be marketed
commercially, or administered a number of times, and administered in
different form, these statistics are a must.
• There are different form of evaluation for other types of response formats.
(e.g, Production responses)
Practicality Aspects must be accommodated in standardized items. It involves:
Clarity of directions,
Timing of the test,
Ease of administration,
Time required to score responses.
questionnaires and interviews (for open ended questions in ESLPT)
That information proved to be invaluable in the revision of prompts and
stimulus reading passages in ESLPT.
• Reliability – is the degree to which an assessment tool produces stable
and consistent results.
• Facility –in order to facilitate test-takers, the standardized items must
consider:
Unclear directions
Complex language
Obscure topics
Fuzzy data
Culturally biased information
5. Specify scoring procedures or
formats
• A systematic assembly of test items in preselected arrangements and
sequences, all of which are validated to conform to an expected level of
difficulty, should yield a test that can then be scored accurately and
reported back to test-takers and institutions efficiently.
• TOEFL Scores are calculated and reported for (a) three sections of the
TOEFL, (b) a total score, (c) separate score for the Essay is also provided.
• ESLPT reports a score for each of the essay sections, but the rating scale
differs between them because in one case the objective is to write a
summary, and in the other to write a response.
• GET has their own scoring guides. Each GET is read by two trained readers,
who give a score between 1 and 4. The two readers' scores are added to sum
a total possible score of 2 to 8. Test administrators recommend a score of 6
as the minimal score.
Graduate Essay Test: Scoring Guide (Brown,
p.81)
6. Perform on going construct
validation studies
No standardized instrument is expected to be used repeatedly without a rigorous
program of on going construct validation.
A. The TOEFL program has an impressive program of research. An early example of
such a study was the seminal Duran et aI. (1985) study, TOEFL from a
Communicative Viewpoint on Language Proficiency, which examined the content
characteristics of the TOEFL from a communicative perspective based on current
research in applied linguistics and language proficiency assessment.
B. The development of the new ESlPT involved a lengthy process of both content
and construct validation, by facing practical issues as scoring the written
C. GET uses holistic scoring rubric. Administrative conditions of the GET are to some
extent patterned relying on on university-level academic writing tests
• Any standardized test, once developed, must be accompanied by systematic
corroboration of its effectiveness and by steps towards its improvement.
STANDARDIZED LANGUAGE PROFICIENCY
TESTING
• Swain (1990) offered a multidimensional view of proficiency
assessment by referring to three linguistic traits (grammar, discourse,
and sociolinguistics) that can be assessed by means of oral, multiple-
choice, and written responses.
• Another definition and conceptualization of proficiency is suggested by
the' ACTFL association.
• ACTFL takes a holistic and more unitary view of proficiency in
describing four levels: superior, advanced, intermediate, and novice.
Traits of second language proficiency (Swain,
1990, p. 403)
ACTFL speaking guidelines, summary,
superior-level (Brown, p. 82)
Four Standardized Language Proficiency Tests

Commercially produced standardized tests of English language


proficiency:
1. TOEFL – The Test Of English as a Foreign Language
2. MELAB – Michigan English Language Assessment Battery
3. IELTS – International English Language Testing System
4. TOEIC – Test of English for International Communication
CONCLUSION

• The construction of a valid standardized test is not minor accomplishment.


• The designing of specifications requires a sophisticated process of construct
validation coupled with considerations of practicality.
• The construction of items and scoring/interpretation procedures may require
a lengthy period of trial and error with prototypes of the final form of the
test.
• Giving attention to all the details of construction, the end product can result
in a cost-effective, timesaving, accurate instrument.

You might also like