Professional Documents
Culture Documents
People do not have equal talents. But all individuals should have an equal opportunity to develop their talents
A standardized tests not be the ONLY method for evaluating a students learning. Nor should standardized tests by themselves be considered sufficient information in holding schools accountable for students learning (Popham, 2005; Taylor & Nolen, 2005)
Important criteria for evaluating standardized tests are: a. Norms- To understand an individual students performance on a test, it needs to be compared with the performance of the norm group ( A group of similar individual who previously were given the tests by the test maker). The test is said to be based on national norms when the norm group consists of a nationally representative group of students.
The norms group should include students from urban, suburban, rural arears; different geographical regions; private and public schools; boys and girls; and different ethnic groups. Based on individual students score on the standardized test, teacher can determine whether a student is performing above, on a level or below a national norm (Freeland, 2005; Gregory, 2007) The evaluations of a students test performance might differ, depending on what norm group is used.
iii. Construct validity is the extent to which there is evidence that a test measures a particular construct. A construct is an unobservable trait or characteristic of a person, such as intelligence, learning style, personality or anxiety. c. Reliability is the extent to which a test produces a consistent, reproducible score. To be called reliable, scores must be stable, dependable and relatively free from errors of measurement (Gronlund, 2006; Popham, 2006)
Reliability can be measured in several ways: i. Test-retest reliability- the extent to which a test yields the same performance when a students is given the same test on two occasions. ii. Alternate forms reliability- judged by giving different forms of the same test on two different occasions to the same group of students to determine how consistent their score are.
iii. Split-half reliability judged by dividing the tests items into two halves, such as the oddnumberes and even-numbered items. The scores on the two sets of items are compared to determine how consistently the students performed across each set.
Validity and reliability are related (Gregory, 2007). A test that is valid is reliable but a test that is reliable is not necessary valid. People can respond consistently on a test but the test might not be measuring what is purports to measure (You have three darts to throw. If all three fall close together, you have reliability. However you have validity only if all three hit the bulls-eye)
d. Fairness and Bias Fair test are unbiased and nondiscriminatory (McMillan, 2004). They are not influenced by factors such as gender, ethnicity or subjective factors such as the bias of a scorer. When tests are fair, students have the opportunity to demonstrate their learning so that their performance is not affected by their gender, ethnicity, disability or factors unrelated to the purpose of the test. An unfair test is a test that puts a particular group of students at a disadvantage (Popham, 2006; Reynolds, Livingstone & Willson, 2006)
For instance, a test that is supposed to assess writing skills asks students to write a short story about a boy who practices very hard to be good in football and makes the team. Clearly this type of item will be easier for boys than girls because boys are generally more familiar with football, so the test will be unfair to girls as an assessment of their writing skills.
Discussion:
1. What is meant by standardized test? What are the uses of standardized test? 2. What do norms, validity, reliability and fairness have to do with judging the quality of a standardized test? 3. Can a test be valid but not reliable? Reliable but not valid? Explain in your own words.
Discussion:
1. How clearly do aptitude and achievement tests differ in purpose? In form? 2. What are survey batteries, specific subject tests, and diagnosis tests? 3. What are some possible advantages to high-stakes state standard-based testing and what are some ways their results are being used? What are some criticisms of high-stakes state standard-based testing? 4. What is the argument for National standardized testing? Why is it resisted? 5. How can standardized tests of teachers be characterized?