You are on page 1of 3

TEST CONSTRUCTION

SCALING

Measurement assignment of numbers according to rules.

Scaling
- A process of setting rules for assigning numbers in measurement.
- A process by which a measuring device is designed and calibrated and by
which numbers (or other indices) scale values are assigned to different
amounts of the trait, attribute, or characteristic being measured.

L.L. Thurstone
- Credited for being at the forefront of efforts to develop methodologically
sound scaling methods.
- Adapted psychophysical scaling methods to the study of psychological
variables such as attitudes and values.
- His article A Method of Scaling Psychological and Educational Tests
introduced the notion of *absolute scaling.
* a procedure for obtaining a measure of item difficulty across
samples of test takers who vary in ability.
A. Types of Scales:
1. Age-based
2. Grade-based
3. Stanine all raw scores on the test are to be transformed into scores
that can range from 1 to 9.
4. Unidimensional only one dimension is presumed to underlie the
ratings.
5. Multidimensional more than one dimension is thought to guide the
test takers responses
6. Comparative
7. Categorical
B. Scaling Methods
1. Rating Scale
- Grouping of words, statements, or symbols on which judgment of
the strength of a particular trait, attitude, or emotion are indicated
by the test taker.
- Can be used to record judgments of oneself, others, experiences, or
objects, and they can take several forms.
2. Summative Scale the final test score is obtained by summing the
ratings across all the items.
3. Likert Scale
- A type of summative rating scale.
- Each item presents the test taker with 5 alternative responses
(sometimes 7), usually to scale attitudes.
4. Method of Paired Comparison

5.

6.

7.
8.

Test takers are presented with pairs of stimuli (2 photos, 2 objects, 2


statements), which they asked to compare.
- Produces ordinal data.
Comparative Scaling
- A method of sorting.
- Entails judgments of a stimulus in comparison with every other
stimulus on the scale.
Categorical Scaling stimuli are placed into one of two or more
alternative categories that differ quantitatively with respect to some
continuum.
Guttman Scale items on it range sequentially form weaker to stronger
expressions of the attitude, belief, or feeling being measured.
Method of Equal-Appearing Intervals used to obtain data that are
presumed to be interval in nature.

WRITING ITEMS
o
o
o

What range of content should the items cover?


Which of the many different types of item format should be employed?
How many items should be written in total and for each content area
covered?

Item Pool the reservoir or well from which items will or will not be drawn for
the final version of the test.
A. Item Format variables such as the form, plan, structure, arrangement,
and layout of individual test.
1. Selected-Response Format require test takers to select a response
from a set of alternative responses.
a. Multiple-Choice Format
Stem
Correct alternative/option
Several incorrect alternative/option (distractors/foils)
b. Matching Item
Premises left column
Responses right column
c. Binary-Choice Item
True or False
Agree or Disagree
Yes or No
Right or Wrong
Fact or Opinion
2. Constructed-Response Format
a. Completion Item requires the examinee to provide a word or
phrase that completes a sentence.
b. Short Answer
c. Essay Item requires the test taker to respond to a question by
writing a composition, typically one that demonstrates recall of
facts, understanding, analysis, and/or interpretation.

B. Writing Items for Computer Administration


Item Bank a relatively large and easily accessible collection of test
questions.
Item Branching
- The ability to individualize testing through a technique.
- The ability of the computer to tailor the content and order of
presentation of test items on the basis of responses to previous
items.
Computerized Adaptive Testing (CAT) an interactive, computeradministered test taking process wherein items presented to the test
taker are based in part on the test takers performance on previous
items.
o Floor Effect the diminished utility of an assessment tool for
distinguishing test takers at the low end of the ability, trait, or
other attribute being measured.
o Ceiling Effect the diminished utility of an assessment tool for
distinguishing test takers at the high end of the ability, trait, or
other attribute being measured.
SCORING ITEMS
1. Cumulative Model the higher the score on the test, the higher the test
taker is on the ability, trait, or other characteristic that the test
purports to measure.
2. Class or Category Scoring wherein individuals must exhibit a certain
number of symptoms to qualify for a specific diagnosis.
3. Ipsative Scoring comparing a test takers score on one scale within a
test to another scale within that same test.

You might also like