Measurement and Scaling: Farzin Madjidi, Ed.D. Pepperdine University Graduate School of Education and Psychology

Measurement and Scaling
Farzin Madjidi, Ed.D. Pepperdine University Graduate School of Education and Psychology
Variables

Independent
Precedes, influences or predicts results
Dependent
Affected by or predicted by the Independent Variable Affected by the D.V., but not controlled or measured. Causes error
2
Extraneous
Variables
Confounding
An extraneous variable that varies systematically (has a relationship) with the I.V. Unobservable trait that influences behavior (e.g., effect of new intervention on self-esteem may be affected by the motivation level of subjects)
Intervening
Variables
Control
Used to eliminate the effect of extraneous variables Aka, measured, or assigned Characteristics of the subjects that cannot be manipulated
Organismic

Levels of Measurements
Four levels of Measurements Nominal
Measures categories Categories + rank and order Equal distance between any two consecutive measures Intervals + meaningful zeros
5
Ordinal

Interval Ratio
Categories of Scales

Categorical (ratings)
Score without comparison - 1 to 5 scales
Comparative (ranking)
Score by comparing - Smartest

Subjective - which do you prefer Objective - which solution is less costly
6
Preference
Non-preference
Categories of Scales
Unidimensional

Involves only one aspect of the measurement Measurement by one construct Involves several aspects of a measurement Uses several dimensions to measure a single construct
Multi-dimensional

Types of Scales
Likert/Summated Rating Scales Semantic Differential Scales Magnitude Scaling Thruston Scales Guttman Scales
Likert Scales
A very popular rating scale Measures the feelings/degree of agreement of the respondents Ideally, 4 to 7 points Examples of 5-point surveys
Agreement SD Satisfaction SD Quality VP
D D P
ND/NA ND/NS Average
A S G
SA SS VG
9
Summative Ratings
A number of items collectively measure one construct (Job Satisfaction) A number of items collectively measure a dimension of a construct and a collection of dimensions will measure the construct (Self-esteem)
10
Summative Likert Scales

Must contain multiple items Each individual item must measure something that has an underlying, quantitative measurement continuum There can be no right/wrong answers as opposed to multiple-choice questions Items must be statements to which the respondent assigns a rating Cannot be used to measure knowledge or ability, but familiarity
11
Semantic Differential Scales

Uses a set of scale anchored by their extreme responses using words of opposite meaning. Example:
Dark ___ ___ ___ ___ ___ Light Short ___ ___ ___ ___ ___ Tall Evil ___ ___ ___ ___ ___ Good
Four to seven categories are ideal

12
Magnitude Scaling
Attempts to measure constructs along a numerical, ratio level scale
Respondent is given an item with a preassigned numerical value attached to it to establish a norm The respondent is asked to rate other items with numerical values as a proportion of the norm Very powerful if reliability is established
13
Thurston Scales
Thurston Scales

Items are formed Panel of experts assigns values from 1 to 11 to each item Mean or median scores are calculated for each item Select statements evenly spread across the scale
14
Thurston Scales
Example:
Please check the item that best describes your level of willingness to try new tasks I seldom feel willing to take on new tasks (1.7) I will occasionally try new tasks (3.6) I look forward to new tasks (6.9) I am excited to try new tasks (9.8)
15
Guttman Scales
Also known as Scalograms Both the respondents and items are ranked Cutting points are determined (GoodenoughEdwards technique) Coefficient of Reproducibility (CReg) - a measure of goodness of fit between the observed and predicted ideal response patterns Keep items with CReg of 0.90 or higher
16
Scale Construction
Define Constructs

Conceptual/theoretical basis from the literature Are their sub-scales (dimensions) to the scale Multiple item sub-scales Principle of Parsimony Simplest explanation among a number of equally valid explanations must be used
17
Item Construction
Agreement items
Write declarative statements

Death
penalty should be abolished I like to listen to classical music

Frequency items (how often)

I
like to read
Evaluation items
How
well did your team play How well does the police serve your community
18
Item Writing

Mutually exclusive and collectively exhaustive items Use positively and negatively phrased questions Avoid colloquialism, expressions and jargon Avoid the use of negatives to reverse the wording of an item

Dont use: I am not satisfied with my job Use: I hate my job!
Be brief, focused, and clear Use simple, unbiased questions

19
Sources of Error

Social desirability
Giving politically correct answers
Response sets
All yes, or all no responses

Telling you what you want to hear Wants to send a message
20
Acquiescence
Personal bias
Sources of Error
Response order
Recency - Respondent stops reading once s/he gets to the response s/he likes Primacy - Remember better the initial choices Fatigue
Answers to later items may be affected by earlier items (simple, factual items first) Respondent may not know how to answer earlier questions
21
Item order

Assessing Instruments
Three issues to consider

Validity: Does the instrument measure what its supposed to measure Reliability: Does it consistently repeat the same measurement Practicality: Is this a practical instrument
22
Types of Validity
Face validity
Does the instrument, on its face, appear to measure what it is supposed to measure Degree to which the content of the items adequately represent the universe of all relevant items under study Generally arrived at through a panel of experts
23
Content validity
Types of Validity
Criterion related
Degree to which the predictor is adequate in capturing the relevant aspects of criterion Uses Correlation analysis Concurrent validity
Criterion
data is available at the same time as predictor score- requires high correlation between the two
Predictive validity
Criterion
is measured after the passage of time Retrospective look at the validity of the measurement Known-groups 24
Types of Validity
Construct Validity

Measures what accounts for the variance Attempts to identify the underlying constructs Techniques used:
Correlation
of proposed test with other existing tests Factor analysis Multi-trait-multimethod analysis Convergent validity - Calls for high correlation between the different measures of the same construct Discriminant validity - Calls for low correlation between sub-scales within a construct
25
Types of Reliability
Stability
Test-retest: Same test is administered twice to the same subjects over a short interval (3 weeks to 6 months) Look for high correlation between the test and retest Situational factors must be minimized
26
Equivalence

Degree to which alternative forms of the same measure produce same or similar results Give parallel forms of the same test to the same group with a short delay to avoid fatigue Look for high correlation between the scores of the two forms of the test Inter-rater reliability
27
Internal Consistency
Degree to which instrument items are homogeneous and reflect the same underlying constructs Split-half testing where the test is split into two halves that contain the same types of questions Uses Cronbachs alpha to determine internal consistency. Only one administration of the test is required Kuder-Richardson (KR20) for items with right and wrong answers
28
Practicality
Is the survey economical

Cost
of producing and administering the survey Time requirement Common sense!
Convenience
Adequacy
of instructions Easy to administer
Can the measurement be interpreted by others

Scoring
keys Evidence of validity and reliability Established norms

29

Measurement and Scaling: Farzin Madjidi, Ed.D. Pepperdine University Graduate School of Education and Psychology

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Measurement and Scaling: Farzin Madjidi, Ed.D. Pepperdine University Graduate School of Education and Psychology

Uploaded by

Copyright:

Available Formats

Measurement and Scaling

Precedes, influences or predicts results

Score without comparison - 1 to 5 scales

Score by comparing - Smartest

Agreement SD Satisfaction SD Quality VP

ND/NA ND/NS Average

Summative Likert Scales

Semantic Differential Scales

Four to seven categories are ideal

Attempts to measure constructs along a numerical, ratio level scale

Write declarative statements

penalty should be abolished I like to listen to classical music

Frequency items (how often)

Dont use: I am not satisfied with my job Use: I hate my job!

Be brief, focused, and clear Use simple, unbiased questions

Giving politically correct answers

All yes, or all no responses

Three issues to consider

Is the survey economical

of producing and administering the survey Time requirement Common sense!

of instructions Easy to administer

Can the measurement be interpreted by others

keys Evidence of validity and reliability Established norms

You might also like