Professional Documents
Culture Documents
RELIABILITY
RELIABILITY
• Reliability is a proportion of
variance measure (squared
variable)
• Defined as the proportion of
observed score (x) variance due
to true score ( ) variance:
• 2x = xx’
= 2 / 2x
VENN DIAGRAM REPRESENTATION
Var() Var(e)
Var(x)
reliability
PARALLEL FORMS OF TESTS
• If two items x1 and x2 are parallel,
they have
• equal true score variance:
– Var(1 ) = Var(2 )
• equal error variance:
– Var(e1 ) = Var(e2 )
• Errors e1 and e2 are uncorrelated:
(e1 , e2 ) = 0
• 1 = 2
Reliability: 2 parallel forms
• x 1 = + e 1 , x 2 = + e2
• (x1 ,x2 ) = reliability
= xx’
= correlation between
parallel forms
Reliability: parallel forms
x1 x2
x
x
e e
xx = .5(.95)/[1+(.5-1)(.95)]
= .905
Thus, a short form with a random
sample of half the items will produce a
test with adequate score reliability
Reliability: KR-20 for parallel or
tau equivalent items/scores
Items are scored as 0 or 1, dichotomous
scoring
Kuder and Richardson (1937):
special cases of Cronbach’s more general
equation for parallel tests.
KR-20 = [k/(k-1)] [ 1 - piqi / 2y ] ,
where pi = proportion of respondents
obtaining a score of 1 and qi = 1 – pi .
pi is the item difficulty
Reliability: KR-21 for parallel
forms assumption
Items are scored as 0 or 1, dichotomous scoring
Kuder and Richardson (1937)
KR-21 = [k/(k-1)] [ 1 - kp. q. / 2c ]
p. is the mean item difficulty and q. = 1 – p.
KR-21 assumes that all items have the same
difficulty (parallel forms)
item mean gives the best estimate of the
population values.
KR-21 KR-20.
Reliability: congeneric
scores
• If two items x1 and x2 are congeneric,
1. 1 2
2. unequal true score variance:
Var(1 ) Var(2 )
3. unequal error variance:
Var(e1 ) Var(e2 )
4. Errors e1 and e2 are uncorrelated:
(e1 , e2 ) = 0
Reliability: congeneric
scores
x1 = 1 + e1 , x2 = 2 + e2
x1 x2
x11 12 x22
e1
1 2
e2
≤ 1 - 2k / 2c
est = k/(k-1)[1 - s2k / s2c ]
Reliability: Coefficient alpha
Alpha =
1. Spearman-Brown for parallel or
tau equivalent tests
2. = KR20 for dichotomous
items (tau equiv.)
= Hoyt, even for 2 x item 0
(congeneric)
Hoyt reliability
• Based on ANOVA concepts extended during
the 1930s by Cyrus Hoyt at U. Minnesota
• Considers items and subjects as factors that
are either random or fixed (different models
with respect to expected mean squares)
• Presaged more general Coefficient alpha
derivation
Reliability: Hoyt ANOVA
Source df Expected Mean Square
g-coefficients
Cronbach’s alpha
inter-rater
parallel form
Hoyt
JOE 1 1 1 0
SUZY 1 0 1 1
FRANK 0 0 1 0
JUAN 0 1 1 1
SHAMIKA 1 1 1 1
ERIN 0 0 0 1
MICHAEL 0 1 1 1
BRANDY 1 1 0 0
WALID 1 0 1 1
KURT 0 0 1 0
ERIC 1 1 1 0
MAY 1 0 0 0
SPSS RELIABILITY OUTPUT
R E L I A B I L I T Y A N A L Y S
I S - S C A L E
(A L P H A)
Reliability Coefficients
N of Cases = 12.0
N of Items = 4
Alpha = .1579
SPSS RELIABILITY OUTPUT
R E L I A B I L I T Y A N A L Y S I S
- S C A L E (A L P H A)
Reliability Coefficients
N of Cases = 12.0
N of Items = 8
Alpha = .6391
Note: same items duplicated
TRUE SCORE THEORY AND
STRUCTURAL EQUATION
MODELING
True score theory is consistent with the
concepts of SEM
- latent score (true score) called a factor in
SEM
- error of measurement
- path coefficient between observed score x
and latent score is same as index of
reliability
COMPOSITES AND FACTOR
STRUCTURE
• 3 Manifest (Observed) Variables required
for a unique identification of a single factor
• Parallel forms implies
– Equal path coefficients (termed factor loadings)
for the manifest variables
– Equal error variances
– Independence of errors
Parallel forms
e e
factor diagram
x1 x2
x x
e
x
x3
= k/(k-1)[1 - 1/ ]
• example 2x = .8 , K=11
= 11/(10)[1 - 1/8.8 ]
= .975
TAU EQUIVALENCE
• ITEM TRUE SCORES DIFFER BY A
CONSTANT:
i = j + k
• ERROR STRUCTURE UNCHANGED AS
TO EQUAL VARIANCES,
INDEPENDENCE
CONGENERIC MODEL
• LESS RESTRICTIVE THAN PARALLEL
FORMS OR TAU EQUIVALENCE:
– LOADINGS MAY DIFFER
– ERROR VARIANCES MAY DIFFER
• MOST COMPLEX COMPOSITES ARE
CONGENERIC:
– WAIS, WISC-III, K-ABC, MMPI, etc.
e1 e2
x1 x2
x
1 x
2
e3
x
3
x3
(x1 , x2 )= x * x
1 2
COEFFICIENT ALPHA
• xx’ = 1 - 2E /2X
• = 1 - [2i (1 - ii )]/2X ,
• since errors are uncorrelated
• = k/(k-1)[1 - s2i / s2C ]
• where C = xi (composite score)
• s2i = variance of subtest xi
• sC = variance of composite
• Does not assume knowledge of subtest ii
COEFFICIENT ALPHA-
NUNNALLY’S COEFFICIENT
• IF WE KNOW RELIABILITIES OF EACH
SUBTEST, i
• N = K/(K-1)[1-s2i (1- rii )/ s2X ]
• where rii = coefficient alpha of each subtest
• Willson (1996) showed
N xx’
NUNNALLY’S RELIABILITY CASE
e1 e2
x1 x2
s1 x
1 x
2
s2
e3
x
3
x3
s3
X X = 2x
i i i + s2 i
Reliability Formula for SEM with
Multiple factors (congeneric with
subtests)
Single factor model:
= i2 / [ i2 + ii + ij ]
>
If eij = 0, reduces to
= i2 / [ i2 + ii ] = Sum(factor loadings on 1st factor)/ Sum of observed
variances
This generalizes (Bentler, 2004) to the sum of factor loadings on the 1 st factor divided by the
sum of variances and covariances of the factors for multifactor congeneric tests
e3
x
3
x3 Specificities can be
misinterpreted as a
correlated error model if
they are correlated or a
second factor
s3
CORRELATED ERROR PROBLEMS
e1 e2
x1 x2
x 1 x 2
e3
x
3 Specificieties can
x3 be misinterpreted as
a correlated error
model if
specificities are
correlated or are a
s3 second factor
SPSS SCALE ANALYSIS
• ITEM DATA
• EXAMPLE: (Likert items, 0-4 scale)
• Mean Std Dev Cases
• Correlation Matrix
• Item Variances
• Mean Minimum Maximum Range Max/Min Variance
• 1.1976 .1242 2.2408 2.1166 18.0415 .7132
• Inter-itemCorrelations
• Mean Minimum Maximum Range Max/Min Variance
• .0822 -.1188 .2985 .4173 -2.5130 .0189
ITEM-TOTAL STATS
• Item-total Statistics
• Scale Scale Corrected
• Mean Variance Item- Squared Alpha
Total Multiple if item
• Correlation R deleted
• Analysis of Variance
• Source of
• Variation Sum of Sq. DF Mean Square F Prob.
• Reliability Coefficients
5 items
• Alpha = .2625
Standardized item alpha =
.3093
• Standardized means all items parallel
RELIABILITY:
APPLICATIONS
STANDARD ERRORS
• se = standard error of measurement
• = sx [1 - xx ]1/2
• can be computed if xx is estimable
• provides error band around an observed
score:
[ -1.96se + x, 1.96se + x ]
x
-1.96se +1.96se