Current Developments in Testing Item Response Theory (IRT) : Prepared by

Current developments in
testing
Item Response Theory (IRT)
Prepared By :
1. Logamalar a/p Chegaran 823545
2. Tilagawathy a/p Palasupmaniam 823565
Overview
What we will cover today
Current developments in testing Item Response

Theory (IRT)
 -Assumptions of Classical Test Theory (CTT).
 -Item Response Theory
 -Similarities and differences between IRT and CTT
 -Test bias
 -Test Fairness
 -Test Accommodations
- Assumptions of Classical Test Theory (CTT).
- Item Response Theory
- Similarities and differences between IRT and CTT
Assumptions of Classical Test Theory (CTT)
There are three main assumptions in the Classical Test Theory

(CTT)
1. The error and the true scores from the same test have a
correlation of zero.Hence, the variance of the observed
score is expected to be equal to the sum of the variances of
the true and error score (Lord,1980)
ie Ɣ Te = 0
Sumber : Prof. ‘Dibu Ojerinde, OON ,Joint Admissions and Matriculation Board
(JAMB), Abuja,
Nigeria
2.The error term have an expected mean of zero.

Once the error is zero, the observed score is equal to
the true score.
(X = T), nΣi = 0 ……….
Sumber : Prof. ‘Dibu Ojerinde, OON ,Joint Admissions and Matriculation Board (JAMB), Abuja,
Nigeria
3. The error from parallel measurements are uncorrelated.
X ║ X1 if X1 = X2 = Ti + Ei
Sumber : Prof. ‘Dibu Ojerinde, OON ,Joint Admissions and Matriculation Board (JAMB), Abuja,
Nigeria
Descriptions of IRT
 “IRT refers to a set of  This latent variable is
mathematical models that
describe, in probabilistic usually a hypothetical
terms, the relationship construct [trait/domain or
between a person’s response ability] which is postulated
to a survey question/test item
and his or her level of the to exist but cannot be
‘latent variable’ being measured by a single
measured by the scale” observable variable/item.
 Fayers and Hays p55
 Assessing Quality of Life in
Clinical Trials. Oxford Univ
Press:  Instead it is indirectly
 Chapter on Applying IRT for measured by using
evaluating questionnaire item multiple items or questions
and scale properties.
in a multi-item test/scale.
7
Assumptions in IRT
• Unidimensionality
– Examinee performance is a single
ability
• Response  Dichotomous
– The relationship of examinee
performance on each item and the
ability measured by the test is
described as monotonically
increasing.
• Monotonicity of item performance
and ability is typified in an item
characteristic curve (ICC).
• Examinees with more ability have
higher probabilities for giving
correct answers to items than
lower ability students
(Hambleton, 1989).
• Mathematical model
linking the observable
dichotomously scored
data (item performance)
a b
to the unobservable data
(ability)
c
• P(θ)
i gives the probability
of a correct response
to item i as a function
if ability (θ)
• b is the probability of
b=item difficulty a=item
a correct answer
discrimination (1+c)/2
c=psuedoguessing parameter
• Three items
showing
different item
difficulties (b)
• Two-parameter
model: c=0
• One-parameter
a model: c=0, a=1
b
• Different levels
of item
discrimination
 IRT has almost completely replaced CTT as method of choice.
 IRT has many advantages ove CTT that have brought IRT into
more frequent use.
 IRT allows for greater reliability.
 IRT can be used in CAT
 IRT allows for difficulty and ability to be on the same scale.
 IRT can be analyzed using multi-level modeling.
Arsaythamby Veloo/Rosna Awang Hashim,Teori Ujian dan Pentaksiran Pendidikan,UUM ,Sintok.2016 pg.28
3 basic Compenents of IRT
1. Item Response Function(IRF) -
Mathematical function that relates the latent to the probability of

endorsing and item.
2. Item information Function –
An indication of item quality ; an item’s ability to diffenrentiate

among respondents.
3. Invariance –
Position on the latent trait can be estimated by the items with know
IRF’s and item characteristic are population independen within
linear tranformation.
Sumber : Psy 427 Cal State Northridge, Andrew Ainsworth, PhD, slides.
Differences between IRT and CTT
Dimension CTT IRT
Definition CTT is a theory about test scores IRT is a general statical theory
that introduces 3 concepts. about examinee item and test
Test score(often calld observed performance and how
score),true score and error score. performance relate to the abilities
that are measured by the items in
the test.
Model Linear Non linear

Level Weak (i.e easy to meet test data) Strong ( i.e more difficult to meet
Assumption test data.
Item Ability Non specified Item characteristic functions

Relationship
Test Parrallel Test Non Parrallel Test

Invariance Sample dependent Sample independent
Error Equal chance for every one Not equal chance
Performance Predictable Non Predictable
Cttirt1-150715175719-1val-app6891.pdf
Test Bias
Test Fairness
Test Accomodations
TEST BIAS
DEFINITION
A test is considered biased

when the scores of one group
are significantly different and
have higher predictive
validity,which is the extent to
which a score on an
assessment predicts future
performance,than another
group.
http://www.academia.edu/9336249/Academic_Achievement_Test_Bias_and_Fairness
SOURCES OF TEST BIAS
Types of Test Bias
Construct bias
 occurs when the construct measured yields significantly different
results for test-takers from the original culture for which the test was
developed and test-takers from a new culture.
 A construct refers to an internal trait that cannot be directly

observed but must be inferred from consistent behavior observed in
people.
 Self-esteem, intelligence and motivation are all examples of a

construct.
Arsaythamby Veloo/Rosna Awang Hashim,Teori Ujian dan Pentaksiran Pendidikan,UUM,Sintok.2016 pg.41-43

Types of Test Bias
Method bias
 Method bias refers to factors surrounding the administration of
the test that may impact the results.
 The testing environment, length of test and assistance provided

by the teacher administrating the test are all factors that may lead to
method bias.
 For example, if a student from one culture is used to, and

expects to, receive assistance on standardized tests, but is faced
with a situation in which the teacher is unable to provide any
guidance, this may lead to inaccurate test results.
Arsaythamby Veloo/Rosna Awang Hashim,Teori Ujian dan Pentaksiran Pendidikan,UUM,Sintok.2016 pg.42

Types of Test Bias
Item bias
 refers to problems that occur with individual items on the

assessment. These biases may occur because of poor use of
grammar, choice of cultural phrases and poorly written assessment
items.
Arsaythamby Veloo/Rosna Awang Hashim,Teori Ujian dan Pentaksiran Pendidikan,UUM,Sintok.2016 pg.42

TEST FAIRNESS
DEFINITION
Fairness in testing is a fair test is one that yields

comparably valid inferences from person to person and group
to group.
Test Fairness
About fairness
 means the test item should not have any biases. It should not be
offensive to any examinee subgroup.
 A test can only be good if it is fair to all the examinees
 A fair assessment provides all student with an equal

opportunity to demonstrate achievement
Test Accomodations
DEFINITION
 Any test accommodations which may enhance student

performance beyond providing equal access are considered
inappropriate and therefore are not permitted.

Students with Disabilities: Guidelines for Special Test Accommodations, August 2015, p.3
TEST
ACCOMODATIONS
Timing/ Response
setting Presentation
scheduling
2015 – 2016 Test Implementation Manuals, Appendix C

.
Students with Disabilities: Guidelines for Special Test Accommodations, August 2015, p.10-14
Students with Disabilities: Guidelines for Special Test Accommodations, August 2015, p.10-14
THANK YOU

Current Developments in Testing Item Response Theory (IRT) : Prepared by

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Current Developments in Testing Item Response Theory (IRT) : Prepared by

Uploaded by

Copyright:

Available Formats

Current developments in

Current developments in testing Item Response

 -Assumptions of Classical Test Theory (CTT).

 -Item Response Theory

 -Similarities and differences between IRT and CTT

There are three main assumptions in the Classical Test Theory

2.The error term have an expected mean of zero.

(X = T), nΣi = 0 ……….

3. The error from parallel measurements are uncorrelated.

1. Item Response Function(IRF) -

Mathematical function that relates the latent to the probability of

2. Item information Function –

An indication of item quality ; an item’s ability to diffenrentiate

Model Linear Non linear

Item Ability Non specified Item characteristic functions

Test Parrallel Test Non Parrallel Test

A test is considered biased

 A construct refers to an internal trait that cannot be directly

 Self-esteem, intelligence and motivation are all examples of a

Arsaythamby Veloo/Rosna Awang Hashim,Teori Ujian dan Pentaksiran Pendidikan,UUM,Sintok.2016 pg.41-43

 The testing environment, length of test and assistance provided

 For example, if a student from one culture is used to, and

Arsaythamby Veloo/Rosna Awang Hashim,Teori Ujian dan Pentaksiran Pendidikan,UUM,Sintok.2016 pg.42

 refers to problems that occur with individual items on the

Arsaythamby Veloo/Rosna Awang Hashim,Teori Ujian dan Pentaksiran Pendidikan,UUM,Sintok.2016 pg.42

Fairness in testing is a fair test is one that yields

 A test can only be good if it is fair to all the examinees

 A fair assessment provides all student with an equal

 Any test accommodations which may enhance student

2015 – 2016 Test Implementation Manuals, Appendix C

You might also like