Paper

Brad Jones 1
When the NES gives you lemons: Making the most of less-than-perfect data
As social scientists, we are often forced to make the most of data that falls far short of the
ideal. Researchers who work with survey data are painfully familiar with having to make do with
much less than what they might have hoped for given greater resources or the opportunity to
field their own study. In this paper, I will show how, with a little extra work at the front end, we
can squeeze the most out of what we have to work with. In particular, I focus on modeling
multidimensional latent variables.
Many political science concepts can be thought of as multidimensional constructs.
Ideology, for example, has long been theorized as consisting of multiple dimensions (Conover
and Feldman 1981; Jost, Federico, and Napier 2009). A recent article by Treier and Hillygus
(2009) confirms the idea that ideology falls along two dimensions. Personality research is
another example of a multidimensional construct that has received increased attention in recent
years (Baril and Stone 1984; Eysenck 1990; Gerber et al. 2010; Mondak et al. 2010; Olver and
Mooradian 2003; Tetlock 1981). Indeed, many of the scales and measures political scientists
routinely borrow from social psychology can be thought of as multidimensional latent variables.
In this paper, I will develop a multidimensional item response theory (MIRT) model for
ordered, polychotomous responses. Item response theory is growing in popularity among social
scientists, as it provides a theoretically grounded method of recovering latent variables from
batteries of questions—the type of data routinely collected in the form of sample surveys.
However, many of the existing models for item response data impose restrictive assumptions that
the researcher may not always be comfortable with. Using simulation studies, I demonstrate how
taking advantage of the multidimensional nature of the latent variables one is trying to measure
can increase the efficiency and precision of the estimates.

Brad Jones 2
Item Response Theory
Item response theory has quickly become almost commonplace in the political science
literature (Delli Carpini and Keeter 1993; Huber and Lapinski 2008; Jacoby 2008; Mondak 2001;
Treier and Jackman 2002; Treier and Simon Jackman 2008). The method was originally
developed in educational testing as a way of measuring latent ability. Rather than just adding up
the total number of correct answers on an exam, item response theory suggests that we can learn
something about the nature of the items and produce better estimates of latent ability by taking
these item-level factors into account. It has since been extended beyond the simple dichotomous
case. The basic model I will use in this paper is derived from the Generalized Partial Credit
Model (GPCM) originally developed by Muraki (1992). I extend the model into multiple
dimensions.
Model and Estimation
The model assumes that responses to the ordered items are governed by a respondent’s
latent trait, θ, which follows a multivariate normal distribution. Responses are modeled through a
multivariate extension of Muraki’s (1992) GPCM.
The GPCM models the probability of a respondent (indexed by i) selecting the kth option
from a question with ordered responses, j, is
Where , and K is the total number of response categories to item j. For the
purposes of my model, gives the coordinates of the individual in the three-dimensional latent
space, and each question, j, is assumed to measure one dimension of that space. The model could
be extended into a compensatory item response model, where responses to any question could be
Brad Jones 3
a function of more than one dimension of (Bolt and Lall 2003), but for my purposes, each item
is assumed to measure only one dimension.
The parameter is analogous to the difficulty parameter in a two-parameter logistic item
response model. The describe the thresholds for the ordinal responses. One advantage of the
GPCM over other ordinal response models (the graded response model, for example) is that it
does not put any ordering constraint on the . The summation in the numerator of the model
ensures that the inherent ordering of the response options is preserved.
Bock and Aitkin (1981) describe and EM procedure to estimate models such as the one
above where appears on both the right and left-hand side. Item response models present a
challenge for normal maximum likelihood estimation as the number of parameters grows with
the sample size. However, if we impose some distributional constraint on , an EM algorithm
can provide estimates of the parameters to maximize the likelihood of any particular response
pattern by integrating over to produce marginal maximum likelihood estimates. Muraki (1992)
describes how this would work for the generalized partial credit model he developed, and his
solution could be extended to three dimensions by integrating over trivariate normal distribution:
For purposes of identification, we can assume and . The EM algorithm would
proceed by integrating over and iterating between different values of the parameters at
different points in the three-dimensional space we are integrating across. Leung (2008) describes
this procedure in more detail.
Alternatively, one could estimate the parameters in a Bayesian framework. With
uninformative priors, this is analogous to the MML solution. The Bayesian solution has the
Brad Jones 4
advantage of being relatively easier to implement through WinBUGS. It also provides direct
estimates of the and some measure of uncertainty around each estimated parameter (Wollack
et al. 2002). An EM algorithm would require more serious programming to take the necessary
integrals.
The Bayesian method requires specification of hyperparameters for each of the
parameters in the model. These hyperparameter distributions are listed below.
Where, is a column vector of zeros and is the variance-covariance matrix. For identification,
the diagonal of is constrained to be unity. There are different options to model the dependence
between the different elements of . One option is to model as if it were drawn from an
inverse Wishart distribution. Alternatively, one could model each correlation separately. I choose
to follow the latter course. We can estimate two of the correlations unconstrained, but the
following conditions must be placed on one of the correlations:
Where , and
The conditions on a and b ensure that the matrix will be positive-definite.
All MCMC simulations were run in WinBUGS via the ―R2WinBUGS‖ package in R. I
used a relatively small number of iterations (300) with a ―burn-in‖ of 50. The simulated dataset
on which all analyses were conducted had a sample size of 500. In its current implementation,
Brad Jones 5
the method takes several hours to implement with a k of 20. The relevant BUGS code is included
in the appendix.
Simulation Studies
To investigate the advantages of adopting the multidimensional framework, I conducted
simulation studies systematically varying two factors: the αj parameter and the total number of
items in the questionnaire. In this paper, I am primarily interested in the differences between the
results of the one-dimensional approach and a multidimensional approach. Table 1 shows the
different conditions I will be examining. The conditions in the table represent real tradeoffs that
researchers are often forced to make. We often find ourselves in a position where we do not have
ideal measurement instruments. The αj parameter is one way of thinking how well the particular
item measures the latent variable. In an ideal world, we would have unlimited questionnaire
space that we could fill with items that provide precise measurements of the latent variables we
are interested in studying. However, in reality, we often have only very limited space and less-
than-perfect measures. Modeling the correlation structure in a multidimensional random variable
directly should allow us to borrow information across the different dimensions of the latent
variable to produce more efficient estimates. The method is more computationally intensive, but
if it produces real gains in terms of efficiency, the extra time and resources devoted in the
analysis stage should be able to compensate for unavoidable problems in the design stage.
[Table 1 here]
Figures 1 and 2 show how different values of αj relate the latent variable, θ, to the
probability of giving a particular response in the GPCM. Comparing Figure 1 and Figure 2
reveals how the α parameter affects the probability of giving a particular response over the range
Brad Jones 6
of θ. The β parameters were kept constant. Items with high values of α convey more information
about the underlying latent variable.
[Figures 1 and 2 here]
Results
In the analyses that follow, I will use simulated responses to four questionnaires
corresponding to the different columns in Table 1. The latent variable follows the multivariate
normal distribution:
The true values of the parameters were fixed, and responses follow the GPCM.
Implications for parameter estimation
By examining the raw correlation between the predicted latent variable and the true
values from the simulation, we can begin to get an idea the advantages of multidimensional item
response models. Table 2 shows the Pearson’s correlation between the predictions from the
estimation and the true values by the factors listed in Table 1. The table produces many relevant
comparisons. For my purposes, I am most interested in comparing the simultaneous and separate
estimations. A naïve estimate (an additive index of responses) is included for comparison.
[Table 2 here]
In every case, the simultaneous estimation outperforms the separate estimation. In some
situations, the improvement is dramatic. However, the naïve estimate performs remarkably well.
Indeed as α or k increases, the correlation between the true values of the latent variable and the
naïve estimator rival the most computationally intense simultaneous estimation. In most cases,
the naïve estimator actually outperforms the separate estimation of the latent variables. This
initial result seems disheartening on its face. If we can get a reasonably good approximation of
Brad Jones 7
the latent construct by simply summing the responses to the questionnaire, why bother with
difficult estimation?
As we might expect, the most marked improvement the multidimensional framework
shows is in modeling the structure of the latent variable. Table 3 shows the estimates of the
correlations between each element of the latent variable. The true correlations are listed across
the top of the columns.
[Table 3 here]
Comparing the naïve estimate and the separate estimation in Table 3 reveals a similar story as
was told in Table 2. In nearly every case, the naïve estimate significantly outperforms the
separate estimation when it comes to modeling the dependence between the elements of the
latent variable. However, by estimating the correlations directly in the simultaneous estimation,
we see marked improvement. Even in the worst case where k is low and α is low, the estimated
correlations are remarkably close to the actual correlations. Indeed, in this sample, increasing the
values of k and α did not lead to dramatically better estimates of the correlations.
Implications for Statistical Analysis
In most cases, we are interested in modeling latent constructs with the ultimate aim to
include them on the right-hand side of a regression equation. Using the true values of the
parameters, I created a simple model with a continuous dependent variable, which follows the
simple OLS specification:
where and
Table 4 shows the resultant estimates of the obtained from the different modeling strategies.
The row labeled ―True Values‖ contains the estimated coefficients when we use the true values
Brad Jones 8
of the latent variables in the regression. The other cell entries are the coefficients obtained when
using the different estimation strategies outlined in Table 1.
[Table 4 here]
Comparing the results of Table 4 to those of Table 2 reveals just how important it is to
account for the correlations between the different dimensions of the latent variables. Although
the naïve and separate strategies produced reasonable approximations of the latent variables, they
lead to regression coefficients that are substantially smaller (in absolute terms) than the true
values. Even with a small value for k composed of questions that only poorly measure the latent
variable, the simultaneous estimation produces estimates that are reasonably close to the true
values.
Discussion and Conclusion
In this paper, I have investigated the properties of a multidimensional item response
theory model for ordinal responses. I have shown that modeling the multidimensional nature of
the latent variable directly can add significantly to the quality of the final estimates. This is
especially evident when using the estimates obtained from the model as independent variables in
subsequent analyses.
There were several possibilities that I did not test directly that might make for interesting
future research. In all of my analyses, I assumed knowledge of the model that generated the data.
It would be worth investigating how violations of the modeling assumption affect subsequent
stages of the analysis. The GPCM is fairly flexible and seems a reasonable choice for most
ordered response data, but one can definitely imagine circumstances where a different model
might be more appropriate (Van Der Ark 2001).

Brad Jones 9
Another interesting avenue of investigation might be to vary the correlations between the
dimensions of the latent variables. I held that aspect of the data constant in my analyses here, but
it would be interesting to know how increasing or decreasing the correlations effects the other
parts of estimation. I would suspect that the multidimensional approach will show the most
improvement in cases where the correlations are high.
It would also be interesting to discover how exploiting the multidimensional nature of the
latent variable might be able to compensate for poor measures across one or more dimensions. I
varied all values of α simultaneously but there are several situations where we might have
excellent measures of one dimension and poor measures of another. As I showed here, the MIRT
approach produces valid estimates of the correlation structure of the latent variable even when
we have only a few, imprecise measurements (see Table 3). The MIRT method described in this
paper allows us to borrow strength from the other dimensions of the latent variable in the
estimation.
Multidimensional Item Response Theory models hold a great deal of promise for social
scientists. In this paper, I have shown through simulation studies how they might allow us to
transform a small number of noisy measures into estimates that we can have a great deal of
confidence in.
Brad Jones 10
Baril, Galen L., and William F. Stone. 1984. ―Mixed Messages: Verbal-Vocal Discrepancy in
Freshman Legislators.‖ Political Psychology 5(1): 83-98.
Bock, R. Darrell, and Murray Aitkin. 1981. ―Marginal maximum likelihood estimation of item
parameters: Application of an EM algorithm.‖ Psychometrika 46(4): 443-459.
Bolt, Daniel M., and Venessa F. Lall. 2003. ―Estimation of Compensatory and Noncompensatory
Multidimensional Item Response Models Using Markov Chain Monte Carlo.‖ Applied
Psychological Measurement 27(6): 395 -414.
Conover, Pamela Johnston, and Stanley Feldman. 1981. ―The Origins and Meaning of
Liberal/Conservative Self-Identifications.‖ American Journal of Political Science 25(4):
617-645.
Curtis, S. McKay. 2010. ―BUGS Code for Item Response Theory.‖ Journal of Statistical
Software 36: 1–34.
Delli Carpini, Michael X. Delli, and Scott Keeter. 1993. ―Measuring Political Knowledge:
Putting First Things First.‖ American Journal of Political Science 37(4): 1179-1206.
Eysenck, H. J. 1990. ―Genetic and Environmental Contributions to Individual Differences: The

Three Major Dimensions of Personality.‖ Journal of Personality 58(1): 245-261.
Gerber, Alan S. et al. 2010. ―Personality and Political Attitudes: Relationships Across Issue
Domains and Political Contexts.‖ American Political Science Review 104(01): 111-133.
Huber, Gregory A., and John S. Lapinski. 2008. ―Testing the Implicit-Explicit Model of
Racialized Political Communication.‖ Perspectives on Politics 6(01): 125-134.
Jacoby, William G. 2008. ―Comment: The Dimensionality of Public Attitudes toward

Government Spending.‖ Political Research Quarterly 61(1): 158-161.
Jost, John T, Christopher M Federico, and Jaime L Napier. 2009. ―Political ideology: its
structure, functions, and elective affinities.‖ Annual Review of Psychology 60: 307-337.
Leung, Shing-On. 2008. ―A Three-Dimensional Latent Variable Model for Attitude Scales.‖
Sociological Methods & Research 37(1): 135 -154.
Mondak, Jeffrey J. 2001. ―Developing Valid Knowledge Scales.‖ American Journal of Political
Science 45(1): 224-238.
Mondak, Jeffrey J. et al. 2010. ―Personality and Civic Engagement: An Integrative Framework
for the Study of Trait Effects on Political Behavior.‖ American Political Science Review
104(01): 85-110.
Muraki, Eiji. 1992. ―A Generalized Partial Credit Model: Application of an EM Algorithm.‖

Brad Jones 11
Applied Psychological Measurement 16(2): 159 -176.
Olver, James M., and Todd A. Mooradian. 2003. ―Personality traits and personal values: a
conceptual and empirical integration.‖ Personality and Individual Differences 35(1): 109-
125.
Tetlock, Philip E. 1981. ―Personality and isolationism: Content analysis of senatorial speeches.‖
Journal of Personality and Social Psychology 41(4): 737-743.
Treier, Shawn, and D. Sunshine Hillygus. 2009. ―The Nature of Political Ideology in the
Contemporary Electorate.‖ Public Opinion Quarterly 73(4): 679-703.
Treier, Shawn, and Simon Jackman. 2002. ―Beyond Factor Analysis: Modern Tools for Social
Measurement.‖ In Chicago, IL.
Treier, Shawn, and Simon Jackman. 2008. ―Democracy as a Latent Variable.‖ American Journal
of Political Science 52(1): 201-217.
Van Der Ark, L. Andries. 2001. ―Relationships and Properties of Polytomous Item Response
Theory Models.‖ Applied Psychological Measurement 25(3): 273 -282.
Wollack, James A. et al. 2002. ―Recovery of Item Parameters in the Nominal Response Model:
A Comparison of Marginal Maximum Likelihood Estimation and Markov Chain Monte
Carlo Estimation.‖ Applied Psychological Measurement 26(3): 339 -352.
Brad Jones 12
Tables and Figures
Table 1: Experimental Factors

Small Questionnaire (k = 6) Large Questionnaire (k = 15
Low α (0.5) High α (1.5) Low α (0.5) High α (1.5)
Estimated Separately least efficient estimates,
(three sets of unidimensional estimates) quickest computation
most efficient estimates,
Estimated Simultaneously
longest
(one set of multidimensional estimates)
computation
= .5
.4
.3
P(Response)
.2
.1
0
-4 -2 0 2 4
P(Y = 1) P(Y = 2)
P(Y = 3) P(Y = 4)
P(Y = 5) P(Y = 6)
Figure 1: Low α
Brad Jones 13
= 1.5
.6
P(Response)
.4
.2
0
-4 -2 0 2 4
P(Y = 1) P(Y = 2)
P(Y = 3) P(Y = 4)
P(Y = 5) P(Y = 6)
Figure 2: High α
Table 2
Naïve Estimate 0.618 0.671 0.6

Low α Separately 0.618 0.663 0.596
Simultaneously 0.663 0.725 0.635
low k
High α Separately 0.806 0.826 0.851

Low α Separately 0.657 0.659 0.633
hi k
High α Separately 0.816 0.828 0.849
note: cell entries are Pearson’s correlation coefficients between the

estimated latent variable and the true value.
Brad Jones 14
Table 3
low k Low α Naïve Estimate 0.259 0.178 0.148

Separately 0.251 0.183 0.157
High α Naïve Estimate 0.387 0.345 0.215

Separately 0.385 0.343 0.213
hi k Low α Naïve Estimate 0.348 0.299 0.162

Separately 0.228 0.153 0.139
High α Naïve Estimate 0.456 0.421 0.219

Separately 0.365 0.34 0.209
note: cell entries are correlations between the estimated factors for the naïve and
separate estimations – the entries for the simultaneous estimation are the estimates
on the hyperparameters
Table 4
True Values 0.52 -0.4 -0.2

low k Low α Naïve Estimate 0.11 -0.14 -0.09
Separately 0.19 -0.24 0.13
Simultaneously 0.48 -0.46 -0.19
High α Naïve Estimate 0.24 -0.21 -0.09
Separately 0.29 -0.25 -0.11
hi k Low α Naïve Estimate 0.2 -0.2 -0.12
Separately 0.24 -0.23 -0.1
High α Naïve Estimate 0.37 -0.3 -0.15
Separately 0.33 -0.31 -0.11
Brad Jones 15
Appendix
My code was adapted from Curtis (2010, 10). To estimate the model, I ran the following BUGS code:
model{
for (i in 1:n) {
for (j in 1:p) {
Y[i, j] ~ dcat(prob[i, j, 1:6])
}
#This code ensures that the theta will have the desired correlation structure
theta1[i] ~ dnorm(0.0, 1.0)
rand2[i] ~ dnorm(0.0, 1.0)
theta2[i] <- theta1[i]*rho12 + rand2[i]*temp2
rand3[i] ~ dnorm(0.0, 1.0)
theta3[i] <- theta1[i]*rho13 + rand2[i]*temp3 + rand3[i]*pow((1-rho13*rho13-
temp3*temp3),.5)
}
for (i in 1:n) {
for (j in 1:p) {
for (k in 1:6) {
#This breaks down the model into manageable pieces
#The d1, d2, and d3 variables are row vectors of 0s and 1s to indicate which questions
#fall on which dimension
eta[i, j, k] <-alpha[j]*(theta1[i]*d1[j]+theta2[i]*d2[j]
+theta3[i]*d3[j]-beta[j,k])
psum[i, j, k] <- sum(eta[i, j, 1:k])
epsum[i, j, k] <- exp(psum[i,j,k])
prob[i,j,k] <- epsum[i,j,k]/sum(epsum[i,j,1:6])
}
}
}
for (j in 1:p) {
alpha[j] ~ dnorm(m.alpha, pr.alpha) I(0,)
beta[j,1] <- 0.0
for (k in 2:6) {
beta[j, k] ~ dnorm(m.beta, pr.beta)
}
}
pr.alpha <- pow(s.alpha, -2)
pr.beta <- pow(s.beta, -2)
rho12 ~ dunif(-1,1)
rho13 ~ dunif(-1,1)
#this code ensures that the correlation matrix will be positive definite
rho23.star ~ dunif(0,1)
min <- pow(((1-rho12*rho12)*(1-rho13*rho13)), .5)*-1 + rho12*rho13
max <- pow(((1-rho12*rho12)*(1-rho13*rho13)), .5) + rho12*rho13
range <- max - min
rho23 <- rho23.star*range+min
temp2 <- pow((1-rho12*rho12),.5)
temp3 <- (rho23-rho12*rho13)/temp2
}

Paper

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Paper

Uploaded by

Copyright:

Available Formats

Brad Jones 1

multidimensional latent variables.

Many political science concepts can be thought of as multidimensional constructs.

scientists, as it provides a theoretically grounded method of recovering latent variables from

can increase the efficiency and precision of the estimates.

Item Response Theory

Model and Estimation

multivariate extension of Muraki’s (1992) GPCM.

from a question with ordered responses, j, is

is assumed to measure only one dimension.

The parameter is analogous to the difficulty parameter in a two-parameter logistic item

ensures that the inherent ordering of the response options is preserved.

the sample size. However, if we impose some distributional constraint on , an EM algorithm

For purposes of identification, we can assume and . The EM algorithm would

this procedure in more detail.

Alternatively, one could estimate the parameters in a Bayesian framework. With

The Bayesian method requires specification of hyperparameters for each of the

parameters in the model. These hyperparameter distributions are listed below.

following conditions must be placed on one of the correlations:

The conditions on a and b ensure that the matrix will be positive-definite.

To investigate the advantages of adopting the multidimensional framework, I conducted

than-perfect measures. Modeling the correlation structure in a multidimensional random variable

about the underlying latent variable.

[Figures 1 and 2 here]

Implications for parameter estimation

As we might expect, the most marked improvement the multidimensional framework

the top of the columns.

Implications for Statistical Analysis

simple OLS specification:

using the different estimation strategies outlined in Table 1.

Discussion and Conclusion

In this paper, I have investigated the properties of a multidimensional item response

might be more appropriate (Van Der Ark 2001).

improvement in cases where the correlations are high.

Eysenck, H. J. 1990. ―Genetic and Environmental Contributions to Individual Differences: The

Jacoby, William G. 2008. ―Comment: The Dimensionality of Public Attitudes toward

Muraki, Eiji. 1992. ―A Generalized Partial Credit Model: Application of an EM Algorithm.‖

Applied Psychological Measurement 16(2): 159 -176.

Tables and Figures

Table 1: Experimental Factors

Naïve Estimate 0.618 0.671 0.6

Naïve Estimate 0.797 0.794 0.78

note: cell entries are Pearson’s correlation coefficients between the

low k Low α Naïve Estimate 0.259 0.178 0.148

High α Naïve Estimate 0.387 0.345 0.215

hi k Low α Naïve Estimate 0.348 0.299 0.162

High α Naïve Estimate 0.456 0.421 0.219

True Values 0.52 -0.4 -0.2

You might also like