You are on page 1of 15

Brad Jones 1

When the NES gives you lemons: Making the most of less-than-perfect data

As social scientists, we are often forced to make the most of data that falls far short of the

ideal. Researchers who work with survey data are painfully familiar with having to make do with

much less than what they might have hoped for given greater resources or the opportunity to

field their own study. In this paper, I will show how, with a little extra work at the front end, we

can squeeze the most out of what we have to work with. In particular, I focus on modeling

multidimensional latent variables.

Many political science concepts can be thought of as multidimensional constructs.

Ideology, for example, has long been theorized as consisting of multiple dimensions (Conover

and Feldman 1981; Jost, Federico, and Napier 2009). A recent article by Treier and Hillygus

(2009) confirms the idea that ideology falls along two dimensions. Personality research is

another example of a multidimensional construct that has received increased attention in recent

years (Baril and Stone 1984; Eysenck 1990; Gerber et al. 2010; Mondak et al. 2010; Olver and

Mooradian 2003; Tetlock 1981). Indeed, many of the scales and measures political scientists

routinely borrow from social psychology can be thought of as multidimensional latent variables.

In this paper, I will develop a multidimensional item response theory (MIRT) model for

ordered, polychotomous responses. Item response theory is growing in popularity among social

scientists, as it provides a theoretically grounded method of recovering latent variables from

batteries of questions—the type of data routinely collected in the form of sample surveys.

However, many of the existing models for item response data impose restrictive assumptions that

the researcher may not always be comfortable with. Using simulation studies, I demonstrate how

taking advantage of the multidimensional nature of the latent variables one is trying to measure

can increase the efficiency and precision of the estimates.


Brad Jones 2

Item Response Theory

Item response theory has quickly become almost commonplace in the political science

literature (Delli Carpini and Keeter 1993; Huber and Lapinski 2008; Jacoby 2008; Mondak 2001;

Treier and Jackman 2002; Treier and Simon Jackman 2008). The method was originally

developed in educational testing as a way of measuring latent ability. Rather than just adding up

the total number of correct answers on an exam, item response theory suggests that we can learn

something about the nature of the items and produce better estimates of latent ability by taking

these item-level factors into account. It has since been extended beyond the simple dichotomous

case. The basic model I will use in this paper is derived from the Generalized Partial Credit

Model (GPCM) originally developed by Muraki (1992). I extend the model into multiple

dimensions.

Model and Estimation

The model assumes that responses to the ordered items are governed by a respondent’s

latent trait, θ, which follows a multivariate normal distribution. Responses are modeled through a

multivariate extension of Muraki’s (1992) GPCM.

The GPCM models the probability of a respondent (indexed by i) selecting the kth option

from a question with ordered responses, j, is

Where , and K is the total number of response categories to item j. For the

purposes of my model, gives the coordinates of the individual in the three-dimensional latent

space, and each question, j, is assumed to measure one dimension of that space. The model could

be extended into a compensatory item response model, where responses to any question could be
Brad Jones 3

a function of more than one dimension of (Bolt and Lall 2003), but for my purposes, each item

is assumed to measure only one dimension.

The parameter is analogous to the difficulty parameter in a two-parameter logistic item

response model. The describe the thresholds for the ordinal responses. One advantage of the

GPCM over other ordinal response models (the graded response model, for example) is that it

does not put any ordering constraint on the . The summation in the numerator of the model

ensures that the inherent ordering of the response options is preserved.

Bock and Aitkin (1981) describe and EM procedure to estimate models such as the one

above where appears on both the right and left-hand side. Item response models present a

challenge for normal maximum likelihood estimation as the number of parameters grows with

the sample size. However, if we impose some distributional constraint on , an EM algorithm

can provide estimates of the parameters to maximize the likelihood of any particular response

pattern by integrating over to produce marginal maximum likelihood estimates. Muraki (1992)

describes how this would work for the generalized partial credit model he developed, and his

solution could be extended to three dimensions by integrating over trivariate normal distribution:

For purposes of identification, we can assume and . The EM algorithm would

proceed by integrating over and iterating between different values of the parameters at

different points in the three-dimensional space we are integrating across. Leung (2008) describes

this procedure in more detail.

Alternatively, one could estimate the parameters in a Bayesian framework. With

uninformative priors, this is analogous to the MML solution. The Bayesian solution has the
Brad Jones 4

advantage of being relatively easier to implement through WinBUGS. It also provides direct

estimates of the and some measure of uncertainty around each estimated parameter (Wollack

et al. 2002). An EM algorithm would require more serious programming to take the necessary

integrals.

The Bayesian method requires specification of hyperparameters for each of the

parameters in the model. These hyperparameter distributions are listed below.

Where, is a column vector of zeros and is the variance-covariance matrix. For identification,

the diagonal of is constrained to be unity. There are different options to model the dependence

between the different elements of . One option is to model as if it were drawn from an

inverse Wishart distribution. Alternatively, one could model each correlation separately. I choose

to follow the latter course. We can estimate two of the correlations unconstrained, but the

following conditions must be placed on one of the correlations:

Where , and

The conditions on a and b ensure that the matrix will be positive-definite.

All MCMC simulations were run in WinBUGS via the ―R2WinBUGS‖ package in R. I

used a relatively small number of iterations (300) with a ―burn-in‖ of 50. The simulated dataset

on which all analyses were conducted had a sample size of 500. In its current implementation,
Brad Jones 5

the method takes several hours to implement with a k of 20. The relevant BUGS code is included

in the appendix.

Simulation Studies

To investigate the advantages of adopting the multidimensional framework, I conducted

simulation studies systematically varying two factors: the αj parameter and the total number of

items in the questionnaire. In this paper, I am primarily interested in the differences between the

results of the one-dimensional approach and a multidimensional approach. Table 1 shows the

different conditions I will be examining. The conditions in the table represent real tradeoffs that

researchers are often forced to make. We often find ourselves in a position where we do not have

ideal measurement instruments. The αj parameter is one way of thinking how well the particular

item measures the latent variable. In an ideal world, we would have unlimited questionnaire

space that we could fill with items that provide precise measurements of the latent variables we

are interested in studying. However, in reality, we often have only very limited space and less-

than-perfect measures. Modeling the correlation structure in a multidimensional random variable

directly should allow us to borrow information across the different dimensions of the latent

variable to produce more efficient estimates. The method is more computationally intensive, but

if it produces real gains in terms of efficiency, the extra time and resources devoted in the

analysis stage should be able to compensate for unavoidable problems in the design stage.

[Table 1 here]

Figures 1 and 2 show how different values of αj relate the latent variable, θ, to the

probability of giving a particular response in the GPCM. Comparing Figure 1 and Figure 2

reveals how the α parameter affects the probability of giving a particular response over the range
Brad Jones 6

of θ. The β parameters were kept constant. Items with high values of α convey more information

about the underlying latent variable.

[Figures 1 and 2 here]

Results

In the analyses that follow, I will use simulated responses to four questionnaires

corresponding to the different columns in Table 1. The latent variable follows the multivariate

normal distribution:

The true values of the parameters were fixed, and responses follow the GPCM.

Implications for parameter estimation

By examining the raw correlation between the predicted latent variable and the true

values from the simulation, we can begin to get an idea the advantages of multidimensional item

response models. Table 2 shows the Pearson’s correlation between the predictions from the

estimation and the true values by the factors listed in Table 1. The table produces many relevant

comparisons. For my purposes, I am most interested in comparing the simultaneous and separate

estimations. A naïve estimate (an additive index of responses) is included for comparison.

[Table 2 here]

In every case, the simultaneous estimation outperforms the separate estimation. In some

situations, the improvement is dramatic. However, the naïve estimate performs remarkably well.

Indeed as α or k increases, the correlation between the true values of the latent variable and the

naïve estimator rival the most computationally intense simultaneous estimation. In most cases,

the naïve estimator actually outperforms the separate estimation of the latent variables. This

initial result seems disheartening on its face. If we can get a reasonably good approximation of
Brad Jones 7

the latent construct by simply summing the responses to the questionnaire, why bother with

difficult estimation?

As we might expect, the most marked improvement the multidimensional framework

shows is in modeling the structure of the latent variable. Table 3 shows the estimates of the

correlations between each element of the latent variable. The true correlations are listed across

the top of the columns.

[Table 3 here]

Comparing the naïve estimate and the separate estimation in Table 3 reveals a similar story as

was told in Table 2. In nearly every case, the naïve estimate significantly outperforms the

separate estimation when it comes to modeling the dependence between the elements of the

latent variable. However, by estimating the correlations directly in the simultaneous estimation,

we see marked improvement. Even in the worst case where k is low and α is low, the estimated

correlations are remarkably close to the actual correlations. Indeed, in this sample, increasing the

values of k and α did not lead to dramatically better estimates of the correlations.

Implications for Statistical Analysis

In most cases, we are interested in modeling latent constructs with the ultimate aim to

include them on the right-hand side of a regression equation. Using the true values of the

parameters, I created a simple model with a continuous dependent variable, which follows the

simple OLS specification:

where and

Table 4 shows the resultant estimates of the obtained from the different modeling strategies.

The row labeled ―True Values‖ contains the estimated coefficients when we use the true values
Brad Jones 8

of the latent variables in the regression. The other cell entries are the coefficients obtained when

using the different estimation strategies outlined in Table 1.

[Table 4 here]

Comparing the results of Table 4 to those of Table 2 reveals just how important it is to

account for the correlations between the different dimensions of the latent variables. Although

the naïve and separate strategies produced reasonable approximations of the latent variables, they

lead to regression coefficients that are substantially smaller (in absolute terms) than the true

values. Even with a small value for k composed of questions that only poorly measure the latent

variable, the simultaneous estimation produces estimates that are reasonably close to the true

values.

Discussion and Conclusion

In this paper, I have investigated the properties of a multidimensional item response

theory model for ordinal responses. I have shown that modeling the multidimensional nature of

the latent variable directly can add significantly to the quality of the final estimates. This is

especially evident when using the estimates obtained from the model as independent variables in

subsequent analyses.

There were several possibilities that I did not test directly that might make for interesting

future research. In all of my analyses, I assumed knowledge of the model that generated the data.

It would be worth investigating how violations of the modeling assumption affect subsequent

stages of the analysis. The GPCM is fairly flexible and seems a reasonable choice for most

ordered response data, but one can definitely imagine circumstances where a different model

might be more appropriate (Van Der Ark 2001).


Brad Jones 9

Another interesting avenue of investigation might be to vary the correlations between the

dimensions of the latent variables. I held that aspect of the data constant in my analyses here, but

it would be interesting to know how increasing or decreasing the correlations effects the other

parts of estimation. I would suspect that the multidimensional approach will show the most

improvement in cases where the correlations are high.

It would also be interesting to discover how exploiting the multidimensional nature of the

latent variable might be able to compensate for poor measures across one or more dimensions. I

varied all values of α simultaneously but there are several situations where we might have

excellent measures of one dimension and poor measures of another. As I showed here, the MIRT

approach produces valid estimates of the correlation structure of the latent variable even when

we have only a few, imprecise measurements (see Table 3). The MIRT method described in this

paper allows us to borrow strength from the other dimensions of the latent variable in the

estimation.

Multidimensional Item Response Theory models hold a great deal of promise for social

scientists. In this paper, I have shown through simulation studies how they might allow us to

transform a small number of noisy measures into estimates that we can have a great deal of

confidence in.
Brad Jones 10

Baril, Galen L., and William F. Stone. 1984. ―Mixed Messages: Verbal-Vocal Discrepancy in
Freshman Legislators.‖ Political Psychology 5(1): 83-98.

Bock, R. Darrell, and Murray Aitkin. 1981. ―Marginal maximum likelihood estimation of item
parameters: Application of an EM algorithm.‖ Psychometrika 46(4): 443-459.

Bolt, Daniel M., and Venessa F. Lall. 2003. ―Estimation of Compensatory and Noncompensatory
Multidimensional Item Response Models Using Markov Chain Monte Carlo.‖ Applied
Psychological Measurement 27(6): 395 -414.

Conover, Pamela Johnston, and Stanley Feldman. 1981. ―The Origins and Meaning of
Liberal/Conservative Self-Identifications.‖ American Journal of Political Science 25(4):
617-645.

Curtis, S. McKay. 2010. ―BUGS Code for Item Response Theory.‖ Journal of Statistical
Software 36: 1–34.

Delli Carpini, Michael X. Delli, and Scott Keeter. 1993. ―Measuring Political Knowledge:
Putting First Things First.‖ American Journal of Political Science 37(4): 1179-1206.

Eysenck, H. J. 1990. ―Genetic and Environmental Contributions to Individual Differences: The


Three Major Dimensions of Personality.‖ Journal of Personality 58(1): 245-261.

Gerber, Alan S. et al. 2010. ―Personality and Political Attitudes: Relationships Across Issue
Domains and Political Contexts.‖ American Political Science Review 104(01): 111-133.

Huber, Gregory A., and John S. Lapinski. 2008. ―Testing the Implicit-Explicit Model of
Racialized Political Communication.‖ Perspectives on Politics 6(01): 125-134.

Jacoby, William G. 2008. ―Comment: The Dimensionality of Public Attitudes toward


Government Spending.‖ Political Research Quarterly 61(1): 158-161.

Jost, John T, Christopher M Federico, and Jaime L Napier. 2009. ―Political ideology: its
structure, functions, and elective affinities.‖ Annual Review of Psychology 60: 307-337.

Leung, Shing-On. 2008. ―A Three-Dimensional Latent Variable Model for Attitude Scales.‖
Sociological Methods & Research 37(1): 135 -154.

Mondak, Jeffrey J. 2001. ―Developing Valid Knowledge Scales.‖ American Journal of Political
Science 45(1): 224-238.

Mondak, Jeffrey J. et al. 2010. ―Personality and Civic Engagement: An Integrative Framework
for the Study of Trait Effects on Political Behavior.‖ American Political Science Review
104(01): 85-110.

Muraki, Eiji. 1992. ―A Generalized Partial Credit Model: Application of an EM Algorithm.‖


Brad Jones 11

Applied Psychological Measurement 16(2): 159 -176.

Olver, James M., and Todd A. Mooradian. 2003. ―Personality traits and personal values: a
conceptual and empirical integration.‖ Personality and Individual Differences 35(1): 109-
125.

Tetlock, Philip E. 1981. ―Personality and isolationism: Content analysis of senatorial speeches.‖
Journal of Personality and Social Psychology 41(4): 737-743.

Treier, Shawn, and D. Sunshine Hillygus. 2009. ―The Nature of Political Ideology in the
Contemporary Electorate.‖ Public Opinion Quarterly 73(4): 679-703.

Treier, Shawn, and Simon Jackman. 2002. ―Beyond Factor Analysis: Modern Tools for Social
Measurement.‖ In Chicago, IL.

Treier, Shawn, and Simon Jackman. 2008. ―Democracy as a Latent Variable.‖ American Journal
of Political Science 52(1): 201-217.

Van Der Ark, L. Andries. 2001. ―Relationships and Properties of Polytomous Item Response
Theory Models.‖ Applied Psychological Measurement 25(3): 273 -282.

Wollack, James A. et al. 2002. ―Recovery of Item Parameters in the Nominal Response Model:
A Comparison of Marginal Maximum Likelihood Estimation and Markov Chain Monte
Carlo Estimation.‖ Applied Psychological Measurement 26(3): 339 -352.
Brad Jones 12

Tables and Figures

Table 1: Experimental Factors


Small Questionnaire (k = 6) Large Questionnaire (k = 15
Low α (0.5) High α (1.5) Low α (0.5) High α (1.5)
Estimated Separately least efficient estimates,
(three sets of unidimensional estimates) quickest computation
most efficient estimates,
Estimated Simultaneously
longest
(one set of multidimensional estimates)
computation

= .5
.4
.3
P(Response)

.2
.1
0

-4 -2 0 2 4

P(Y = 1) P(Y = 2)
P(Y = 3) P(Y = 4)
P(Y = 5) P(Y = 6)

Figure 1: Low α
Brad Jones 13

= 1.5

.6
P(Response)

.4
.2
0

-4 -2 0 2 4

P(Y = 1) P(Y = 2)
P(Y = 3) P(Y = 4)
P(Y = 5) P(Y = 6)

Figure 2: High α

Table 2

Naïve Estimate 0.618 0.671 0.6


Low α Separately 0.618 0.663 0.596
Simultaneously 0.663 0.725 0.635
low k
Naïve Estimate 0.811 0.829 0.85
High α Separately 0.806 0.826 0.851
Simultaneously 0.845 0.84 0.859

Naïve Estimate 0.797 0.794 0.78


Low α Separately 0.657 0.659 0.633
Simultaneously 0.819 0.823 0.797
hi k
Naïve Estimate 0.92 0.925 0.927
High α Separately 0.816 0.828 0.849
Simultaneously 0.921 0.932 0.932

note: cell entries are Pearson’s correlation coefficients between the


estimated latent variable and the true value.
Brad Jones 14

Table 3

low k Low α Naïve Estimate 0.259 0.178 0.148


Separately 0.251 0.183 0.157
Simultaneously 0.676 0.442 0.363

High α Naïve Estimate 0.387 0.345 0.215


Separately 0.385 0.343 0.213
Simultaneously 0.567 0.528 0.333

hi k Low α Naïve Estimate 0.348 0.299 0.162


Separately 0.228 0.153 0.139
Simultaneously 0.589 0.533 0.298

High α Naïve Estimate 0.456 0.421 0.219


Separately 0.365 0.34 0.209
Simultaneously 0.582 0.551 0.312

note: cell entries are correlations between the estimated factors for the naïve and
separate estimations – the entries for the simultaneous estimation are the estimates
on the hyperparameters

Table 4

True Values 0.52 -0.4 -0.2


low k Low α Naïve Estimate 0.11 -0.14 -0.09
Separately 0.19 -0.24 0.13
Simultaneously 0.48 -0.46 -0.19
High α Naïve Estimate 0.24 -0.21 -0.09
Separately 0.29 -0.25 -0.11
Simultaneously 0.45 -0.36 -0.18
hi k Low α Naïve Estimate 0.2 -0.2 -0.12
Separately 0.24 -0.23 -0.1
Simultaneously 0.45 -0.37 -0.24
High α Naïve Estimate 0.37 -0.3 -0.15
Separately 0.33 -0.31 -0.11
Simultaneously 0.54 -0.42 -0.22
Brad Jones 15

Appendix

My code was adapted from Curtis (2010, 10). To estimate the model, I ran the following BUGS code:

model{
for (i in 1:n) {
for (j in 1:p) {
Y[i, j] ~ dcat(prob[i, j, 1:6])
}
#This code ensures that the theta will have the desired correlation structure
theta1[i] ~ dnorm(0.0, 1.0)
rand2[i] ~ dnorm(0.0, 1.0)
theta2[i] <- theta1[i]*rho12 + rand2[i]*temp2
rand3[i] ~ dnorm(0.0, 1.0)
theta3[i] <- theta1[i]*rho13 + rand2[i]*temp3 + rand3[i]*pow((1-rho13*rho13-
temp3*temp3),.5)
}

for (i in 1:n) {
for (j in 1:p) {
for (k in 1:6) {
#This breaks down the model into manageable pieces
#The d1, d2, and d3 variables are row vectors of 0s and 1s to indicate which questions
#fall on which dimension
eta[i, j, k] <-alpha[j]*(theta1[i]*d1[j]+theta2[i]*d2[j]
+theta3[i]*d3[j]-beta[j,k])
psum[i, j, k] <- sum(eta[i, j, 1:k])
epsum[i, j, k] <- exp(psum[i,j,k])
prob[i,j,k] <- epsum[i,j,k]/sum(epsum[i,j,1:6])
}
}
}

for (j in 1:p) {
alpha[j] ~ dnorm(m.alpha, pr.alpha) I(0,)
beta[j,1] <- 0.0
for (k in 2:6) {
beta[j, k] ~ dnorm(m.beta, pr.beta)
}
}
pr.alpha <- pow(s.alpha, -2)
pr.beta <- pow(s.beta, -2)
rho12 ~ dunif(-1,1)
rho13 ~ dunif(-1,1)
#this code ensures that the correlation matrix will be positive definite
rho23.star ~ dunif(0,1)
min <- pow(((1-rho12*rho12)*(1-rho13*rho13)), .5)*-1 + rho12*rho13
max <- pow(((1-rho12*rho12)*(1-rho13*rho13)), .5) + rho12*rho13
range <- max - min
rho23 <- rho23.star*range+min
temp2 <- pow((1-rho12*rho12),.5)
temp3 <- (rho23-rho12*rho13)/temp2
}

You might also like