You are on page 1of 11

Software Review

Applied Psychological Measurement


37(2) 162–172
A Review of DIMPACK Ó The Author(s) 2012
Reprints and permission:
Version 1.0: Conditional sagepub.com/journalsPermissions.nav
DOI: 10.1177/0146621612465487
Covariance–Based Test http://apm.sagepub.com

Dimensionality Analysis
Package

Nina Deng1, Kyung T. Han2, and Ronald K. Hambleton3

Abstract
DIMPACK Version 1.0 for assessing test dimensionality based on a nonparametric conditional
covariance approach is reviewed. This software was originally distributed by Assessment
Systems Corporation and now can be freely accessed online. The software consists of
Windows-based interfaces of three components: DIMTEST, DETECT, and CCPROX/HAC,
which conduct hypothesis test for unidimensionality, cluster items, and perform hierarchical
cluster analysis, respectively. Two simulation studies were conducted to evaluate the software
in confirming test unidimensionality (a Type I error study) and detecting multidimensionality (a
statistical power study). The results suggested that different data always be used in selecting
assessment subtest items independent of calculating the DIMTEST statistic. The Type I error
rate was excessively inflated otherwise. The statistical power was found low when sample size
was small or the dimensions were highly correlated. It is suggested that some major changes be
made to the software before it can be successfully useful among practitioners.

Keywords
test dimensionality, DIMPACK, item response theory software

Introduction
Item response theory (IRT) and its applications have been widely used in various educational
and psychological testing practices, including test construction, ability estimation, score report-
ing, equating, detection of differential item functioning (DIF), and computer adaptive testing.
One of the fundamental assumptions for the most commonly used IRT models is unidimension-
ality (Hambleton, Swaminathan, & Rogers, 1991), that is, only one single latent variable
accounts for the item responses. To justify the uses of IRT, unidimensionality needs to be
1
University of Massachusetts Medical School, Worcester, USA
2
Graduate Management Admission Council, Reston, VA, USA
3
University of Massachusetts–Amherst, USA

Corresponding Author:
Nina Deng, University of Massachusetts Medical School, 55 Lake Ave. N., AC7-053, Worcester, MA 01655, USA
Email: nina.deng@umassmed.edu
Deng et al. 163

evaluated before any unidimensional IRT modeling should be applied. However, it is often the
case that many educational and psychological tests are not strictly unidimensional. Therefore, it
is important to check whether the test satisfies the unidimensionality assumption and to detect
how many dimensions may be needed to fit the test data. Checking test dimensionality structure
is, therefore, a fundamental practice and should always be conducted before applying an IRT
model to avoid negative consequences.

Conditional Covariance (CCOV)–Based Dimensionality Assessment


Lots of methods and procedures have been developed to assess the test dimensionality over the
past decades. Stout (1987) proposed a nonparametric approach based on the concept called
essential unidimensionality. Under this approach, a test is regarded as essentially unidimen-
sional if the item-pair covariances, conditional on the dominant dimension, are close to zero.
The item-pair CCOV for item i and item l based on the unidimensional proficiency or trait,
uTT , is defined as follows:
Б
CCOVi, l = CovðUi , Ul jYTT = uTT Þf ðuTT ÞduTT , ð1Þ
‘

where f(uTT) is an assumed probability density of uTT in the examinee population.


The estimates of CCOVi, l for a given unidimensional IRT model can be used to analyze
whether the underlying structure of data is essentially unidimensional or multidimensional. This
idea has been implemented in the procedures of DIMTEST, DETECT, and CCPROX/HAC
(Stout et al., 1996).
DIMTEST was originally developed by Stout (1987) and enhanced by Stout, Froelich, and
Gao (2001). In DIMTEST, there is a nonparametric hypothesis testing procedure of the assump-
tion of unidimensionality. The null hypothesis is that the data satisfy unidimensionality and
local independence. There are two stages going on in the procedure. First, the test needs to be
partitioned into two sets of items: AT (assessment subtest) and PT (partitioning subtest). The
AT items are usually chosen being believed to be measuring a dimension different from what
the majority of items (PT) are measuring. Once the AT and PT items are chosen, the estimate
of T-statistic is calculated as follows:
P Б  
T = ‘ Cov Ui , Ul jYTT = uTT f ðuTT ÞduTT : ð2Þ
i\l2AT

The statistic, T*, is assumed to asymptotically follow a standard unit-normal distribution.


Based on the chosen alpha level (the nominal Type I error rate), the null hypothesis implies that
AT is dimensionally similar to PT. That null hypothesis is rejected if the test statistic is larger
than the predetermined critical value from the normal distribution, leading to the plausible con-
clusion that the AT is dimensionally distinct from PT. Otherwise, the test is viewed as essen-
tially unidimensional (i.e., any residual covariance can be practically treated as ignorable).
Previous studies suggest that the number of PT items should always be larger than that of AT
items, and the number of AT items should be no less than four.
The DETECT procedure was developed by Kim (1994) and Zhang and Stout (1999) to
explore the plausible structure of multidimensional data. It is the first nonparametric technique
to actually estimate the number of dimensions (vs. merely rejecting an assumption of unidimen-
sionality) and to identify which dimension is predominantly measured by each item. DETECT
is based on the assumption that the data follow an approximate simple structure. This means
the items can be grouped into a number of nonoverlapping clusters, each cluster measuring a
164 Applied Psychological Measurement 37(2)

distinct dimension. Each item can only be viewed as measuring one dimension. The items are
split in the way that items in the same cluster have positive CCOVi, l , whereas items in different
clusters have negative CCOVi, l . What DETECT does is to search through all possible item par-
titions (see description of CCPROX/HAC below) and tries to find the optimal partition by max-
imizing the Index D:
2
P Ð
DðP, YTT Þ = nðn1Þ di, l CCOVi, l , ð3Þ
1iln

where di, l = 1 if item i and l are in the same cluster and 21 if in different clusters. The number
of clusters should approximate the test dimensionality. Unlike DIMTEST, DETECT does not
provide a hypothesis test.
CCPROX/HAC (Roussos, Stout, & Marden, 1998) is a dimensionality-sensitive cluster anal-
ysis procedure. HAC performs an agglomerative hierarchical cluster analysis, based on the
dimensionality-sensitive proximity matrix provided by CCPROX. The agglomerative HAC
quickly clusters the items progressively into larger dimensionally homogeneous groups. At each
level of the hierarchy, the two clusters with the minimum proximity are joined at the next level.
The proximity measure for a pair of items i and l is given by

rccov ðUi , Ul Þ =  1 CCOVi, l + constant: ð4Þ

By adding a constant and reversing the sign of CCOVi, l , items with a positive CCOV will
have a small rccov value, whereas items with a negative CCOV will have a large rccov value.
CCPROX and HAC are typically used together to suggest a plausible dimensional structure
information at each level of the cluster hierarchy. In particular, these procedures can jointly pro-
vide the initial item clustering for DETECT and possible AT items to be tested in DIMTEST
(Froelich & Habing, 2008). However, the algorithms do not automatically determine which
stage of the cluster analysis presents the optimal clustering of the items.

Program Description
DIMPACK (Version 1.0) is a comprehensive computer software package incorporating the var-
ious nonparametric CCOV-based dimensionality analyses for dichotomous items. It was devel-
oped by Roussos and Stout at the William Stout Institute for Measurement (2006) and is
distributed by Assessment Systems Corporation. Now, it can be freely accessed at the Psycho-
Source website maintained by Measured Progress (http://psychometrictools.measuredprogres-
s.org/dif2). The program has Microsoft WindowsÒ-based interfaces for three software compo-
nents: DIMTEST, DETECT, and CCPROX/HAC, each of which was previously distributed as
stand-alone DOS programs. DIMPACK integrates the three components into one package and
streamlines the analyses of clustering items, detecting multidimensionality and testing the uni-
dimensionality hypothesis.
The software limits the maximum number of items to 150 and the maximum number of
examinees to 7,000. It requires Microsoft Windows 98Ò or later and .NET Framework 1.1 or
later to run. It is easy to install and very user-friendly to run. To install the package after down-
loading, the user simply runs the ‘‘setup.exe’’ program and follows the on-screen instructions.
The default directory to which the program is installed is ‘‘c:\Program Files\Dimpack1.0\.’’
Below is the description of each interface component in the package.
Deng et al. 165

DIMTEST
The overall interface of DIMTEST is quite user-friendly and the steps are straightforward.
There are three general program execution steps: (a) specify and load the data set(s), (b) choose
the AT items, and (c) specify the output file and run the program.

AT Item Selection
As DIMTEST tests a unidimensionality hypothesis based on the covariances of pairs of AT
items conditional on PT items, the selection of AT items is quite important and can dramati-
cally affect the performance of DIMTEST and the results. Compared with the earlier
DIMTEST_DOS program, there are two major changes made in the current version of
DIMPACK. First, the earlier DIMTEST_DOS program divided the test into three required subt-
ests: AT, AT2, and PT. In DIMPACK, the AT2 requirement was eliminated and instead
replaced by an AT-based simulated data set. This change was apparently implemented to
enable the program to be used with shorter tests. Second, the DIMTEST_DOS program used a
linear factor analysis (FAC) for automatic selection of AT items. In DIMPACK, FAC was
replaced by ATFIND, a procedure combining CCPROX/HAC and DETECT analyses.
Two methods are available for selecting AT items in DIMPACK: (a) a confirmatory analysis
(called user specified in DIMPACK) and (b) an exploratory analysis. The confirmatory analysis
requires a preidentified list of AT items. If the users are unable to identify the AT items, they
can elect to use the exploratory analysis approach, which uses an automatic selection proce-
dure, ATFIND, to statistically identify the AT items. The exploratory analysis can either use
the current data set or accept as input an independent data set to carry out the AT item selec-
tion. Using the current data is the default. Unfortunately, it implies that the same data set is
used for AT item selection and the DIMTEST statistic calculation. In our study—described fur-
ther on—the authors found that using an independent data set was absolutely essential as cross-
validation evidence and to avoid inflated Type I error rates due to capitalization on chance. To
use independent data (two separate data sets), the user needs to prepare beforehand a second
data set for input. One option might be to randomly divide the original data into two sets with
sampling ratios ideally between 1:2 and 1:1 (where a smaller data set can be used for AT item
selection and the larger data set for the DIMTEST analysis). Alternatively, users can manually
select the AT items, perhaps based on subject matter or expert judgment, or based on externally
run linear or nonlinear factor analyses. A minimum of 15 PT items were recommended in
DIMPACK help file. The DIMTEST output file (with appended file name extension, ‘‘.dim’’)
provides the AT and PT item lists, the DIMTEST T*-statistic and the associated p value.
Depending on the chosen Type I error level, users can decide whether to reject the null hypoth-
esis of essential unidimensionality.

DETECT
Similarly, there are confirmatory and exploratory approaches available in DETECT to partition
the items into clusters. Under the confirmatory approach, users must specify the number of clus-
ters and the items associated with each cluster. The interface allows users to either manually
select the items or to merely provide an input file (referred to as a ‘‘cluster file’’). For the
exploratory approach, users can decide whether to do cross-validation, and furthermore, whether
to use one file or two files for the cross-validation analysis. If only one file is available for
cross-validation, the user must specify the number of examinees set aside for cross-validation;
DETECT will then randomly split the original group of examinees into two samples. The output
166 Applied Psychological Measurement 37(2)

file generated by DETECT, with a file name extension ‘‘.det’’ appended, provides the clusters
and their associated items. As noted earlier, no dimensionality hypothesis test is available in
DETECT.

CCPROX/HAC
The third interface consists of two sections: CCPROX and HAC. CCPROX should be used prior
to HAC as it provides the proximity matrix that is subsequently required as the input for the
HAC analysis. Users specify the analysis parameters for CCPROX. For example, if guessing is
assumed to a multiple-choice test, the software authors recommend using the estimated lowest
number-correct score divided by the total number of items as the ‘‘guessing’’ parameter. In
addition, the user can specify whether CCPROX should produce CCOV, the proximity matrix
based on the conditional covariances, or conditional correlations (CCOR), the proximity matrix
based on product–moment correlations.
The proximity matrix file (*.prx) will be automatically loaded if HAC is run immediately
after executing CCPROX. The user also has a wide variety of HAC options from which to
choose regarding the nature and type of cluster analysis conducted. The actual hierarchical clus-
ter analysis is based on the ‘‘Sequential Agglomerative Hierarchical Nonoverlapping (SAHN)’’
algorithm. The HAC output file is saved with the appended file name extension ‘‘.hac.’’

Documentation and Support


The program has an easy-to-access HTML help window available from the program menu. The
Help contents consist of an overview of the program, instructions for loading data files, detailed
descriptions of each of the three component interfaces, and a reference list of relevant journal
articles. There is also a search function where the user can type in keywords to search on desired
topics. One rather obvious deficit is a lack of concrete examples. There should be examples of
the data files and required formats, as well as set-up parameters for at least several typical appli-
cations. There is also not a digital or hard copy of the users’ guide or installation documentation.
In fact, other than the Help system, the only other documentation is a collection of research arti-
cle citations related to nonparametric dimensionality detection methods. The referenced articles
are very technical and may not be very helpful for practitioners. In our judgment, good docu-
mentation should show how to actually conduct the analyses and how to interpret the results
with examples provided.

Simulation Studies
The earlier DOS versions of DIMTEST and DETECT have been widely disseminated and used
in many simulation studies that investigated their utility and performance. Previous studies tend
to show that DIMTEST sufficiently detects multidimensionality when the sample size is large,
when the test is quite long, and/or when the between-trait correlations are relatively low (e.g.,
Deng & Ansley, 2000). At least one study has shown that that the Type I error is quite low
when there is no guessing or similar patterns of random responses evident in the data (Finch &
Habing, 2007). On the other hand, the performance of DIMTEST was found to be less satisfac-
tory with shorter tests or with data conforming to compensatory and partially compensatory
multidimensional IRT or factor analytic models (Hattie, Krakowski, Rogers, & Swaminathan,
1996; Meara, Robin, & Sireci, 2000). The literature does suggest that DETECT identifies sim-
ple structure effectively, but it does not perform well when the data exhibit complex item-factor
loading patterns (so-called ‘‘complex factor structures’’) and/or highly correlated dimensions.
Deng et al. 167

To evaluate the new integrated DIMPACK package for this review, two simulation studies
were conducted: First, a Type I error study was conducted to evaluate the software in terms of
confirming test unidimensionality (where the null hypothesis is true). Second, a statistical power
study was conducted to evaluate the software in terms of detecting test multidimensionality
(where the alternative hypothesis, rejecting the unidimensionality assumption, is true).

Type I Error Study


The Type I error rates were evaluated in conditions crossed by three samples sizes (n = 500,
1,000, 5,000) and three test lengths (30, 50, 100 items). The data were generated from the unidi-
mensional 1-, 2-, and 3-parameter logistic (PL) dichotomous IRT models. The latent ability dis-
tribution was generated as normal with mean of 0 and SD of 1. The 3PL IRT model was shown
in Equation 5 as an example.
 
Pi uj = ci + 1 + exp 1c i
, ð5Þ
½ai ðuj bi Þ

where ai = 1, ci = 0 in the 1PL model, and ci = 0 in the 2PL model.


As discussed above, AT selection is an important step in DIMTEST, which may greatly
affect the program performance. Both AT selection methods provided by DIMTEST were com-
pared: that is, reusing the same data set versus using two different data sets for AT item selec-
tion and DIMTEST analysis. When using ‘‘different data,’’ the original, generated data set was
divided into two subsets: (a) one third of the sample was used for AT item selection and (b) the
remaining two thirds of the sample were used for DIMTEST analysis. It is noteworthy that
although the ‘‘different data’’ option is actually recommended as the preferred method in the
Help file, the option of ‘‘current data’’ is set as the default choice.
As a step further in evaluating the package, DIMPACK was compared with its earlier coun-
terpart DIMTEST_DOS program (Stout, Nandakumar, Junker, Chang, & Steidinger, 1992)
using the same data. Only conditions of 30 and 50 items were studied because of the upper limit
of the number of items in the DOS version of the program; nevertheless, both test lengths are
commonly found in practice. There are two types of statistics available in the DIMTEST_DOS
program: conservative and powerful statistics. Both are reported and considered in the simula-
tions. For each procedure and under each condition, 100 replications were generated. This pro-
vided empirical sampling distributions for the detection rates under each condition. The number
of rejections out of 100 replications was calculated and comparisons were then made with
respect to a nominal Type I error rate of a = .05 (the probability of falsely rejecting the unidi-
mensionality assumption when it is true).
The Type I error rates for different approaches are presented in Figures 1 and 2 for the
2PL and 3PL data, respectively. Because the results for the 1PL data were highly similar with
those for the 2PL data, they are not presented here. The results suggested that when using the
current data for AT item selection in DIMPACK (i.e., reusing the same data file for AT item
selection and the DIMTEST analysis), the Type I error rates may be highly inflated. The
improper rejection of a true null hypothesis reached 60% or even higher—far above the nom-
inal level of a = .05 and highly unsatisfactory on statistical or practical grounds. On the other
hand, when using two different data sets, one for the AT item selection and one for the inde-
pendent DIMTEST statistical analysis, the Type I error rates dropped dramatically and were
much closer to the nominal level. This study suggests that, due to the capitalization on chance
based on data reuse, it was very likely for the software to reject the essentially unidimen-
sional assumption even when the data were actually generated from a particular unidimen-
sional model. These are practically unacceptable results and therefore it is recommended that
168 Applied Psychological Measurement 37(2)

the option ‘‘current data’’ should probably never be selected when conducting exploratory
analysis for AT selection in DIMPACK. At the very least, additional research is needed and
the default settings in DIMPACK should be reconsidered until the issues can be more fully
understood.
Although the Type I error rates under the studied conditions using ‘‘different data’’ (two
independent data sets) showed a lot of improvement, the large sample size (5,000 exami-
nees) with short to moderate test lengths (30 or 50 items) still tended to have large Type I
error rates. For the smaller sample sizes (500 and 1,000 examinees), the Type I error rates
tended to be too low (.00-.02) compared with the nominal level (a = .05), which suggests an
increased potential for making Type II errors due to having too few examinees and being of
very limited potential with small data sets. Interestingly, the results showed that both types
of statistics produced in DIMTEST_DOS had much lower Type I error rates when the sam-
ple size was large.

Statistical Power Study


Another simulation study was conducted to evaluate the software’s performance in terms of sta-
tistical power in detecting multidimensionality. The response data were generated based on a
two-dimensional compensatory 3PL model defined in Equation 6.
 
Pi uj = ci + 1 + exp  a 1c i
, ð6Þ
½ ð i1 uj1 + ai2 uj2 + di Þ

where ai1 and ai2 are the slopes that are related with item’s discriminating power on
Dimensions 1 and 2 for item i; uj1 and uj2 are person j’s abilities on Dimensions 1 and 2; di is
the intercept that is negatively related with difficulty parameter of item i.
The person abilities uj1 and uj2 were randomly drawn from a bivariate normal distribution
with both means set to 0 and both variances set to 1. Conditions were crossed by three sam-
ple sizes (n = 500, 1,000, 5,000), three test lengths (30, 50, 100 items), and four degrees of
correlations between the two dimensions (r = 0, 0.3, 0.6, 0.9), which resulted in 36 condi-
tions in total (3 3 3 3 4). Based on the results of the Type I error rate study, only the option
of ‘‘different data’’ was used in AT selection in the statistical power study. One hundred
replications were simulated in each condition. The numbers of rejections out of 100 replica-
tions were recorded.
Results (Figure 3) showed that DIMPACK had excellent power when the sample size was
large (5,000 examinees) and the correlations between the dimensions were low to moderate
(correlations of 0, 0.3, and 0.6). In general, the higher the correlation was, the larger the sample
size needed to be to achieve a satisfactory level of statistical detection power. Small sample
size (500 examinees) with short test length had very low power and are not recommended for
practical work—at least based on the results of this study.
DIMTEST_DOS was also run on all of the generated multidimensional data. Compared
with DIMTEST_DOS, DIMPACK ran more smoothly and without abnormal program termi-
nations or interruptions. Still, DIMTEST_DOS could not analyze certain two-dimensional
3PL-generated data. Finally, if DIMTEST_DOS had such data accumulated to a certain
amount in a short period of time as the simulations did in the study, the executive file for
AT selection (ASN.EXE) crashed and the program failed to produce any valid results.
Deng et al. 169

DIMPACK:Current Data for AT DIMPACK:Different Data for AT


1.0

0.5
0.9

0.4
Type I Error Rate

Type I Error Rate


0.8

0.3
0.7

0.2
0.6

0.1
0.5

0.0
30 50 100 30 50 100
Test Length Test Length

DIMTEST_DOS: Conservative DIMTEST_DOS: Powerful


0.5

0.5
0.4

0.4
Type I Error Rate

Type I Error Rate


0.3

0.3
0.2

0.2
0.1

0.1
0.0

0.0

30 50 100 30 50 100
Test Length Test Length

N=500 N=1000 N=5000

Figure 1. Type I error rate for DIMPACK and DIMTEST_DOS with 2PL data
Note: 2PL = two parameter logistic model; AT = assessment subtest.

Conclusion and Limitations


DIMPACK is a comprehensive computer software package for dimensionality assessment
based on conditional covariance and, to date, is the only readily accessible nonparametric
dimensionality detection package. With the integration of three different dimensionality analy-
ses components, DIMTEST, DETECT, and CCPROX/HAC, the package allows for a variety
of interesting research activities, including a hypothesis test for lack of unidimensionality, clus-
tering the items into subtests, detecting dimensionality for simple structure data, calculating a
proximity matrix, and conducting a hierarchical cluster analysis.
A handful of features are made available in the package such as cross-validation, guessing
parameter specification, ignoring certain items, and so on. The user can choose either an
exploratory or a confirmatory analysis approach in selecting AT items that are suspected of
measuring a different dimension. The package is easy to install and user-friendly. In addition, it
ran smoothly during the simulation studies with no crashes, pauses, or processing interruptions.
170 Applied Psychological Measurement 37(2)

DIMPACK:Current Data for AT DIMPACK:Different Data for AT

1.0

0.5
0.9

0.4
Type I Error Rate

Type I Error Rate


0.8

0.3
0.7

0.2
0.6

0.1
0.5

0.0
30 50 100 30 50 100
Test Length Test Length

DIMTEST_DOS: Conservative DIMTEST_DOS: Powerful


0.5

0.5
0.4

0.4
Type I Error Rate

Type I Error Rate


0.3

0.3
0.2

0.2
0.1

0.1
0.0

0.0

30 50 100 30 50 100
Test Length Test Length

N=500 N=1000 N=5000

Figure 2. Type I error rate for DIMPACK and DIMTEST_DOS with 3PL data
Note: 3PL = three parameter logistic model; AT = assessment subtest.

Although the earlier DOS programs may have been slightly more efficient in terms of apparent
processing speed—possibly due to choices of compilers and other technical programming
issues—it failed to finish the analyses for some of the two-dimensional data studied by the
authors.
Based on our experience, the DIMPACK package appears to have some major shortcomings
that need to be addressed before it can be endorsed as practically useful. First and foremost, it
was found that the option of ‘‘current data’’ in AT selection resulted in distressingly high Type
I error rates under many typical situations that arise in practice and therefore should probably
be discouraged and investigated further. Certainly, the authors would not use the option in any
of their research work. It is suggested that ‘‘different data’’ be always selected—in fact, it
should become the new default data selection choice. In addition, it would be more convenient
for users if the software could provide an option to specify the exact number of examinees set
aside for AT selection and then automatically and randomly split the data into two sets for
cross-validation purposes. Second, the authors found that the statistical power was low under
many conditions except with very large sample sizes. This finding seems to limit the utility of
Deng et al. 171

Theta Correlation = 0 Theta Correlation = 0.3


1.0

1.0
0.8

0.8
Power Rate

Power Rate
0.6

0.6
0.4

0.4
0.2

0.2
0.0

0.0
30 50 100 30 50 100
Test Length Test Length

Theta Correlation = 0.6 Theta Correlation = 0.9


1.0

1.0
0.8

0.8
Power Rate

Power Rate
0.6

0.6
0.4

0.4
0.2

0.2
0.0

0.0

30 50 100 30 50 100
Test Length Test Length

N=500 N=1000 N=5000

Figure 3. Power rate for DIMPACK with compensatory two-dimensional three parameter logistic data

the software for many practitioners dealing with small or moderate data sets. Third, some dis-
crepancies were found in the Type I error rates with large sample size between DIMPACK and
DIMTEST_DOS programs. The discrepancies were substantial and deserve further investiga-
tion. Finally, successful applications and more widespread use may be hampered unless the
software developers provide better documentation with concrete examples of data files and
common applications.

Acknowledgment
The authors are grateful for the valuable comments from Richard M. Luecht, which strengthened the
review considerably.

Declaration of Conflicting Interests


The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publi-
cation of this article.
172 Applied Psychological Measurement 37(2)

Funding
The authors received no financial support for the research, authorship, and/or publication of this article.

References
Deng, H., & Ansley, T. N. (2000, April). Detecting compensatory and noncompensatory multi-
dimensionality using DIMTEST. Paper presented at the meeting of the National Council on
Measurement in Education, New Orleans, LA.
Finch, H., & Habing, B. (2007). Performance of DIMTEST- and NOHARM-based statistics for testing
unidimensionality. Applied Psychological Measurement, 31, 292-307.
Froelich, A. G., & Habing, B. (2008). Conditional covariance-based subtest selection for DIMTEST.
Applied Psychological Measurement, 32, 138-155.
Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory.
Newbury Park, CA: SAGE.
Hattie, J., Krakowski, K., Rogers, H. J., & Swaminathan, H. (1996). An assessment of Stout’s index of
essential unidimensionality. Applied Psychological Measurement, 20, 1-14.
Kim, H. R. (1994). New techniques for the dimensionality assessment of standardized test data
(Unpublished doctoral dissertation). Department of Statistics, University of Illinois at Urbana-
Champaign, IL.
Meara, K., Robin, F., & Sireci, S. G. (2000). Using multidimensional scaling to assess the dimensionality
of dichotomous item data. Multivariate Behavioral Research, 35, 229-259.
Roussos, L., Stout, W., & Marden, J. (1998). Using new proximity measures with hierarchical cluster
analysis to detect multidimensionality. Journal of Educational Measurement, 35, 1-30.
Stout, W. (1987). A nonparametric approach for assessing latent trait unidimensionality. Psychometrika, 52,
589-617.
Stout, W., Habing, B., Douglas, J., Kim, H., Roussos, L., & Zhang, J. (1996). Conditional covariance-based
nonparametric multidimensionality assessment. Applied Psychological Measurement, 20, 331-354.
Stout, W., Froelich, A., & Gao, F. (2001). Using resampling methods to produce an improved DIMTEST
procedure. In A. Boomsma, M. A. J. van Duijn, & T. A. B. Snijders (Eds.), Essays on item response
theory (pp. 357-376). New York, NY: Springer-Verlag.
Stout, W., Nandakumar, R., Junker, B., Chang, H.-H., & Steidinger, D. (1992). DIMTEST: A Fortran
program for assessing dimensionality of binary item responses. Applied Psychological Measurement,
16, 236.
William Stout Institute for Measurement. (2006). Nonparametric dimensionality assessment package
DIMPACK (Version 1.0) [Computer software]. St. Paul, MN: Assessment Systems Corporation.
Zhang, J., & Stout, W. (1999).The theoretical DETECT index of dimensionality and its application to
approximate simple structure. Psychometrika, 64, 213-249.

You might also like