You are on page 1of 12

Assessment & Evaluation in Higher Education

ISSN: 0260-2938 (Print) 1469-297X (Online) Journal homepage: http://www.tandfonline.com/loi/caeh20

Assessing testtaking strategies of university


students: developing a scale and estimating its
psychometric indices

Hamzeh Dodeen

To cite this article: Hamzeh Dodeen (2008) Assessing testtaking strategies of university students:
developing a scale and estimating its psychometric indices, Assessment & Evaluation in Higher
Education, 33:4, 409-419, DOI: 10.1080/02602930701562874

To link to this article: http://dx.doi.org/10.1080/02602930701562874

Published online: 23 Jul 2008.

Submit your article to this journal

Article views: 226

View related articles

Citing articles: 2 View citing articles

Full Terms & Conditions of access and use can be found at


http://www.tandfonline.com/action/journalInformation?journalCode=caeh20

Download by: [The University of British Columbia] Date: 21 March 2017, At: 22:28
Assessment & Evaluation in Higher Education
Vol. 33, No. 4, August 2008, 409419

Assessing test-taking strategies of university students: developing a scale


and estimating its psychometric indices
Hamzeh Dodeen*

Psychology Program, UAE University, UAE


Assessment
10.1080/02602930701562874
CAEH_A_256144.sgm
0260-2938
Original
Taylor
02007
00
hdodeen@uaeu.ac.ae
HamzehDodeen
000002007
and
& Article
Francis
(print)/1469-297X
Francis
& Evaluation in Higher
(online)
Education

Test-taking strategies are important cognitive skills that strongly affect students
performance in tests. Using appropriate test-taking strategies improves students
achievement and grades, improves students attitudes toward tests and reduces test
anxiety. This results in improving test accuracy and validity. This study aimed at
developing a scale to assess students test-taking strategies at university level. The scale
developed was passed through several validation procedures that included content,
construct and criterion-related validity. Similarly, scale reliability (internal reliability
and stability over time) was assessed through several procedures. Four samples of
students (50, 828, 553 and 235) participated by responding to different versions of the
scale. The scale developed consists of 31 items distributed into four sub-scales: Before-
test, Time management, During-test and After-test. To the researchers knowledge, this
is the first comprehensive scale developed to assess test-taking strategies used by
university students.

Introduction
Test scores should reflect the level of students knowledge of the test content as well as
related skills. This is a critical aspect of tests that is used to make decisions regarding partic-
ular persons. For example, When tests scores become the most important factor determin-
ing who gets included in and excluded from educational opportunities, scores that
accurately reflect students knowledge and skills become imperative (Taylor and Walton
1997, 67). However, do test scores reflect only students knowledge? Are there other vari-
ables that influence test scores? During test-taking, ability is not the only factor that affects
students performance. There are several cognitive and psychological factors, such as
subject matter, level of test anxiety, attitudes toward the subject of the test, attitudes towards
tests in general and test-taking strategies, that influence test scores (Hambleton et al. 1991).
Therefore, several atypical student responses or behaviors may be observed during tests
(Meijer 1996):

An examinee having difficulty beginning the test may show a sleeping behavior.
Plodding behavior results from working slowly and not moving to the next item.
Alignment error may occur when a high-ability student skips an item in the test but
forgets to skip it in the answer sheet.

Other unusual responses might be due to poorly managing test time, fatigue, unfamiliarity
with the topic or the test format (Swearingen 1998) and scoring errors (Hulin et al. 1983).

*Email: hdodeen@uaeu.ac.ae

ISSN 0260-2938 print/ISSN 1469-297X online


2008 Taylor & Francis
DOI: 10.1080/02602930701562874
http://www.informaworld.com
410 H. Dodeen

In addition, some students who are prepared for a test do not do well while others perform
better than expected (Vattanapath and Jaiprayoon 1999).
Students who are able to do better than others from the same ability level are called
test-wise. Test-wiseness is A subjects capacity to utilize the characteristics and formats of
test and/or the test-taking situation to receive a high score (Hyde 1981, 3). Test-wise
students have strategies or skills that help them do well in tests independent of the knowl-
edge of the test content or materials (Sarnacki 1979). Their strategies or skills, usually
called test-taking strategies, are the cognitive abilities that allow them to undertake any
testing situation in an appropriate manner and to know what to do before, during and after
the test.

Importance of test-taking strategies


Testing strategies help students translate their knowledge from classroom learning
(McLellan and Craig 1989). Students who have or acquire test-taking strategies or skills will
positively affect their testing competence and, hence, their academic performance. This is
particularly true for low-ability students who perform better than expected (Dolly and
Williams 1986). On the other hand, students who are expected to do well in tests but do not
either lack testing strategies or use poor ones (Vattanapath and Jaiprayoon 1999). In fact,
some argue that test-taking strategies are just as important as having the basic knowledge
and information to answer the test questions (Langerquist 1982). Studies indicate that
students with test-taking strategies: (1) have improved attitudes toward tests; (2) have lower
levels of test anxiety; (3) achieve better grades (Vattanapath and Jaiprayoon 1999). Even
students who are familiar with the subject matter may do poorly in tests because of the lack
of test-taking skills (Sweetnam 2003). For example, Dreisbach and Keogh (1982) studied
the effect of test-taking strategies on students performance on a school readiness test.
Strategies taught in their study included familiarization with the test, practice of different
written responses and practice in test direction. Results showed that test-taking strategies
have an important influence on students performance. Dolly and Williams (1986)
investigated the effect of using test-taking strategies on multiple-choice test scores. Those
participants receiving test-taking strategy training for several weeks outperformed their
counterparts on tests.
In explaining why females generally do better than males in mathematics classes but
show poorer performance on tests, Kimball (1989) suggested that the difference in test-
taking strategies (e.g. problem-solving strategy) used by males and females could be the
reason behind the difference in test performance. Gallagher (1992) studied sex differences
in problem-solving strategies used by high-scoring examinees in the mathematical section
of the Scholastic Aptitude Test (SAT). In one part of this study, Gallagher analyzed the
relationship among students performances in SATs, the types of strategies they used, their
attitude toward mathematics and their test-taking strategies. A strong relationship was
found between performance in mathematics and test-taking strategies.
Since test anxiety is a fairly common problem in college students, having test-taking
strategies can be extremely useful in reducing this anxiety. According to Hembree (1988)
more than 20% of college students experience this problem with tension or uneasiness
occurring before, during or after an exam. While a reasonable level of test anxiety is useful
to motivate students to do better in tests, a high level of test anxiety may interfere with how
students perform (Strnad 2003). Highly anxious students generally have poor test-taking
strategies. They do poorly on essay questions and take-home tests and they have difficulties
on multiple-choice verbal items (Culler and Holahan 1980; Rocklin and Thompson 1985).
Assessment & Evaluation in Higher Education 411

Test-taking strategies used effectively help examinees cope with the problem of test anxiety.
For example, Carraway (1987) investigated the effect of a test-taking strategies seminar on
improving students scores in tests and on reducing their level of anxiety related to tests.
The results of this study indicated that students who participated in the seminar had lower
test-anxiety levels and higher test scores than their matched peers who did not participate in
the seminar.
Tests are usually designed to assess students knowledge in particular content or mate-
rials. When other factors affect students performance, test scores are no longer valid
measures of students knowledge or ability levels. Test-taking strategies can improve the
overall validity of the test scores so that they accurately reflect what students really know.
This could be done by ensuring that students lose points only because they do not know the
information and not for unrelated reasons. Ebel (1965) stated that More error in measure-
ment is likely to originate from students who have too little, rather than too many, skills in
test-taking (3). Assessing test-taking strategies of university students is useful to study and
to understand students behavior in tests. This can be an initial step in understanding several
related phenomena such as why some students do poorly in exams. The purpose of this study
is to develop a scale to assess test-taking strategies of university students and to estimate its
psychometric indices.

Scale development
Participants
This study was conducted on the students of the United Arab Emirates University (UAEU).
UAEU is a medium-sized four-year public university which has an enrolment of approxi-
mately 15,000 students. Four random samples (50, 828, 553 and 235 students) participated
in this study by responding to different versions of the test-taking strategies scale. These
samples represented the actual percentage of both genders in the university. Sample 1
consisted of 50 students (31 females [62%] and 19 males [38%]). Sample 2 had 828 students
(534 females [64.5%] and 294 males [35.5%]). Sample 3 consisted of 553 students (342
females [61.8%] and 211 males [38.2%]). Finally, Sample 4 had 235 students (160 females
[68%] and 75 males [32%]). All colleges at UAE University were also represented in these
samples. Table 1 shows number and percentage of students from each college in Samples
2, 3 and 4.

Development steps
The development of the scale to assess students test-taking strategies and estimating its
psychometric indices was conducted in the following manner:

1. Determining basic test-taking strategies


Through extensive review of the literature of test-taking strategies and skills used by
students in tests at different school levels, and by reviewing related literature in educational
measurement, testing and educational psychology, 74 strategies (or skills) were determined.
These strategies were classified into four main categories:

Before-test: strategies employed before answering the test questions;


Time management: strategies for managing time during the test;
412 H. Dodeen

Table 1. Number and percentage of students per college in Samples 2, 3 and 4.


Sample 2 Sample 3 Sample 4
No. of No. of No. of
College students % students % students %
Humanities & Social Sciences 202 24.4 138 25.0 48 20.4
Science 240 16.9 87 15.7 25 10.6
Education 100 12.0 41 7.4 27 11.5
Business & Economics 110 13.3 99 17.9 35 14.9
Law 55 6.6 36 6.5 21 9
Food Systems 24 2.9 31 5.6 18 7.6
Engineering 127 15.4 91 16.5 32 13.6
Information Technology 70 8.5 30 5.4 29 12.4
Total 928 100 553 100 235 100

During-test: strategies used in answering the test questions;


After-test: strategies employed after taking the test.

2. Developing and reviewing the scale items


The 74 strategies or skills were used to develop the items of the scale. Each item represented
only one strategy or skill. Items were developed for each of the four categories. Then, a
panel of 10 faculty members from UAE University with a background in educational
measurement and evaluation, educational psychology or education reviewed the items. In
this process, reviewers were asked to check item content, stating clarity, appropriateness to
the assessed skill, relativity to the scale sub-category, and any other related issue that might
result in improving the items or the scale as a whole. All reviewers comments and sugges-
tions were collected, analyzed and considered. This resulted in changing, deleting or adding
a few items to the existing scale. As a result, an improved draft of the scale was developed
with a total of 62 items that were distributed into the four previously mentioned categories
as follows: 13 items in the Before-test, 13 items in the Time management, 24 items in the
During-test, and 12 items in the After-test. To assess how often students apply these test-
taking strategies or skills in their regular tests, a five-point Likert scale was used. The scale
ranges from never (1) to always (5). Some items were stated in the positive direction such
that the higher score the better, in terms of using or having the appropriate test-taking strat-
egies or skills. Conversely, other items on the scale were stated in the negative direction.
These later items had to be recoded before conducting any further statistical analysis.

3. Piloting the scale


In order to verify the functionality and applicability of the scale and to estimate response
time, a random sample of 50 students (Sample 1) responded to this draft of the scale. As
respondents were anonymous, no further personal details were requested. The review of the
scale included checking clarity and appropriateness of the items for the students level of
understanding. Respondents were also asked to write any comments or thoughts that they
might have about the instrument. After analysis, the scale was modified again. After that, a
revised version of the scale was prepared. This version consisted of 59 items, which were
Assessment & Evaluation in Higher Education 413

distributed into the four categories as follows: 13 items in Before-test, 11 items in Time
management, 22 items in During-test, and 13 items in After-test.

4. Assessing validity
Validity is an indication of how well an instrument actually measures what it is claimed to
measure and helps to ensure that there are no logical errors in drawing conclusions from
the data (Garson 1998). To validate an instrument, several pieces of evidence of validity
are usually assessed. The most widely used aspects are content, construct and criterion-
related validity (Crocker and Algina 1986). These three types were assessed for the
present scale.

Content validity this is the degree to which the content of the scale of interest is
relevant, representative and of technical quality in respect of the scale used. Content
validity of an instrument is established if content experts agree that the instrument
items cover the issues to be assessed. To assess content validity of the current scale,
a panel of 15 faculty members with a background in education, measurement and
evaluation, or educational psychology was asked to review the scale. Revisions were
made that addressed rewording, appropriateness, clarity and some technical issues.
Ambiguous items were either removed or rewritten. Following this step a revised
version of the scale was prepared. This version consisted of only 48 items that were
distributed into the four categories as follows: eight items in Before-test, nine items in
Time management, 20 items in During-test, and 11 items in After-test.
Construct validity this refers to the degree to which a scale measures an intended
hypothetical construct (Gay 1996). This evidence of validity can be established by
relating the scale or the instrument of interest to some other measures consistent with
the hypothesis or the construct being assessed. Statistically, construct validity can be
assessed through the use of a factor analysis procedure. The aim of this analysis is to
identify the main components or categories that underline the scale. Sample 2 (828
students: 534 females and 294 males) was used in this analysis. For a clear interpre-
tation of the extracted components, a factor analysis with Varimax rotation was
applied to the data. The KasierMeyerOlkin (KMO) value was 0.87, which indicated
that the sample was suitable to run factor analysis (a minimum value of KMO = 0.60
is acceptable; Stevens, 1996). Four factors were extracted from this analysis. These
factors together explained more than 50% of the total variance in all scale items. Items
with low loading values were identified and deleted. In social sciences, the common
minimum cut-off loading value is 0.30 or 0.35 (Stevens 1996; Tabachnick and Fidel,
2001). Using this criterion, only variables with loading values of 0.30 were retained.
On that basis, all eight variables in the first factor, Before-test strategies, were
retained. In the second factor, Time-management strategies, one variable was deleted
because of low loading value, and another variable was deleted because it loaded on
two factors. In the third factor, During-test strategies, eight variables were deleted
because of their low loading values, and one variable was deleted because it loaded in
more than one factor. Finally in the fourth factor, After-test strategies, six variables
were deleted. Table 2 summarizes the loading values of all variables with deleted vari-
ables emboldened. At the end of this analysis, the total number of items in the four
sub-scales was reduced to 31 items.
Criterion-related validity this refers to the degree to which the scores on the scale
are related to the scores on another valid criterion available at the same time (Gay
Table 2. Loading of items with the four extracted components.
414

Component
Item 1 2 3 4
I do not attend the last few classes before the test 0.082 0.131 0.269 0.343
I spend most of the night before the test studying 0.415
H. Dodeen

0.084 0.064 0.091


I drink lots of coffee or soda drinks before the test 0.104 0.044 0.090 0.363
I bring to the test all necessary materials 0.245 0.185 0.295 0.413
I do not take my breakfast when I have a test 0.056 0.027 0.116 0.361
I continue studying and reviewing until the last minute 0.063 0.029 0.038 0.448
Before the test I talk with other students about the test 0.107 0.159 0.106 0.345
I read the test instructions extremely carefully 0.300 0.276 0.273 0.321
I think carefully about the total test time and plan how to use it 0.266 0.349 0.236 0.057
I use the full time allowed for the test 0.123 0.415 0.311 0.167
I leave a few minutes before the end for checking answers 0.223 0.344 0.433 0.107
When other students leave the test room, I feel I should leave it too 0.064 0.506 0.151 0.171
I estimate how much time I have to answer each question 0.020 0.460 0.005 0.062
I use the extra time I have to review my answers 0.192 0.469 0.415 0.093
If I run out of time, I outline the remaining information 0.223 0.312 0.156 0.131
I am committed to the time assigned for each question 0.118 0.290 0.023 0.137
I mark the question that I do not know 0.080 0.302 0.264 0.022
I read each question carefully before trying to answer it 0.376 0.140 0.349 0.024
I underline important words and phrases in a question 0.320 0.291 0.092 0.021
If the question is complex, I restate it in my own words 0.231 0.114 0.011 0.090
I answer first the question that I think is easiest 0.295 0.083 0.003 0.138
If I do not know a question, I quickly leave it and move on 0.267 0.031 0.149 0.051
If I do not understand a question, I give all the answers I know 0.377 0.039 0.065 0.208
If I do not know the answer, I leave it blank 0.378 0.312 0.271 0.220
I do not read the whole question if it looks familiar to me 0.432 0.138 0.007 0.319
Table 2. (Continued).
Component
Item 1 2 3 4
I leave several questions unanswered 0.284 0.304 0.161 0.233
I leave a space at the end of each question for corrections 0.285 0.196 0.026 0.022
I organize my answer in my mind before writing it down 0.442 0.364 0.092 0.044
I do not review my test even when there is some extra time 0.292 0.016 0.270 0.112
I think of the result more than the test itself 0.318 0.004 0.198 0.218
If I do not know the answer, I make some intelligent guesses 0.333 0.075 0.322 0.176
I try to show all my work in any question 0.320 0.077 0.340 0.203
I keep thinking of the difficult questions while answering other ones 0.283 0.009 0.247 0.209
I check my work even when it looks right 0.317 0.303 0.161 0.225
If something is unclear, I ask for clarification 0.300 0.229 0.156 0.058
I survey the whole test before trying to answer any question 0.268 0.079 0.030 0.036
I consider the score of each question before trying to answer it 0.368 0.282 0.188 0.156
I review my paper carefully and read what is written by my teacher 0.089 0.256 0.272 0.156
I sum my sub-grades and compare them with my total score 0.074 0.232 0.394 0.018
I identify the origin of each question 0.113 0.390 0.393 0.033
I determine the reasons that effectively reduced my scores 0.063 0.264 0.320 0.097
I determine the most difficult part of the test 0.109 0.231 0.278 0.059
I carefully review my mistakes 0.049 0.259 0.231 0.058
I objectively evaluate my efforts on the test 0.091 0.257 0.238 0.033
Based on the current test results, I improve my preparation methods 0.280 0.093 0.342 0.007
I determine what will improve my performance in the next test 0.280 0.041 0.293 0.054
I listen carefully to any in-class review when the test is handed back 0.056 0.248 0.499 0.173
I understand the correct answer for each missed question 0.035 0.247 0.291 0.177
Items in bold were deleted because either their loading values <.30 or loaded on more than one factor.
Assessment & Evaluation in Higher Education
415
416 H. Dodeen

1996). Two types of criterion-related validity, convergent and divergent, were


assessed using two instruments administered at the same time to a random sample of
553 students (Sample 3).

The first instrument was Attitudes toward Tests. This instrument was designed to assess
students attitudes toward tests. Examples from the instrument are: Tests motivate me to
study hard, For me taking tests is a painful experience and I try my best to avoid any
course that requires a lot of tests. A high score on this instrument suggests a positive atti-
tude toward tests. The correlation between students scores on the scale and their attitudes
toward tests was used to estimate the convergent validity of the scale. Generally, attitudes
toward the subject matter have a positive relationship with achievement (Schofield 1982;
Wilson 1983). It is assumed that students who have good test-taking strategies develop more
positive attitudes toward tests.
The second instrument was Test Anxiety Inventory (TAI). This inventory has been
widely used in measuring the level of adult test anxiety. It was originally developed by
Spielberger (1980), then used and validated to fit several cultures. Examples from the TAI
questions are: While taking an examination, I have an uneasy, upset feeling (Item 2),
Thoughts of doing poorly interfere with my concentration on tests (Item 7), and During
examinations, I get so nervous so that I forget facts I really know (Item 20). The Arabic
version of TAI, validated by Tayb (1984), was used in this study. The first item (I feel
confident and comfortable during tests), which is the only positively stated item on the
scale, was recoded before being added to the other ones. A high score on this inventory indi-
cates a high level of test anxiety. In this study, it is assumed that students who have more
appropriate test-taking strategies are less anxious about tests.
The two instruments Attitudes toward Tests and Test Anxiety were administered
at the same time with the developed scale. Cronbachs alpha values were 0.89 and 0.93
for the Attitudes toward Tests and Test Anxiety respectively. Based on these results,
the two instruments were judged to have adequate internal reliability. Correlations
between each category and the instruments were calculated and results are summarized
in Table 3.
As shown in Table 3, there was a significant positive correlation between each category
and students attitudes toward tests. For example, the correlation between students attitudes
toward the test and the Before-test sub-scale was 0.42, p < 0.01. Similar results were
observed for the other sub-scales. This could be seen as evidence of criterion-related valid-
ity to each of the four categories. On the other hand, the negative significant correlations
between each sub-scale and test-anxiety level could also be seen as an evidence of validity
for these sub-scales. For example the correlation between Before-test and the Test Anxiety
scale was 0.30, p < 0.01.

Table 3. Correlations between each sub-scale and attitudes toward tests and test anxiety.
Subscale
Scale Before-test Time management During-test After-test
Attitude toward rests 0.42** 0.43** 0.52** 0.44**
Test anxiety 0.30** 0.19** 0.26** 0.16**
Note: **Denotes significant at 0.01.
Assessment & Evaluation in Higher Education 417

5. Assessing reliability
Reliability of an instrument refers to the degree to which the results could be replicated if
the same individuals were tested again under similar circumstances (Crocker and Algina
1986, 105). Two types of reliability were assessed: stability over time and internal
reliability.

Stability over time: this refers to the degree to which the scale is giving similar results
over time. A random sample of 235 students (Sample 4) was used in this analysis.
Students responded to the scale twice within three weeks at different time intervals.
Stability was estimated through correlating students responses on the two adminis-
tration times. Results of this analysis showed that there was a high correlation
between the two responses. Correlations for the four categories were 0.89, 0.82, 0.92
and 0.86 for Before-test, Time management, During-test and After-test respectively.
Internal reliability: the internal consistency and homogeneity for the four categories
of the scale were assessed using Cronbachs alpha. The minimum advisable level is
0.70 (Nunnally and Bernstein 1994). Cronbachs alpha values for the four categories
were as follows: Before-test 0.71, Time management 0.75, During test 0.76, and After-
test 0.81. Based on these results, the four categories were judged to have adequate
internal reliability.

6. Item discrimination
This was used as evidence of item quality. Item discrimination was assessed through calcu-
lating the correlation between each item and its main component category. Correlations
between the eight items of the first category and their scale were 0.52, 0.54, 0.65, 0.31, 0.61,
0.62, 0.46 and 0.34. Correlations between the seven items that make up the second category
and their scale were 0.69, 0.61, 0.41, 0.62, 0.68, 0.56, and 0.69. Correlations between the
11 items that make up the third category and their scale were as follows: 0.51, 0.46, 0.47,
0.42, 0.47, 0.56, 0.42, 0.58, 0.58, 0.45 and 0.53. Finally, correlations between the five items
that make up the fourth category and their scale were 0.63, 0.56, 0.58, 0.55 and 0.68. These
values were higher than correlations between items and the scales to which they were
unrelated. This indicated that items have acceptable discrimination values.

Conclusion
Test-taking strategies are important cognitive skills that strongly affect students perfor-
mance in tests. With the increasing use of tests in different academic and non-academic
contexts, using appropriate test-taking strategies becomes a critical factor in helping
students test performance, better matching their preparation and ability level. This results
in improved test accuracy and validity. In addition, having test-taking strategies improves
students attitudes toward tests and reduces test anxiety.
A scale for assessing strategies and skills used by university students in test-taking was
developed in this study (see Appendix). The developmental process required several steps
and procedures to ensure a high-quality scale from a psychometric point of view. Develop-
ing the scale was dependent on extensively reviewing related literature on important,
common test strategies that were highly recommended by educators and test specialists. The
developed items went through several validation procedures that included content, construct
and criterion-related validity. Similarly, scale reliability was assessed through several
418 H. Dodeen

procedures and using several samples of responses. This included internal reliability as well
as stability of the scale over time. Two panels of faculty members with background in
measurement, education or educational psychology reviewed the scale items and validated
their content. Four samples of students (50, 828, 553 and 235) participated by responding
to different versions of the scale. To the researchers knowledge, this is the first comprehen-
sive scale developed to assess test-taking strategies used by university students. Additional
applications, however, are needed to replicate and validate the scale using different samples
from different educational levels.

Acknowledgements
The author would like to thank the Scientific Research Affairs Sector in the UAE University for fund-
ing this research.

Notes on contributor
Hamzeh M. Dodeen is associate professor in measurement and evaluation at UAE University. His
research interests include item analysis in both Classical Test Theory (CTT) and Item Response
Theory (IRT), person- and item-fit analysis, differential item functioning (DIF), and test-related
characteristics.

References
Carraway, C. 1987. Determining the relationship of nursing test scores and test-anxiety levels before and
after a test-taking strategy seminar. (ERIC Document Reproduction Service No. ED 318 498).
Colosi, L. 1997. The laymans guide to social research methods. Available online at http://
www.socialresearchmethods.net/tutorial/Colosi/lcolosi1.htm
Crocker, L., and J. Algina. 1986. Introduction to classical and modern test theory. Orlando, FL:
Harcourt Brace Jovanovich.
Culler, R.E., and C.J. Holahan. 1980. Test anxiety and academic performance: the effect of study-
related behavior. Journal of Educational Psychology 72: 1620.
Dolly, J.P., and K.S. Williams. 1986. Using test-taking strategies to maximize multiple-choice test
scores. Educational and Psychological Measurement 46: 619625.
Dreisbach, M., and B. Keogh. 1982. Testwiseness as a factor in readiness test performance of young
Mexican-American children. Journal of Educational Psychology 72, no. 2: 224229.
Ebel, R. 1965. Measuring educational achievement. Englewood Cliffs, NJ: Prentice-Hall.
Gallagher, A.M. 1992. Sex differences in problem-solving strategies used by high-scoring examinees on
SAT-Math. (ERIC Document Reproduction Service No. ED 352-420.)
Garson, D. 1998. Quantitative research in public administration. Available online at http://
www2.chass.ncsu.edu/garson/pa765/validity.htm
Gay, L.R. 1996. Educational research. Englewood Cliffs, NJ: Prentice-Hall.
Hambleton, R.K., H. Swaminathan, and H.J. Rogers. 1991. Fundamentals of item response theory.
Newbury Park, CA: Sage Publications.
Hembree, R. 1988. Correlates, causes, effects, and treatment of test anxiety. Review of Educational
Research 58, no. 1: 4777.
Hulin, C.L., F. Drasgow, and C.K. Parsons. 1983. Item response theory. Homewood, IL: Dow Jones-
Irwin.
Hyde, R.E. 1981. Successful test-taking strategies for nursing students. Paper presented at the
Annual Meeting of the College Reading Association. Louisville, KY.
Kimball, M. 1989. A new perspective on womens math achievement. Psychological Bulletin 105:
198214.
Langerquist, S. 1982. Nursing examination review: test-taking strategies. Menlo Park, CA: Addison-
Wesley.
McLellan, J., and C. Craig. 1989. Facing the reality of achievement test. Education Canada, 3640.
Meijer, R.R. 1996. Person-fit research: an introduction. Applied Measurement in Education 9, no. 1: 38.
Assessment & Evaluation in Higher Education 419

Nunnally, F., and I. Bernstein. 1994. Psychometric theory. New York: McGraw-Hill.
Rocklin, T., and J.M. Thompson. 1985. Interactive effects of test anxiety, test difficulty, and
feedback. Journal of Educational Psychology 77: 368372.
Sarnacki, R.E. 1979. An examination of test-wiseness in the cognitive test domain. Review of
Educational Research 49: 252279.
Schofield, H. 1982. Sex, grade level, and the relationship between mathematics attitudes and
achievement in children. Journal of Educational Research 75, no. 5: 280284.
Spielberger, C.D. 1980. Conceptual and methodological issues in anxiety research. In Anxiety:
current trends in theory and research, ed. C.D. Spielberger, Vol. 2. New York: Academic Press.
Stevens, J. 1996. Applied multivariate statistics for social sciences. Mahwah, NJ: Lawrence Erlbaum.
Strnad, K. 2003. Coping with college series: handling test anxiety. Available online at http://
www.counseling.ilstu.edu/files/downloads/articles/coping-test_anxiety.pdf
Swearingen, D.L. 1998. Person-fit and its relationship with other measures of response set. Paper
presented at the Annual Meeting of the American Educational Research Association. San Diego,
CA.
Sweetnam, K.R. 2003. Test-taking strategies and student achievement. Available online at http://
www.cloquet.k12.mn.us/chu/class/fourth/ks/stratigies.htm
Tabachnick, B.G., and L.S. Fidel. 2001. Using multivariate statistics. Boston: Allyn & Bacon.
Tayb, M.A. 1984. The test-anxiety scale. Cairo: Dar Al-Maref.
Taylor, K., and S. Walton. 1997. Co-opting standardized tests in the service of learning. Phi Delta
Kappan 6670.
Vattanapath, R., and K. Jaiprayoon. 1999. An assessment of the effectiveness of teaching test-taking
strategies for multiple-choice English reading comprehension test. Occasional Papers 8: 5771.
Wilson, U. 1983. A meta-analysis of the relationship between science achievement and science atti-
tude: kindergarten through college. Journal of Research in Science Teaching 20, no. 4: 839850.

You might also like