Professional Documents
Culture Documents
To cite this article: Elise Boyas , Lois D. Bryan & Tanya Lee (2012) Conditions affecting the
usefulness of pre- and post-tests for assessment purposes, Assessment & Evaluation in Higher
Education, 37:4, 427-437, DOI: 10.1080/02602938.2010.538665
To link to this article: http://dx.doi.org/10.1080/02602938.2010.538665
Assessment
10.1080/02602938.2010.538665
CAEH_A_538665.sgm
0260-2938
Original
Taylor
02010
00
leeta@rmu.edu
TanyaLee
000002010
&
and
Article
Francis
(print)/1469-297X
Francis
& Evaluation in Higher
(online)
Education
Introduction
Colleges and universities are being called upon to be accountable for student learning.
The interest in measuring student learning outcomes began in the USA in public K-12
education with a call for more rigorous standards (Department of Education 1983) and
was operationalised in The No Child Left Behind Act of 2001. This interest has now
moved to higher education. Stakeholders want institutions of higher education to
demonstrate that students have achieved competence in general intellectual skills and
discipline-specific skills.
Stakeholders in this process include students and their parents, employers,
institutions of higher learning, and accrediting bodies. For a school of business, both
the Association of Collegiate Business Schools and Programs (ACBSP) and the
Association to Advance Collegiate Schools of Business (AACSB) believe that learning objectives for students should be linked to institutional learning objectives. These
accrediting bodies want student mastery of these objectives to be measured and
reported and this information to be used for improvement of curriculum and course
*Corresponding author. Email: leeta@rmu.edu
ISSN 0260-2938 print/ISSN 1469-297X online
2012 Taylor & Francis
http://dx.doi.org/10.1080/02602938.2010.538665
http://www.tandfonline.com
content (ACBSP 2006; AACSB 2008). Neither accrediting body provides specific
requirements for how to accomplish this. The choice of learning objectives and
assessment instruments is left to the discretion of each institutions faculty and
administrators.
One assessment tool often used to measure student learning is a pre/post-test.
Students are given a test at the beginning of a course to determine their level of
mastery of course learning objectives. At the end of the course, the same assessment
is administered. The expectation is that student performance on the assessment will
improve, presumably as a result of completing course-related requirements.
Despite the popularity of this assessment tool, criticisms of the use of pre/posttests exist (Suskie 2004, 110). Therefore, it is useful to consider when a pre/post-test
may be an appropriate choice. Identification of factors that should be considered when
an instructor chooses an assessment instrument could be helpful. It may be that
students in core courses (required but not in their major subject area) will be less motivated to perform well on an ungraded pre/post-test than will students in courses in
their major. Similarly, some course learning objectives may be more or less suited to
pre/post-testing. It may be difficult to measure learning for some of the higher order
skills using this mechanism since answering questions at this level requires something
beyond the recall needed to answer more basic knowledge and comprehension
questions. These are empirical questions addressed in this paper.
Literature review
Student test performance on entry to and exit from university courses has been used
to measure student learning. Measuring the learning of students in this manner, a
value-added approach, has intuitive appeal but has also been criticised. One criticism
is that there is the possibility that students actually learned the skills they needed for
success on the post-test outside the classroom (Suskie 2004, 110).
Another criticism of the use of the pre/post-test to measure student learning is that
students will necessarily perform poorly on the pre-test since they know nothing about
the material at the beginning of the course; the lower the pre-test score the higher the
potential learning gains (Warren 1984; Pascarella and Wolniak 2004). Post-tests under
such conditions will always show improvement over the pre-test. Additionally,
instructors may teach to the post-test in order to improve results, potentially neglecting desirable skills not tested (Ewell 2002).
A final criticism of the use of pre/post-tests to measure learning is the presence of
response-shift bias, which occurs because students frames of reference during the
pre-test are not the same as during the post-test (Mann 1997; Drennan and Hyde
2008). A response-shift bias can occur when students are self-evaluating their skills
prospectively without a real understanding of the skills they are evaluating. Thus, the
pre-test results may reflect an overestimation of their skills while the post-test results
reflect a more informed judgement of these skills, resulting in an understatement of
the value added by the course (Drennan and Hyde 2008). Statistical analysis of pre/
post-test results presumes that the frame of reference of the student has not been
altered by the course content, hardly a realistic assumption (Mann 1997).
Despite these criticisms, attention has refocused on the use of value-added measures
of student learning to evaluate the effectiveness of course content (Kimmell, Marquette,
and Olsen 1998; Banta and Pike 2007). The American Association of State Colleges and
Universities states (AASCU), It is time for states and their colleges and universities, in
conjunction with the regional accrediting agencies, to lead the development of a consensus model for assessing the value added from undergraduate student learning (2006, 1).
The AASCU points out that general intellectual skills required for students to be successful across career disciplines should be assessed (2006).
Recently, educators have concluded that the use of a pre/post-test can yield
exceptionally compelling information on what students have learned during a course
or program (Suskie 2004, 110). Suskie goes on to point out that even with its shortfalls, value-added assessment may still be a good choice when it is important to document significant gains in student learning at the course or discipline level. This
opinion is shared by the AACSB which clearly identifies the use of pre/post-tests as a
viable measure of student learning:
Evaluation of the effectiveness of instruction begins with an examination of learning
goals. It goes on to include such things as student reactions, peer observation, expert
observation, and periodic assessment of the impact of instruction on later performance.
To ensure quality, the schools faculty members measure overall student achievement by
use of such techniques as pre-testing and post-testing, assessment in subsequent coursework, surveys of employers, etc. (AACSB 2008, 54)
The AACSB expects business schools to use direct methods to assess student learning
(Calderon 2005) and has documented best practices in assessment which include the
use of pre/post-tests (AACSB 2008). Meyers and Nulty (2009) emphasise the important role that assessment has in achieving desired student learning outcomes and
suggest that assessment should have a central role in the design of curriculum.
In accounting, the Accounting Education Change Commission (1990a, 1990b),
Albrecht and Sack (2000) and the Core Competency Framework developed by the
American Institute of Certified Public Accountants (1999) call for accounting
graduates to be active lifelong learners with skill sets that include both competency in
discipline-specific accounting knowledge and inter-disciplinary skills such as critical
reasoning and problem solving. This combination of the desired characteristics for
accounting graduates and the need to actively assess student learning gains has motivated many accounting programmes to define specific student learning objectives for
courses and programmes and to develop assessment tools for these objectives.
Calderon, Green, and Harkness (2004) reported that over half of the accounting
department chairs responding to a survey believed that formalised assessment of
student learning has improved student success in their programmes.
Clearly, assessing student learning has become an important focus in higher
education, and the use of pre/post-tests has increased. This study attempts to identify
conditions that could affect the usefulness of pre/post-tests to assess changes in the
student mastery of particular learning objectives over the course of a semester.
Hypotheses
Pre/post assessments involve the administration of tests which are of importance to
the institution but of limited personal importance to the individual student. A significant body of research exists addressing the question of whether students are equally
motivated to succeed on low-stakes testing and high-stakes testing (Harlen and Crick
2003). A useful way of representing student test-taking behaviour has been defined by
expectancy-value models of achievement motivation. Expectancy is the students
belief that he or she can successfully complete the assessment, and value embodies the
beliefs a student holds as to exactly why he or she should try to complete the task
successfully (Wise and DeMars 2005). When the outcome of the test has great value
to the student, and the assessment tool is well-designed, demonstrated levels of proficiency are a good proxy for actual levels of proficiency (Wise and DeMars 2005).
However, some research has shown that students do not perform to their actual levels
of proficiency on low-stakes tests (Wise and DeMars 2005; Cole, Bergin, and
Whittaker 2008). We expect the outcome of ungraded pre/post-tests to have limited
value to students, leading to the following hypothesis:
H1: The use of a graded assessment instrument (questions embedded in a course final
exam) will elicit a higher level of performance than will the use of an ungraded post-test.
One factor that could affect the degree of differences in performance between a
graded assessment and an ungraded post-test is the difficulty level of the learning
objectives being assessed. Blooms Taxonomy (Bloom 1956) is widely used to categorise the cognitive difficulty level of course material. Applying Blooms Taxonomy
to accounting, basic knowledge and comprehension learning objectives requires
students to be able to define terms or basic relationships. Application and analysis
learning objectives require students to be able to combine or manipulate financial
information during problem solving. Student learning objectives for courses in
colleges and universities generally reflect a combination of basic and higher level
skills. Basic knowledge questions can be answered by recalling information. Questions related to higher level skills require students to both recall the related knowledge
and use the knowledge to perform application or analysis tasks, leading to the
following hypothesis:
H2: Application and analysis questions will result in larger differences in performance
between graded (course final exam) and ungraded (post-test) assessment instruments
than will knowledge and comprehension questions.
Research design
This research used pre/post-tests in several undergraduate classes in accounting in
order to evaluate the usefulness of this technique to assess student learning outcomes.
This study used a total of 120 subjects, all of whom were undergraduates attending a
school of business located in the mid-Atlantic region of the USA. Of these subjects,
90 were enrolled in three sections of an introductory accounting course required in the
undergraduate business programme; 60 were taught by one instructor and 30 by
another (see Table 1). All sections were treated as much alike as possible to minimise
differences due to students being in different sections. Both instructors were full-time
faculty members experienced in teaching this course.
Sample demographics.
Gender
Number of subjects
Male
Female
Total
60
30
90
18
12
30
78
42
120
Number of
subjects
Average %
correct post-test
Average % correct
final exam
Statistical
testa
p value
Lower level
n = 90
68
80
Higher level
n = 30
91
95
t-test
WSR
t-test
WSR
<.0001
<.0001
.0264
.0193
For the lower level students, the average score on the final exam was significantly
higher than on the post-test, with an average of 68% correct on the post-test and 80%
correct on the final exam (p < 0.0001); for the higher level students, an average of
91% correct on the post-test and 95% correct on the final exam (p < 0.03). Thus H1
was supported, indicating that an ungraded post-test may not reflect all the actual
course learning gains for all students.
The second hypothesis states that more difficult questions will result in larger
differences in performance gains between graded (course final exam) and ungraded
(post-test) assessment instruments. More difficult questions are those that test the
students ability to conduct analysis and apply their knowledge, categorised as A/A
(application and analysis) questions in this research. To analyse this hypothesis,
performance on the K/C (knowledge and comprehension) questions was analysed
separately from performance on the A/A questions for both subject groups. The results
are summarised in Table 3.
The hypothesis is supported for the lower level students. This group had a significant increase for the A/A questions but not for the K/C questions from the post-test
to the final examination. It was not supported for the higher level students. The higher
level students achieved very high scores on the post-test (the average is 91%); thus,
there is little left to gain between the post-test and the final exam since these students
are already achieving near the ceiling on the post-test. Support for this hypothesis for
the lower level students provides evidence that post-tests may be a more accurate indicator of student learning in an introductory class for simpler K/C questions than for
more difficult A/A questions.
The third hypothesis states that students taking a required course in their major
(higher level students) are expected to demonstrate a smaller difference between the
Table 3. Test of Hypothesis 2 comparison of student performance on the final exam vs. the
post-test for K/C and A/A type questions.
Average % gains Average % gains
(post-test to final (post-test to final
exam) for A/A Statistical
Lower or higher level Number of exam) for K/C
questions
testa
questions
students
subjects
p value
Lower level
n = 90
5.7
20
Higher level
n = 30
3.3
4.6
t-test
WSR
t-test
WSR
<.0001
<.0001
.7157
1.0000
Lower or higher
level students
Number of
subjects
Lower level
n = 90
68
80
12
Higher level
n = 30
91
95
post-test and the final exam than students taking a required course probably not in
their major. The expectation is that higher level students enrolled in a course in their
major area will be more motivated, generally, throughout the course when compared
with the lower level students. They will, as a result, experience lower achievement
gains between the post-test and the final exam than the lower level students. The
results of the comparison of the two groups using both parametric and non-parametric
tests (one-sided) supported the hypothesis and are presented in Table 4.
Both the KruskalWallis and Wilcoxon two-sample tests are significant. Thus,
while pre-test and post-test score differences can be used to indicate student learning,
these differences may be less indicative of actual student learning in lower level (nonmajor) classes with lower level students than in upper level classes with higher level
students.
It is also interesting to note (see Table 5) that there are significant differences across
pre-test and post-test results for both groups despite the limitations of the use of pre/
post-tests. Lower level students in the introductory course scored significantly higher
on the post-test than on the pre-test (an average of 49% correct on the pre-test and 68%
correct on the post-test, p < 0.0001). Higher level students in the required taxation
course also scored significantly higher on the post-test than on the pre-test (an average
of 83% correct on the pre-test and 91% correct on the post-test, p < 0.006). Given the
design of this study, we cannot make statistically valid inferences as to the cause of the
increase in scores from pre-test to post-test. While the delivery of curriculum could be
causal, the increases may also be at least partly a result of an increase in students
desires to do what their instructors asked of them over the course of the semester.
Table 5.
groups.
Comparison of student performance on the post-test vs. the pre-test across student
Lower or
higher level
students
Lower level
N=90
49
68
Higher level
N=30
83
91
t-test
WSR
t-test
WSR
Notes: at-test = paired t-test, WSR = Wilcoxon signed-rank test; bt-test = Wilcoxon and KW = Kruskal
Wallis two-sample tests.
436
10
E. Boyas et al.
not complete the course and whose pre-test results were, therefore, not included in the
study. Third, lower level students participated in a review after the post-test but before
the final exam. This may have increased the differences between the ungraded posttest and the graded final exam.
Additionally, the results could have been affected by the nature of the courses and
course content. While lower level students are not very likely to have encountered
accounting terminology outside the classroom, the higher level students are likely to
have had some exposure to the subject matter covered in an individual tax course.
They are likely to have been filing their own tax returns since most have some work
experience and may have been exposed to some tax topics in other courses. The higher
level students may also, on average, have taken the entire exercise, right from the
beginning, more seriously than the lower level students since non-performing students
will have already abandoned the degree programme.
In conclusion, this study provides evidence that while pre/post-tests can be useful
in assessing student mastery of learning objectives for both lower and higher level
students, some conditions may affect this usefulness. Pre/post-test differences may
understate student mastery of learning objectives especially for students in lower level
(non-major) classes and for more complex/higher order learning objectives.
Notes on contributors
Elise Boyas is a clinical assistant professor of accounting at the Katz Graduate School of Business, University of Pittsburgh in Pittsburgh, Pennsylvania. She earned her PhD from Rutgers
and has taught graduate and undergraduate accounting courses at a variety of institutions
including Robert Morris University, the University of Pittsburgh, and Rutgers. Prior to entering
academia, she was an accounting practitioner. Her research interests centre on pedagogical
issues in the delivery of undergraduate and graduate accounting curriculum.
Lois D. Bryan earned her doctorate in information systems and communication from Robert
Morris University and is a licensed certified public accountant. She is a professor of accounting
at Robert Morris University in Moon Township, Pennsylvania. She teaches taxation, managerial accounting and financial accounting and conducts research in the areas of assessment,
Sarbanes-Oxley compliance and taxation.
Tanya Lee earned her PhD at Arizona State University. She has taught at several universities
and is an associate professor of accounting at Robert Morris University in Moon Township,
Pennsylvania. She has published articles in a number of journals in both managerial accounting
and accounting information systems. Her current research interests relate to managerial
accounting, accounting information systems and assessment.
References
AACSB. 2008. Eligibility procedures and accreditation standards for business accreditation.
http://www.aacsb.edu/accreditation/process/documents/AACSB_STANDARDS_Revised_
Jan08.pdf (accessed May 8, 2009).
AASCU. 2006. Value-added assessment: Accountabilitys new frontier. Perspectives, Spring.
ACBSP. 2006. Best practices in outcomes assessment. http://www.acbsp.org/download.php?sid=175 (accessed May 8, 2009).
Accounting Education Change Commission (AECC). 1990a. Issues statement number one:
AECC urges priority for teaching in higher education. Sarasota, FL: American Accounting
Association.
AECC. 1990b. Position statement number one: Objectives of education for accountants.
Sarasota, FL: American Accounting Association.
11
437
Albrecht, W.S., and R.J. Sack. 2000. Accounting education: Charting the course through a
perilous future. Sarasota, FL: American Accounting Association.
American Institute of Certified Public Accountants. 1999. Core competency framework for
entry into the accounting profession. http://www.aicpa.org/edu/corecomp.htm (accessed
September 14, 2008).
Banta, T.W., and G.R. Pike. 2007. Revisiting the blind alley of value added. Assessment
Update 19, no. 1: 12, 1416.
Bloom, B.S. 1956. Taxonomy of educational objectives, handbook I: The cognitive domain.
New York: David McKay.
Calderon, T.G. 2005. Assessment in the accounting discipline: Review and reflection. In
Assessment of student learning in business schools: Best practices each step of the way,
ed. K. Martell and T.G. Calderon, vol. 1, 187206. Tallahassee, FL: Association for
Institutional Research.
Calderon, T.G., B.P. Green, and M. Harkness. 2004. Best practices in accounting program
assessment. Sarasota, FL: American Accounting Association.
Cole, J.S., D.A. Bergin, and T.A. Whittaker. 2008. Predicting student achievement for low
stakes tests with effort and task values. Contemporary Educational Psychology 33, no. 4:
60924.
Department of Education. 1983. A nation at risk: The imperative for educational reform. http:/
/www.ed.gov/pubs/NatAtRisk/recomm.html (accessed September 15, 2008).
Drennan, J., and A. Hyde. 2008. Controlling response shift bias: The use of the retrospective
pre-test design in the evaluation of a masters programme. Assessment & Evaluation in
Higher Education 33, no. 6: 699709.
Ewell, P.T. 2002. An emerging scholarship: A brief history of assessment. In Building a
scholarship of assessment, ed. T.W. Banta and associates, 325. San Francisco, CA: John
Wiley.
Harlen, W., and R.D. Crick. 2003. Testing and motivation for learning. Assessment in
Education 10, no. 2: 169207.
Kimmell, S.L., R.P. Marquette, and D.H. Olsen. 1998. Outcomes assessment programs:
Historical perspective and state of the art. Issues in Accounting Education 13, no. 4: 85168.
Mann, S. 1997. Implications of the response-shift bias for management. Journal of Management Development 16, no. 5: 32836.
Meyers, N.M., and D.D. Nulty. 2009. How to use (five) curriculum design principles to align
authentic learning environments, assessment, students approaches to thinking and learning outcomes. Assessment & Evaluation in Higher Education 34, no. 5: 56577.
Pascarella, E.T., and G.C. Wolniak. 2004. Change or not to change Is there a question?
Journal of College Student Development 45, no. 3: 3535.
Suskie, L. 2004. Assessing student learning: A common sense guide. Bolton, MA: Anker
Publishing Company.
Warren, J. 1984. The blind alley of value added. American Association for Higher Education
Bulletin 37, no. 1: 103
Wise, S.L., and C.E. DeMars. 2005. Low examinee effort in low-stakes assessment: Problems
and potential solutions. Educational Assessment 10, no. 1: 117.