You are on page 1of 21

MEASURING

MASTERY
BEST PRACTICES FOR ASSESSMENT
IN COMPETENCY-BASED EDUCATION

Katie Larsen McClarty


and Matthew N. Gaertner
Center for College & Career Success, Pearson

April 2015
AEI Series on Competency-Based Higher Education

C E N T E R O N H I G H E R E D U C AT I O N R E F O R M
AMERICAN ENTERPRISE INSTITUTE
Foreword

R ising tuition prices and finite public budgets have


spawned a lively policy debate about innovation
in higher education. In particular, competency-based
degrees? What does the regulatory environment look
like for competency-based providers? Do employers
value the credential?
models have garnered a lot of attention from policy- Despite increasing attention being paid to the
makers, reformers, and funders. Unlike online col- potential of competency-based education, researchers
lege courses, which often leave the basic semesterlong and policymakers still have few answers to these ques-
structure intact, competency-based models award tions. To provide some early insight, AEIs Center on
credit based on student learning, not time spent in Higher Education Reform has commissioned a series
class. As soon as a student can prove mastery of a par- of papers that examine various aspects of competency-
ticular set of competencies, he or she is free to move based education. In the third paper of the series, Katie
on to the next set. A number of institutions are cur- Larsen McClarty and Matthew N. Gaertner of Pear-
rently engaged in these efforts, including Western son Education introduce a set of best practices for
Governors University, Excelsior College, Northern high-stakes competency-based education assessment,
Arizona University, and the University of Wisconsins detailing how providers can work to validate their
UW Flexible Option. assessments and establish performance levels that map
The competency-based model presents opportuni- to real-world mastery.
ties for improvement on two dimensions: first, it allows As always, our goal is not to come up with a verdict
students to move at their own pace, perhaps shorten- as to whether this innovation is good or bad, but to
ing the time to complete a degree, and second, com- provide a look under the hood that is useful to policy-
petencies can provide a clearer signal of what graduates makers and other observers. I hope you find it useful,
know and are able to do. Yet for all the enthusiasm that and stay tuned for more.
surrounds competency-based approaches, a number
of fundamental questions remain: What kinds of stu- Andrew P. Kelly
dents are likely to choose competency-based programs? Resident Scholar in Education Policy Studies
How do students in these programs fare in terms of Director, Center on Higher Education Reform
persistence, completion, and labor market outcomes? American Enterprise Institute
Are these programs more affordable than traditional

i
Executive Summary

C ompetency-based education (CBE) programs are


growing in popularity as an alternative path to a
postsecondary degree. Freed from the seat-time con-
instancewho will encounter an increasing number
of CBE programs.
Based on our review of the current landscape, we
straints of traditional higher education programs, CBE argue that CBE programs have dedicated most of their
students can progress at their own pace and complete attention to defining discrete competencies and embed-
their postsecondary education having gained relevant ding those competencies in a broader framework asso-
and demonstrable skills. The CBE model has proven ciated with degree programs. Many programs clearly
particularly attractive for nontraditional students jug- document not only the competencies but also the types
gling work and family commitments that make con- of assessments they use to measure student proficiency.
ventional higher education class schedules unrealistic. This is a good start.
But the long-term viability of CBE programs hinges on We argue that, moving forward, CBE programs
the credibility of these programs credentials in the eyes should focus on providing evidence that supports the
of employers. That credibility, in turn, depends on the validity of their assessments and their interpretation
quality of the assessments CBE programs use to decide of assessment results. Specifically, program design-
who earns a credential. ers should work to clarify the links between the tasks
In this paper we introduce a set of best practices for students complete on an assessment and the compe-
high-stakes assessment in CBE, drawing from both the tencies those tasks are designed to measure. Moreover,
educational-measurement literature and current prac- external-validity studiesrelating performance on
tices in prior-learning and CBE assessment. Broadly CBE assessments with performance in future courses
speaking, there are two areas in assessment design and or in the workplaceare crucial if CBE programs want
implementation that require significant and sustained employers to view their assessments and their compe-
attention from test developers and program adminis- tency thresholds as credible evidence of students career
trators: (1) validating the assessment instrument itself readiness.
and (2) setting meaningful competency thresholds External validity is the central component of our
based on multiple sources of evidence. Both areas are recommendations:
critical for supporting the legitimacy and value of CBE
credentials in the marketplace. 1. CBE programs should clearly define their com-
This paper therefore details how providers can petencies and clearly link those competencies to
work to validate their assessments and establish per- material covered in their assessments.
formance levels that map to real-world mastery, pay-
ing particular attention to the kinds of research and 2. To support valid test-score interpretations, CBE
development common in other areas of assessment. assessments should be empirically linked to exter-
We also provide illustrative examples of these con- nal measures such as future outcomes.
cepts from prior-learning assessments (for example,
Advanced Placement exams) and existing CBE pro- 3. Those empirical links should also be used in the
grams. Our goal is to provide a resource to institu- standard-setting process so providers develop
tions currently developing CBE offerings and to cut scores that truly differentiate masters from
other stakeholdersregulators and employers, for nonmasters.

ii
MEASURING MASTERY KATIE LARSEN MCCLARTY AND MATTHEW N. GAERTNER

4. In addition to rigorous test development and order to provide evidence that a CBE credential
standard setting, CBE programs should continue stands for a level of rigor and preparation equiva-
to collect and monitor graduates life outcomes in lent to a traditional postsecondary degree.

iii
Measuring Mastery: Best Practices for Assessment
in Competency-Based Education
Katie Larsen McClarty and Matthew N. Gaertner

This paper is the third in a series examining competency-based higher education from a number of perspectives.

W hile college costs have risen dramatically over


the past decade, degree completion rates have
remained stubbornly flat, leading policymakers and
middle of the week. CBE can help these students work
at their own pace and on a more feasible schedule. And
they can use the program to show they have mastered a
advocates to look for new models of education that predetermined set of competencies.
can reduce costs and raise productivity. Reformers The idea of CBE is not new. In the 1970s, the US
have increasingly touted competency-based education Department of Education Fund for the Improvement
(CBE) as a potential remedy for escalating prices and of Postsecondary Education made grants to support the
stagnant graduation rates.1 development of new CBE programs at institutions that
The case for CBE is intuitively appealing: Students were already providing adult-learning programs. One
can earn college credit by demonstrating competencies grant recipienta consortium of Minnesota commu-
rather than accruing a certain amount of seat time, the nity collegesbegan developing a CBE program in
conventional metric. In simple econometric terms, tra- 1973 and, two years later, 250 students across the St.
ditional higher education programs hold time constant Paul metropolitan area were enrolled. An evaluation of
(for example, students must complete 120 credit hours competency-based teacher education programs in Min-
to earn a bachelors degree) but allow the amount of nesota and Nebraska showed improved performance
demonstrated learning during that time to vary (for for beginning teachers, and higher levels of teacher and
example, students can earn different course grades and student satisfaction.4
still receive the same number of credit hours). CBE pro- Although CBE programs remained a small part of
grams aim for the opposite: the standards for demon- higher education for many years, their focus on stu-
strated learning are held constant, but the amount of dent knowledge and outcomes rather than time spent
time students must spend to reach them can vary. in a traditional classroom led to advances in the move-
CBE is particularly appealing for students whose ment to grant credit for prior learning. When Western
work or family commitments make educational flex- Governors University (WGU) was founded in the late
ibility a priority. Such students represent a large and 1990s, it represented the first higher education insti-
growing share of the college-going population. Twenty tution to award degrees based solely on competencies.
percent of undergraduate students work full time, with CBE programs are now firmly established elsewhere,
more than 70 percent working at least part time.2 at institutions such as Alverno College, Capella Uni-
Nearly a quarter of undergraduate students are parents, versity, Excelsior College, Lipscomb University, and
and half of those are single parents.3 Work and family Southern New Hampshire University.
priorities compete with class schedules and may make The emerging completion agenda has taken CBE
it difficult for some students to adhere to the seat time from a niche market to the forefront of federal and state
requirements of traditional education models where higher education policy discussions. In March 2013,
classes often meet in the middle of the day and in the the Department of Education announced that students

1
MEASURING MASTERY KATIE LARSEN MCCLARTY AND MATTHEW N. GAERTNER

participating in approved CBE programs could be eli- help prove the value of the model by providing regula-
gible for federal financial aid, echoing what advocates tors and employers with clear, concrete evidence that
have been saying about the model for years: their competencies and assessments truly differentiate
students who have mastered necessary material from
Competency-based approaches to education have the those who have not. Marshalling this evidence, in turn,
potential for assuring the quality and extent of learn- requires the kind of best practices in assessment devel-
ing, shortening the time to degree/certificate comple- opment, standard-setting processes, and evaluation that
tion, developing stackable credentials that ease student have been developed in psychometrics.
transitions between school and work, and reducing
the overall cost of education for both career-technical
and degree programs. The Department plans to col-
A CBE model is workable only insofar
laborate with both accrediting agencies and the higher as its measures of learning yield
education community to encourage the use of this
innovative approach when appropriate, to identify trustworthy data about students
the most promising practices in this arena, and to prospects for future success.
gather information to inform future policy regarding
competency-based education.5
This paper therefore seeks to explore the current
Students can currently receive federal financial state of CBE assessment relative to best practices in
aid under two types of CBE models. The first is a assessment development and validation. We describe
course-based model with credit equivalency. In this how prior-learning assessments have been implemented
approach, student competencies are built into partic- in higher education and how sound assessment princi-
ular courses and then mapped back to credit hours. ples and lessons learned have been or could be applied
Although the credit hour is not the underlying met- to CBE programs. We begin with a review of two
ric of student learning, credit-hour equivalence is frameworks: the first describes industry standards for
used to qualify students for financial aid. This was developing and validating assessments, and the second
the original CBE model and is still the most popu- focuses on determining mastery. Next, we apply each of
lar. The second model, direct assessment, abandons the frameworks to existing prior-learning assessments
consideration of credit hours altogether in favor of a and CBE programs, concluding with a set of recom-
direct measure of student learning such as projects, mendations for institutions implementing or planning
papers, examinations, presentations, performances, to implement CBE programs.
and portfolios. So far, though, regulators have only
tentatively granted access to CBE models that are
entirely divorced from the credit hour: only two The Common Elements of
institutionsSouthern New Hampshire University Competency-Based Education
and Capella Universityhave received both regional
accreditor and Department of Education approval CBE models can take a variety of forms, but most pro-
for direct-assessment programs.6 grams include two common elements: (1) a competency
From a regulators perspective, such caution is framework and (2) competency assessments. The com-
understandable given the pace of change and the calls petency framework describes the skills, abilities, and
for expansion.7 Despite CBEs rising popularity, many knowledge needed to perform a specific task.8 Compe-
important questions remain. A measure of learning is tencies must be clearly defined, measurable, and related
more intuitively appealing than a measure of time, but to the knowledge or skills needed for future endeavors,
a CBE model is workable only insofar as its measures such as additional education or employment.9 Often,
of learning yield trustworthy data about students pros- competencies are specific to a particular course or degree
pects for future success. Fortunately, CBE providers can program. For example, competencies in a public health

2
MEASURING MASTERY KATIE LARSEN MCCLARTY AND MATTHEW N. GAERTNER

program may include being able to identify public Framework for Assessment Design. Assessment
health laws, regulations, and policies related to preven- designers should start with the Standards for Educa-
tion programs or use statistical software to analyze tional and Psychological Testing, the book that describes
health-related data.10 The second common element of industry standards for assessment development and val-
CBE models is competency assessment. Because com- idation.13 The Standards provides guidance for devel-
petency assessments are used to determine mastery and oping and evaluating assessments and outline the types
award credit, the value of CBE credentials hinges on the of evidence needed to support valid inferences about
reliability and validity of those assessments. assessment results. Basically, an assessment is valid if
Assessment quality has been an important research there is appropriate evidence to support the intended
topic for as long as CBE programs have existed. In score interpretations and the ways in which those who
1976, John Harris and Stephen Keller outlined sev- give the test will use it. Validity is obviously crucial for
eral key considerations in competency assessment assessment development in CBE programs, where test
and concluded that the major development effort in scores may be used to confer not only course credits but
competency-based education should not lie in design also degrees or certificates.14
of instructional materials but in design of appropriate Imagine a test developed to measure a students
performance assessments. Furthermore, institutions knowledge of public-health laws, regulations, and
should not commit themselves to competency-based policies. Students with higher scores should exhibit a
curricula unless they possess means to directly assess greater level of knowledge of public-health concepts.
students performance.11 Their level of knowledge, as evidenced by their test
Nearly 40 years later, that imperative persists. In Paul scores, could be used to determine whether they are
Gastons book about higher education accreditation, he awarded competency credits in this area and, by exten-
states: Qualifying [CBE] programs should be expected sion, whether they are prepared for future endeavors in
to demonstrate that meaningful program-level out- public health.
comes are equivalent to those accomplished by more Although this understanding of test scores may
traditional means and, thereby, deserving of recognition seem intuitive, the ability to make valid inferences
through equivalent credentials.12 The implications of from assessment results relies on these simple axioms.
this statement bear emphasis: Reliable assessment is a For example, can the test developers demonstrate that
necessary but insufficient precondition for CBE pro- knowledge of public-health laws, regulations, and
gram success. Programs must also produce students policiesand not some irrelevant traitexplains test-
who are just as well prepared for future success as com- score variability? Moreover, do higher scores relate to
parable students who earn credentials through more higher levels of subsequent job performance? Valida-
traditional avenues. It seems evident, then, that wide- tion is the process of accumulating evidence to answer
spread acceptance and adoption of the CBE model will these fundamental questions. According to the Stan-
require high-quality competency assessments linked to dards, validity evidence can come from five sources: (1)
meaningful labor market outcomes. test content, (2) response processes, (3) internal test
When developing competency assessments, there structure, (4) relations to other variables, and (5) test
are two important stages. The first is assessment devel- consequences.15
opment and score validationin other words, do The first three sources of evidence generally reflect
scores on the assessment reflect the different levels of the test instrument itself, whereas the second two rely
knowledge and skills that assessment designers are try- on data external to the assessment. Although not all
ing to measure? The second is determining how well sources of validity evidence may be present for every
a student must perform on the assessment in order assessment, programs can make a strong validity argu-
to demonstrate competencyin other words, what ment by integrating evidence from multiple sources.
is the cut score that separates the competent from the For example, it is important to show that a competency-
not-yet-competent? In this section we address each based assessment does test the knowledge and skills
stage separately, drawing on best practices in each area. associated with the specified competency (evidence

3
MEASURING MASTERY KATIE LARSEN MCCLARTY AND MATTHEW N. GAERTNER

based on test content). It is just as important, however, and real-world observations (for example, preservice
to show that students who score higher on the assess- teachers in the classroom). Regardless of format, how-
ment also do well on other tasks, such as job perfor- ever, the credibility of inferences drawn from assess-
mance, that require that competency (evidence based ment results depends on evidence of their validity. A
on relations to other variables). In a later section on 2002 Department of Education review of CBE pro-
assessment design in practice, we detail specific valid- grams, for instance, stated that few programs report
ity evidence that supports intended score interpreta- robust reliability or validity information. The authors
tions for existing prior-learning and CBE assessments. note, By attending to concerns about validity and
reliability, institutions can glean meaningful informa-
Framework for Determining Mastery. Once an tion to improve their initiatives and to satisfy exter-
assessment has been developed, test designers must nal demands for accountability.18 In this section, we
establish cut scores to separate masters from nonmas- describe potential sources of such validity evidence and
ters. In the case of CBE, the assessment cut scores dis- provide examples of evidence from CBE programs and
tinguish those who receive credit (or various levels of prior-learning assessments. We also note examples of
credit) from those who do not. Because cut scores are evidence that CBE providers could collect to validate
central to the use and interpretation of CBE assess- and promote their model going forward.
ments, test designers must also gather validity evidence
to support cut-score placement. Intended Score Interpretations and Test Use. The
One particularly relevant approach for setting cut first step in developing an assessment and amassing the
scores and determining mastery is Evidence Based appropriate validity evidence is specifying the purpose
Standard Setting (EBSS), which is especially useful of the assessment, or the intended interpretation of test
when an assessment makes claims about future perfor- scores for a given use. Once specified, that interpreta-
mance (for example, a test-takers ability to pass future tion must be validated. Because CBE assessments pro-
courses or succeed in the workplace).16 In K12 set- vide evidence of student learning and are used to award
tings, EBSS approaches have been used to identify credits, degrees, or other certificationsqualifications
college-ready high school students by using data that that students can take with them from one institution to
link secondary school test scores with how those stu- another or from institution to employerthis evidence
dents perform once they reach college.17 CBE creden- should theoretically be transportable across educational
tials imply preparedness for future work, so EBSS may institutions and sectors. At present, the Department
be similarly well suited to setting cut scores on CBE of Education notes that CBE programs in the United
assessments. States are far from achieving this goal (although, to be
In an EBSS approach, the judgments of subject- fair, most traditional colleges suffer from the same por-
matter experts are combined with data from research tability challenges).19
studies to determine the cut scores for different perfor- To support portability, CBE programs should gather
mance levels. The five EBSS steps are described in detail evidence corresponding to the five validity elements
in a subsequent section, but first, we turn to assessment described in Standards for Educational and Psychological
design in practice. Testing. Specifically, CBE programs should

1. Clearly define the competencies;


Assessment Design in Practice
2. Provide an explicit link between the skills mea-
CBE assessment can take a variety of formats: objec- sured by the assessments and those competencies;
tively scored assessments (for example, those with
multiple-choice or true-false questions), performance- 3. Demonstrate that student behaviors or thought
based assessments (for example, those including processes during testing reflect the competencies;
essays, group projects, or simulated environments),

4
MEASURING MASTERY KATIE LARSEN MCCLARTY AND MATTHEW N. GAERTNER

4. Relate performance on competency assessments Defining Competencies. Perhaps the most import-
with other measures of the same competencies; ant step in assessment design is defining the compe-
and tencies. As Richard Voorhees has argued, competencies
must be clearly defined and measurable; otherwise,
5. Document the empirical relationship between they cannot be considered competencies.21 Therefore,
assessment scores and future outcomes (such as designers of CBE programs must clearly define the
success in the workplace or attainment of a more competency or set of competencies an assessment will
advanced competency). measure. An exemplar in this area is WGU. For each
degree WGU awards, a set of domains is specified. For
CBE programs must also provide detailed infor- example, a bachelors of science in accounting consists
mation about the intended interpretations and uses of of 10 domains: (1) accounting; (2) accounting/finance
their assessments. For example, Excelsior Colleges CBE and information technology; (3) business law and eth-
nursing students are expected to demonstrate theoreti- ics; (4) cost/managerial accounting; (5) economics,
cal learning and clinical competence, including critical global business, and quantitative analysis; (6) founda-
thinking, at a level required for beginning practice as an tions; (7) liberal arts; (8) marketing and communica-
associate-degree-level registered nurse.20 Accordingly, tion; (9) organizational behavior and management; and
the Excelsior nursing assessments should be designed to (10) system administration and management.
measure students theoretical knowledge, clinical com- For each domain, a set of subdomains elaborate
petence, and critical thinking. To earn a CBE nursing the specific competencies that a student must demon-
degree, student performance on the assessments should strate. The subdomains within accounting include
be similar to that of nurses with associate degrees, and the student understands the nature and purpose of
their performance on the job should be similar to that information systems; the student understands the
of other nurses at that level. need for and uses of internal control systems; and the

Figure 1
Hierarchical Structure of Degrees, Domains, and Competencies

Source: The authors


Note: This structure is based on Western Governors Universitys bachelors degree in accounting.

5
MEASURING MASTERY KATIE LARSEN MCCLARTY AND MATTHEW N. GAERTNER

student understands information systems auditing. feedback from local business and industry stakeholders
The domains and subdomains are jointly developed suggested that 17 of the 41 competencies would be rel-
by subject-matter experts and employers in that field.22 evant and appropriate qualifiers for an undergraduate
Figure 1 shows the hierarchical structure of the degree degree. Lipscomb now requires students seeking a CBE
program, domains, and subdomains. undergraduate degree to demonstrate mastery of these
17 competencies.
Alverno College has established four
The Validity of the Test Instrument. Before educa-
institution-wide learning outcomes, tors can use an assessment to award credit or degrees
for demonstrated competencies, they must determine
which it has regularly expanded and that the assessment is a valid measure of those com-
revised, and today the institution lists petencies. That means that (1) the test must fully
measure the competency, (2) the processes students
eight competencies required use to complete the assessment tasks must be an
of all students. authentic reflection of the competency, and (3) stu-
dents would receive the same test results if they were
to take a different form of the test scored by differ-
A second example of degree-level competency spec- ent raters. These ideas correspond with validity evi-
ification comes from Alverno College, which began dence based on test content, response processes, and
exploring CBE programs in the late 1960s. First, edu- internal structure, respectively. Each will be briefly
cators used a faculty survey to capture the learning described in the following paragraphs.
outcomes that professors saw as critical for individ-
ual courses and academic departments. A subsequent Evidence Based on Test Content. Once program design-
analysis identified the similarities across courses and ers have clearly defined relevant competencies, they
departments, from which Alverno established four should collect evidence that their test content fully
institution-wide learning outcomes. Alverno College reflects those competencies. Specifically, providing
regularly expanded and revised those learning out- validity evidence based on test content means showing
comes, and today the institution lists eight competen- the relationships between test questions or tasks and the
cies required of all students: communication, analysis, defined competencies. Test developers must consider
problem solving, value in decision making, social inter- how well the breadth and depth of their test relates to
action, developing a global perspective, effective citi- defined competencies.
zenship, and aesthetic engagement.23 The Advanced Placement (AP) program is one exam-
Another approach to defining competencies in CBE ple. Its test content validity evidence is grounded in a
programs is to establish a large master set of compe- process known as evidence-centered design (ECD).25
tencies and then require students to demonstrate pro- Using ECD, assessment designers connect three com-
ficiency in a subset depending on the degree program ponents: (1) the intended claims about students con-
or job requirements. Lipscomb University uses this tent knowledge and skill proficiency, (2) the observable
approach. Lipscomb licensed the Polaris business com- evidence a student would need to produce to support
petency framework, which defines 41 competencies those claims, and (3) the tasks or situations that would
across seven general categories: interpersonal, commu- elicit that evidence. By designing assessment tasks that
nication, management, leadership, conceptual, per- enable students to demonstrate the relevant knowledge
sonal, and contextual.24 and skills, test developers provide evidence of validity.
The Polaris system has been implemented by An example from the AP Chemistry assessment
numerous companies across a variety of industries to starts with the following claim: Students can apply
hire staff, provide training, and develop leaders. When mathematics in which they evaluate the reasonableness
Lipscomb applied the Polaris competency model, of quantities found in stoichiometric calculations.26

6
MEASURING MASTERY KATIE LARSEN MCCLARTY AND MATTHEW N. GAERTNER

Examples of supporting evidence would include the the other hand, are best measured through a perfor-
correctness of a chemical equation, chemical formu- mance assessment where the student sings a set of notes
las, application of mathematical routine, or coefficients into a recorder. Using novel pieces of music and sets
interpreted as mole ratios. Assessment items can then of notes helps ensure that the assessments are measur-
be written such that students demonstrate the appli- ing specifically aural skills or sight singing rather than
cation of a correct chemical formula or interpret the memory or musical experience.
coefficients of a problem as mole ratios. Test content CBE programs can and should gather similar evi-
is thereby directly linked to defined competencies dence. For example, in Excelsior Colleges nursing pro-
through the ECD process. gram, a computer-based exam is given to assess nursing
Although not necessarily developed with ECD prin- theory, but critical thinking and clinical reasoning are
ciples in mind, some CBE assessments make an explicit measured through simulated clinical experiences and
link between test content and defined competencies. actual performance in a clinical setting.29 While it
For example, in Southern New Hampshire Universi- seems preferable to assess clinical reasoning in a clinical
tys direct-assessment CBE program, students show setting, assessment designers must clearly describe how
proficiency in various competencies through authen- adequate reasoning skills are demonstrated (or insuffi-
tic project tasks. Students are able to select from mul- cient reasoning skills identified) in such a test-taking
tiple simple projects that assess one competency at a scenario. In the case of this nursing exam, establishing
time or a single complex project that assesses multi- explicit links between the desired thinking and reason-
ple competencies. A simple project assessing students ing processes and successful task completion would
ability to write a paragraph involves describing a recent provide validity evidence based on response process.
purchasespecifically, why the item was purchased
and why it was selected over other items. Evidence Based on Internal Structure. A third type of
A more complex project assessing multiple compe- validity evidence is based on the internal structure of
tencies, on the other hand, requires a student to write a the assessmentthat is, how the assessment items or
formal memo to his or her boss evaluating two vending tasks relate to each other. For example, the developers
machine companies and recommending one over the of the AP World Languages and Cultures assessments
other. The vending machine recommendation project hypothesized that their tests measured four factors:
assesses five competencies: (1) can use logic, reason- reading, writing, listening, and speaking. Factor anal-
ing, and analysis to address a problem; (2) can write a ysis (a common and useful tool for determining the
business memo; (3) can use a spreadsheet to perform number of factors a test measures) supported their
calculations; (4) can synthesize material from multi- hypothesis.30
ple sources; and (5) can evaluate information and its Another way to consider this is to compare test
sources critically.27 This explicit project-to-competency structure across different examinee groups. For exam-
link provides strong validity evidence based on test con- ple, College Board conducted several studies to deter-
tent for Southern New Hampshires CBE program. mine whether the AP World Language and Cultures
exams kept their four-factor structure for native speak-
Evidence Based on Response Processes. Students response ers, bilingual students, and second-language learners.
processesthat is, the thoughts, behaviors, and actions Results supported a similar factor structure across all
required of a student to complete an assessmentare population groups.31
another source of validity evidence, usually gathered The most common way to demonstrate validity evi-
during initial test development. For example, students dence based on internal structure is through reliabil-
taking the AP Music Theory course and assessment ity. There are different ways to measure different types
must demonstrate a variety of skills through different of reliability, including test-retest (where students take
processes. Aural skills are measured through exercises the same test form on different occasions), internal
requiring students to listen to a piece of music and consistency (which measures the extent to which stu-
write the notation on a staff.28 Sight-singing skills, on dents respond similarly to items within a single test

7
MEASURING MASTERY KATIE LARSEN MCCLARTY AND MATTHEW N. GAERTNER

form), and inter-rater reliability (where two or more Concurrent Validity Evidence. Validity evidence based
raters evaluate the same student performance on a test). on relationships with other variables can come at two
Students should receive approximately the same score points. First, educators could compare assessment
if they take a test multiple times, regardless of the test results with other measures collected concurrently.
form administered or the raters scoring it. Using Cron- For example, students completing a college algebra
bachs alpha (a reliability statistic that ranges from 0 to course may be administered a College-Level Examina-
1.0) as a measure of internal consistency, for example, tion Program (CLEP) exam to evaluate the relation-
values above 0.80 are considered acceptable, although ship between performance on CLEP and performance
most standardized tests typically have values above in the course. A strong positive relationship between
0.90.32 The AP program reports the reliability of each test performance and course performance would sup-
section (multiple choice and free response), of the rat- port using the CLEP test to place out of college algebra.
ers, of the composite score, and of the subsequent score Indeed, there is evidence that CLEP scores are moder-
classification.33 ately correlated with college course grades.34
All three of these analyses could be applied to CBE
programs. Program designers could apply statistical
analyses to the WGU assessments (described pre- External-validity evidence is critical
viously in this section) to determine whether their to supporting the claims that CBE
structure reflects the domain and subdomain struc-
ture specified in the competency frameworks. Addi- programs can make about the
tional analyses could help them evaluate whether
relationship between their measures of
the structure is consistent across relevant popula-
tion groups. Finally, CBE program designers should competence and workplace success, and
always report reliability statistics when tests are used
for high-stakes purposes. about comparability of graduates from
Although many CBE programs report developing CBE and non-CBE programs.
reliable and valid assessments, reliability statistics are
rarely publicly documented. Some programs rely on
assessments developed by external organizations, and Similar evidence is limited for CBE programs.
those organizations typically provide reliability infor- Examples do exist, however, from programs outside
mation for their instruments. For example, the reli- the United States. The National Cancer Action Team
ability of the Polaris assessments (used by Lipscomb in England developed a competency assessment tool
University) exceeds the 0.80 threshold for all but a for technical surgical performance. Test developers val-
few dimensions. Many more institutions, however, are idated their assessment tool against other measures of
developing their own assessments and should work to examinee performancenamely, a measure of observed
provide their own reliability evidence. errors. As one would hope, scores on the competency
assessment tool were inversely related to the number of
Validity Associated with External Evidence. While surgical errors.35
we just outlined three sources of validity related to
the test itselftest content, response processes, and Predictive Validity Evidence. In addition to concurrent
internal structurethis section describes sources validity evidence, predictive validity evidence is critical
based on external evidence. External-validity evi- when assessment scores will be used to predict a future
dence is critical to supporting the claims that CBE outcome. Performance on AP exams, for example,
programs can make about the relationship between should predict postsecondary outcomes. Accordingly,
their measures of competence and workplace success, College Board has provided evidence that, after con-
and about comparability of graduates from CBE and trolling for SAT scores and high school grades, students
non-CBE programs. who scored higher on the AP exam had higher levels

8
MEASURING MASTERY KATIE LARSEN MCCLARTY AND MATTHEW N. GAERTNER

of college success (in other words, higher grades, reten- when linked with students initial performance on CBE
tion, and selectivity) than lower-scoring AP students or assessments. Further, employers postprogram ratings
students not taking the AP exam.36 could provide evidence of the CBE assessments pre-
Colleges also use CLEP exams to award credit and dictive value.
therefore require similar validity evidence. That body Other, more mature CBE programs do report lim-
of evidence suggests that students who receive college ited information related to later-life outcomes. For
credit via CLEP perform comparatively well in their example, on its website WGU reports that its senior
subsequent college courses: CLEP students typically students performed better than students at 78 percent
have higher grade-point averages (GPAs) and earn a of institutions participating in the Collegiate Learning
higher proportion of As and Bs relative to their class- Assessment, a standardized measure of critical think-
mates.37 Even after controlling for prior achievement ing and communication.39 In addition, 94 percent of
and demographic characteristics, CLEP students had employers felt that WGU graduates performed at least
higher GPAs than non-CLEP students and graduated as well as graduates from other institutions. In fact,
sooner, enrolled in fewer semesters, and graduated with 53 percent of employers reported higher performance
fewer credits.38 from the WGU graduates.40
Excelsior College also reports outcomes data in
terms of standardized-test performance and subsequent
Hopefully as years pass and CBE
job performance. Graduates from the Excelsior nurs-
programs mature, more institutions ing program pass the nursing licensure exam at rates
comparable to the national average. Once employed,
undertake and publish rigorous validity 82 percent of surveyed nurse supervisors rated Excelsior
studies to establish a research base nursing graduates similar or higher in terms of clinical
competency compared to other associate-degree-level
commensurate with CBEs nursing graduates.41
growing popularity. Posting student outcomes to a website or publish-
ing job performance results via commissioned reports
is a step in the right direction. But the educational
These studies provide strong evidence of validity research community needs more examples similar to
based on test consequences. Similar performance pat- those provided by Excelsior College and WGU. Fur-
terns in subsequent courses helps demonstrate that thermore, submitting claims about student outcomes
students who succeed on a placement exam have to rigorous scientific peer review could substantially
indeed mastered the requisite skills; this is the eviden- expand the CBE knowledge base and allow policy-
tiary sine qua non for prior-learning assessments. For makers to fairly assess the value these programs pro-
CBE programs to become widely accepted as an alter- vide. While that kind of research takes time, we hope
native path for earning a college degree, the programs that as years pass and CBE programs mature, more
must likewise provide evidence that they are just as institutions undertake and publish rigorous valid-
good as corresponding traditional degree programs ity studies to establish a research base commensurate
at impartingor at least measuringthe relevant with CBEs growing popularity.
knowledge and skills.
Although such external-validity data for CBE assess-
ments is relatively scant, some programs are develop- Determining Mastery
ing infrastructure to support these important analyses.
Lipscomb University students, for example, are rated The previous section focused on assessment design
by their employers at the beginning and end of the and the need to collect validity evidence for assess-
CBE program. Employers ratings at the beginning of ment results. In this section we focus on the equally
the CBE program could provide concurrent evidence important task of determining the level of performance

9
MEASURING MASTERY KATIE LARSEN MCCLARTY AND MATTHEW N. GAERTNER

required to receive credit. In educational assessment, with a particular set of knowledge, skills, and behav-
this is known as standard setting, a process that is com- iors. Lipscombs influence competency, required for
mon in primary and secondary schooling, but less fre- an undergraduate CBE degree, provides a good exam-
quently discussed in higher education. Standard setting ple. Students at the basic/elementary level of the influ-
is the process of defining discrete levels of achieve- ence competency are responsive: they acknowledge
ment on an assessment and setting cut scores to sepa- requests quickly, listen attentively, and gain respect and
rate those levels. In some cases, such as with licensure admiration. At the next level, proficient practitioners
exams, two performance levels are sufficient: pass or are reliable team leaders who identify and communi-
fail. In other cases, more levels are useful to further dif- cate compelling motivators. They adjust their influence
ferentiate performance. AP exams, for example, have style to meet the needs of individual team members
five score points. and offer recognition and encouragement to keep the
Standard setting not only relates assessment scores to team moving forward.
performance levels, but also determines which perfor- The exceptional/expert influencer communicates a
mance levels are sufficient to receive credit. As described legitimate, consistent agenda across a variety of func-
earlier, EBSS may provide a particularly attractive tions, understands power dynamics and the respon-
standard-setting framework for both prior-learning sibilities of leadership, clearly articulates situational
and CBE programs, because each performance level in advantages, and validates potential concerns. Finally,
those programs is associated with not only a compe- individuals at the master/guru level develop and imple-
tency level but also an empirical likelihood of future ment appropriate and creative recognition, rewards,
success. The following paragraphs describe each of the and incentives to activate an organization. They influ-
five steps of EBSS: ence all levels of the organization and external stake-
holders through strong communication, impactful
1. Define the relevant outcomes; messages, and personal appeal. In addition, masters/
gurus remain persistently optimistic, particularly in the
2. Design appropriate studies; face of challenges.43
Lipscombs CBE program has also made associations
3. Conduct studies and synthesize results; with external measures. While each of these four cate-
gories describes a distinct level of competence, each is
4. Stakeholder review and recommendations; and also linked to success in various tiers of employment.
For example, students at the basic/elementary level are
5. Ongoing monitoring.42 ready to become entry-level, individual contributors,
while proficient practitioners are prepared for supervi-
Step 1: Define the Relevant Outcomes. The first step sor or entry-level manager positions. The exceptional/
in EBSS is defining the competencies and correspond- expert-level competencies are needed for functional
ing indicators of future success for each performance managers or managers of managers. Finally, strategic
level. For example, students generally need to demon- leaders or corporate executives are associated with the
strate specific knowledge and skills to earn an AP exam master/guru level of performance.44
score of 4a claim about competency. Moreover, stu- CBE program designers must also consider how
dents who score a 4 would typically earn an A-, B+, many distinct performance categories can be clearly dif-
or B in a corresponding college coursea claim about ferentiated by their assessments, and the consequences
future success. of landing in any given level. Many CBE programs
CBE program designers have established similar divide their assessments scales into two levels (one in
definitions. Lipscomb Universitys CBE program, for which students receive credit, and one in which they
example, has four levels for each competency: basic/ do not), but such a stark dichotomy is not required.
elementary, proficient practitioner, exceptional/expert, Instead, different performance levels could translate to
and master/guru. Each performance level is associated different numbers of credits awarded, or CBE programs

10
MEASURING MASTERY KATIE LARSEN MCCLARTY AND MATTHEW N. GAERTNER

could establish graduated distinctions for exceptional Step 3: Conduct Studies and Synthesize Results. In
performance (similar to course grades in traditional step three, program designers carry out the research
degree programs). studies designed in step two and then combine and
synthesize the results. For each AP exam, for instance,
Step 2: Design Appropriate Studies. In step two, CBE the College Board conducts a college comparability
program leaders must develop research studies that can study, administering a shorter version of the AP exam
produce evidence for the claims implied by their per- to students enrolled in corresponding introductory col-
formance levels. For AP exams, college professors and lege courses.48 Students performance on the AP exam
high school AP course teachers first described what stu- is then compared to their course grades.
dents should be able to do at each level and then esti- To establish empirical links between test scores
mated how many test questions a student would need and relevant outcomes, AP scores are averaged within
to correctly answer to attain each score point. But to discrete college grades. (For example, what is the
support claims about future outcomes, leaders had to average AP score for students earning an A-, B+, or
design research studies that compared college course B in the course?) Average performance can support
performance with AP exam scores.45 claims about future outcomes at each of the five AP
To support the competency claims made by CBE performance levels. In general, an AP score of 5 is
programs, assessment designers have implemented sim- equivalent to a college course grade of A; a 4 maps to
ilar processes. In WGUs nursing program, for instance, college grades of A, B+, and B; and a 3 maps to B,
a panel of university faculty and external experts C+, and C.49
reviewed the objectively scored assessment items and This kind of external-validity research is rare in CBE
indicated how they thought a student with sufficient programs. Few programs have linked performance on
mastery of the competency would perform.46 Similarly, their assessments with future outcomes, so these links
designers of the Lipscomb University Adult Degree are also absent during the standard-setting process.
Program gathered recommendations from several dif- This is a clear area for improvement: when CBE pro-
ferent groups, including faculty and local employers, grams set minimum test scores for course credit, exter-
who recommended competency levels appropriate for nal data linking those judgments to future performance
an undergraduate degree, based on their expert knowl- should play a central role.
edge of course and job requirements.
To our knowledge, however, neither university Step 4: Stakeholder Review and Recommendations.
designed studies to support the claims about future In step four, stakeholders review both the assessments
outcomes as part of its initial standard-setting pro- and the study evidence to determine the score that best
cess. To be sure, this is typical of most standard-set- differentiates students who have demonstrated mastery
ting processes for educational assessments. Initial from those who have not (or students in one perfor-
standards are often set based on expert judgments, mance level from those in an adjacent level). For the AP
while designers collect validity data based on rela- program, the most relevant stakeholders are high school
tionships with future outcomes or testing conse- AP teachers and college faculty. For CBE programs,
quences after the fact. relevant stakeholders include not only the faculty who
However, when assessments are linked to significant will implement the programs and their assessments but
claims about future outcomesas is the case for both also colleagues, employers, or industry representatives
prior-learning assessments and CBE programswe who will be hiring CBE graduates.
argue that, where possible, program designers should For the AP program, stakeholders can consider
seek out empirical evidence relating assessment results information both from experts judgments on the
to external outcomes to inform the initial standard set- assessment items and from student performance in
ting. Establishing these external links a priori takes time college courses. There are general guidelines about the
and careful planning but has proven feasible in large- relationship between AP exam performance and college
and small-scale testing programs.47 course grades, but the stakeholders must also consider

11
MEASURING MASTERY KATIE LARSEN MCCLARTY AND MATTHEW N. GAERTNER

variation among college courses and the judgments grades in entry-level courses. Furthermore, if students
made by the expert reviewers in recommending cut receive AP credit and skip an entry-level course, in sub-
scores for each performance level.50 sequent courses they should perform as well as compa-
In addition, stakeholders can use college compa- rable students who did not advance via AP credit.
rability study results to consider the consequences of Meanwhile, programs that do not include
different cut-score placements. Because most colleges outcomes-based study evidence in the initial
use an AP score of 3 or higher to award college credit, standard-setting process should monitor (and, if nec-
this cut score is particularly important. If the cut score essary, revise) cut scores after they have collected and
is set too high, many students would still be required analyzed such data. For example, Lipscomb can use
to complete an entry-level course even though they employer ratings once graduates return to the workforce
already have the knowledge and skills necessary to per- to evaluate whether reaching the proficient practitioner
form well in that course. If the cut score is set too low, level predicts successful job performance, or whether
students may place out of the entry-level course but not requiring exceptional/expert-level performance might
have the knowledge and skills to be successful in a sub- be more prudent for more demanding jobs. Likewise,
sequent course. The goal is finding the Goldilocks cut Excelsior College can monitor the relationship between
score that is not so high that overprepared students are student performance on CBE nursing assessments and
unfairly sent to introductory courses, but not so low subsequent performance on the nursing licensure exam
that underprepared students are allowed to skip mate- to ensure that students who pass the CBE program
rial they have not yet mastered. assessments are at least as well prepared for the licensure
Stakeholder review is also an important step in set- exam as students who opted for a traditional program.
ting standards for CBE programs. Stakeholders usually As CBE programs continue to mature, they should
have a voice in defining the original competency frame- continue to gather data from students and graduates for
work; they should also be involved in determining the use in regular reviews of their competency thresholds.
level of competency required for a credential. Further-
more, when setting standards for K12 assessments, test
developers typically publish a technical report describ- Conclusion and Recommendations
ing the composition of stakeholders and their role in
the standard-setting process. Technical reports are Competency-based education programs may well rep-
common practice in assessment development. Reports resent a viable alternative pathway to a postsecond-
provide transparency and allow external review of test ary degree. Ideally, CBE students would progress at
development processes and the associated validity evi- their own pace and demonstrate mastery of important
dence. CBE providers should make efforts to publish competencies, free from the restrictions of traditional
similar technical documentation for their assessments. seat-time requirements. This would allow gradu-
ates to clearly describe (and provide evidence for) the
Step 5: Ongoing Monitoring. CBE program design- knowledge, skills, and abilities they demonstrated to
ers should collect data throughout the life of an assess- earn their degree. Employers could match their needs
ment program to provide continuing support for the to candidates with relevant competencies. Practically
interpretation of scores and performance-level classi- speaking, though, the credibility of CBE credentials in
fications. As appropriate, assessment developers and the marketplace (and therefore the viability of the CBE
stakeholders may revisit and revise performance-level model in general) rests on reliable and valid assessments
cut scores to reflect new evidence generated from mul- with evidence-based performance levels.
tiple student cohorts progressing through a course or Based on our review, it seems that most CBE efforts
academic unit. For AP, this means additional iterations to date have focused on defining the competencies and
of the studies noted previously, where entry-level col- developing the competency frameworks associated
lege course grades are linked to AP scores. Obviously, with various degree programs. Many programs have
higher AP scores should continue to predict better clear documentation of the competencies they seek to

12
MEASURING MASTERY KATIE LARSEN MCCLARTY AND MATTHEW N. GAERTNER

teach and measure and the types of assessments they validity evidence but would also improve trans-
will use to determine mastery. Their next step should be parency around the processes and expectations of
to provide more specific documentation linking assess- CBE programs.
ment tasks (such as test questions) with the competen-
cies those tasks are designed to measure. 2. Conduct research to relate CBE assessments
More importantly, CBE programs would be wise to other assessments measuring similar com-
to begin longitudinal research linking their assess- petencies and to future outcomes that assess-
ments to other relevant student outcomes, such as ments are designed to predict. Many CBE
job performance. This type of evidence is crucial for programs develop their own assessments and
establishing the validity of both CBE assessments and have good reasons for not wanting to adopt large-
the cut scores separating those who receive credit from scale standardized tests. But it is no less important
those who do not. that these local assessments be validated against
other measures. CBE programs could collaborate
to collect the necessary evidence. For example,
CBE programs would be wise to begin
students could complete both Alverno Colleges
longitudinal research linking their problem-solving measures and Lipscomb Uni-
versitys problem-solving and decision-making
assessments to other relevant student competency assessments. Collaborating institu-
outcomes, such as job performance. tions should not expect scores to be identical, but
where competencies are conceptually related, the
assessments of those competencies should show
Throughout this paper, we have described the assess- an empirical relationship.
ment industrys evidence standards and the current state
of CBE assessment relative to those basic principles. By 3. Use the results of empirical research studies
way of summarizing our observations and prodding in the initial standard-setting process. Data
further research, we conclude with some recommenda- relating CBE assessment scores to other out-
tions for institutions that currently offer (or are consid- comes should be used not only to validate the
ering developing) CBE programs: assessment post hoc but also to set competency
standards a priori. Although performance stan-
1. Clearly define competencies, and document dards must usually be set before longitudinal
evidence that assessments fully measure those data linking assessment performance with future
competencies. Contemporary higher education outcomes can be collected, there are viable
debates are focused on the knowledge, skills, and alternatives for gathering outcomes data. CBE
abilities that college graduates should possess. assessments could be administered to employ-
The Lumina Foundations Degree Qualifications ees currently working in fields relevant for the
Profile and the Association of American Colleges assessment. For example, WGU offers degree
& Universities Essential Learning Outcomes programs in education, business, information
have set out to identify competencies for gen- technology, and health care. Entry-level work-
eral education. So have several of the programs ers in those fields could complete WGUs CBE
mentioned in this paper. Those competencies, assessments, and those students on-the-job per-
however, must be defined with enough detail that formance could be compared to their perfor-
they can be measured. Although many current mance on CBE assessments. These empirical
CBE programs have detailed descriptions of the links could be evaluated in conjunction with the
competencies, there is less documentation linking expert judgments currently collected, helping
those competencies to the assessments that mea- bolster the validity evidence supporting chosen
sure them. Such clarity would not only provide performance levels.

13
MEASURING MASTERY KATIE LARSEN MCCLARTY AND MATTHEW N. GAERTNER

4. Continue to gather and report validity evi- 6. Patricia Book, All Hands on Deck: Ten Lessons from
dence for CBE assessments and performance Early Adopters of Competency-Based Education, Western
standards, including comparisons of student Interstate Commission for Higher Education, May 2014,
outcomes against relevant comparison groups. http://wcet.wiche.edu/wcet/docs/summit/AllHandsOn-
For CBE programs to be viewed as an attractive Deck-Final.pdf.
alternative to traditional programs, students and 7. In 2014, the Department of Educations inspector gen-
employers need evidence that (1) CBE graduates eral issued a report warning of the potential for waste, fraud,
possess the same knowledge and skills as compa- and abuse to result from the departments approval of direct
rable traditional graduates and (2) CBE graduates assessment programs. See US Department of Education,
are equally successful after graduation. These out- Office of the Inspector General, Final Audit Report, Sep-
comes could be measured in terms of subsequent tember 30, 2014, www2.ed.gov/about/offices/list/oig/audit-
academic performance or through job attain- reports/fy2014/a05n0004.pdf.
ment, job performance, occupational prestige, or 8. Ibid, 7.
earnings. Some CBE programs may be collect- 9. Sally Johnstone and Louis Soares, Principles for Devel-
ing these data already; they should focus on rig- opment Competency-Based Education Programs, Change:
orous analysis and publication. Other programs The Magazine of Higher Learning 46, no. 2 (2014): 1219.
will need to develop the necessary infrastructure 10. Council on Education for Public Health, Competencies
and timelines to start data collection. It will take and Learning Objectives, June 2011, http://ceph.org/assets/
time to gather robust long-term outcomes data, Competencies_TA.pdf.
but these data can provide compelling evidence 11. John Harris and Stephen Keller, Assessment Measures
for the effectiveness of CBE programs and sup- Needed for Competency-Based Higher Education, Peabody
port their continued growth. Journal of Education 53, no. 4 (1976): 24147.
12. Paul Gaston, Higher Education Accreditation: How Its
Changing, Why It Must (Sterling, VA: Stylus Publishing,
Notes 2014).
13. Standards for Educational and Psychological Testing
1. US Department of Education, Applying for Title IV (Washington, DC: American Educational Research Associa-
Eligibility for Direct Assessment (Competency-Based) Pro- tion, 2014).
grams, March 19, 2013, http://ifap.ed.gov/dpcletters/ 14. Ibid. According to Standard 1.0 of the Standards for
GEN1310.html. Educational and Psychological Testing, Clear articulation of
2. Jessica Davis, School Enrollment and Work Status: each intended test score interpretation for a specified use
2011, US Census Bureau, October 2012, www.census.gov/ should be set forth, and appropriate validity evidence in sup-
prod/2013pubs/acsbr11-14.pdf. port of each intended interpretation should be provided.
3. Center for Law and Social Policy, Yesterdays Nontradi- 15. Ibid.
tional Student Is Todays Traditional Student [Fact Sheet], 16. Katie Larsen McClarty et al., Evidence-Based Standard
June 2011, www.clasp.org/resources-and-publications/ Setting: Establishing a Validity Framework for Cut Scores,
publication-1/Nontraditional-Students-Facts-2011.pdf. Educational Researcher 42, no. 2 (2013): 7888.
4. Roland L. Peterson, Review and Synthesis of Research 17. Edward Haertel, Jennifer N. Beimers, and Julie A. Miles,
in Vocational Teacher Education, Ohio State University, Cen- The Briefing Book Method, in Setting Performance Stan-
ter for Vocational Education, 1973, http://files.eric.ed.gov/ dards: Foundations, Methods, and Innovations, ed. Gregory
fulltext/ED087898.pdf. Cizek (New York, NY: Routledge, Second Edition, 2012); and
5. White House, Office of the Press Secretary, Fact Sheet Kimberly OMalley, Leslie Keng, and Julie A. Miles, From Z
on the Presidents Plan to Make College More Affordable: A to A: Using Validity Evidence to Set Performance Standards,
Better Bargain for the Middle Class, August 22, 2013, www. in Setting Performance Standards: Foundations, Methods, and
whitehouse.gov/the-press-office/2013/08/22/fact-sheet- Innovations, ed. Gregory Cizek (New York, NY: Routledge,
president-s-plan-make-college-more-affordable-better-bargain-. Second Edition, 2012).

14
MEASURING MASTERY KATIE LARSEN MCCLARTY AND MATTHEW N. GAERTNER

18. Elizabeth A. Jones, Richard A. Voorhees, and Karen 31. Ibid.


Paulson, Defining and Assessing Learning: Exploring Com- 32. Robert F. DeVellis, Scale Development: Theory and
petency-Based Initiative, Institute of Education Sciences, Applications (Newbury Park, NJ: Sage Publications, 1991).
National Center for Education Statistics, November 6, 2002, 33. Brent Bridgeman, Rick Morgan, and Ming-mei Wang,
http://nces.ed.gov/pubsearch/pubsinfo.asp?pubid=2002159. Reliability of Advanced Placement Examinations, Educa-
19. Ibid. tional Testing Service, 1996.
20. Rebecca Klein-Collins, Competency-Based Degree Pro- 34. Amiel T. Sharon, The Use and Validity of the GED and
grams in the US: Postsecondary Credentials for Measurable CLEP Examinations in Higher Education (presentation,
Student Learning and Performance, Council for Adult and American Personnel and Guidance Association Annual Con-
Experiential Learning, 2012, www.cael.org/pdfs/2012_ vention, Atlantic City, NJ, April 1971).
competencybasedprograms. 35. Danilo Miskovic et al., Is Competency assessment at the
21. Richard A. Voorhees, Competency-Based Learning Specialist Level Achievable? A Study for the National Training
Models: A Necessary Future, New Directions for Institutional Programme in Laparoscopic Colorectal Surgery in England,
Research 110 (2001): 513. Annals of Surgery 257 (2013); 47682.
22. Klein-Collins, Competency-Based Degree Programs. 36. Krista Mattern, Emily Shaw, and Xinhui Xiong, The
23. Ibid. Relationship between AP Exam Performance and College Out-
24. Council for Adult and Experiential Learning, Custom- comes, College Board, 2009, https://research.collegeboard.
ized, Outcome-Based, Relevant Evaluation (CORE) at Lip- org/sites/default/files/publications/2012/7/researchreport-
scomb University: A Competency-Based Education Case 2009-4-relationship-between-ap-exam-performance-
Study, 2014, www.cael.org/cael_lipscomb_case_study. college-outcomes.pdf.
25. Robert J. Mislevy and Geneva D. Haertel, Implications 37. Brad Moulder, Abdulbaset Abdulla, and Deanna L.
for Evidence-Centered Design for Educational Assessment, Morgan, Validity and Fairness of CLEP Exams, College
Educational Measurement: Issues and Practice 25, no. 4 Board, 2005, http://media.collegeboard.com/digitalServices/
(2006): 620. pdf/clep/validity-fairness-clep-exam.pdf.
26. Maureen Ewing et al., Representing Targets of Measure- 38. Carol Barry, A Comparison of CLEP and Non-CLEP
ment within Evidence-Centered Design, Applied Measure- Students with Respect to Postsecondary Outcomes, College
ment in Education 23, no. 4 (2010): 32541. Board, 2013, http://media.collegeboard.com/digitalServices/
27. Jennifer Share, College for America: A New Approach pdf/clep/clep_research_report.pdf.
for a New Workforce That Is Accessible, Affordable, and Rele- 39. Western Governors University, WGU Student and
vant, in 2013 CAEL Forum & News: Competency-Based Graduate Success, http://texas.wgu.edu/about_wgu_texas/
Education, ed. Diana Bamford-Rees et al. (Council for Adult learning_results.
and Experiential Learning, 2013), www.cael.org/pdfs/cael_ 40. Western Governors University, WGU Is Focused on
competency_based_education_2013. Student and Graduate Success, www.wgu.edu/about_WGU/
28. College Board, Music Theory Course Description, graduate_success.
2012, https://secure-media.collegeboard.org/ap-student/ 41. Li Gwatkin, Mary P. Hancock, and Harold A. Javitz, As
course/ap-music-theory-2012-course-exam-description.pdf. Well Prepared, and Often Better: Surveying the Work Perfor-
29. Klein-Collins, Competency-Based Degree Programs. mance of Excelsior College Associates Degree in Nursing
30. April Ginther and Joseph Stevens, Language Background, Graduates, SRI International, November, 25, 2009, https://
Ethnicity, and the Internal Construct Validity of the Advanced my.excelsior.edu/documents/78666/102207/Work_
Placement Spanish Language Examination, in Validation in Performance_of_Excelsior_Associate_Nursing_Graduates.
Language Assessment, ed. Antony John Kunnan (Mahwah, NJ: pdf/357ed375-41ce-436a-be9e-73a96a34ec51.
Lawrence Erlbaum, 1998); and Rick Morgan and John Mazzeo, 42. McClarty et al., Evidence-Based Standard Setting.
A Comparison of the Structural Relationships among Reading, 43. Organization Systems International, OSI Polaris
Listening, Writing, and Speaking Components of the AP French Competency Continuums.
Language Examination for AP Candidates and College Stu- 44. Other CBE programs have developed similarly detailed
dents, Educational Testing S ervice, 1988. definitions of performance levels. For the sake of parsimony,

15
MEASURING MASTERY KATIE LARSEN MCCLARTY AND MATTHEW N. GAERTNER

we will not describe each in detail here. For a thorough descrip- Education Agency, STARR Performance Standards, http://
tion of the performance levels for Alverno Colleges problem- tea.texas.gov/WorkArea/linkit.aspx?LinkIdentifier
solving competency and Tusculum Colleges coherence com- =id&ItemID=25769804117&libID=25769804117.
petency, see Rebecca Klein-Collins, Competency-Based 48. Patterson and Ewing, Validating the Use of AP Exam
Degree Programs. Scores for College Course Placement.
45. For an example of this type of study, see Brian F. Patter- 49. College Board, Music Theory Course Description.
son and Maureen Ewing, Validating the Use of AP exam 50. Deborah Lokai Bischof et al., Validating AP Modern
Scores for College Course Placement, College Board, 2013, Foreign Language Exams through College Comparability
http://research.collegeboard.org/sites/default/files/ Studies, Foreign Language Annals 37, no. 4 (2004): 61622.
publications/2013/7/researchreport-2013-2-validating-AP-
exam-scores-college-course-placement.pdf.
46. Jan Jones-Schenk, Nursing Education at Western Gov- Other Papers in This Series
ernors University: A Modern, Disruptive Approach, Journal
of Professional Nursing 30, no 2. (2014): 16874. T
 he Landscape of Competency-Based Educa-
tion: Enrollments, Demographics, and Afford-
47. For example, a consortium of states developing an Alge-
ability, Robert Kelchen
bra II assessment as part of the American Diploma Project used
this approach, and Texas also used this approach in developing E
 mployer Perspectives on Competency-Based
standards for its statewide assessment program. See Haertel,
Education, Chip Franklin and Robert Lytle
Beimers, and Miles, The Briefing Book Method; and Texas

16
About the Authors

Katie McClarty, director of the Center for College & Career Success in Pearsons Research
& Innovation Network, leads a team of researchers who plan and execute research in sup-
port of the centers mission, which is to identify and measure the skills needed to be suc-
cessful in college and careers, determine pathways for students to be college and career
ready, track their progress along those pathways, and evaluate effective ways to keep stu-
dents on track. McClarty has authored papers and presentations related to college read-
iness, standard setting, assessment design, and talent development. Her work has been
published in journals such as the American Psychologist, Research in Higher Education,
Educational Measurement: Issues and Practice, and Educational Researcher.

Matthew Gaertner is a senior research scientist at the Center for College & Career Suc-
cess in Pearsons Research & Innovation Network. His methodological interests include
multilevel models, categorical data analysis, and item response theory. Substantively, his
research focuses on the effects of educational policies and reforms on disadvantaged stu-
dents access, persistence, and achievement. Gaertners work has been published in Har-
vard Law Review, Harvard Educational Review, Educational Evaluation and Policy Analysis,
Research in Higher Education, and Educational Measurement: Issues and Practice. He was
awarded a Spencer Foundation Dissertation Fellowship and an Association for Institu-
tional Research Dissertation Grant. He also received the 2013 and 2011 Charles F. Elton
Best Paper Awards from the Association for Institutional Research.

Acknowledgments

We would like to express our gratitude to Charla Long at Lipscomb University for pro-
viding detailed information about the schools CBE program. We also thank Andrew
Kelly and Rooney Columbus of AEI for supporting this project and providing feedback
on earlier drafts. All remaining errors are our own.

You might also like