You are on page 1of 13

1

Chapter IV
Assessing Creativity Using the
Consensual Assessment
Technique
John Baer
Rider University, USA

Sharon S. McKool
Rider University, USA

Abstract

The Consensual Assessment Technique is a powerful tool used by creativity researchers in which panels
of expert judges are asked to rate the creativity of creative products such as stories, collages, poems, and
other artifacts. Experts in the domain in question serve as judges; thus, for a study of creativity using
stories and poems, a panel of writers and/or teachers of creative writing might judge the creativity of
the stories, and a separate panel of poets and/or poetry critics might judge the creativity of the poems.
The Consensual Assessment Technique is based on the idea that the best measure of the creativity of a
work of art, a theory, a research proposal, or any other artifact is the combined assessment of experts in
that field. Unlike other measures of creativity, such as divergent-thinking tests, the Consensual Assess-
ment Technique is not based on any particular theory of creativity, which means that its validity (which
has been well established empirically) is not dependent upon the validity of any particular theory of
creativity. This chapter explains the Consensual Assessment Technique, discusses how it has been used
in research, and explores ways it might be employed in assessment in higher education.

Copyright 2009, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
Assessing Creativity Using the Consensual Assessment Technique

INTRODUCTION the poems, stories, and essays that they write. It


therefore has enormous potential for assessing
Assessment of creativity presents a unique chal- creativity in higher education settings.
lenge in higher education. Although there are
tools on the market for assessing creativity, most
are designed for young children, and all tend BACKGROUND
either to lack sufficient validity and reliability or
to assess only rather trivial aspects of creativity Why do you believe that Van Goghs paintings of
(or, in many cases, both). If creativity is to be sunflowers are creative? On what basis do you
assessed in college settings in a meaningful way, judge the special theory of relativity to be highly
divergent-thinking tests like the Torrance Tests creative? Why do you think Shakespeare was a
of Creative Thinking and other commonly used more creative dramatist than Marlowe? And how
creativity tests are inadequate because they fail to would you judge the creativity of some recent ten-
meet even the loosest standards of validity. (And and eleven-dimensional string theories?
unless we are teaching masonry, do we really You may be comfortable answering some of
care how many uses someone can think of for these questions, but unless you are truly a Renais-
a brick? Sadly, this is the kind of question that sance person, its unlikely that you feel qualified
most creativity tests are based on.) Self-report to make a defensible response to all four of them.
measures of creativity and global assessments of And even though you might know enough about,
students creativity by others (such as teachers) say, the works of Shakespeare and Marlowe to give
have also failed to demonstrate sufficient validity an informed opinion, does your opinion really
to be trusted for most uses (Baer, 1993; Kaufman, count as much as the opinions of recognized
Plucker, & Baer, in press). Despite the importance experts in the field of English literature?
of creativity, its assessment has proven to be How is creativity judged at the highest levels?
extremely difficult. Why are some works of art treasured and others
The Consensual Assessment Technique is a forgotten? Why do some theories, compositions,
fairly new method of measuring creativity that books, and inventions win prizes? These kinds
could open up new avenues for creativity as- of decisions arent based on a procedure or rubric
sessment in higher education. First proposed by that awards points for different attributes of a
Teresa Amabile in 1982 and further developed painting, composition, or theory. There is no test
by her and other researchers in the last quarter to determine which historians theories, which bio-
century (Amabile, 1982, 1983, 1996; Baer, 1993, chemists models, or which screenwriters movies
1994a, 1994b; Baer, Kaufman, & Gentile, 2004; are the most creative. Nobel Prize committees
Hennessey, 1994; Kaufman, Baer, Cole, & Sexton, dont apply rubrics, complete checklists, or score
in press), the Consensual Assessment Technique is tests. What do they do? They ask experts. The
now a well validated tool for assessing creativity. most valid assessment of the creativity of an idea
It has been called the gold standard of creativity or creation in any field is the collective judgment
assessment (Carson, 2006), but its use has been of recognized experts in that field. And while
limited primarily to research settings. It can be its true that experts in different times and places
used in any field; for example, it can be used for may come to different conclusions (and pity the
judging the creativity of (a) students research unfortunate artists and scientists whose genius
designs or theories in science, (b) their artistic is only recognized when it is too late for them to
creations and their musical compositions, or (c) enjoy their posthumous fame), at any given time,

2
Assessing Creativity Using the Consensual Assessment Technique

the best judgment one can make of the creativ- In the only Point-Counterpoint exchange in
ity of anyones ideas, poems, theories, artworks, its history, the Creativity Research Journal asked
compositions, or other creations is the overall two leading researchers in the field to make the
judgment of experts in their field1. case for these opposing conceptualizations of
The Consensual Assessment Technique is creativity (Baer, 1998a; Plucker, 1998). This issue
based on the rather simple idea that the best mea- remains unresolved (for recent developments, see
sure of the creativity of a work of art, a theory, or Baer & Kaufman, 2005; Kaufman & Baer, 2005a),
any other artifact is the combined assessment of and because most creativity tests are tied to one
experts in that field. Whether one is selecting a or the other of these models (almost all assume
short story for a prestigious award or judging the domain-generality, which until recent years was
creativity of a painting in an undergraduate art the most commonly accepted hypothesis), the
show, one doesnt compute a creativity score by validity of creativity assessment is tied to the
following some checklist or applying a general validity of particular models of creativity (in ad-
creativity-assessment rubric. The most valid dition to all the usual issues that validity raises
judgments of the creativity of such artifacts that regarding any test).
can be produced -- imperfect though these may Unlike just about every other technique for
be -- are the combined opinions of experts in creativity assessment, the Consensual Assess-
the field. Thats what most prize committees do ment Technique is not tied to any particular
(which is why only the opinions of a few experts theory of creativity2. It works equally well no
matter when choosing, say, the winner of the matter how the domain generality/specificity is-
Fields Medal in mathematics -- the opinions of sue may one day be resolved (or not resolved; as
the rest of us just dont count). The Consensual in many contentious issues, the truth is probably
Assessment Technique uses essentially the same somewhere in between this polarity, and the most
procedure to judge the creativity of more everyday likely resolution is perhaps a hierarchical model
creations. of some type that includes both domain-general
Creativity assessment is made difficult by and domain-specific features, such as the theory
many things, not the least of which are disagree- proposed by Kaufman and Baer (2005b)). The
ments about the nature of creativity. One of the Consensual Assessment Technique is based
most fundamental questions in creativity theory on actual creative performances or artifacts,
and research is the issue of domain specificity. and it mimics the way creativity is assessed in
Are the skills, talents, personality characteris- the real world. This approach is not without
tics, ways of thinking, and other determinants limitations, however. The Consensual Assess-
of creative performance general-purpose traits ment Technique relies on comparisons of levels
that a person possessing them can bring to bear of creativity within a particular group, and it is
on any kind of task? Can ones creativity as a therefore not possible to create any kind of stan-
composer of music help her produce more creative dardized scoring using Consensual Assessment
paintings? Can ones creativity as a chef help him Technique ratings that might allow comparisons
write more creative short stories? Is a creative to be made across settings. Its widest use to date
biologist likely also to be rather creative as a has been in research, but it can also be used for
teacher, a poet, and a dancer? Or, on the other many kinds of assessment in higher education,
hand, is creativity quite domain specific, such that as will be explained below.
whatever leads to creativity in one domain may
be different from that which leads to creativity
in other domains?

3
Assessing Creativity Using the Consensual Assessment Technique

PROCEDURE FOR USING THE looking at the creative (or not-so-creative) products
CONSENSUAL ASSESSMENT that subjects have produced.
TECHNIQUE Heres the basic Consensual Assessment Tech-
nique procedure: Subjects are given some basic
The basic technique is quite simple: instructions and, where necessary, materials, for
creating some kind of product. All subjects are
1. Subjects are asked to create something (e.g., given the same materials and instructions. Then
a poem, a short story, a collage, a composi- a group of experts, each working independently
tion, an experimental design). of one another, assesses the creativity of those
2. Experts in the domain in question are then creations. In one study, for example, students
asked to evaluate the creativity of the things were given a line drawing of a girl and a boy . .
they have made. . [and] asked to write an original story in which
the boy and the girl played some part (Baer,
The experts work independently and do not 1994a, p. 39). Experts in the area of childrens
influence one anothers judgments in any way. The writing were then asked to rate the creativity of
most common kinds of tasks have been writing the stories on a 1.0-to-5.0 scale. (The range of
poems, creating collages, and writing short stories, the scale is a matter of choice, but should have at
but the potential range of creative products that least three score points so that there can be some
one could use is quite wide. No attempt is made diversity of ratings. Typically judges are free to
to measure some skill, attribute, or disposition that use fractions if they choose -- e.g., a judge might
is theoretically linked to creativity; instead, it is give a creativity rating of 3.5 -- but in practice,
the actual creativity of things that subjects have few judges actually employ fractions even when
produced that is assessed. The focus is therefore the option exists.) The judges are not asked to
on creative products, not creativity-relevant talents explain or defend their ratings in any way, and it
or attributes that are hypothesized to influence is important that no such instructions be given.
creativity. It is the product or performance itself Judges are simply instructed to use their expert
that is of interest. As Csikszentmihalyi (1999) sense of what is creative in the domain in question
wrote, If creativity is to have a useful meaning, to rate the creativity of the products in relation to
it must refer to a process that results in an idea one another. That is, the ratings can be compared
or product that is recognized and adopted by only within the pool of artifacts being judged by
others. Originality, freshness of perception, and a particular panel of experts. High or low levels
divergent-thinking ability are all well and good in of creativity, as revealed by the Consensual As-
their own right, as desirable personal traits. But sessment Technique, refer to differences within
without some sort of public recognition they do not the group of artifacts judged, not in comparison
constitute creativity. . . The underlying assumption to any external standard. Judges are asked to use
[in all creativity tests] is that an objective quality the full scale (that is, not to rate all the artifacts
called creativity is revealed in the products, and as 1s or 2s, or all as 4s or 5s. The goal is to get
that judges and raters can recognize it (p. 314). ratings of the comparative creativity of the things
So instead of trying to measure things that might being judged. For this reason, a poem that might
be associated with creativity or that might be pre- be judged to be highly creativity in one group of
dictive of creativity, the Consensual Assessment rather pedestrian poems might receive a much
Technique goes right to the heart of creativity by lower creativity rating if it were included in a
group of much more creative poems.

4
Assessing Creativity Using the Consensual Assessment Technique

VALIDITY AND RELIABILITY (poetry-writing and story-telling) creativity. The


inter-rater reliabilities ranged from .72 to .93. In
The Consensual Assessment Technique assesses her more recent work Amabile (1996) has found
creativity at all levels -- everyday creativity as a similar range of inter-rater reliability correla-
well as creativity at the highest levels -- in the tions (from .70 to .89), and other researchers have
same way that creativity is assessed at the genius generally reported similar inter-rater reliabilities
level, by asking experts in that field. This is the among expert judges, typically in the .70-to-.90
standard against which any other judgment of range (e.g., Baer, 1993, 1997, 1998b; Baer, Kauf-
creativity would be measured. Rather than use man, & Gentile, 2004; Conti, Coon, & Amabile,
a test, a rubric, or some other device to approxi- 1996; Hennessey, 1994; Kaufman, et al., in press;
mate the judgments of experts, the Consensual Runco, 1989). Just as longer tests generally have
Assessment Technique goes directly to the most better reliability, the greater the number of judges
valid yardstick, the experts in a given domain. It who assess the products independently, the higher
is of course true that experts dont always agree the overall inter-rater reliability correlations. The
and expert opinion may change over time, but at average number of expert judges reported by
any point in time there is no more objective or Amabile (1966) was just over 10, with a low of 2
valid measure of the creativity of a work of art and a high of 40.
than the collective judgments of artists and art But perhaps these ratings are really judgments
critics, just as there is no more valid measure of something other than creativity. To find out,
of the creativity of a scientific theory than the Amabile (1982, 1983) had raters judge creativity
collective opinions of scientists working in that and also a number of other attributes of the prod-
field. And for the more everyday, garden-variety ucts they were evaluating. For example, working
creativity of most creativity research and most with the artistic creativity task of collage-making,
creativity assessments in higher education, the Amabile found that while experts tended to agree
fact that fields may experience paradigm shifts in their judgments of creativity, these creativity
over time is of little significance because few if ratings were not the same as judgments of such
any of the products being judged will be at the attributes as technical goodness (correlation with
cutting edge of a domain. creativity ratings = .13), neatness (correlation with
But do experts agree? Are they of one opinion creativity ratings = -.26), or expression (correla-
regarding which poems, collages, theories, etc. are tion with creativity ratings = -.05). There were
the most and least creative? A very large number significant positive correlations with many other
of studies have shown that they consistently do judgments, such as novel use of materials (cor-
agree, and to a remarkable degree (especially relation with creativity ratings = .81), complexity
when judging everyday, garden-variety creativity), (correlation with creativity ratings = .76), and
although of course they do not agree completely aesthetic appeal (correlation with creativity ratings
(which is why a group of experts, working in- = .43), but these are all aspects of a collage that
dependently, is needed). Inter-rater reliability should be related to the creativity of that collage.
using the Consensual Assessment Technique is A factor analysis of 23 different ratings produced
typically measured using Cronbachs coefficient two factors, creativity and technical goodness,
alpha, the Spearman-Brown prediction formula, and a similar study using poetry-writing produced
or the intraclass correlation method. These meth- similar results, with three factors emerging: cre-
ods generally yield similar inter-rater reliability ativity, style, and technical correctness (Amabile,
estimates. Amabile (1983) described a series of 1983). So the creativity ratings obtained using
21 studies of artistic (collage-making) and verbal the Consensual Assessment Technique have been

5
Assessing Creativity Using the Consensual Assessment Technique

shown to have good discriminant validity and or a more creative accountant), then one could
to be assessments of creativity, not of unrelated use ones poetry-writing creativity to be a more
attributes of the artifacts being judged. creative chef. This has been tested using Consen-
Consensual Assessment Technique ratings of sual Assessment Technique ratings of creativity in
stories, collages, poems, and many other artifacts diverse domains, and these in fact show very little
have been shown to be highly valid measures domain generality (that is, correlations of ratings
of creativity in their respective domains, but a of subjects creativity in different domains tend to
caution is in order. The Consensual Assessment hover near zero, especially if differences attribut-
Technique does not claim to provide evidence of able to general intelligence is removed; see, e.g.,
more general creativity-relevant abilities, a topic Baer, 1992, 1993, 1998a). Creativity researchers
about which there has been much debate (see, e.g.., are not in complete agreement on the question of
Amabile, 1983, 1996; Baer, 1993, 1994a, 1996, how much domain generality there may be, and
1998a; Conti, Coon, & Amabile, 1996; Plucker, the best bet is probably on a hierarchical model
1998; Plucker & Runco, 1998; Runco, 1987). Some of some kind (with some abilities contributing
have argued that such general creativity-relevant modestly to creativity across domains, others
skills simply do not exist, and therefore there is only to creativity with a given domain, and others
nothing to measure (and any creativity tests that only on specific tasks within a domain, such as
purports to measure such a general skill cannot poetry within the larger domain of creative writ-
possibly be valid, which is perhaps why it has ing; see, e.g., Baer & Kaufman, 2005; Kaufman
been so difficult to produce a valid creativity test & Baer, 2005b).
of that kind). In research assessing the impact of a wide
This is to many people a counter-intuitive idea. variety of interventions, training, or experimental
Of course creativity (as a general skill or trait) constraints on creative performance, Consensual
exists, many will protest: we see it all the time. Assessment Technique ratings have been shown
And there are many people who are creative in to work well. The technique is not tied to any
many areas, and others who seem to show little one theory of creativity, and because it is uncom-
creativity in any endeavor. But this is exactly mitted (and therefore unbiased) regarding most of
what one would expect if creativity were totally the big questions in creativity research, it can be
domain specific (that is, if creativity in one domain used equally well by researchers on either side of
did not predict creativity in other domains). If most research questions. Consensual Assessment
creativity were totally domain specific, creativity Technique ratings are also generally quite stable
in different domains would be uncorrelated (not across time (Baer, 1994b), but they nonetheless
negatively correlated). There would therefore respond well to real within-subject changes in
be a normal distribution of creativity in each motivation. For example:
domain, and these abilities would be essentially
randomly distributed across domains, with some 1. Amabile (1996) found in a series of studies
people evidencing creativity in many areas, most that experimental conditions that make ex-
people exhibiting varying levels of creativity trinsic constraints salient (such as offering
across domains, and some people showing very rewards for completing a task, or leading
little creativity in any domain3. subjects to expect that their work would be
If creativity were a general trait or set of skills evaluated) lead to generally lower creative
that could be applied in any field (so that the same performance.
creativity-relevant skills could help a person be a 2. Baer (1997, 1998b) discovered that this
more creative dramatist, a more creative chemist, decrement in creative performance under

6
Assessing Creativity Using the Consensual Assessment Technique

conditions of reward or expected evaluation ference on any of the tasks was in poetry, where
is much more prominent among girls than there were small but statistically significant dif-
boys. ferences between the Latino/a-Caucasian groups
3. Baer (1994a) found that increases in and Latino/a-Asian groups.
skill based on training were very nar-
rowly domain-specific. Subjects trained
using divergent-thinking exercises aimed in THE RANGE OF CONSENSUAL
poetry-relevant skills wrote more creative ASSESSMENT TECHNIQUE
poems, but not more creative short stories, APPLICATIONS
than subjects who had not received such
training. The Consensual Assessment Technique has been
used in many ways:
This has made the Consensual Assessment
Technique useful in assessing the impact of vary- 1. to compare creative performance under
ing constraints on creative performance. different (intrinsic v. extrinsic) motivational
constraints (e.g., Amabile, 1983,1996);
2. to measure the impact of teaching different
GENDER, RACE, ETHNICITY AND skills and content knowledge on creative
THE CONSENSUAL ASSESSMENT performance (e.g., Baer, 1993, 2003);
TECHNIQUE 3. to study how varying motivational con-
straints influence the creativity of boys and
Most intelligence, aptitude, and achievement tests girls differently (e.g., Baer, 1997, 1998b);
report different mean scores for different races, 4. to look for possible gender and ethnicity
ethnicities, and sometimes genders. The validity differences in creativity (e.g., Kaufman,
of such assessments has been fiercely debated Baer, & Gentile, 2004);
(see, e.g., Gould, 1981; Halpern, 2000; Herrnstein 5. to compare and evaluate domain-general and
& Murray, 1994; Jacoby & Glauberman, 1995; domain-specific models of creativity (e.g.,
Pinker & Spelke, 2005), and we wont enter Baer, 1993; Conti, Coon, & Amabile, 1996;
that contentious arena. Consensual Assessment Runco, 1987; Ruscio, Whitney, & Amabile,
Technique scores, in contrast, show very little 1998);
evidence of differences based on race/ethnicity. 6. to study the relationship between process
Kaufman, Baer, & Gentile (2004) conducted the and product in creativity (e.g., Hennessey,
largest study of this type. They performed three 1994);
separate analyses of the creativity ratings of 103 7. to look at creativity in cross-cultural set-
poems, 104 fictional stories, and 103 personal nar- tings (e.g., Niu, in press; Niu & Sternberg,
ratives written by Caucasian, African American, 2001);
Latino/a, and Asian eighth-grade students as a 8. to investigate the long-term stability of cre-
part of a study using student work collected by ativity in a given domain (e.g., Baer, 1994a);
the National Assessment of Educational Progress. and
Each poem, story, and narrative was rated for 9. to analyze ways that people with different
creativity by 10 experts in those areas. There levels of expertise in a domain conceptualize
were no significant African American-Caucasian creativity differently (e.g., Kaufman et al, in
differences, and no gender differences4, on any press; Kaufman, Gentile, & Baer, 2005).
of the writing tasks. The only significant dif-

7
Assessing Creativity Using the Consensual Assessment Technique

The Consensual Assessment Technique has To date, the Consensual Assessment Tech-
also been used to judge the creativity of such nique has not been widely used in higher education,
diverse tasks as dramatic performance (Myford, except as a research tool. Although its primary
1989), musical compositions (Hickey, 2001), use has been in research, it has also sometimes
mathematical equations created by children and been used in elementary and secondary education
adolescents (Baer, 1993), captions written to to judge student creativity in a particular area (or
pictures (Sternberg & Lubart, 1995), personal several areas) for such purposes as admission to a
narratives (Baer, Kaufman, & Gentile, 2002), and program for gifted and talented students.
mathematical word problems (Baer, 1993). Here are a few arenas in which the Consensual
The standard format for the Consensual As- Assessment Technique could be used in higher
sessment Technique is to have experts judge the education:
creativity of products that have been created under
identical conditions (with all subjects receiving 1. Research on the effectiveness of college
the same instructions and time limits), but recent majors or programs. Colleges want to
research has shown that the Consensual Assess- know how well they are succeeding in their
ment Technique also works when the things to various missions (an interest accreditation
be judged have been created under different boards share). Nurturing student creativ-
conditions (Baer, Kaufman, & Gentile, 2004). ity is a goal of some college programs, and
This makes possible such uses as comparing how in those areas the Consensual Assessment
different prompts or assignments impact creative Technique could be helpful. For example, in
performance differently. a program in which students produce a port-
folio of creative work, samples of students
creations from different years in a program
USING THE CONSENSUAL could be taken. A group of experts in that
ASSESSMENT TECHNIQUE IN field could be asked to rate the creativity of
HIGHER EDUCATION the various creations (not knowing which
students produced which work, or in what
The Consensual Assessment Technique is not academic year the work was produced, of
limited to use in fields most commonly associated course). If the creativity ratings are higher
with creativity, such as the arts and sciences. As the longer students are in a program -- a
Emerson (1837/1998) reminded us, There are very easily computed statistic -- that is
creative manners, there are creative actions, and very strong evidence that the program is
creative words; manners, actions, words, that is, successfully nurturing student creativity.
indicative of no custom or authority, but spring- (One could also ask the expert judges to rate
ing spontaneous from the minds own sense of the artifacts on other dimensions as well as
good and fair (p. 4; and four paragraphs later creativity, of course.)
he adds creative reading as well as creative writ- 2. Selection for admission to competitive
ing to the list). One might use the Consensual programs. Colleges have long been using
Assessment Technique to judge the creativity of an informal Consensual Assessment Tech-
just about anything in which one finds imagina- nique for selecting students for programs
tive or original work, such as wedding cakes, in creative writing, music, art, theater, and
cartoons, or even the graffiti found on the walls other areas. Validation of the Consensual
of buildings. Assessment Technique supports such selec-

8
Assessing Creativity Using the Consensual Assessment Technique

tion techniques and can help guide their use. and for which one has a creativity rating that
We know that it is important to use multiple one trusts. Including a handful of such items
judges; for the judges to make their creativity that one knows show varying levels of creativity
ratings independently; and for the judges to allows one to make adjustments for the varying
do what are in effect blind reviews -- that creativity of the works being judged. Rather than
is, they should not know anything about the base ones ratings on how well the students in the
candidate other than the work being judged. class perform in comparison to each other, one
(This is why selections of musicians these can use these extra, previously vetted works as
days are now often done with the candidate ones standards. If a students work receives a
playing from behind a screen, so that other creativity rating equal to a work that one knows
student characteristics -- such as appearance, to be highly creative, then that is the score one
gender, race, etc. -- cannot be factors in the would use, not how well it did in comparison to
judges decisions.) others in the class. (Of course, norm-referenced
3. Evaluations of students in regular courses. and criterion-referenced scores typically line up
In many courses creativity is one aspect of rather closely, but this technique avoids the danger
students work that is to be evaluated, and of mis-judging a students creativity because of
in such cases it is often the most difficult the varying creativity of the group of students
evaluation professors need to make. Profes- who happen to be in her class.)
sors might find it helpful to ask colleagues One can also use the Consensual Assessment
who do not know the students to make Technique to compare the work of students at the
independent judgments of the creativity of beginning and the end of a course, as discussed
students work. This is a bit tricky because above in the Research on the effectiveness of
Consensual Assessment Technique ratings college majors or programs section if students
are always, in effect, norm-referenced will have produced several different works dur-
ratings based on comparisons within the ing the semester.
group of creations being judged. As such,
a moderately creative work that is part of a 4. Selecting winners of prizes, fellowships, and
group of very uncreative works will earn other honors. Many colleges already use a
top ratings, but the same work would re- procedure similar to that used by major prize
ceive low ratings in a group of very creative committees to select winners of competitions
works. Because some classes have higher -- that is, by having experts in the domain
levels of creativity than others, this could in question judge submissions. Following
lead to unfair grading-on-the-curve kinds the procedures of the Consensual Assess-
of assessments. ment Technique ensures that this process
is conducted in a fair and well validated
To get around this and to make the creativ- manner. As noted above under Selection
ity ratings more criterion-referenced, one can for admission to competitive programs,
do what testing companies like the Educational it is important to use multiple judges, for
Testing Service do to make sure different versions the judges to make their creativity ratings
of tests are of equal difficulty, and what holistic independently, and for the judges to make
rating systems do to make sure that multiple their judgments without knowing whose
raters are using the same standards. One needs work is whose among the artifacts being
to include in ones sample of work some items judged. In competitions such as these, in
whose creativity has been previously assessed which some of the judges may know some

9
Assessing Creativity Using the Consensual Assessment Technique

of the candidates, blind review is especially Amabile, T. M. (1983). The social psychology of
important. creativity. New York: Springer-Verlag.
Amabile, T. M. (1996). Creativity in context:
Update to the social psychology of creativity.
CONCLUSIONS AND
Boulder, CO: Westview.
RECOMMENDATIONS
Baer, J. (1993). Creativity and divergent think-
The Consensual Assessment Technique is a power- ing: A task-specific approach . Hillsdale, NJ:
ful tool for assessing creativity. It has been well Lawrence Erlbaum Associates.
validated and is used widely in creativity research.
Baer, J. (1994a). Divergent thinking is not a gen-
Unlike most tests of creativity, the Consensual
eral trait: A multi-domain training experiment.
Assessment Technique does not measure skills
Creativity Research Journal, 7, 35-46.
or traits that are hypothesized to be part of cre-
ative thinking or performance. The Consensual Baer, J. (1994b). Performance assessments of
Assessment Technique assesses actual creative creativity: Do they have long-term stability?
performance. Roeper Review, 7(1), 7-11.
The Consensual Assessment Technique has
Baer, J. (1996). The effects of task-specific
many potential applications in higher education
divergent-thinking training. Journal of Creative
assessment, but it is not without limitations and
Behavior, 30, 183-187.
drawbacks. It is very resource intensive: as-
sembling groups of expert judges is not simple Baer, J. (1997). Gender differences in the effects
and it may be expensive. And one cannot replace of anticipated evaluation on creativity. Creativity
expert judges with novices (such as by having Research Journal, 10, 25-31.
students judge one anothers work) unless the
Baer, J. (1998a). The case for domain specificity
students themselves have a high level of exper-
in creativity. Creativity Research Journal, 11,
tise. While gifted and highly creative students
173-177.
have been shown to rate creativity in ways very
similar to experts, college students in general do Baer, J. (1998b). Gender differences in the effects
not (Kaufman, Baer, Cole, & Sexton, in press; of extrinsic motivation on creativity. Journal of
Kaufman, Gentile, & Baer, 2005). Creative Behavior, 32, 18-37.
The Consensual Assessment Technique is not
Baer, J. (2003). Impact of the Core Knowledge
linked to any particular theory of creativity, and
Curriculum on creativity. Creativity Research
its validity does not rise or fall with the success
Journal, 15, 297-300.
or failure of any theory. It has also been shown
to be free of gender and race/ethnicity biases. It Baer, J. (2005). Gender and Creativity. Paper
has great potential for creativity assessment in presented at the annual meeting of the American
many areas of higher education. Psychological Association, Washington, DC.,
August 18-21, 2005.
Baer, J., & Kaufman, J.C. (in press). Gender
REFERENCES
differences in creativity. Journal of Creative
Behavior.
Amabile, T. M. (1982). Social psychology of
creativity: A consensual assessment technique. Baer, J., & Kaufman, J.C. (2005). Bridging
Journal of Personality and Social Psychology, generality and specificity: The Amusement Park
43, 997-1013.

10
Assessing Creativity Using the Consensual Assessment Technique

Theoretical (APT) Model of Creativity. Roeper Herrnstein, R. J., & Murray, C. (1994). The bell
Review, 27, 158-163. curve. New York: The Free Press.
Baer, J., Kaufman, J. C., & Gentile, C. A. (2004). Hickey, M. (2001). An application of Amabiles
Extension of the Consensual Assessment Tech- Consensual Assessment Technique for rating the
nique to nonparallel creative products. Creativity creativity of childrens musical compositions.
Research Journal, 16, 113-117. Journal of Research in Music Education, 49,
234-244.
Carson, S. (2006). Creativity and Mental Illness.
Invitational Panel Discussion Hosted by Yales Jacoby, R., & Glauberman, N. (1995). The bell
Mind Matters Consortium, New Haven, CT., curve debate. New York: Times Books.
April 19, 2006.
Kaufman, J. C., & Baer, J. (Eds.). (2005a). Creativ-
Conti, R., Coon, H., & Amabile, T. M. (1996). ity across domains: Faces of the muse. Hillsdale,
Evidence to support the componential model of NJ: Lawrence Erlbaum Associates.
creativity: Secondary analyses of three studies.
Kaufman, J. C., & Baer, J. (2005b). The amuse-
Creativity Research Journal, 9, 385-389.
ment park theory of creativity. In J. C. Kaufman
Csikszentmihalyi, M. (1999). Implications of a & J. Baer (Eds.), Creativity across domains:
systems perspective for the study of creativity. Faces of the muse (pp. 321-328). Hillsdale, NJ:
In R. J. Sternberg (Ed.), Handbook of creativity Lawrence Erlbaum Associates.
(pp. 313-335). Cambridge: Cambridge University
Kaufman, J. C., Baer, J., Cole, J. C., & Sexton, J.
Press.
D. (in press). A comparison of expert and non-
Emerson, R. W. (1837/1998). The American expert raters using the Consensual Assessment
Scholar. An oration delivered before the Phi Beta Technique. Creativity Research Journal.
Kappa Society, at Cambridge, August 31, 1837,
Kaufman, J. C., Baer, J., & Gentile, C. A., (2004).
published in Nature; Addresses and Lectures,
Differences in gender and ethnicity as measured
and retrieved December 14, 2007, from http://
by ratings of three writing tasks. Journal of
rwe.org/works/Nature_addresses_1_The_Ameri-
Creative Behavior (39), 56-69).
can_Scholar.htm.
Kaufman, J. C., Gentile, C. A., & Baer, J. (2005).
Gardner, H. (1983). Frames of mind: The
Do gifted student writers and creative writing
theory of multiple intelligences. New York:
experts rate creativity the same way? Gifted
basic Books.
Child Quarterly, 49, 260-265.
Gould, S. J. (1981). The mismeasure of man. New
Kaufman, J. C., Plucker, J. A., & Baer, J. (in
York: W. W. Norton.
press). Essentials of creativity assessment. New
Halpern, D. F. (2000). Sex differences in cognitive York: Wiley.
abilities (3rd ed.). Hillsdale, NJ: Erlbaum.
Myford, C. M. (1989). The nature of expertise in
Hennessey, B. A. (1994). The Consensual As- aesthetic judgment: beyond inter-judge agree-
sessment Technique: An examination of the ment. Unpublished doctoral dissertation, Uni-
relationship between ratings of product and versity of Georgia.
process creativity. Creativity Research Journal,
Niu, W. (in press). Individual and environmental
7, 193-208.
influence of Chinese creativity. Journal of Cre-
ative Behavior.

11
Assessing Creativity Using the Consensual Assessment Technique

Niu, W. & Sternberg, R. J. (2001) Cultural in- Creativity: refers to anything someone does in
fluence of artistic creativity and its evaluation. a way that is original to the creator and that is ap-
International Journal of Psychology, 36(4), 225 propriate to the purpose or goal of the creator.
241.
Divergent Thinking: is a kind of thinking that
Pinker, S,. & Spelke, E. (April 22, 2005). The produces a variety of unusual and often original
science of gender and science: Pinker vs. Spelke: ideas to an open-ended question.
A debate sponsored by Harvards Mind Brain
Domain Generality: a theory of creativity
and Behavior Inter-Faculty Initiative. Retrieved
that assumes that the skills or traits that underlie
May 11, 2006, from the Edge Foundation Web
creative performance are essentially the same in
site: http://www.edge.org/3rd_culture/debate05/
all domains.
debate05_index.html
Domain Specificity: a theory of creativity that
Plucker, J. A. (1998). Beware of simple con-
argues that the skills or traits that underlie creative
clusions: The case for the content generality
performance vary from domain to domain.
of creativity. Creativity Research Journal, 11,
179-182. Reliability: the degree to which scores on a test
are consistent -- that test scores do not vary from
Plucker, J., & Runco, M. (1998). The death of
day to day or depend on who is scoring a test.
creativity measurement has been greatly exag-
gerated: Current issues, recent advances, and Validity: how well a test measures what it is
future directions in creativity assessment. Roeper supposed to measure (and that it is not instead
Review, 21, 36-39. measuring other, unrelated variables).
Runco, M. A. (1987). The generality of creative
performance in gifted and nongifted children. Endnotes
Gifted Child Quarterly, 31, 121-125.
1
Even within a given field, different experts
Runco, M. A. (1989). The creativity of childrens
might be more appropriate for judging dif-
art. Child Study Journal, 19, 177-190.
ferent kinds of works. For example, Pulitzer
Ruscio, J., Whitney, D. M., & Amabile, T. M. Prize committees might not be ideal judges of
(1998). Looking inside the fishbowl of creativ- the creativity of compositions by 12-year-old
ity: Verbal and behavioral predictors of creative writers; it might be better in that case to have
performance. Creativity Research Journal, 11, writers and critics who also have familiarity
243-263. with writings by students of that age serve
as judges. Similarly, one might find judg-
Sternberg, R. J., & Lubart, T. I. (1995). Defying
ments of the Academy of Motion Picture
the crowd. New York: Free Press.
Arts and Sciences or the Directors Guild
useful for judging the creativity of a film,
KEY TERMS but for judging a films likely commercial
success (or its entrepreneurial film-making
Consensual Assessment Technique: a method creativity) one might instead consult the
for assessing creativity in which panels of expert Peoples Choice Awards.
judges are asked to rate the creativity of creative 2
Tests of divergent thinking -- the most com-
products such as stories, collages, poems, and monly used tools for measuring creativity--
other artifacts. are examples of a kind of creativity test that is

12
Assessing Creativity Using the Consensual Assessment Technique

anchored to a particular theory of creativity. 3


This argument is parallel to that made by
Divergent thinking tests that ask test-takers Gardners (1983) Theory of Multiple Intelli-
to do things like list as many uses for empty gences. Gardner argues that his intelligences
tin cans as they can in a short period of time. are orthogonal, and therefore one should
The theory behind these tests claims that (a) expect essentially zero correlations between
this kind of thinking is important in creativ- any two intelligences. That does not mean
ity and (b) the particular content or domain that there will not be some people who have a
from which the exercise is drawn does not great deal of all eight intelligences, however
matter. If this kind of divergent thinking is (or some who might score low on all eight).
an important component of creativity, and if It simply means that the intelligences are
it doesnt matter what domain one uses to test randomly distributed, and ones level of
it, then divergent thinking tests might indeed intelligence in one area does not in any way
be valid measures of creativity. But if either predict ones levels of intelligence in any
the divergent thinking theory is wrong or other areas. Creativity, it has been argued,
the domain generality theory of creativity is shows even more domain specificity than
wrong, then these tests cannot be valid ways Gardners eight intelligences (Baer, 1993).
to assess creativity. In contrast, the validity 4
This is in line with hundreds of studies of
of the Consensual Assessment Technique is creativity using a variety of assessment tech-
not dependent on the validity of any theory niques. Gender differences in such studies
of creativity. It is equally valid no matter tend to be the exception, not the rule (Baer,
which creativity theories prove to be most 2005; Baer & Kaufman, 2005).
useful or widely accepted, and because it is
not linked to any theory, it can also be used
to compare and evaluate theories.

13
The author has requested enhancement of the downloaded file. All in-text references underlined in blue are linked to publications on ResearchGate.

You might also like