You are on page 1of 29

Intrinsic Motivation and Extrinsic Rewards: A Commentary on Cameron and Pierce's MetaAnalysis Author(s): Mark R.

Lepper, Mark Keavney, Michael Drake Reviewed work(s): Source: Review of Educational Research, Vol. 66, No. 1 (Spring, 1996), pp. 5-32 Published by: American Educational Research Association Stable URL: http://www.jstor.org/stable/1170723 . Accessed: 02/05/2012 02:44
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.

American Educational Research Association is collaborating with JSTOR to digitize, preserve and extend access to Review of Educational Research.

http://www.jstor.org

Review of EducationalResearch Spring 1996, Vol. 66, No. 1, pp. 5-32

Intrinsic Motivation and Extrinsic Rewards: A Commentary on Cameron and Pierce's Meta-Analysis
Mark R. Lepper, Mark Keavney, and Michael Drake Stanford University
This article provides a critical analysis of Cameronand Pierce 's (1994) metaanalytic review of the experimental literature on the effects of extrinsic rewards on intrinsic motivation. It is suggested that Cameron and Pierce's overly simplistic conclusion has little theoretical or practical value and is instead the direct consequence of their systematic and consistent misuse of meta-analytic procedures. A more nuanced analysis of the several different processes by which extrinsic rewards may affect motivation is also offered.

In an article published in a recent issue of this journal, Cameron and Pierce (1994) presenteda meta-analyticreview of the literature concerningthe effects of reinforcement or reward on intrinsic motivation. In this review, the authors repeatedlyemphasizeda simple "main-effect"conclusion that "ouroverall findings suggest that there is no detrimentaleffect [of extrinsic rewards]on intrinsic motivation"(p. 394). Such a conclusion directly contradictsdozens of narrative reviews of this same literature (e.g., Bates, 1979; Deci & Ryan, 1985, 1987; Kohn, 1993; Lepper, 1988; Lepper& Hodell, 1989; Morgan,1984; Quattrone,1985), as well as other recent but more theoreticallynuancedmeta-analyticreviews (e.g., Rummel & Feinberg, 1988; Tang & Hall, 1995). As a result, both their findings and their methods warrantcareful attention. In the presentcommentary,we examine the proceduresused by Cameronand Pierce to arriveat theirsweeping and anomalousconclusions.We contendthatthe review are a directconsequenceof the authors'systematic resultsof this particular use of inappropriate analyses and unwarranted inferences-strategies of analysis and inference that virtually guaranteea finding of small or nonexistent effects. Then, to provide a more balanced framework for future discussions of this we literature, offer a more detailedconceptualanalysisof the variousprocessesby which extrinsic rewardshave been shown to affect subsequentbehavior. How to "Verify" the Null Hypothesis We should note that our criticism of Cameronand Pierce's methods is not a criticismof meta-analytictechniquesin general.Certainlyit would be possible to in of Preparation this articlewas supported, part,by ResearchGrantsHD-25258 and of fromthe NationalInstitute ChildHealthandHumanDevelopment MH-44321 of fromthe NationalInstitute MentalHealth.Requestsfor reprints maybe sent to the first author. 5

Lepper,Keavney,and Drake

conduct a meta-analysis of this literature that avoids Cameron and Pierce's emphasison a simplistic conclusion, theirproblematiccategorizationof findings, of and their selective interpretation results. Indeed, Tang and Hall (1995) have done so and have been led to conclusions very differentfrom those of Cameron and Pierce. Conversely, were Cameronand Pierce's strategiesemployed in the context of a traditionalnarrativereview of this literature,we would find them equally review would never objectionable.We do believe, however, that such a narrative have been published,for without the veneer of objectivity associated with metaanalysis and the addedcredencethat seemingly accompaniesits use, the flaws in these procedureswould be clearlyrecognizable.Let us examine these procedures, and their problems,in detail.
Step 1: Start With the Answer

To begin with, Cameron and Pierce clearly have an ax to grind. They are disturbedthat the literatureon intrinsicmotivationsuggests that there are conditions under which extrinsic rewards may have detrimentaleffects on intrinsic interest."Reinforcement theory,"they tell us at the outset, "hashad a significant impact on education";its principles are "often used to promote learning and to motivate students"(p. 363). Yet somehow "several researchers [italics added] have presentedevidence and argued that incentive systems based on reinforceeffects" (p. 363). Worse yet, these findings have been ment may have detrimental publicized, such that "one commonly finds general statementscondemningreinforcementand/orreward"(p. 395), which leads "teachersto resist implementing incentive systems in the classroom"(p. 397). As Kohn (1996) notes in this issue
of Review of Educational Research, from the minute we hear about those "several

researchers" who distinguishbetween intrinsicand extrinsicmotivation,"it is not difficultto predictwhat conclusion [CameronandPierce] will reachregardingthe effects of rewardson the former."' Now let us be absolutelyclear on this point:We do not object to Cameronand Pierce's having strongviews on this subject.We have strongviews, too. In fact, we believe thatthe authorsof many, if not most, literature reviews in psychology have strong prior convictions (e.g., Mahoney, 1976).2Indeed, the prevalence of such biases was one of the main problems that meta-analyticprocedureswere intended to address (Cooper, 1989; Hedges & Olkin, 1985; Rosenthal, 1984). Rather,ourpoint is thatCameronandPierceareneitherless, nornecessarilymore, biased than the authors of the many reviews that have come to the opposite conclusion. We mention this point at the outset only because it seems important thatreadersunfamiliarwith the long historyof conflictingparadigmsand strident controversyin this area recognize that the use of meta-analytictechniquesdoes not guaranteeobjectivity.
Step 2: Select an Appropriate Straw Man

When one knows what conclusion one would like to draw, the next step is to identify a relevantstrawman againstwhich to inveigh. Thus, CameronandPierce state the problem as follows: "Overall,what is the effect of reward [or, slightly on later,reinforcement] intrinsicmotivation?"(p. 373). At first glance this might seem like a sensible question,especially to readersunfamiliarwith this particular 6

IntrinsicMotivationand ExtrinsicRewards

but literature, in this instance appearancesare deceiving. To ask aboutthe "overall" or "in general"effects of rewardsor reinforcersis to pose a fundamentally meaningless question. This is true for several reasons. First, it has been known for over 20 yearsindeed, since the earliestexperimentalstudies cited by Cameronand Pierce-that rewardshave demonstrably variedeffects on intrinsicmotivationas a functionof differences in the nature of the activities and rewards, the manner in which rewardsare administered,and the situation surrounding their administration. In Deci's (1971) very first paper, he predicted and found that whereas tangible rewardsmay decreaseintrinsicmotivation,verbalrewardsmay enhanceintrinsic motivation. In a second article, Deci (1972) similarly suggested and provided evidence that noncontingentrewardsproduce little change in intrinsic interest. Likewise, Lepper,Greene,and Nisbett (1973) predictedand showed thatwhereas expected tangiblerewardsmay undermineintrinsicmotivation,identicalrewards delivered unexpectedlydo not. over two decades ago, to ask Given that these interactionswere demonstrated a question now about the "main effect" of reward is sorely misguided. The relevant question is clearly under what conditions (e.g., with what types of rewards, contingencies, subjects, activities, contexts, etc.) extrinsic rewardsare likely to have positive, negative, or no effects on intrinsic motivation (e.g., Cronbach& Snow, 1977; Deci & Ryan, 1985; Lepper& Greene, 1978a, 1978b). Notice that if one were aware of only these three earliest studies (Deci, 1971, 1972; Lepperet al., 1973) and were to extrapolatefrom theirfindings-as if these initial results were the whole truth and nothing but-one would be able to reconstructvirtually all of the results involving the free time measures used by these investigators that Cameron and Pierce report in their final summary in Figure 3 (p. 392). Thus, verbal rewards show a positive effect on intrinsic motivation(dw = +.38), whereastangiblerewards(overall)show a negative effect (dw = -.21). Among tangible rewards,it is only expected (dw = -.25) and not unexpected rewards (ns) that produce these negative effects; among expected tangible rewards, those contingent upon task engagement or correct solutions show significant negative effects (dw = -.23), whereas those not contingent on eithersuccess or engagementdo not (ns). Indeed,the only findings from Cameron and Pierce's analysis of free time measuresthat are not alreadyapparentin these three initial experiments are the findings concerning performance-contingent rewards.3 One might think that such findings would be cause for praise, because the models that producedthese earliest findings have stood the test of time so well. Indeed,both of the otherrecent meta-analyses(Rummel& Feinberg, 1988; Tang & Hall, 1995), as well as numerous previous narrativereviews, have reached exactly this conclusion. However, Cameron and Pierce, by pretendingthat the importantquestion is the "overall"or "general"effects of rewardson intrinsic motivation,reach exactly the opposite conclusion. A literaturethat, by their own analyses, provides clear evidence of differential effects of different types of rewards is now taken as evidence against, rather than for, the theories that predictedthose effects! Cameronand Pierce appearto arguethattheirfocus on this generalquestionis somehowjustified by the fact thatthereare some authorswho have overstatedthe 7

Lepper,Keavney,and Drake

conclusions from this literature. This is simply a red herring.No doubt there are authorswho have read this literatureand have jumped to conclusions far more but general than are warranted, that does not justify jumping to equally extreme and unwarranted conclusions in the opposite direction. In addition, most authorsin this area have sought to point out the boundary conditionsof the phenomenathey were investigating.Lepper,Greene,andNisbett (1973), for example, describedat considerablelength the expected limitationsin the phenomena they had demonstratedand offered explicit caveats regarding generalizationsfrom their study: At the same time, becausethe implications this point of view for social of controland socializationare potentiallyso great, it is important point to to from the presentexperiimmediately the hazardsof overgeneralization ment.Certainly thereis nothingin thepresent of reasoning thepresent line or data to suggest that contracting engage in an activity for an extrinsic to will reward always,or evenusually,resultin a decrement intrinsic in interest programs(Fargo, Behrs, & Nolen, 1970; O'Leary & Drabman,1971) the that supporting proposition extrinsicincentivesmay oftenbe effectively used to increaseinterest certain in broadclasses of activities.On the present line of reasoning,this proposition shouldbe particularly when (a) the true level of initialinterestin the activityis very low and some extrinsicdevice is essentialfor producing involvement withthe activity;or (b) the activityis one whose attractiveness becomesapparent only through engagingin it for a long time or only aftersome minimallevel of masteryhas been attained. In fact, such conditionscharacterize prototypical the token-economy program,in thattangibleextrinsicrewardsare necessaryto elicit the desired behavior. from Hence,it wouldbe a mistaken overgeneralization the present to studyto proscribebroadlythe use of token-economy programs modify children'sbehavior.(p. 136) In our view, then, Cameronand Pierce's hypothesisconcerningthe absence of or "general" "overall"rewardeffects is clearly a strawman. We would expect it to appearto be true,in theirmeta-analyticsense, even if the theoriesthatled to this literature were completely accurate.It glosses over criticalinteractionsand qualifications that have been recognized and studied for over 20 years.4
Step 3: Average Across Demonstrably Competing Effects in the activity. . . . There is considerable evidence from token economy

As if this were not enough, Cameronand Pierce go even furtherby averaging across competing effects and across experimental and control groups within individualstudies,therebyeliminatingfrom theiranalysisthe additionalvariables those studiesinvestigatedand furtherguaranteeing the "general" "overall" that or effect size will be small or nonexistent. Consider, for example, the pattern of data illustratedin Figure 1: a full or crossoverinteraction betweenVariablesA andB. In this hypotheticalexample,let us assumethatthe two "competingeffects" of VariableA as a functionof the level of VariableB-namely, a decrease from Al to A2 at Level B1, but an increase from Al to A2 at Level B2-both proved individuallystatisticallysignificant,as would the overall interactionterm in such a case. How might such a patternof results be reasonablysummarized? 8

IntrinsicMotivationand ExtrinsicRewards

' --B2

#e

0
I i

A1

A2

FIGURE 1. A generic illustration of afull or crossover interaction, in which variations in Factor A produce opposite effects as a function of variations in Factor B

In our view, thereis little meritin summarizingsuch a studymerely as showing or no "general" "overall"effect of VariableA, aftercollapsing across VariableB. Such a conclusion not only ignores the main point of such a study-namely, that TreatmentA will producecompletely opposite effects at differentlevels of B; it of actuallylessens our understanding the effects of A and will lead to inappropriate conclusions about the applicationof TreatmentA in real-worldsettings. Suppose that Al and A2 representthe presence and absence of some drug and that B1 and B2 representtwo differentpopulationsto which that drug had been applied, such as males and females, or children with and without some specific cognitive or social deficit. Or suppose that B 1 and B2 are two differentdosages of the same drugwithin a single subjectpopulation.If the resultsin Figure 1 were obtained in either of these sorts of studies, would one want to conclude that "overall"there is no effect of this drug and hence no reason to consider further, or to be concernedabout,its use? If tests of a new medicaltreatment yielding such results were reportedin this way, the authorswould be laughed out of the field. Alternatively,suppose that one examines the effects of an instructionalintervention and finds that it produces opposite effects for students of differing aptitudes, proving a significant help for low-ability learners but a significant impediment for high-ability learners (or vice versa). Or perhapsthis treatment works wonderfully for boys but has negative effects for girls. Or maybe it is beneficial when introducedwith sufficientpriorteachereducationbut detrimental when introducedwithout sufficient prior instructionfor teachers. Once again, if someone were to summarizesuch studies as showing no "general"effect, or if someone were to drawthe conclusion thatthereis no reasonto consideror worry
9

Lepper,Keavney,and Drake

about the use of such an interventionin the classroom, that summary or that conclusion would be completely misleading. on Indeed,were one to applythis reasoningto the entirevast literature aptitudeinteractions & treatment (e.g., Cronbach Snow, 1977;Snow, 1994;Snow, Federico, & Montague, 1980a, 1980b; Snow & Lohman, 1984), the results would no doubt also show minimal "overall"effects of the treatmentsin those studies. As in the presentcase, we believe this would be a grossly mistakengeneralization.Similar spuriousconclusions would also be reachedby subjectingliteraturesin developmentalpsychology to "overall"analysesthatignoredthe interactionof treatments with age. This, however, is precisely the sort of summary that Cameron and Pierce provideof studies thathave shown opposing effects of identicalextrinsicrewards administeredunder different conditions. Consider but one example. In three separatestudies, carriedout by three differentgroupsof investigators,the effects of expected, tangible rewardshave been shown to vary completely as a function of subjects' level of initial interest in the rewardedactivity.5Thus, Calder and Staw (1975), McLoyd (1979), and Loveland and Olley (1979) all reportthe same crossover interaction:Subjects offered a tangible, extrinsic reward for engagement in or completionof a highly interestingactivity showed significantlylower subsequentintrinsic motivation, whereas subjects offered the same reward for engagementin or completionof a relativelyuninteresting activity showed significantly higher subsequentintrinsicmotivation.6 The convergence of results across these three studies seems quite impressive. Each involved different procedures, activities, rewards, and contingencies. In some studies, interestwas experimentallymanipulated; others,it was based on in measures of preexisting individualpreferences.The subject populationsranged from preschool children to college students.Despite these differences, the findings were the same-a resultthatwould seem to have clear practicalimplications and uses of rewardsin classrooms. regardingthe appropriate inappropriate In Cameronand Pierce's article,however, each of these studies is presentedas if therehad only been two groups,rewardandno reward.The opposingeffects are averaged, and only a single effect size is reported.Not surprisingly,given this procedure,Cameronand Pierce's analysis yields an averageeffect size for these three studies of-.02, clearly contributing the authors'"no effects" conclusion. to This is not an isolatedexample:Exactlythe same procedureis followed with other predicted and significant crossover interactions (e.g., Kruglanskiet al., 1975, Experiments1 and 2; Staw, Calder,Hess, & Samdelands,1980). The inappropriateness thisprocedure even be demonstrated of can usingCameron and Pierce's own meta-analyticprocedures.If one considers separatelythe highinterestandlow-interestconditionsin these threestudies,these studiesprovidesix separatecomparisonsbetween groups offered an expected, tangible rewardand groupsoffered no such reward.Following the proceduresoutlinedin Hedges and Olkin (1985), one may first ask whetherthe six effect sizes obtainedfrom these comparisonsdo seem to come from a single population,and thereforewhetherit is appropriate draw a conclusion aboutthe "overall"effect size. The answeris to clearly no. The Q statisticfor the six comparisonsis 24.10, p < .001. For the three comparisonsinvolving activities of initial high intrinsicinterest,the overall effect size estimate d is -1.03, which is significantly different from 0 at p < .001. By 10

IntrinsicMotivationand ExtrinsicRewards

contrast,for the three comparisonsinvolving activities of low initial interest,the overall effect size estimated is +.86, which is significantlydifferentfrom 0, in the opposite direction,at p < .005. Were one to compute a single effect size estimate for these six comparisonsby ignoringthe interestvariableandits interactionswith the rewardvariable,one would of courseproducean estimateof d = -.08, an effect size not significantly differentfrom 0.7 In short, in Cameronand Pierce's article, even robust,replicable, and statistically significant crossover interactions are averaged, so that they contribute a single effect size to the analysis.In the process, the majorfindings of these studies In are lost, and minoreffect sizes are virtuallyguaranteed. addition,these studies are reportedin Cameronand Pierce's Appendix C (pp. 405-419) in such a form that a readerunfamiliarwith these articles would have no idea what was actually studiedor found.This practiceseems especially ironicgiven CameronandPierce's own dictum that "even an informed readercan have difficulty keeping in mind what a particularstudy is investigating.... It is hoped the presentmeta-analysis has helped to clarify the issue" (p. 395). to As CameronandPierce themselves note elsewhere, "Itis important point out thatthese maineffect resultsshouldbe viewed with caution.This is because many studies show interactioneffects thatare obscuredwhen resultsare aggregated" (p. 383). This problem remains true of their analysis at every level of aggregation it presentedin the article;yet, unfortunately, seems not to have producedin them the sortof cautionaboutgeneralizingfromthese analysesthatthey urgeupontheir readers. Step 4: Average Across Experimentaland Control Groups Of course, in this literature, experimentsyielding full crossoverinteractionsare clearly in the minority.Less extreme versions of these same problems,however, occur in the much more widespreadcase of experimentsdeliberatelydesigned to contrasttheoretically"active"experimentaltreatmentswith theoretically"inacIn tive," or control,procedures. these cases, once again,to averageacrosstheoretically contrastingconditions in orderto derive a single effect size is to throw out whateverimportantinformationthese studies may have found. Consider the simplest case: a study that contains one baseline, no-treatment conditionandtwo additionalconditionsto be comparedto thatbaseline condition, one an "experimental"group and the other a "control"group. That is, one comparison condition, the experimentalgroup, has been explicitly designed to present the conditions under which some particulareffect is predictedto occur, whereas a second comparisoncondition, the control group, has been explicitly designed to introduceone specific change in procedurefrom the experimental condition that is intended to eliminate the effect of interest.8This basic design, which is routinelyused to test the importanceof some hypothesizedunderlying process or variable,is presentedschematicallyin Figure 2. that Again, let us assumefor the sake of argument all the effects in such a design are exactly as predictedby some theory. In the case displayed in Figure 2, this meansthatthe experimentalgroupis significantlydifferentfromboth the baseline group and the control group, and that the control group results are identical to those in the baseline group.In short,as predictedby the theory, the experimental group producedsome effect, and the control group did not. 11

Lepper,Keavney,and Drake

Experimental

Control

No Treatment

Treatment
FIGURE 2. A generic illustration of the use of a control group to identify critical variables underlying an experimental effect

Now imagine that, as before, one were to average effect sizes across the differentgroups in such a design in orderto ask about the effects of variationsin A in general. The result would be obvious. The significant effect sizes deriving from the experimental conditions would be cut in half by the addition of a noneffect in the control condition. Again, this procedurehelps to assurea finding of minimal overall effects when used on a literaturein which explicit attempts have been made to study the conditions under which an effect will and will not occur. on In practice,this sortof design is extremelycommon in the literature intrinsic motivation, as it is in many other theoreticallyderived literaturesin psychology. It is, in fact, the most common design for eliminating alternativeexplanations: showing that a second set of conditions identical to the original ones in every way-except for one theoreticallycriticaldifference-will eliminatethe phenom& enon in question(e.g., Abelson, 1995; Aronson,Ellsworth,Carlsmith, Gonzales, 1990; Neale & Liebert, 1986). Indeed, after the first dozen papers had been published in this area, the presence of some control group of this sort (or the for demonstration a crossover interaction)became a virtualrequirement publiof cation, at least in the most highly ratedjournals, because editors recognized that underprogressin such a field requireda more precisely nuancedand articulated of the conditionsunderwhich underminingeffects would and would not standing
occur.

12

IntrinsicMotivationand ExtrinsicRewards

Considersome specific examples. Several studies (M. Ross, 1975, Experiments 1 and 2; Sarafino,1984) have examinedthe hypothesisthatthe detrimental effects of expected, tangiblerewardswill dependon the salience of those rewards.Thus, M. Ross (1975, Experiment 2) added to a "standard"9 task-contingentreward condition and a no-rewardcondition two other cells-one in which the children were given explicit instructionsto think about the rewardas they engaged in the targetactivity, and anotherin which the childrenwere given explicit instructions to thinkaboutan irrelevantdistractor (i.e., "snow")as they engagedin the activity. As predicted,childrenin the "standard" task-contingentrewardgroup and in the reward-ideation group showed significantdecreasesin intrinsicmotivationcomparedto controlsubjects;also as predicted,by contrast,childrenin the distractionideation group show no such effect. Yet, here, as in the other two studies on rewardsalience which yielded comparablefindings, these differentialresults are summarizedby Cameronand Pierce with a single effect size. Similarly, Pittman, Cooper, and Smith (1977) predicted that the detrimental effects of an expected reward on intrinsic motivation could be eliminated if subjects were explicitly led to see themselves as intrinsically interested in an activitydespite the offer of tangiblerewards.To test this hypothesis,these authors reward and no-reward conditions two additional cells in added to "standard" which studentswere given explicit false feedback showing that their own physiological responses had indicatedhigh levels of either intrinsicor extrinsic motivation. As predicted, the detrimentaleffects of an expected, tangible reward shown in the standardand extrinsic feedback conditions were eliminated in the intrinsic feedback condition. Of course, these theoretically telling differential findings are summarizedby Cameronand Pierce with a single effect size. Once again, the importantpoint lies not so much in the mannerin which the originaldatafrom these specific studies are distortedby the presentanalysis as in the ubiquityof this practice.An examinationof the actualconditionsreportedin each of the 90 original articles available to us (and published after 1973, when most of the initial distinctionsof importancehad alreadybeen made)revealedthat a vast majority (over 80%) involved further comparisons not reported in the summaries in Cameron and Pierce's Appendix C. In this issue of Review of Educational Research, both Kohn (1996) and Ryan and Deci (1996) discuss additionalexamples of this practice.Again, it is no surprisethat the consequence of averagingacross conditions specifically designed to producedifferenteffects will be an "overall"effect size near zero. Step 5: If at First YouDon't Succeed ... In short, most of the effect size estimates included in Cameronand Pierce's meta-analysisare averaged across diametricallyopposing effects, or across experimentaland control conditions, even before their entry into the analysis pool. Then, to performthe "overall"analyses that Cameronand Pierce so emphasize, these effect sizes from their Appendix C are again averaged across all of the relevant dimensions-expected versus unexpected, tangible versus verbal, and noncontingentversus task-contingent performance-contingent-examinedin later analyses. To justify this overall analysis, the authorspresentin their Figure 1 (p. 380) a normalshape thatappearto have the approximately set of frequencydistributions 13

Lepper,Keavney,and Drake

that mightjustify a meta-analysis.Because these figures are based on effect size estimatesthathave alreadybeen averaged,twice in manycases, the datadisplayed in these figures should come as little surprise.Note, however, thattwo additional decisions by the authors, concerning the reportingof effect size estimates for studies in cases where the actualeffect sizes cannotbe calculated,also contribute to the seeming normalityof these distributions.First, in any case in which the original article does not provide enough information to compute the overall means, but in which the directionof a nonsignificantdifference is specified, the authorsassign a randomnumberbetween .01 and the value representing effect an thatwould have been minimally statisticallysignificant.Second, if the difference in directionin the overall means is not available,the study is given an effect size of 0.0. Both approximations used frequently:At least one effect size of 0.0 is are for 18 of the articles, and at least one effect size estimatedrandomlyis reported reportedfor 15 of the articles.In total, nearlya third(32%) of the articlesinclude some estimate of this sort. What is importantfrom the present perspective, however, is that both these in proceduresare inherently"conservative," the sense that they presumea distributioncenteredat zero are hence biased towardan acceptanceof the null hypothesis (Cooper & Hedges, 1994; Mullen, 1989). Their widespreaduse, therefore, should bias the result towarda "no effects" conclusion. To their credit, Cameron and Pierce do reportmeans with and withoutthe pure zero cases includedin their following tables, but only after these "data"have alreadybeen included in their Figures 1 and 2. Small wonder these distributionslook relatively "normal"and centeredaroundzero. A much more serious problemthan these estimatesfor unavailabledata,however, lies in the authors'liberaldiscardingof "outlying"cases, as shown in the Q statistics reportedby Cameron and Pierce in their Tables 2 through 5. The Q statistic was createdby Hedges and Olkin (1985) as a test for whether a set of effect size estimatesfrom differentstudiescan be considereda samplefrom some commonpopulationof effect sizes, andthereforewhetherthe use of meta-analytic proceduresbased on such an assumptionis appropriate. Interestinglyenough, despite all the multiple averagingand the use of zero or randomnumbersas estimates of estimates, the data provide a clear answer:No, the Q statistics in Tables 2 through5 say, it is not reasonable,within traditional statisticallimits, to assume thatthese studies have been sampledfrom a common underlyingpopulation.Rather,for 15 of the 18 distributions reported(83%),with all known effects included,the samples are significantlynonhomogeneousat the p < .01 level. Yet, a 16th is significantlynonhomogeneousat the p < .05 level. One might imagine that such results, which would normallybe interpreted to suggest that there are additionalfactors thatneed to be taken into accountbefore a meta-analysisis warranted, might dissuade the authorsfrom doing, or putting much faith in, these demonstrably too-generalanalyses.Not so. Instead,Cameron and Pierce proceed to eliminate "outliers,"using the proceduresfor trimming distributions describedby Tukey (1977). Even then, they find thatfully half of the distributionsoverall remain significantly nonhomogeneousat the p < .01 level, and that 11 of 18 (61%) remain significant at the p < .05 level. For the crucial distributions involving expected, tangiblerewards-the only cases where thereis of the 6 (again, 83%) remain significantly any real controversy-5 14

IntrinsicMotivationand ExtrinsicRewards

nonhomogeneous. Again, one might imagine that the data have spoken. Once again, not so. Instead, the authors proceed to eliminate as many more "outliers" as prove necessary for them to performthe too-general, overall tests they emphasize. In their Table 2, this requiresthe elimination of roughly 20% of all known effect sizes listed. Presumably, this percentage includes a much higher fraction of findings thathad not alreadybeen dilutedby initial averagingacrosscompetingor disparateeffects. Again, this procedureis a conservative one, which makes it harder,on average, to disconfirmthe null hypothesis. The authorsseek to legitimize theirliberaldiscardingof "outliers" referring by to a paper by Hedges (1987) in which it is claimed that even physical scientists discard outliers in large numbers.The context that Hedges examined, however, involved successive attemptsto determine,to many decimal places, the values of physical constants,where (among other things) improvementsin equipmentand technology may be expected to provide superiorestimates. It did not deal with literaturesin which opposing effects and complex interactionshave been predicted and replicated,or in which competing predictionsare at issue. In fact, it in at bore little resemblanceto the literature hand, andto most literatures psychology. Step 6: Selectively Interpretthe "Results"Obtained Even if the results obtained using the techniques outlined in the preceding sections are highly probative-and we believe that they certainly are not-the final step in Cameronand Pierce's analysis involves a continuedselective interpretationof those results. As noted above, even given the putativeresults of their analysis, one might summarizethe differencesobtainedfor free-timemeasuresas in consistentwith the effects predictedanddemonstrated the earlieststudiesin this area. Alternatively,drawing on the most recent theoreticalanalyses, one might conclude, as do Tang and Hall (1995) in their more recent and more articulated meta-analysisof this same literature,that "the overjustificationeffect has been in consistently demonstrated situationswhen it should be expected to occur" (p. 379). Instead,CameronandPiercereiteratetheirmaineffect conclusionthat"overall, the results indicate that rewarddoes not negatively affect intrinsicmotivationon any of the four measures"(p. 391) and that "in terms of rewardsand extrinsic reinforcement,our overall findings suggest that there is no detrimentaleffect on intrinsicmotivation"(p. 394). In addition,when they do discuss morefine-grained analyses,they use a varietyof strategiesto minimize the importanceof significant negative effects and to maximize the importanceof significant positive effects. For example, as Ryan and Deci (1996) note, Cameronand Pierce use aggregation freely when it suits their goals but avoid it entirely when it produces results contraryto their interests-for example, drawing general conclusions about the significantpositive effects of verbalrewardsbut not aboutthe significantnegative effects for the precisely comparablecategory of tangible rewards. Additional Difficulties With Cameron and Pierce's Analysis In our view, Cameronand Pierce have createdand employed a recipe that will work with almost any complex and diverse literatureto producea summarythat 15

Lepper,Keavney,and Drake

"overall"there is no "general"effect. This simplistic, main effect conclusion, derived from procedureswhich virtually preordainit, seems to us to have no probative value. Moreover, although the problems described in the previous section representto us the most distinctiveand perhapsthe most troublingaspects of Cameronand Pierce's review, they do not exhaustour concernsregardingtheir applicationof meta-analyticproceduresto the intrinsic motivation literature.In this second section, we consider a numberof these other concerns.
The Quality Problem

The first of these, the qualityproblem,is straightforward. computingstatisIn tical effect sizes, Cameronand Pierce explicitly give equal weight to "good"and "bad"studies. The potential costs and benefits of such procedureshave been discussedat lengthwithinthe meta-analytic community,andmanyproposalshave been made for ways of taking into account variationsin methodologicalrigor or relevance across studies (e.g., Cooper, 1989; Rosenthal, 1979, 1991; Slavin, 1986). Cameronand Pierce, however, make no attemptto examine or take into account the methodological adequacyof the experimentsthey review, although there are wide variationsin quality across these studies. One centralmethodologicalissue, for instance, concerns the design of dependent measures that effectively assess intrinsic motivation. For free choice measures, for instance,one needs to be certainnot only that subjectsno longer expect a tangible rewardfor continued engagement in the target activity but also that subjects do not infer that their continued engagementin the activity is likely to please the experimenteror otherwise lead to continuedsocial approval.In some of the studies reviewed here, subjects are observed weeks later in a naturally occurringsituationin which they are not even awarethatthey are being observed or that they are still part of an experiment;in others, subjects are confronted immediatelyfollowing the receipt of rewardwith the same narrowset of choices by the same experimenterwho remainsin the room to observe their behavior.?1 Clearly, these are qualitativelydifferentmeasures.In the lattercase, but not the former, subjects may continue to choose the previously rewardedactivity, not necessarilyout of intrinsicinterest,but because they believe thatthe experimenter would still prefer that they do so. Theoreticallycrucial distinctions of this sort, however, are not consideredin Cameronand Pierce's review. To take anotherexample, recall the interactionfindings predictedand obtained by Calder and Staw (1975), Loveland and Olley (1979), McLoyd (1979), and Newman and Layton (1984)-that identical tangible, expected rewards may increase intrinsic interest in initially boring tasks and yet decrease interest in initiallyinterestingtasks. Similarvariationsin the initial interestvalue of different activities are also apparent across studies;indeed, readersmay themselves note in Cameronand Pierce's AppendixC a numberof tasks that most people would not find at all intrinsicallymotivating, such as scoring questions, coding data, and proofreading.'1 Naturally,the predictablydifferentresults of these differentprocedures are simply more "errorvariance"for Cameronand Pierce's analysis.
The Quantity Problem

An equally criticalbut subtlerproblemwith Cameronand Pierce's analysis of the intrinsicmotivationliterature could be termedthe quantityproblem.Because 16

IntrinsicMotivationand ExtrinsicRewards

many theoretically critical variations in procedures in this literature are not representedin sufficient numbersto be included as separatefactors in a metaanalysis, they are groupedwith other,more common variations,and any information that one might have gained from them is effectively discarded. Consider an early study by Kruglanski,Alon, and Lewis (1972), designed to demonstratethat the critical conceptual variablein the underminingof intrinsic interestby tangible rewardsis the subject'sperception of having engaged in an interesting task in order to receive some extrinsic incentive. If so, then under normal circumstances,providing unexpected, tangible rewardsto subjects after their engagement in an interesting task should not have detrimentaleffects on interest, as other studies have shown. However, if the subject's perceptionsare indeed crucial,it shouldbe possible to design an objectively "unexpected" reward procedure that would undermine intrinsic interest by producing a subjective perception of instrumentality just like a comparable"expected"reward. These authors proceeded to do precisely this by overtly lying to subjects who had undertakenan interestingactivity with no expectation of a reward,telling them thatthey were now receiving the rewardthat "hadbeen promisedto them"at the outset. Using this deliberate,intricate,and unusualdeception,Kruglanskiet al. (1972) were indeed able to show a decreasein intrinsicinterestfollowing the delivery of an objectively unexpected reward. In doing so, they provided a particularly informativedemonstration. However, in partbecause no otherstudies in this area have used such a procedure,this finding enters Cameronand Pierce's analysis as procedure-one that happens to have prosimply one more unexpected-reward ducedresultsoppositefrommost-rather thanas a theoreticallytelling "exception thatproves the rule".Once again,a potentiallyimportant findingaddsonly "error" to this meta-analysis.'2
The Confounded Variables Problem

problemis theproblemof consistentconfoundings Closely relatedto the quantity of different variables and proceduralcharacteristicsacross studies in this literature. To take just one example, consider the category of "verbal,"as contrasted with "tangible,"rewardsin the presentreview. Although this might seem like a distinction,it actuallyinvolves a large numberof confoundedbut straightforward potentially critical variables. In this literature,it turns out that most "verbal" rewardsare also contingent on performance,are unexpected,and provide information about subjects' ability to performthe activity, whereas most "tangible" rewardsare not performancecontingent,are expected, and give little information to subjectsabouttheirabilities.Hence, differencesin the effects of these two types of rewardscould be due to any or all of these correlatedfactors.13 more highly controlledand more theoreticallytelling comparisons Fortunately, are availablewithina numberof individualexperiments,as both Kohn (1996) and Ryan and Deci (1996) note. Unfortunately,perhaps because these particular within-experimentcomparisons have not been repeated a sufficient number of times to be entered as a feature in their meta-analysis,they are not discussed in Cameronand Pierce's review. Again, the most theoreticallyimportantfindings are ignored. 17

Lepper,Keavney,and Drake The "Psychological" Effect Size Problem

A fourth additionalproblem with Cameronand Pierce's review derives from the frequentantinomyin the intrinsicmotivationliteraturebetween measuresof statisticaleffect size and measuresof "psychological" effect size. Thus, measures of statisticaleffect size dependentirelyon the means, standard deviations,and ns of the relevant comparison conditions, and procedures that minimize withincondition varianceand/ormaximize mean differences will producelargereffect size estimates. By contrast,when we evaluate the power or significance of an experimental effect as psychologists, ratherthanas statisticians,we use a quite differentmetric. In particular,the "power"of a given comparison, in this psychological sense, depends not only on the statistical significance of the difference between two conditions,but also on the strengthof the manipulation producingthatdifference, the sensitivity of the measuresassessing thatdifference,and the rangeand power of factors that were not controlledin the comparisonand therebyconstitute the error variance against which the effect must "compete"for significance (e.g., Abelson, 1995; Lepper, 1995; Prentice& Miller, 1992; L. Ross & Nisbett, 1991). Consider, as one prominentexample of this process from the field of educational research, Rosenthal and Jacobson's (1968) famous study of "Pygmalion effects" in elementaryschool classrooms. In this study, Rosenthaland Jacobson told elementaryschool teachers at the beginning of the school year that certain (randomlyselected) pupils in their classes had been identified by a new personality test as "late bloomers" who were likely to show particularimprovement tests, includingIQ duringthe coming school year.Nine monthslater,standardized to measures,were administered experimentaland "control" pupilsby testersblind to students' experimentalconditions. The results clearly struckthe field as dramaticand powerful-but in psychological, not statistical,terms.What made this study an overnightclassic were the particulardetails of Rosenthal and Jacobson's procedure,details that suggested that even weakly statistically significant differences would be unlikely even if their hypothesis were completely correct. First, the manipulationof teachers' expectancies was extremely subtle and involved only a single brief communication from the researchers; readerswould have been much less impressedhad the experimentersfrom Harvardstoppedby the involved classrooms once a week to remind the teachers of the supposed late bloomers and to inquire about their progress. Second, this manipulationtook place in students' regularclassrooms, where any experimentaleffect would necessarilybe in competitionwith a host of otherimportant, real-worldfactorsknown to influence studentperformance. Had the same manipulationbeen used in a novel laboratorysetting in which the only expectationsteachershad abouttheirstudentswere those based on the "blooming" test, a statistically equivalent effect would have generatedfar less interest and controversy.Finally, the effects were apparent9 months later, on standardized tests known for their resistanceto short-terminterventionsand administeredby personnel blind to experimentalconditions. Had the same effects been demonstratedonly on measuresdesigned by the classroom teachers,or only for 1 week afterthe manipulation, theirfindings would have been considerablyless provocative. 18

IntrinsicMotivationand ExtrinsicRewards

Thus, it is only by virtue of considerablebackgroundknowledge that readers came to identify Rosenthal and Jacobson's results as particularlyinteresting, powerful, and surprising.In the abstract,it was already well-known and welldocumentedthatpeople's expectationsaboutotherscan influencethe way they act towardthose others.Thata single brief communicationshouldproduceapparently significant effects on aptitude measures we know and care about a year later, however, was a psychologically impressive result that immediately garnered considerableattention.The dramaand the attendingpsychological significanceof such a study,like the fabled differencebetween success andfailurein business, lie
in the details.14

Unfortunately,the problemis not just that these sorts of criteriaare not taken into account in Cameron and Pierce's meta-analysis. Actually the situation is much worse than that, because the same factors that, ceteris paribus, heighten also, ceteris paribus, psychological effect size in the intrinsicmotivationliterature reduce statistical effect size, and vice versa. One can compare studies, for instance, thatuse differentdependentmeasures.In some studies, actualbehavioris observed 2 weeks after the rewardmanipulationin a naturalclassroom setting in which subjectsdo not know thatthey arebeing observed(or even thatthey are still in an experiment)and in which they are free to choose among scores of different activities of interest.In other studies, subjects are asked merely to place a check markon a single 5-point scale immediatelyafterthe end of a rewardsession, often in the presence of the same experimenterwho just conductedthat session. Given equal statisticaleffect sizes in these two studies, we would arguethat one should give greaterweight to the formerstudy, because in that study the cardshave been deliberatelystacked,as in RosenthalandJacobson(1968), againstthe demonstration of a significant effect. trade-offshere.Using moreconsequential,moredistant,and Note the important more dissociatedmeasuresentails a majorrisk. Otherthings being equal, a study using such measures will be less likely to obtain statisticallysignificant results, even if the hypothesis is entirely true. Therefore,equally statisticallysignificant effects on such measuressuggest a largerpsychological effect size thando effects on less consequential,distal, and dissociated measures. Yet, because the use of bettermeasuresentails puttingone's hypothesis to a much more stringenttest, a study employing such measures is less likely to produce an equally statistically powerful effect. More psychologically powerful and methodologically sound choices will, ceteris paribus,lead to smaller average statisticaleffect sizes. Because statistical effect size and psychological effect size are negatively correlated in this particularliterature,the use of meta-analytic procedures to comparestatisticaleffect sizes, without regardto psychological effect sizes, will result in greaterweight being given to precisely those findings that we should otherwise find of least evidential value. A Gresham'sLaw of Research? The net result of these difficulties with Cameronand Pierce's applicationof meta-analyticproceduresto the intrinsic motivation literatureis potentially distressing. By lumping together good studies and bad, by ignoring informative in comparisonsnot represented sufficient numbersfor aggregatedstatisticaltreatment, and by focusing on statistical criteria at the expense of psychological 19

Lepper,Keavney,and Drake

criteria-as well as by averaging across competing effects and explicit control procedures,as documentedin the previoussection-Cameron andPierce's analysis seems to value less probative,less methodologicallysound, and less theoretically telling experimentsas much, or more, than the very best studies. In economics, Gresham'slaw describesthe fact that "bad"currencydrives out "good"-that if two currencies have the same purchasingvalue, the one with greaterintrinsicvalue will be driven from the marketplace.We believe a similar effect can occur if Cameronand Pierce's proceduresare appliedto literatures like the present one. Doing good experimentalresearchis more difficult; it "costs" more than doing bad research.If the standards judging the contributionsof a for particularexperimentto a field of knowledge treat the two as equally valuable, therewill be little incentivefor investigatorsto do the sortsof researchwe believe the field should value most highly. Instead,bad researchwill drive out good. The same phenomenon may also occur at a different level. In our view, comparisonsthat show significantdifferencesbetween conditionswithin a single well-designed experimentwill routinely be more informativethan comparable comparisonsacross two equally well-designed experiments.For within-experiment comparisonsin a good study, we know thatextraneousconfoundingfactors have been carefully controlled,that differentconditions differ only in specified ways. For across-experimentcomparisons,by contrast, differences in relevant theoretical variables will necessarily be confounded with a host of incidental variables, including different subject populations, different experimenters,and different settings, as well as specific proceduraldetails. Yet, in Cameron and Pierce's meta-analytic review, a wide arrayof criticalwithin-experiment comparisons never even enter the analysis. Again, bad comparisonsmay drive out good ones. Real-World Implications We have arguedthat Cameronand Pierce's focus on the "overall"main effect of rewards is misguided and theoretically meaningless. One counterargument, however, is that this is somehow the correct question for addressingreal-world applicationsof this literature,as opposed to theory. Cameronand Pierce seem to make this argumentwhen they state what Ryan and Deci (1996) describeas their conclusionthat "overall,the presentreview suggests thatteachers "not-to-worry" have no reason to resist implementingincentive systems in the classroom" (p. 397). This point of view does have some supportwithin the meta-analyticcommunity. Indeed,in threeseparateanonymousreviews, CameronandPierce's focus on a simplistic main effect conclusion from this literature was stronglydefended.To these reviewers,CameronandPierce's emphasison an "overall" effect of rewards was deemed just as interesting and valuable as our "theoretical" focus on the "interactions"in this literatureand was touted as "the relevant guide to the potentialeffects" of a variety of vague policy proposalsthat have been (or might be) made concerningthe use of extrinsic rewardsin education. We cannotacceptthis position. Is theresome reasonwhy policy questionsneed to be phrasedin termsof simple main effects? Is providingmore precise information actually expected to lead to worse policy decisions? Even if one worried about possible information overload, could one not draw from Cameron and 20

IntrinsicMotivationand ExtrinsicRewards

Pierce's own Figure 3 the more accurateyet equally clear, memorable,and easythat educatorsshould rely on praisebut avoid the to-implementrecommendation use of tangible rewards? Alternatively,supposethatone performedan analysisidenticalto Cameronand Pierce's but restrictedthe literatureexamined to studies involving tangible rewards. This would yield, even with their methods, a significant "main effect" conclusionthattangiblerewardswill "overall" have adverseeffects on subsequent in the previouslyrewardedactivity. Given the replicable spontaneousengagement interactionsthathave alreadybeen shown to exist in this literature, would find we such a broadcautionaryconclusionjust as inappropriate the broadreassurance as offered by Cameronand Pierce. Each seems ludicrouslytoo abstracta summary of this literatureon which to base either theory or practice-just as would a medical article that purported evaluate the effects of "surgeryin general"or a to defense department study summarizingthe battlefieldeffectiveness of "weapons overall."
Predicting the Effects of Rewards on Subsequent Motivation

If one does not settle for a one-line generalizationas an adequate guide to the understanding effects thatextrinsicrewardsmay have on subsequentbehavior, how might one seek to predictthese effects? In our view, rewardsmay be thought of as serving three theoreticallydistinct functions-an instrumental incentive or function, an evaluation or feedback function, and a constraintor social control function-and each of these threefactorsmay influence an individual'sdecisions aboutwhen, how, and whetherto engage in a previouslyrewardedactivity. Thus, receivingan extrinsicrewardfor engagingin a taskmay influence(a) anindividual's expectations that furtherextrinsic rewards may follow task engagement in the future,(b) an individual'ssense of personalcompetenceand task mastery,and (c) an individual's attributions personalcontrol versus extrinsic constraint. of To understand predictthe effects of any particular of rewards,therefore, and use will requiresimultaneousattentionto each of these potentiallycompetingfactors. A rough schematic illustrationencompassing these different factors appearsin Figure 3. Let us briefly consider each of these factors, in turn.
The instrumental or incentive function of rewards. A first type of information

that rewards can provide concerns the likely extrinsic payoffs that various responses will produce in future situations. Hence, one primary determinantof whether, when, and how an individual will continue to engage in a previously rewardedactivitywill be thatindividual'sperceptionsof the continuedinstrumental value of the activity. In addition,if a rewardinitially producesdifferences in the amountor the natureof engagementwith the activity, those differences may themselves influence the individual's subsequentinvolvement with the activity. Both sorts of effects are shown in the top half of Figure 3. effects. Often, the presenceof a continConsiderfirst these latter,performance gent rewardin a particularsetting will increase task engagementin that setting. These increases, in turn, may lead to the acquisition of new skills or increased proficiency, which should increaseperceived competence and subsequentintrinsic motivation (e.g., Deci & Ryan, 1985; Harter, 1978a; Hunt, 1961; Lepper & Greene, 1978c). A child just learning to read who is offered payment for each book read, for instance, might eventually develop sufficient skill at reading to 21

INSTRUMENTAL FUNCTION

EXPECTATION OF FURTHER TANGIBLE OR SOCIAL REWARDS IN PARTICULAR FUNCTIONALLY SIMILAR SITUATIONS

DEFINITION OF CRITERIA DETERMINING MINIMUM PERFORMANCE STANDARDS NECESSARY TO OBTAIN THE REWARD

QUANTATIVE CRITERIA

QUALITATIVE CRITERIA

POTENTIAL INCREASES IN TASK ENGAGEMENT

DEFINITION OF INSTRUMENTA VS. IRRELEVANT PARAMETER

POSSIBLE ACQUISITION -OF FURTHER SKILLS OR PROFICIENCY

POSSIBLE EFFECTS OF SATIATION, FAMILIARITY, ETC.

POSSIBLE EFFECTS ON APPROACH TO TASK, PREFERENCE FOR EASY VS. DIFFICULT TASKS

POSSIBLE POSI OR NEGATIVE, E "CENTRAL" PER MEASURES, DE TASK AND CON

EXPECTED INSTRUMEN TAL VALUE OF T tHE N ACTIVITY II LATER SITUATION!

ITALINTR

PERCEPTIONS OF RELATIVE COMPETENCE OR INCOMPETENCE AND CHALLENGE OFFERED BY THE TASK

~~~~~~~~SITUATIONS ~~~~~~~~~~~~~~~~~~~~~~~~~S INTRINSIC


PERCEPTIONS OF VS. EXTRINSIC MOTIVATION, CODING OF THE ACTIVITY AS "WORK" VS. "PLAY"

ATTRIBUTIONS CONCERNING THE CAUSES OF SUCCESS OR FAILURE AT THE TASK

ATTRIBUTIONS CONCERNING ONE'S REASONS FOR ENGAGING IN THE ACTIVITY

VALUATION ORFEEDBACK FUNCTION


E

SOCIAL CONTROL FUNCTION

FIGURE 3. The multiple processes by which rewards may influence subsequent behavior Note. From "The Multiple Functions of Reward: A Social-Developmental Perspective," by M S. M. Kassin, & F. X. Gibbons (Eds.), Developmental Social Psychology (p. 22), 1981, New Yo by Oxford University Press. Reprinted with permission.

IntrinsicMotivationand ExtrinsicRewards

begin to enjoy the activity for its own sake. At the same time, other factors may also enter the picture. As Zajonc (e.g., 1968, 1980) has repeatedly shown, increases in mere exposure to an activity may increase a person's liking for new activities. Alternatively,continuous engagement in an activity over a period of time can also produce at least short-termfeelings of boredom or satiation. Similarly, contingent reward procedurescan also change how an activity is If criteriafor rewardattainment. this performed,by defining specific performance one may expect changes in performancealong these instrumentally happens, relevantdimensions. If these changes then producehigher levels of accomplishment, the result may be increased motivation. If our hypotheticalchild offered money for reading books is also requiredto summarizeeach book in a written report,what that child learns from each book might well increase. Often,however,contingentrewardprocedures producenegamay inadvertently For tive effects on otheraspectsof immediatetask performance. example, people faced with a salient extrinsic contingency will typically seek to do the minimum requiredto obtainthe profferedreward(Kruglanski,Stein, & Riter, 1977). Thus, unless special steps are taken to avoid such effects, people expecting a performance-contingentreward will simultaneously show both superiorperformance relevantdimensions and inferiorperformancealong instrualong instrumentally mentally irrelevantdimensions (e.g., Condry& Chambers,1978; Harter,1978b; Kruglanski et al., 1977; McGraw, 1978). Likewise, if engaging in different versions of an activity will meet some extrinsiccontingencyequally well, people will select the easiest means of obtainingthe reward,even if thatmeans undertaking an activity that they find less inherently interesting (e.g., Harter, 1978b; Lepper, 1981). Finally, for activities involving insight, creativity, or heuristic problem solving, extrinsic rewards may produce detrimentaleffects even on relevantdimensions (e.g., Condry& Chambers,1978; Cordova& instrumentally Lepper, 1995; McGraw, 1978). scholar to be quite alert to Thus, we might also expect our entrepreneurial of beating the system or of doing the minimum requiredto receive possibilities payment. Our studentmight well become an expert at skimmingbooks (or even at consulting Cliffs Notes instead) while continuing to give the appearanceof having read them. Or, if offered payment for each and every book read, our studentmight begin to seek out particularlyshort or simple books, even if they prove utterly boring, as long as they lead to a financial reward.If any of these performance effects, in turn, were to influence enjoyment or perceptions of competence,they could also influence our student's subsequentintrinsicmotivation. At the same time, thereis a second and perhapseven more criticalaspect to the incentive functionof rewards:simply, thatpeople who arerewardedfor engaging in an activity in one situationmay believe that they will continueto be rewarded for engaging in that activity later in that same situationor in seemingly related situations (e.g., Bandura, 1977, 1986; Bandura& Walters, 1963; Estes, 1972; Mischel, 1973). Even if previous tangible rewards are known to be no longer availablein a latersetting,the individualmay still believe thatsocial approvalwill follow continued engagement in the same activity. In either case, people who expect to be rewardedfor an activity will continue to engage in it, if the reward is large enough, whether or not they like it. By contrast, when there is no 23

Lepper,Keavney,and Drake

expectationof continuedextrinsicreward,intrinsicinterestwill play a centralrole. Consider, once more, our young capitalistic reader.If the "bucksfor books" duringa stay at her house, we would programhad been institutedby grandmother our student to be motivated by continuing pecuniary interests during expect subsequent stays at grandmother's.Unless a similar payment schedule were institutedat home, however, such extrinsic incentives would be irrelevantthere, and other factors would control our student's actions. that rewards may convey concerns the individual's competence at an activity. Thus, receipt of a reward may signal success at an activity relative to some absolute standardof performance,relative to one's own prior performance,or relative to the performanceof others. Other things being equal, rewards that enhance an individual's sense of competence and accomplishmentwill increase his or her intrinsicmotivation;conversely, contingencies that reduce a person's sense of competenceor accomplishmentwill decreasehis or her intrinsicmotivation (e.g., Bandura,1977; Deci, 1975; Deci & Ryan, 1985; Harter,1978a;Lepper & Greene, 1978c). Enhancedperceptionsof competence may also play a role in determiningmotivationin situationsin which extrinsicrewardsremainavailable, at but only following successful performance the activity. If one lacks confidence in one's ability to performwell enough to earna reward,thatrewardis not likely to prove an effective incentive. These processes are shown in the lower left-hand portionof Figure 3. Ourhypotheticalinsolvent student,for instance,mightreactquite differentlyto paymentsthat signal the mere completionof a book and paymentsthat explicitly signal a significant increase in reading proficiency. Likewise, we would expect quite different reactions to payments that identified the student as "best in the school," as a "solid B" student, or as merely having "passed"some minimum cutoff for receipt of reward.
The constraint or social control function of rewards. A third type of informaThe evaluation orfeedbackfunction of rewards. A second type of information

tion that rewardscan carry involves their social control or constraintfunction. Thus, extrinsic rewardsmay often signal an attemptby some external agent to controlthe individual'sactions, and may thereforelead the individualto view his or her engagementin the rewardedactivity as extrinsically,ratherthan as intrinFriedman,& Zeevi, 1971; Lepper sically, motivated(e.g., Deci, 1971; Kruglanski, et al., 1973). As a result, when the instrumental and the evaluative functions of rewardare controlled-as is most clearly the case when (a) subsequentbehavioris observed in a setting in which it is clear to subjectsthatneitherfurthertangiblenor further social rewards will be contingent upon engagement in the activity and (b) the rewards employed do not produce performancedifferences or convey salient competence information(e.g., as in the task-contingentrewardconditions in the studies under discussion)-offers of superfluousexpected rewardsfor engagement in an activity of initial high interest should and do decrease subsequent intrinsic interest in the activity (e.g., Lepper et al., 1973). This last process is displayed in the lower right-handsection of Figure 3. of Perhapsthe clearestdemonstration this process involves the impositionof a purely nominalcontingency on children'sengagementin two equally interesting activities. Lepper, Sagotsky, Dafoe, and Greene (1982), for example, asked pre24

Intrinsic Motivation and Extrinsic Rewards

school children to engage in two art activities of high, and equivalent, initial interestunder one of two conditions. In the means-endcondition, childrenwere told thatthey would have a chance to play with Activity 2 only if they first played with Activity 1. In the controlcondition,childrenwere asked to engage in the two activities without the imposition of any contingency between engagement in Activity 1 and engagementin Activity 2. Two weeks later, when the children's intrinsic interest in these same two art activities was covertly observed during free-playperiodsin theirregularclassrooms,childrenwho had engaged in Activity 1 in orderto earn the chance to engage in Activity 2 showed significantlyless intrinsic interest in that first activity. Similar and quite strong effects have been subsequently shown with contingencies placed on the consumption of initially equivalent foods and other tasks (e.g., Birch, Birch, Marlin, & Kramer, 1982; Birch, Marlin, & Rotter, 1984). effect of offeringfinancialpaymentsto our mercenary Thus, a final detrimental student in returnfor books read is that the student begins to view reading as extrinsicallyratherthan intrinsicallymotivatedand, later, is less likely to choose to readwhen continuedextrinsicrewardsare no longer expected. For instance,in one early study on token economies, Meichenbaum,Bowers, and Ross (1968) established a powerful token incentive programto control the academicallydisruptive behaviors of delinquentadolescents duringthe morningclass periods in their institutional school. As the weeks progressed, these students' behavior improved markedlyin the mornings;at the same time, however, their behavior deteriorated correspondinglyin the afternoons,when the extrinsic incentive prothe gramwas not in effect. "Whyshould we behave ourselves in the afternoons," studentsasked the experimenters,"when we're not being paid for it?" The three factors combined. Of course, in most real-world situations these differenteffects of rewardsarenot easy to separate.Because any particular reward may affect behaviorin any or all of these ways simultaneously,its net effect will necessarily depend both on the potentially competing effects of these different factors and on the specific situationin which subsequentbehavioris observed. Hence, there is virtuallynothing to be learnedfrom any analysis in which the and effects of rewardson performance motivationaresubsumedinto a single main effect hypothesis. We alreadyknow a great deal aboutthe multiple, theoretically independent,and potentially competing factors that determinethe net effects of any given rewardprocedure.To pretendthis knowledge does not exist is to take a major step backwards.15 Concluding Comments of At one time, one of the leading manufacturers baby food decided to add to its regularline of strainedcarrots,mashed potatoes, and pureed spinach a more exotic choice of peach melba. This potential gourmettreat, however, bore little resemblanceto what one might expect to receive if one orderedsuch a dessert in a restaurant. Indeed, from the homogenized mass of liquid in the smalljar, it was to visually distinguishthis dish from dozens of others. In fact, it was impossible hard to imagine exactly how this dish had been prepared. One might have carefullysliced peach sections in syrupon a plate, then imagineda chef arranging drizzling raspberrysauce over them, adding just the right amount of French 25

Lepper,Keavney,and Drake

vanilla ice cream with a bit more raspberrysauce, and topping the dish with an attractivedollop of hand-whippedcream. Once completed, the dish was then presumablyhandedto the chef's assistant,who slid the entire concoction into an industrialstrengthblender and flipped the switch to "liquefy."In our view, the thoroughlyhomogenizedresultof this process bearsroughlythe same relationship to the original dish as Cameronand Pierce's meta-analysisof the intrinsicmotivation literaturedoes to the studies that comprise that literature. Just as the giant blenderin this example renderedunrecognizablethe original ingredientsand the actualconcoction from which a homogenizedsoup was made, so Cameronand Pierce's techniquesrenderunrecognizablethe studies and findings regardingthe effects of extrinsicrewardson intrinsicmotivation.By posing a meaningless main effect null hypothesis in the face of numerous crossover interactions,by averagingacross both diametricallyopposed effects and specifically designed experimentaland controlconditions,by giving equal weight to the worst-controlled the best-designedexperiments,by completely ignoringboth and theoreticallytelling but uncommonfindings and highly probativewithin-experimentcomparisons,theirmeta-analysishas reducedan extensive, rich, andpredictably complex researchliteratureto pap. As we have alreadynoted, ourcriticismof CameronandPierce's review should not be takenas a generalindictmentof meta-analysis.Most of these difficultiesare not inherentin the statisticaltechniqueof meta-analysis,and we would find them equally problematic in a narrativereview. However, we are unaware of any narrativereview of this literaturein which studies are discussed only after averaging across competing or differentialeffects, in which between-studycomparisons aregiven effective priorityover within-studycomparisons,in which singular but theoretically telling variations in procedure are completely ignored, or in which a main effect hypothesis is championedin the face of demonstrableand replicable crossover interactions. Indeed, we find it difficult to imagine that comparableprocedures,if employed in a narrativereview, would have survived the editorial process; and we believe that Cameron and Pierce's use of these techniqueshas gained acceptanceonly because they have been hiddenbehindthe aura of objectivity generally associated with meta-analysis.Unfortunately,the mere use of meta-analyticprocedures is not, as is sometimes supposed, any guaranteeof objectivity or accuracy. Cameronand Pierce appearto have felt justified in theirefforts by the fact that some authorshave unfairlyovergeneralizedfrom the literatureon intrinsicmotivation. In our view, their equally inappropriate overgeneralizationin the other direction, based on egregiously flawed and overly simplistic analyses, remains unwarranted of no probativevalue. As the old saying has it, "Two wrongs do and not make a right." Over a centuryago, BenjaminDisraeliopinedthat"therearethreekinds of lies: lies, damned lies, and statistics."For years, as consumers and as proponentsof statistics,the presentauthorshave found Disraeli's descriptionof statisticsas the blackest of all lies, as an altogether too harsh and sweeping condemnation. Perhaps, we see now, he was simply anticipating the day when the sorts of procedures employedby CameronandPiercecould be used to turnsilk pursesinto sows' ears. 26

IntrinsicMotivationand ExtrinsicRewards

Notes 'That this allusion to "severalresearchers" not an accuratedepictionof the is authors' personalviews is evidentfromtheirown discussionselsewherein the paper concerningthe "largebody of research"on this topic, "the major contentionin educationand psychology . . . thatrewardsand reinforcement negativelyimpacta and person'sintrinsicmotivation," the like-not to mentionthe roughly100 studies they cite and anotherseveraldozen thatmightalso have been included. the this is 2Indeed, long historyof theoreticalcontroversy surrounding literature highlightedby a study in which Mahoney(1976), then an editor of an avowedly behavioristic journal,sent reviewerswhat were alleged to be sections of an actual for in reinforcement manuscript review.In the studydescribed thisphonymanuscript, in was providedand then withdrawn a particular setting.In a versionsent to some in the belowbaseline reviewers, allegedresultwas a decrease subsequent performance to levels; in a versionsent to otherreviewers,the allegedresultwas simplya return behavioral reviewerswere vastly more baselinelevels. Given identicalprocedures, to below likely to reject the articleif it appeared show a decreasein performance motivation baseline.We presume reviewersin the intrinsic that camp(suchreviewers were not included in Mahoney's study) would have shown comparable,though opposite,biases. this 3Although case had not yet been investigated,even in those early days the rewardswas predictable, because additional complexityof performance-contingent effects on are suchrewards expectedto involvemultipleandinherently contradictory intrinsicmotivation(e.g., Condry,1977;Deci, 1975). will here to 4Itis important notethatourargument is notthatmain-effect hypotheses be they always,or even generally, inappropriate; will be so, however,whenappliedto in of includedin the studies literatures which the "sample" procedures experimental have deliberately as is nonrepresentative, will typicallybe the case whenresearchers controlprocedures and/orhave deliberately developedtheoretisoughtto introduce but procedures. cally informative ecologicallyuncharacteristic this same significantcrossover 5Newmanand Layton (1984) also demonstrate betweeninitial interestand expected,tangiblereward,significantat the interaction and .025 level. Becausethisstudyis notmentioned Cameron Pierce,its findingsare by not includedin the current analysis. that 6In fact, Cameron andPiercethemselvesoffer the speculation "reinforcement intrinsicmotivationfor low interestactivities"to explain other does not interrupt findings discussed briefly on page 393 of their article. They make no mention, has however,of the factthatthisexacthypothesis beenextensivelytestedin a number of the very studiesthey have reviewed. effect size of -.08 is not identicalto the 7We should note that this "average" effect size of -.02 for these three studies computedfrom Cameronand "average" Pierce's appendix,for two primaryreasons.First, we have computedactualeffect sizes in some cases where Cameronand Pierce employedrandomestimates,and effect sizes acrossmultiplemeasureswithinstudies. second,we have averaged end the in 8Alternatively, someexperiments, sametheoretical maybe accomplished to in the use of a factorialdesignin whichvariations A can be demonstrated through that B producesome effect underconditions 1, butnotto produce effect underslightly differentconditionsB2 (e.g., Pretty& Seligman,1984). we here and in the following several paragraphs mean only to 9By "standard" indicatean expectedtangiblerewardconditionin which the additionalvariableof studyis not manipulated. specialinterestin a particular one '?In realmof attitudes the measures, similarly, needsto ensurethatsubjectsare in clearthattheyarebeingaskedto providetheirattitudes engagement the concerning 27

Lepper,Keavney,and Drake activity itself, and not engagement in the activity coupled with the receipt of some attractivereward. Salancik and Conway (1975), for example, have shown empirically that a simple "How much did you like/enjoy X?" question will be taken by subjects to mean the former in a context that emphasizes intrinsic interest but to mean the latter in a context that emphasizes extrinsic rewards or constraints. Naturally, these two interpretationsof the question produced quite different answers. "Unlike the cases cited here, a few other studies that also used tasks that might be initially coded as relatively uninteresting (e.g., stabilometer) do provide direct evidence of the initial attractiveness of the activity to the subject population under study. '2Note that unless there were some reason to doubt Kruglanski et al.'s findings, it is unlikely that others would repeat this procedure, because the procedure is neither likely to occur in everyday life nor likely to be seen as a viable ameliorative strategy. Hence, it is not an accident that there has been no follow-up study using this sort of procedure in the more than 2 decades since the publication of this study. '3Indeed,we should be clear that these are not at all simple problems. An expected verbal reward procedure, which typically consists of telling subjects at the outset that they will receive feedback concerning their performance, differs in many other theoretically significant respects from a comparable expected, tangible reward procedure in which subjects know in advance exactly what they will receive (e.g., 1 dollar or 10 candies per puzzle solved). '4Even more striking though less educationally relevant illustrations of this basic point can be seen in still famous and widely cited studies like Asch's (1951) investigations of conformity to group pressure and Milgram's (1965, 1974) experiments on obedience to authorities. In these classic cases, no formal hypotheses were stated, no comparisons were needed, and no statistics were required. Yet, all of us were good enough psychologists to realize that a 65% rate of obedience or a 35% rate of conformity-in the specific situations created in these classic studies-represented extraordinarilypowerful results (see Lepper, 1995, and Prentice & Miller, 1992, for more extended discussions of these points). '5At the same time, we want to acknowledge clearly that this analysis will not always permit one to derive an unequivocal prediction about the absolute effects of a given reward procedure on an individual's subsequent actions. In those situations in which competing forces are at work, we would need to be able to assess the power of these competing forces effectively. Nonetheless, it is equally importantto be clear that it is quite possible to generate clear relative predictions within experiments in which individual features of a reward procedure can be systematically varied in settings in which the other features of that procedure have been experimentally controlled.

References
Abelson, R. P. (1995). Statistics as principled argument. Hillsdale, NJ: Erlbaum. Aronson, E., Ellsworth, P. C., Carlsmith, J. M., & Gonzales, M. H. (1990). Methods of research in social psychology (2nd ed.). New York: McGraw-Hill. Asch, S. (1951). Effects of group pressure upon the modification and distortion of judgments. In H. Guetzkow (Ed.), Groups, leadership, and men (pp. 177-190). Pittsburgh, PA: Carnegie Press. Bandura, A. (1977). Social learning theory. Englewood Cliffs, NJ: Prentice-Hall. Bandura, A. (1986). Socialfoundations of thought and action. Englewood Cliffs, NJ: Prentice-Hall. Bandura, A., & Walters, R. H. (1963). Social learning and personality development. New York: Holt, Rinehart and Winston. Bates, J. A. (1979). Extrinsic reward and intrinsic motivation: A review with implications for the classroom. Review of Educational Research, 49, 557-576. 28

IntrinsicMotivationand ExtrinsicRewards

L. Birch,L. L., Birch,D., Marlin,D. W., & Kramer, (1982). Effects of instrumental 3, 125-134. Birch,L. L., Marlin,D. W., & Rotter,J. (1984). Eatingas the "means" activityin a Effectson youngchildren'sfood preference. ChildDevelopment, 55, contingency: 431-439. for Boggiano,A. K.,Main,D. S., & Katz,P. A. (1988).Children's preference challenge: B. of and motivation. Calder, J.,& Staw,B. M. (1975).Self-perception intrinsic extrinsic and motivation: Cameron, & Pierce,W. D. (1994).Reinforcement, J., reward, intrinsic J. Self-initiated versusother-initiated of Condry, (1977).Enemies exploration: learning. In motivation theprocessof learning. and J. J., Condry, & Chambers, (1978).Intrinsic M. R. Lepper D. Greene(Eds.),Thehiddencostsof reward 61-84). Hillsdale, & (pp. NJ: Erlbaum.
Cooper, H. M. (1989). Integrating research: A guide for literature reviews (2nd ed.). Cooper, H., & Hedges, L. V. (Eds.). (1994). The handbook of research synthesis. New Cordova, D. I., & Lepper,M. R. (1994). Intrinsic motivation and theprocess oflearning: Beneficial effects of contextualization, personalization, and choice. Unpublished Cordova, D. I., & Lepper, M. R. (1995). The effects of intrinsic versus extrinsic incentives on the process of learning. Unpublished manuscript,StanfordUniversity, Journal of Personality and Social Psychology, 35, 459-477. A meta-analysis. Review of Educational Research, 64, 363-423. Journal of Personality and Social Psychology, 31, 599-605. The role of perceived competence and control. Journal of Personality and Social Psychology, 54, 916-924. consumption on children's food preference. Appetite: Journal for Intake Research,

BeverlyHills, CA: Sage.

York:Russell Sage Foundation.

Stanford CA. University,Stanford, manuscript, CA. Stanford,

Deci, E. L. (1971). Effects of externallymediatedrewardson intrinsicmotivation. and rewards controls and Deci, E. L. (1972). Theeffectsof contingent non-contingent 229. New York:PlenumPress. Deci, E. L. (1975). Intrinsicmotivation.
on intrinsic motivation. Organizational Behavior and Human Performance, 8, 217Journal of Personality and Social Psychology, 18, 105-155.

Cronbach,L. J., & Snow, R. E. (1977). Aptitudes and instructional methods: Handbook for research on interactions. New York: Irvington.

Deci, E. L., & Ryan, R. M. (1985). Intrinsic motivation and self-determination in human Journal of Personality and Social Psychology, 53, 1024-1037.

behavior.New York:PlenumPress. of of and Deci, E. L., & Ryan,R. M. (1987).Thesupport autonomy thecontrol behavior. American in Scientist,60, 723Estes,W. K. (1972). Reinforcement humanbehavior. 729.
Fargo, G. A., Behms, C., & Nolen, P. (1970). Behavior modification in the classroom. model. Human Development, 1, 34-64.

Belmont,CA: Wadsworth. Towarda developmental Harter,S. (1978a). Effectancemotivationreconsidered:

derivedfromchallengeandtheeffectsof receivinggrades S. Harter, (1978b).Pleasure on children'sdifficultylevel choices. ChildDevelopment, 788-799. 49, Hedges,L. V. (1987).Howhardis hardscience,how softis softscience?Theempirical
cumulativeness of research. American Psychologist, 42, 443-455. Hedges, L., & Olkin, I. (1985). Statistical methods for meta-analysis. Orlando, FL:

Academic. Hunt,J. M. V. (1961). Intelligenceand experience.New York:RonaldPress.

29

Lepper,Keavney,and Drake Kohn, A. (1993). Punished by rewards: The trouble with gold stars, incentive plans, A's, praise, and other bribes. New York: Houghton Mifflin. Kohn, A. (1996). By all available means: Cameron and Pierce's defense of extrinsic motivators. Review of Educational Research, 66, 1-4. Kruglanski, A. W., Alon, S., & Lewis, T. (1972). Retrospective misattributionand task enjoyment. Journal of Experimental and Social Psychology, 8, 493-501. Kruglanski,A. W., Friedman,I., & Zeevi, G. (1971). The effects of extrinsic incentives on some qualitative aspects of task performance. Journal of Personality, 39, 606617. Kruglanski,A. W., Riter, A., Amitai, A., Margolin, B., Shabtai, L., & Zaksh, D. (1975). Can money enhance intrinsic motivation: A test of the content-consequences hypothesis. Journal of Personality and Social Psychology, 31, 744-750. Kruglanski, A. W., Stein, C., & Riter, A. (1977). Contingencies of exogenous reward and task performance:On the "minimax"principle in instrumentalbehavior. Journal of Applied Social Psychology, 7, 141-148. Lepper, M. R. (1981). Intrinsic and extrinsic motivation in children:Detrimentaleffects of superfluous social controls. In W. A. Collins (Ed.), Minnesota symposiumon child psychology (Vol. 14, pp. 155-214). Hillsdale, NJ: Erlbaum. Lepper, M. R. (1983a). Extrinsic reward and intrinsic motivation: Implications for the classroom. In J. M. Levine & M. C. Wang (Eds.), Teacher and studentperceptions: Implications for learning (pp. 281-317). Hillsdale, NJ: Erlbaum. Lepper, M. R. (1983b). Social control processes and the interalization of social values: An attributionalperspective. In E. T. Higgins, D. N. Ruble, & W. W. Hartup(Eds.), Developmental social cognition: A sociocultural perspective (pp. 294-330). New York: Cambridge University Press. Lepper, M. R. (1988). Motivational considerations in the study of instruction.Cognition and Instruction, 5, 289-310. Lepper, M. R. (1995). Theory by the numbers? Some concerns about meta-analysis as a theoretical tool. Applied Cognitive Psychology, 9, 411-422. Lepper, M. R., & Gilovich, T. (1981). The multiple functions of reward: A socialdevelopmental perspective. In S. S. Brehm, S. M. Kassin, & F. X. Gibbons (Eds.), Developmental social psychology (pp. 5-31). New York: Oxford University Press. Lepper, M. R., & Greene, D. (1978a). Divergent approaches to the study of rewards. In M. R. Lepper & D. Greene (Eds.), The hidden costs of reward (pp. 217-244). Hillsdale, NJ: Erlbaum. Lepper, M. R., & Greene, D. (Eds.). (1978b). The hidden costs of reward. Hillsdale, NJ: Erlbaum. Lepper, M. R., & Greene, D. (1978c). Overjustification research and beyond: Toward a means-end analysis of intrinsic and extrinsic motivation. In M. R. Lepper & D. Greene (Eds.), The hidden costs of reward (pp. 109-148). Hillsdale, NJ: Erlbaum. Lepper, M. R., Greene, D., & Nisbett, R. E. (1973). Undermining children's intrinsic interest with extrinsic rewards: A test of the "overjustification"hypothesis. Journal of Personality and Social Psychology, 28, 129-137. Lepper, M. R., & Hodell, M. (1989). Intrinsic motivation in the classroom. In C. Ames & R. Ames (Eds.), Research on motivation in education (Vol. 3, pp. 73-105). New York: Academic Press. Lepper, M. R., Sagotsky, G., Dafoe, J., & Greene, D. (1982). Consequences of superfluous social constraints: Effects of nominal contingencies on children's subsequent intrinsic interest. Journal of Personality and Social Psychology, 42, 51-65. Loveland, K. K., & Olley, J. G. (1979). The effect of external reward on interest and quality of task performance in children of high and low intrinsic motivation. Child Development, 50, 1207-1210. Mahoney, M. (1976). Scientist as subject: The psychological imperative. Cambridge, 30

IntrinsicMotivationand ExtrinsicRewards Mahoney, M. J. (1974). Cognition and behavior modification. Cambridge, MA:

MA: Ballinger.

Ballinger. K. effectsof reward performance: literature on A McGraw, 0. (1978).Thedetrimental & reviewanda prediction model.InM.R. Lepper D. Greene(Eds.),Thehidden costs of reward(pp. 33-60). Hillsdale,NJ: Erlbaum. of value on high McLoyd,V. C. (1979). The effects of extrinsicrewards differential and low intrinsicinterest.ChildDevelopment, 1010-1019. 50, D. of Meichenbaum, H., Bowers,K. S., & Ross, R. R. (1968). Modification classroom female adolescentoffenders.BehaviorResearchand behaviorof institutionalized Milgram,S. (1965). Some conditionsof obedienceand disobedienceto authority.
Human Relations, 18, 57-76. Milgram, S. (1974). Obedience to authority: An experimental view. New York: Harper sonality. Psychological Review, 80, 252-283. Therapy, 6, 343-353.

and Row. of Mischel,W. (1973). Towardsa cognitivesocial learningreconceptualization pertion. Review of Educational Research, 54, 683-692. Mullen, B. (1989). Advanced BASIC meta-analysis. Hillsdale, NJ: Erlbaum. Neale, J. M., & Liebert, R. M. (1986). Science and behavior: An introductionto methods Personality and Social Psychology Bulletin, 10(3), 419-425.

M. decrements increments intrinsicmotivaand in Morgan, (1984). Reward-induced

of research(3rded.). EnglewoodCliffs, NJ: Prentice-Hall. A Newman,J., & Layton,B. D. (1984). Overjustification: self-perception perspective. R. in O'Leary,K. D., & Drabman, (1971). Tokenreinforcement programs the classof T. Pittman, S., Cooper,E. E., & Smith,T. W. (1977). Attribution causalityandthe T. K. N. Pittman, S., Davey,M. E., Alafat,K. A., Wetherill, V., & Kramer, A. (1980).
Informational vs. controlling verbal rewards. Personality and Social Psychology Bulletin, 6, 228-233. logical Bulletin, 112, 160-164. overjustification effect. Personality and Social Psychology Bulletin, 3, 280-283. room: A review. Psychological Bulletin, 75, 379-398.

Prentice,D. A., & Miller,D. T. (1992). When smalleffects are impressive. Psychoeffect. Journal Pretty,G. H., & Seligman,C. (1984). Affect andthe overjustification betweeninternal statesandaction.PsychologiG. Quattrone, (1985). Onthecongruity Rosenthal,R. (1979). The "file drawerproblem"and tolerancefor null results.
Psychological Bulletin, 86, 638-641. Rosenthal, R. (1984). Meta-analytic proceduresforsocial research. Beverly Hills, CA: cal Bulletin, 98, 340. of Personality and Social Psychology, 46, 1241-1253.

Rosenthal, R. (1991). Meta-analyticproceduresfor social research (Rev. ed.). Newbury Rosenthal, R., & Jacobson, L. (1968). Pygmalion in the classroom: Teacher expectation and pupils' intellectual development. New York: Holt. Ross, L., & Nisbett, R. E. (1991). Theperson and the situation: Perspectives of social and Social Psychology, 32, 245-254.

Sage.

Park,CA: Sage.

PA: psychology.Philadelphia, TempleUniversityPress. Journalof Personality Ross, M. (1975). Salienceof rewardandintrinsicmotivation. Rummel,A., & Feinberg,R. (1988). Cognitiveevaluationtheory:A meta-analytic An in and sphere: extension Ryan,R. M. (1982).Control information theintrapersonal
of cognitive evaluation theory. Journal of Personality and Social Psychology, 43, review of the literature. Social Behavior and Personality, 16, 147-164.

31

Lepper,Keavney,and Drake 450-461. Ryan, R. M., & Deci, E. L. (1996). When paradigms clash: Comments on Cameron and Pierce's claim that rewards do not undermine intrinsic motivation. Review of Educational Research, 66, 33-38. Ryan, R. M., Mims, V., & Koestner, R. (1983). Relation of reward contingency and interpersonal context to intrinsic motivation: A review and test using cognitive evaluation theory. Journal of Personality and Social Psychology, 45, 736-750. Salancik, G. R., & Conway, M. (1975). Attitude inferences from salient and relevant cognitive content about behavior. Journal of Personality and Social Psychology, 32, 829-840. Sarafino, E. P. (1984). Intrinsic motivation and delay of gratification in preschoolers: The variables of reward salience and length of expected delay. British Journal of Developmental Psychology, 2, 149-156. Slavin, R. E. (1986). Best evidence synthesis: An alternative to meta-analytic and traditional reviews. Educational Researcher, 15(9), 5-11. Snow, R. E. (1994). Abilities in academic tasks. In R. J. Sternberg & R. K. Wagner (Eds.), Mind in context: Interactionist perspectives on human intelligence (pp. 337). New York: Cambridge University Press. Snow, R. E., Federico, P.-A., & Montague, W. E. (Eds.). (1980a). Aptitude, learning, and instruction: Vol. 1. Cognitive process analyses of aptitude. Hillsdale, NJ: Erlbaum. Snow, R. E., Federico, P.-A., & Montague, W. E. (Eds.). (1980b). Aptitude, learning, and instruction: Vol. 2. Cognitiveprocess analyses of learning andproblem-solving. Hillsdale, NJ: Erlbaum. Snow, R. E., & Lohman, D. F. (1984). Toward a theory of cognitive aptitudefor learning from instruction. Journal of Educational Psychology, 76, 347-376. Staw, B. M., Calder, B. J., Hess, R. K., & Samdelands, L. E. (1980). Intrinsicmotivation and norms about payment. Journal of Personality, 48, 1-14. Tang, S., & Hall, V. (1995). The overjustification effect: A meta-analysis. Applied Cognitive Psychology, 9, 365-404. Tukey, J. W. (1977). Exploratory data analysis. Reading, MA: Addison-Wesley. Zajonc, R. B. (1968). Attitudinal effects of mere exposure. Journal of Personality and Social Psychology, 9(Monograph suppl. 2, part 2), 1-68. Zajonc, R. B. (1980). Feeling and thinking: Preferences need no inferences. American Psychologist, 5, 151-175. Authors MARK R. LEPPER is Professor, Department of Psychology, Stanford University, JordanHall-Building 420, Stanford, CA 94305-2130; lepper@psych.stanford.edu. He specializes in social, developmental, and educational psychology. MARK KEAVNEY is President, Serious Fun, 4265 Alma Street, Palo Alto, CA 94306. He specializes in social psychology. MICHAEL DRAKE is Research Assistant, Department of Psychology, Stanford University, JordanHall-Building 420, Stanford, CA 94305-2130. He specializes in social psychology. Received July 18, Revision received August 1, Second revision received October 23, Accepted November 20, 1995 1995 1995 1995

32

You might also like