Amimal Learning

Animal Learning & Cognition
Copyright 2008 Psychology Press http://www.psypress.com/animal-learning-and-cognition/
Animal Learning & Cognition,

An Introduction
Third Edition
John M. Pearce
Cardiff University
Published in 2008 by Psychology Press 27 Church Road, Hove, East Sussex, BN3 2FA Simultaneously published in the USA and Canada by Psychology Press 270 Madison Ave, New York, NY 10016 www.psypress.com Psychology Press is an imprint of the Taylor & Francis Group, an informa business 2008 Psychology Press All rights reserved. No part of this book may be reprinted or reproduced or utilized in any form or by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying and recording, or in any information storage or retrieval system, without permission in writing from the publishers. The publisher makes no representation, express or implied, with regard to the accuracy of the information contained in this book and cannot accept any legal responsibility or liability for any errors or omissions that may be made. British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library Library of Congress Cataloging-in-Publication Data Pearce, John M. Animal learning and cognition: an introduction / John M. Pearce. p. cm. Includes bibliographical references and index. ISBN 9781841696553ISBN 9781841696560 1. Animal intelligence. I. Title. QL785.P32 2008 591.5'13dc22 ISBN: 9781841696553 (hbk) ISBN: 9781841696560 (pbk) Typeset by Newgen Imaging Systems (P) Ltd, Chennai, India Printed and bound in Slovenia 2007034019
For Victoria
Contents
Preface 1 The study of animal intelligence
The distribution of intelligence Defining animal intelligence Why study animal intelligence? Methods for studying animal intelligence Historical background
ix
2
4 12 16 20 22
The performance of instrumental behavior The Law of Effect and problem solving
106 111
5 Extinction
Extinction as generalization decrement The conditions for extinction Associative changes during extinction Are trials important for Pavlovian extinction?
122
123 125 134 142
2 Associative learning
Conditioning techniques The nature of associative learning Stimulusstimulus learning The nature of US representations The conditioned response Concluding comment: the reflexive nature of the conditioned response
34
36 42 49 52 55
6 Discrimination learning
Theories of discrimination learning Connectionist models of discrimination learning Metacognition and discrimination learning
148
149 161 166
60
3 The conditions for learning: Surprise and attention

Part 1: Surprise and conditioning Conditioning with a single CS Conditioning with a compound CS Evaluation of the RescorlaWagner model Part 2: Attention and conditioning Wagners theory Stimulus significance The PearceHall theory Concluding comments
7 Category formation 62
64 64 68 72 74 76 80 86 91 Examples of categorization Theories of categorization Abstract categories Relationships as categories The representation of knowledge
170
171 173 179 180 188
8 Short-term retention
Methods of study Forgetting Theoretical interpretation Serial position effects Metamemory
190
191 199 202 206 207
4 Instrumental conditioning
The nature of instrumental learning The conditions of learning
92 9 Long-term retention
93 97 Capacity Durability
212
214 215
viii Contents
Theoretical interpretation Episodic memory 218 225 Copying behavior: imitation Theory of mind Self-recognition Concluding comments 304 312 319 324
10 Time, number, and serial order

Time Number Serial order Transitive inference Concluding comments
232
233 243 253 259 262
13 Animal communication and language

Animal communication Communication and language Can an ape create a sentence? Language training with other species The requirements for learning a language
326
327 336 339 350 356
11 Navigation
Part 1: Short-distance travel Methods of navigation Part 2: Long-distance travel Navigational cues Homing Migration Concluding comments
264
265 265 283 284 286 289 293
14 The distribution of intelligence

Intelligence and brain size The null hypothesis Intelligence and evolution
360
361 364 369
12 Social learning
Diet selection and foraging Choosing a mate Fear of predators Copying behavior: mimicry
296
298 301 301 302
References Author index Subject index
373 403 411
Preface
In preparing the third edition of this book, my aim, as it was for the previous editions, has been to provide an overview of what has been learned by pursuing one particular approach to the study of animal intelligence. It is my belief that the intelligence of animals is the product of a number of mental processes. I think the best way of understanding these processes is by studying the behavior of animals in an experimental setting. This book, therefore, presents what is known about animal intelligence by considering experimental findings from the laboratory and from more naturalistic settings. I do not attach any great importance to the distinction between animal learning and animal cognition. Research in both areas has the common goal of elucidating the mechanisms of animal intelligence and, very often, this research is conducted using similar procedures. If there is any significance to the distinction, then it is that the fields of animal learning and animal cognition are concerned with different aspects of intelligence. Chapters 2 to 6 are concerned predominantly with issues that fall under the traditional heading of animal learning theory. My main concern in these chapters is to show how it is possible with a few simple principles of associative learning to explain a surprisingly wide range of experimental findings. Readers familiar with the previous edition will notice that apart from a new chapter devoted to extinction, there are relatively few changes to this part of the book. This lack of change does not mean that researchers are no longer actively investigating the basic learning processes in animals. Rather, it means that the fundamental principles of
x Preface
learning are now reasonably well established and that current research is directed towards issues that are too advanced to be considered in an introductory text book. The second half of the book covers material that is generally treated under the heading of animal cognition. My overall aim in these chapters is to examine what has been learned from studying animal behavior about such topics as memory, the representation of knowledge, navigation, social learning, communication, and language. I also hope to show that the principles developed in the earlier chapters are of relevance to understanding research that is reviewed in the later chapters. It is in this part of the book that the most changes have been made. Research on animal cognition during the last 10 years has headed in many new directions. I have tried to present a clear summary of this research, as well as a balanced evaluation of its theoretical implications. Those who wish to study the intelligence of animals face a daunting task. Not only are there numerous different species to study, but there is also an array of intellectual skills to be explored, each posing a unique set of challenging theoretical problems. As a result, many of the topics that I discuss are still in their infancy. Some readers may therefore be disappointed to discover that we are still trying to answer many of the interesting questions that can be asked about the intelligence of animals. On the other hand, it is just this lack of knowledge that makes the study of animal learning and cognition so exciting. Many fascinating discoveries remain to be made once the appropriate experiments have been conducted. One of the rewards for writing a book is the opportunity it provides to thank the many friends and colleagues who have been so generous with the help they have given me. The way in which this book is organized and much of the material it contains have been greatly influenced by numerous discussions with A. Dickinson, G. Hall, N. J. Mackintosh, and E. M. Macphail. Different chapters have benefited greatly from the critical comments on earlier versions by A. Aydin, N. Clayton, M. Haselgrove, C. Heyes, V. LoLordo, A. McGregor, E. Redhead, and P. Wilson. A special word of thanks is due to Dave Lieberman, whose thoughtful comments on an earlier draft of the present edition identified numerous errors and helped to clarify the manner in which much of the material is presented. The present edition has also greatly benefited from the detailed comments on the two previous editions by N. J. Mackintosh. I should also like to express my gratitude to the staff at Psychology Press. Without the cajoling and encouragement of the Assistant Editor, Tara Stebnicky, it is unlikely that I would have embarked on this revision. I am particularly grateful to the Production Editor, Veronica Lyons, who, with generous amounts of enthusiasm and imagination, has done a wonderful job in trying to transform a sows ear into a silk purse. Thanks are also due to the colleagues who were kind enough to send me photographs of their subjects while they were being tested. Finally, there is the pleasure of expressing gratitude to Victoria, my wife, who once again patiently tolerated the demands made on her while this edition was being prepared. In previous editions I offered similar thanks to my children, but there is no need on this occasion now that they have left home. Even so, Jess, Alex, and Tim would never forgive me if I neglected to mention their names. While preparing for this revision I read a little about Darwins visit to the Galapagos Islands. I was so intrigued by the influence they had on him that I felt compelled to visit the islands myself. During the final stages of preparing this edition, Veronica and Tania, somewhat reluctantly, allowed me a two-week break to travel to the Galapagos Islands. The holiday was one of the highlights of my life.
Preface
xi
The sheer number of animals, and their absolute indifference to the presence of humans, was overwhelming. The picture on the previous page shows me trying unsuccessfully to engage a giant tortoise in conversation. This, and many other photographs, were taken without any elaborate equipment and thus reveal how the animals allowed me to approach as close as I wished in order to photograph them. I came away from the islands having discovered little that is new about the intelligence of animals, but with a deeper appreciation of how the environment shapes not only their form, but also their behavior. John M. Pearce October, 2007
CHAPTER 4
CONTENTS
The nature of instrumental learning The conditions of learning The performance of instrumental behavior The Law of Effect and problem solving
93
97
106
111
Instrumental conditioning
Behavior is affected by its consequences. Responses that lead to reward are repeated, whereas those that lead to punishment are withheld. Instrumental conditioning refers to the method of using reward and punishment in order to modify an animals behavior. The first laboratory demonstration of instrumental conditioning was provided by Thorndike (1898) who, as we saw in Chapter l, trained cats to make a response in order to escape from a puzzle box and earn a small amount of fish. Since this pioneering work, there have been many thousands of successful demonstrations of instrumental conditioning, employing a wide range of species, and a variety of experimental designs. Skinner, for example, taught two pigeons, by means of instrumental conditioning, to play ping-pong with each other. From the point of view of understanding the mechanisms of animal intelligence, three important issues are raised by a successful demonstration of instrumental conditioning. We need to know what information an animal acquires as a result of its training. Pavlovian conditioning was shown to promote the growth of stimulusstimulus associations, but what sort of associations develop when a response is followed by a reward or punishment? Once the nature of the associations formed during instrumental conditioning has been identified, we then need to specify the conditions that promote their growth. Surprise, for example, is important for successful Pavlovian conditioning, but what are the necessary ingredients to ensure the success of instrumental conditioning? Finally, we need to understand the factors that determine when, and how vigorously, an instrumental response will be performed. Before turning to a detailed discussion of these issues, we must be clear what is meant by the term reinforcer. This term refers to the events that result in the strengthening of an instrumental response. The events are classified as either positive reinforcers, when they consist of the delivery of a stimulus, or negative reinforcers, when it involves the removal of a stimulus.
KEY TERM
Reinforcer An event that increases the probability of a response when presented after it. If the event is the occurrence of a stimulus, such as food, it is referred to as a positive reinforcer; but if the event is the removal of a stimulus, such as shock, it is referred to as a negative reinforcer.
T H E N AT U R E O F I N S T R U M E N TA L L E A R N I N G Historical background
Thorndike (1898) was the first to propose that instrumental conditioning is based on learning about responses. According to his Law of Effect, when a response is followed by a reinforcer, then a stimulusresponse (SR) connection is strengthened. In the case of a rat that must press a lever for food, the stimulus might be the lever itself and the response would be the action of pressing the lever. Each successful lever press would thus serve to strengthen a connection between the sight of the lever and the response of pressing it. As a result, whenever the rat came across the lever in the future, it would be likely to press it and thus gain reward. This analysis of instrumental conditioning has formed the basis of a number of extremely influential theories of learning (e.g. Hull, 1943).
94 A N I M A L L E A R N I N G & C O G N I T I O N
A feature of the Law of Effect that has proved unacceptable to the intuitions of many psychologists is that it fails to allow the animal to anticipate the goal for which it is responding. The only knowledge that an SR connection permits an animal to possess is the knowledge that it must make a particular response in the presence of a given stimulus. The delivery of food after the response will, according to the Law of Effect, effectively come as a complete surprise to the animal. In addition to sounding implausible, this proposal has for many years conflicted with a variety of experimental findings. One early finding is reported by Tinkelpaugh (1928), who required monkeys to select one of two food wells to obtain reward. On some trials the reward was a banana, which was greatly preferred to the other reward, a lettuce leaf. Once the animals had been trained they were occasionally presented with a lettuce leaf when they should have received a banana. The following quote, which is cited in Mackintosh (1974), provides a clear indication that the monkey expected a more attractive reward for making the correct response (Tinkelpaugh, 1928, p. 224): She extends her hand to seize the food. But her hand drops to the floor without touching it. She looks at the lettuce but (unless very hungry) does not touch it. She looks around the cup and behind the board. She stands up and looks under and around her. She picks the cup up and examines it thoroughly inside and out. She had on occasion turned toward the observers present in the room and shrieked at them in apparent anger.
70 60 50 Per cent errors 40 30 20
Experimental Control
10 0 0 2 4 6
FIGURE 4.1 The mean number of errors made by two groups of rats in a multiple-unit maze. For the first nine trials the reward for the control group was more attractive than for the experimental group, but for the remaining trials both groups received the same reward (adapted from Elliott, 1928).
A rather different type of finding that shows animals anticipate the rewards for which they are responding can be found in experiments in which rats ran down an alley, or through a maze, for food. If a rat is trained first with one reward which is then changed in attractiveness, there is a remarkably rapid change in its performance on subsequent trials. Elliott (1928) found that the number of errors in a multipleunit maze increased dramatically when the quality of reward in the goal box was reduced. Indeed, the animals were so dejected by this change that they made more errors than a control group that had been 8 10 12 14 16 trained throughout with the less attractive reward Trials (Figure 4.1). According to SR theory, the change in performance by the experimental group should have taken place more slowly, and should not have resulted in less accurate responding than that shown by the control group. As an alternative explanation, these findings imply that the animals had some expectancy of the reward they would receive in the goal that allowed them to detect when it was made less attractive. Tolman (1932) argued that findings such as these indicate that rats form Runconditioned stimulus (US) associations as a result of instrumental conditioning. They are assumed to learn that a response will be followed by a particular outcome. There is no doubt that the results are consistent with this proposal, but they do not force us to accept it. Several SR theorists have pointed out that the anticipation of reward could have been based on conditioned stimulus (CS)US, rather than RUS
95
associations. In Elliotts (1928) experiment, for example, the animal consumed the reward in the goal box. It is possible that the stimuli created by this part of the apparatus served as a CS that became associated with food. After a number of training trials, therefore, the sight of the goal box would activate a representation of the reward and thereby permit the animal to detect when its value was changed. Both Hull (1943) and Spence (1956) seized on this possibility and proposed that the strength of instrumental responding is influenced by the Pavlovian properties of the context in which the response is performed. The debate between SR theorists and what might be called the expectancy (RUS) theorists continued until the 1970s (see for example, Bolles, 1972). In the last 20 years or so, however, experiments have provided new insights into the nature of the associations that are formed during instrumental conditioning. To anticipate the following discussion, these experiments show that both the SR and the expectancy theorists were correct. The experiments also show that these theorists underestimated the complexity of the information that animals can acquire in even quite simple instrumental conditioning tasks.
Evidence for RUS associations

To demonstrate support for an expectancy theory of instrumental conditioning, Colwill and Rescorla (1985) adopted a reinforcer devaluation design (see also Adams & Dickinson, 1981). A single group of rats was trained in the manner summarized in Table 4.1. In the first (training) stage of the experiment subjects were able to make one response (R1) to earn one reinforcer (US1) and another response (R2) to earn a different reinforcer (US2). The two responses were lever pressing or pulling a small chain that was suspended from the ceiling, and the two reinforcers were food pellets or sucrose solution. After a number of sessions of this training, an aversion was formed to US1 by allowing subjects free access to it and then injecting them with a mild poison (lithium chloride; LiCl). This treatment was so effective that subjects completely rejected US1 when it was subsequently presented to them. For the test trials subjects were again allowed to make either of the two responses, but this time neither response led to the delivery of a reinforcer. The results from the experiment are shown in Figure 4.2, which indicates that R2 was performed more vigorously than R1. The figure also shows a gradual decline in the strength of R2, which reflects the fact that neither response was followed by reward. This pattern of results can be most readily explained by assuming that during their training rats formed RlUS1 and R2US2 associations. They would then be reluctant to perform R1 in the test phase because of their knowledge that this response produced a reinforcer that was no longer attractive.
TABLE 4.1 Summary of the training given to a single group of rats in an experiment by Colwill and Rescorla (1985) Training
R1 US1 R2 US2
KEY TERM
Reinforcer devaluation A technique in which the positive reinforcer for an instrumental response is subsequently devalued, normally by pairing its consumption with illness.
Devaluation
US1 LiCl
Test
R1 versus R2
LiCl, lithium chloride; R, response; US, unconditioned stimulus.
Evidence for SR associations

The evidence that instrumental conditioning results in the development of SR associations is perhaps less 6 convincing than that concerning the development of R2 RUS associations. A re-examination of Figure 4.2 reveals that after the devaluation treatment there 4 remained a tendency to perform R1. This tendency was sustained even though the response never 2 R1 resulted in the delivery of a reinforcer and, more importantly, it was sustained even though the 0 devaluation training resulted in a complete rejection 0 1 2 3 4 5 of US1. The fact that an animal is willing to make a Blocks of four minutes response, even though it will reject the reinforcer that normally follows the response, is just what would be expected if the original training resulted in the growth FIGURE 4.2 The mean of an SR connection. In other words, because an SR connection does not allow an rates at which a single group of rats performed animal to anticipate the reward it will receive for its responses, once such a two responses, R1 and connection has formed the animal will respond for the reward even if it is no longer R2, that had previously attractive. Thus the results of the experiment by Colwill and Rescorla (1985) indicate been associated with that during the course of their training rats acquired both RUS and SR two different rewards. associations. Before the test sessions, the reward for R1, but Readers who are struck by the rather low rate at which Rl was performed might not R2, had been conclude that the SR connection is normally of little importance in determining devalued. No rewards responding. Note, however, that for the test trials there was the opportunity of were presented in the performing either R1 or R2. Even a slight preference for R2 would then have a test session (adapted suppressive effect on the performance of R1. On the basis of the present results, from Rescorla, 1991). therefore, it is difficult to draw precise conclusions concerning the relative contribution SR and RUS associations to instrumental responding. To complicate matters even further, it seems that the relative contribution of SR and RUS associations to instrumental behavior is influenced by the training given. Adams and Dickinson (1981) conducted a series of experiments in which rats had to press a lever for food. An aversion to the food was then conditioned using a technique similar to that adopted by Colwill and Rescorla (1985). If a small amount of instrumental training had been given initially, then subjects showed a marked reluctance to press the lever in a final test session. But if extensive instrumental training had been given initially, there was little evidence of any effect at all of the devaluation treatment. Adams and Dickinson (1981) were thus led to conclude that RUS associations underlie the acquisition and early stages of instrumental training, but with extended practice this learning is transformed into an SR habit. There is some debate about the reasons for this change in influence of the two associations, or whether it always takes place (see Dickinson & Balleine, 1994).
Mean responses per minute 8
Evidence for S(RUS) associations

Animals can thus learn to perform a particular response in the presence of a given stimulus (SR learning), they can also learn that a certain reinforcer will follow a response (RUS learning). The next question to ask is whether this information can be integrated to provide the knowledge that in the presence of a certain stimulus a certain response will be followed by a certain outcome. Table 4.2 summarizes
97
TABLE 4.2 Summary of the training given to a single group of rats in an experiment by Rescorla (1991) Discrimination training
Sl: R1 US1 and R2 US2 S2: R1 US2 and R2 US1
Devaluation
US2 LiCl
Test
S1: R1 S2: R2 R2 Rl
LiCl, lithium chloride; R, response; S, stimulus; US, unconditioned stimulus.
the design of an experiment by Rescorla (1991) that was conducted to test this possibility. A group of rats first received discrimination training in which a light or a noise (S1 or S2) was presented for 30 seconds at a time. During each stimulus the rats were trained to perform two responses (pulling a chain or pressing a lever), which each resulted in a different reinforcer (food pellets or sucrose solution). The design of conditioning experiments is rarely simple and, in this case, it was made more difficult by reversing the responsereinforce relationships for the two stimuli. Thus in S1, R1 led to US1 and R2 led to US2; but in S2, R1 led to US2 and R2 led to US1. For the second stage of the experiment, the reinforcer devaluation technique was used to condition an aversion to US2. Finally, test trials were conducted in extinction in which subjects were provided with the opportunity of performing the two responses in the presence of each stimulus. The result from these test trials was quite clear. There was a marked preference to perform R1, rather than R2, in the presence of S1; but in the presence of S2 there was a preference to perform R2 rather than R1. These findings cannot be explained by assuming that the only associations acquired during the first stage were SR, otherwise the devaluation technique would have been ineffective. Nor can the results be explained by assuming that only RUS associations developed, otherwise devaluation treatment should have weakened R1 and R2 to the same extent in both stimuli. Instead, the results can be most readily explained by assuming that the subjects were sensitive to the fact that the devalued reinforcer followed R2 in S1, and followed R1 in S2. Rescorla (1991) has argued that this conclusion indicates the development of a hierarchical associative structure that he characterizes as S(RUS). Animals are first believed to acquire an RUS association, and this association in its entirety is then assumed to enter into a new association with S. Whether it is useful to propose that an association can itself enter into an association remains to be seen. There are certainly problems with this type of suggestion (see, for example, Holland, 1992). In addition, as Dickinson (1994) points out, there are alternative ways of explaining the findings of Rescorla (1991). Despite these words of caution, the experiment demonstrates clearly that animals are able to anticipate the reward they will receive for making a certain response in the presence of a given stimulus.
THE CONDITIONS OF LEARNING

There is, therefore, abundant evidence to show that animals are capable of learning about the consequences of their actions. We turn now to consider the conditions that enable this learning to take place.
8 Mean responses per minute
Contiguity
A fundamental principle of the early theories of learning was that instrumental conditioning is most 6 effective when the response is contiguous with or, in other words, followed immediately by the reinforcer. 4 An early demonstration of this influence of contiguity on instrumental conditioning was made by Logan (1960), who trained rats to run down an alley 2 for food. He found that the speed of running was substantially faster if the rats received food as soon as 0 they reached the goal box, as opposed to waiting in 0 2 4 6 8 10 Session the goal box before food was made available. This disruptive effect of waiting was found with delays from as little as 3 seconds. Moreover, the speed of FIGURE 4.3 The mean rate of pressing a lever by a running down the alley was directly related to the single rat when food was presented 30 seconds after a duration of the delay in the goal box. This response (adapted from Lattal & Gleeson, 1990). effect, which is referred to as the gradient of delay, has been reported on numerous occasions (e.g. Dickinson, Watt, & Griffiths, 1992). It is apparent from Logans (1960) study that even relatively short delays between a response and a reinforcer disrupt instrumental conditioning. Once this finding has been established, it then becomes pertinent to consider by how much the reinforcer can be delayed before instrumental conditioning is no longer possible. The precise answer to this question remains to be sought, but a study by Lattal and Gleeson (1990) indicates that it may be greater than 30 seconds. Rats were required to press a lever for food, which was delivered 30 seconds after the response. If another response was made before food was delivered then Temporal contiguity is an important factor in the effectiveness of instrumental conditioning. This golden the timer was reset and the rat had to wait another retrievers obedience training will be much more effective 30 seconds before receiving food. This schedule if the owner rewards his dog with a treat straight after ensured that the delay between any response and the desired response. food was at least 30 seconds. Despite being exposed to such a demanding method of training, each of the three rats in the experiment showed an increase in the rate of lever KEY TERM pressing as training progressed. The results from one rat are shown in Figure 4.3. Gradient of delay The remarkable finding from this experiment is that rats with no prior experience The progressive of lever pressing can increase the rate of performing this response when the only weakening of an response-produced stimulus change occurs 30 seconds after a response has instrumental response been made. as a result of It should be emphasized that the rate of lever pressing by the three rats was increasing the delay between the relatively slow, and would have been considerably faster if food had been completion of the presented immediately after the response. Temporal contiguity is thus important response and the for instrumental conditioning, but such conditioning is still effective, albeit to a delivery of the lesser extent, when there is a gap between the response and the delivery reinforcer. of reward.
99
Responsereinforcer contingency
We saw in Chapter 3 that the CSUS contingency is important for Pavlovian conditioning because learning is more effective when the US occurs only in the 40 presence of the CS than when the US also occurs both in the presence and absence of the CS. An experiment by Hammond (1980) makes a similar point for 20 instrumental behavior, by demonstrating the importance of the responsereinforcer contingency 0 for effective conditioning. The training schedule was 0 0.08 0.12 quite complex and required that the experimental Group session was divided into 1-second intervals. If a response occurred in any interval then, for three groups of thirsty rats, water was delivered at the end of the FIGURE 4.4 The mean rates of lever pressing for water interval with a probability of 0.12. The results from a by three groups of thirsty rats in their final session of group that received only this training, and no water in training. The groups differed in the probability with the absence of lever pressing (Group 0), are shown in which free water was delivered during the intervals the left-hand histogram of Figure 4.4. By the end of between responses. Group 0 received no water during these intervals, Group 0.08 and Group 0.12 received training this group was responding at more than 50 water with a probability of 0.08 and 0.12, respectively, at responses a minute. For the remaining two groups, the end of each period of 1 second in which a response water was delivered after some of the 1-second did not occur (adapted from Hammond, 1980). intervals in which a response did not occur. For Group 0.08, the probability of one of these intervals being followed by water was 0.08, whereas KEY TERMS for Group 0.12 this probability was 0.12. The remaining two histograms show the final response rates for these two groups. Both groups responded more slowly than Group 0, Responsereinforcer but responding was weakest in the group for which water was just as likely to be contingency delivered whether or not a response had been made. The contingency between response The degree to which and reinforcer thus influences the rate at which the response will be performed. We now the occurrence of the reinforcer depends need to ask why this should be the case. In fact, there are two answers to this question. on the instrumental One answer is based on a quite different view of instrumental conditioning to that response. Positive considered thus far. According to this account, instrumental conditioning will be contingency: the effective whenever a response results in an increase in the rate of reinforcement frequency of the (e.g. Baum, 1973). Thus there is no need for a response to be followed closely by reinforcer is increased by making the response. reward for successful conditioning, all that is necessary is for the overall probability of Negative contingency: reward being delivered to increase. In other words, the contingency between a response the frequency of the and reward is regarded as the critical determinant for the outcome of instrumental reinforcer is reduced by conditioning. This position is referred to as a molar theory of reinforcement because making the response. animals are assumed to compute the rate at which they make a response over a Zero contingency: the frequency of the substantial period of time and, at the same time, compute the rate at which reward is reinforcer is unaffected delivered over the same period. If they should detect that an increase in the rate of by making the response. responding is correlated with an increase in the rate of reward delivery, then the Molar theory of response will be performed more vigorously in the future. Moreover, the closer the reinforcement correlation between the two rates, the more rapidly will the response be performed. The assumption that Group 0 of Hammonds (1980) experiment demonstrated a high correlation between the rate of the rate at which the lever was pressed and the rate at which reward was delivered, and instrumental this molar analysis correctly predicts that rats will learn to respond rapidly on the lever. responding is determined by the In the case of Group 0.12, however, the rate of lever pressing had some influence on responsereinforcer the rate at which reward was delivered, but this influence was slight because the reward contingency. would be delivered even if a rat refused to press the lever. In these circumstances,
Mean responses per minute 60
responding is predicted to be slow and again the theory is supported by the findings. The molar analysis of instrumental behavior has received a considerable amount of attention and generated a considerable body of experimental research, but there are good reasons for believing that it may be incorrect. In an experiment by Thomas (1981), rats in a test chamber containing a lever were given a free pellet of food once every 20 seconds, even if they did nothing. At the same time, if they pressed the lever during any 20-seconds interval then the pellet was delivered immediately, and the pellet at the end of the 15 20 25 30 interval was cancelled. Subsequent responses during Session the remainder of the interval were without effect. This treatment ensured that rats received three pellets of food a minute whether or not they pressed the lever. Thus lever pressing in this experiment did not result in an increase in the rate of food delivery and, according to the molar point of view, the rate of making this response should not increase. The mean rate of responding during successive sessions is shown for one rat in Figure 4.5. Although the rat took some time to press the lever, it eventually pressed at a reasonably high rate. A similar pattern of results was observed with the other rats in the experiment, which clearly contradicts the prediction drawn from a molar analysis of instrumental behavior. Thomas (1981) reports a second experiment, the design of which was much the same as for the first experiment, except that lever pressing not only resulted in the occasional, immediate delivery of food but also in an overall reduction of food by postponing the start of the next 20-second interval by 20 seconds. On this occasion, the effect of lever pressing was to reduce the rate at which food was delivered, yet each of six new rats demonstrated an increase in the rate of lever pressing as training progressed. The result is opposite to that predicted by a molar analysis of instrumental conditioning. Although molar theories of instrumental behavior (e.g. Baum, 1973) are ideally suited to explaining results such as those reported by Hammond (1980), it is difficult to see how they can overcome the problem posed by the findings of Thomas (1981). It is therefore appropriate to seek an alternative explanation for the influence of the responsereinforcer contingency on instrumental conditioning. One alternative, which by now should be familiar, is that instrumental conditioning depends on the formation of associations. This position is referred to as a molecular theory of reinforcement because it assumes that the effectiveness of instrumental conditioning depends on specific episodes of the response being paired with a reinforcer. The results from the experiment can be readily explained by a molecular analysis of instrumental conditioning, because contiguity between a response and a reinforcer is regarded as the important condition for successful conditioning. Each lever press that resulted in food would allow an association involving the response to gain in strength, which would then encourage more vigorous responding as training progressed. At first glance, Hammonds (1980) results appear to contradict a molecular analysis because the response was paired with reward in all three groups and they would therefore be expected to respond at a similar rate, which was not the case. It is, however, possible to reconcile these results with a molecular analysis of instrumental conditioning by appealing to the effects of associative competition, as the following section shows.
30 Mean responses per minute 25 20 15 10 5 0 0 5 10
FIGURE 4.5 The total number of lever presses recorded in each session for a rat in the experiment by Thomas (1981).
KEY TERM
Molecular theory of reinforcement The assumption that the rate of instrumental responding is determined by responsereinforcer contiguity.
101
Associative competition
If two stimuli are presented together for Pavlovian conditioning, the strength of the conditioned response (CR) that each elicits when tested individually is often weaker than if they are presented for conditioning separately. This overshadowing effect is explained by assuming the two stimuli are in competition for associative strength so that the more strength acquired by one the less is available for the other (Rescorla & Wagner, 1972). Overshadowing is normally assumed to take place between stimuli but, if it is accepted that overshadowing can also occur between stimuli and responses that signal the same reinforcer, then it is possible for molecular theories of instrumental behavior to explain the contingency effects reported by Hammond (1980). In Group 0 of his study, each delivery of water would strengthen a lever-presswater association and result eventually in rapid lever pressing. In the other groups, however, the delivery of free water would allow the context to enter into an association with this reinforcer. The delivery of water after a response will then mean that it is signaled by both the context and the response, and theories of associative learning predict that the contextwater association will restrict, through overshadowing, the growth of the responsewater association. As the strength of the responsewater association determines the rate at which the response is performed, responding will be slower when some free reinforcers accompany the instrumental training than when all the reinforcers are earned. Furthermore, the more often that water is delivered free, the stronger will be the contextwater association and the weaker will be the responsewater association. Thus the pattern of 20 results shown in Figure 4.4 can be explained by a molecular analysis of instrumental conditioning, providing it is assumed that responses and stimuli 15 compete with each other for their associative strength. The results from two different experiments Uncorr Food alone lend support to this assumption. 10 The first experiment directly supports the claim that overshadowing is possible between stimuli and Corr responses. Pearce and Hall (1978) required rats to 5 press a lever for food on a variable interval schedule, in which only a few responses were followed by reward. For an experimental group, each rewarded 0 response was followed by a brief burst of white noise 1 2 3 4 before the food was delivered. The noise, which Sessions accompanied only rewarded responses, resulted in a substantially lower rate of lever pressing by the experimental than by control groups that received either similar exposure to the noise FIGURE 4.6 The mean (but after nonrewarded responses) or no exposure to the noise at all (Figure 4.6). rates of lever pressing by three groups of rats Geoffrey Hall and I argued that the most plausible explanation for these findings is that received a burst that instrumental learning involves the formation of RUS associations and that these of noise after each were weakened through overshadowing by a noisefood association that developed rewarded response (Corr), after some in the experimental group. The second source of support for a molecular analysis of the effect of nonrewarded responses (Uncorr), or no noise contingency on instrumental responding can be found in contingency experiments in at all (Food alone) which a brief stimulus signals the delivery of each free reinforcer. The brief stimulus (adapted from Pearce & should itself enter into an association with the reinforcer and thus overshadow the Hall, 1978).
Responses per minute
development of an association between the context and the reinforcer. Whenever a response is followed by the reinforcer it will now be able to enter a normal RUS association, because of the lack of competition from the context. Responding in these conditions should thus be more vigorous than if the free US is not signaled. In support of this argument, both Hammond and Weinberg (1984) and Dickinson and Charnock (1985) have shown that free reinforcers disrupt instrumental responding to a greater extent when they are unsignaled than when they are signaled. These findings make a particularly convincing case for the belief that competition for associative strength is an important influence on the strength of an instrumental response. They also indicate that this competition is responsible for the influence of the responsereinforcer contingency on the rate of instrumental responding.
The nature of the reinforcer

Perhaps the most important requirement for successful instrumental conditioning is that the response is followed by a reinforcer. But what makes a reinforcer? In nearly all the experiments that have been described thus far, the reinforcer has been food for a hungry animal, or water for a thirsty animal. As these stimuli are of obvious biological importance, it is hardly surprising to discover that animals are prepared to engage in an activity such as lever pressing in order to earn them. However, this does not mean that a reinforcer is necessarily a stimulus that is of biological significance to the animal. As Schwartz (1989) notes, animals will press a lever to turn on a light, and it is difficult to imagine the biological need that is satisfied on these occasions. Thorndike (1911) was the first to appreciate the need to identify the defining characteristics of a reinforcer, and his solution was contained within the Law of Effect. He maintained that a reinforcer was a stimulus that resulted in a satisfying state of affairs. A satisfying state of affairs was then defined as . . . one which the animal does nothing to avoid, often doing things which maintain or renew it (Thorndike, 1913, p. 2). In other words, Thorndike effectively proposed that a stimulus would serve as a reinforcer (increase the likelihood of a response) if animals were willing to respond in order to receive that stimulus. The circularity in this definition should be obvious and has served as a valid source of criticism of the Law of Effect on more than one occasion (e.g. Meehl, 1950). Thorndike was not alone in providing a circular definition of a reinforcer. Skinner has been perhaps the most blatant in this respect, as the following quotation reveals (Skinner, 1953, pp. 7273): The only way to tell whether or not a given event is reinforcing to a given organism under given conditions is to make a direct test. We observe the frequency of a selected response, then make an event contingent upon it and observe any change in frequency. If there is a change, we classify the event as reinforcing. To be fair, for practical purposes this definition is quite adequate. It provides a useful and unambiguous terminology. At the same time, once we have decided that a stimulus, such as food, is a positive reinforcer, then we can turn to a study of a number of issues that are important to the analysis of instrumental learning. For instance, we have been able to study the role of the reinforcer in the associations that are formed during instrumental learning, without worrying unduly about what it is that makes a stimulus a reinforcer. But the definitions offered by Thorndike and
103
Skinner are not very helpful if a general statement is being sought about the characteristics of a stimulus that dictate whether or not it will function as a reinforcer. And the absence of such a general statement makes our understanding of the conditions that promote instrumental learning incomplete. A particularly elegant solution to the problem of deciding whether a stimulus will function as a reinforcer is provided by the work of Premack (1959, 1962, 1965), who put forward what is now called the Premack principle. He proposed that reinforcers were not stimuli but opportunities to engage in behavior. Thus the activity of eating, not the stimulus of food, should be regarded as the reinforcer when an animal has been trained to lever press for food. To determine if one activity will serve as the reinforcer for another activity, Premack proposed that the animal should be allowed to engage freely in both activities. For example, a rat might be placed into a chamber containing a lever and some food pellets. If it shows a greater willingness to eat the food than to press the lever, then we can conclude that the opportunity to eat will reinforce lever pressing, but the opportunity to lever press will not reinforce eating. It is perhaps natural to think of the properties of a reinforcer as being absolute. That is, if eating is an effective reinforcer for one response, such as lever pressing, then it might be expected to serve as a reinforcer for any response. But Premack (1965) has argued this assumption is unjustified. An activity will only be reinforcing if subjects would rather engage in it than in the activity that is to be reinforced. To demonstrate this relative property of a reinforcer, Premack (1971a) placed rats into a running wheel, similar to the one sketched in Figure 4.7, for 15 minutes a day. When the rats were thirsty, they preferred to drink rather than to run in the wheel, but when they were not thirsty, they preferred to run rather than to drink. For the test phase of the experiment, the wheel was locked and the rats had to lick the drinking tube to free it and so gain the opportunity to run for 5 seconds. Running is not normally regarded as a reinforcing activity but because rats that are not thirsty prefer to run rather than drink, it follows from Premacks (1965) argument that they should increase the amount they drink in the wheel in order to earn the opportunity to run. Conversely; running would not be expected to reinforce drinking for thirsty rats, because in this state of deprivation they prefer drinking to running. In clear support of this analysis, Premack (1971a) found that running could serve as a reinforcer for drinking, but only with rats that were not thirsty. As Allison (1989) has pointed out, Premacks proposals can be expressed succinctly by paraphrasing Thorndikes Law of Effect. For instrumental conditioning to be effective it is necessary for a response to be followed not by a satisfying state of affairs, but by a preferred response. Despite the improvement this change affords with respect to the problem of defining a reinforcer, experiments have shown that it does not account adequately for all the circumstances where one activity will serve as a reinforcer for another. Consider an experiment by Allison and Timberlake (1974) in which rats were first allowed to drink from two spouts that provided different concentrations of
Drinking tube
Pneumatic cylinders
FIGURE 4.7 A sketch of the apparatus used by Premack (1971a) to determine if being given the opportunity to run could serve as a reinforcer for drinking in rats that were not thirsty (adapted from Premack, 1971a).
KEY TERM
Premack principle The proposal that activity A will reinforce activity B, if activity A is more probable than activity B.
saccharin solution. This baseline test session revealed a preference for the sweeter solution. According to Premacks proposals, therefore, rats should be willing to increase their consumption of the weaker solution if drinking it is the only means by which they can gain access to the sweeter solution. By contrast, rats should not be willing to increase their consumption of the sweeter solution to gain access to the weaker one. To test this second prediction, rats were allowed to drink from the spout supplying the sweeter solution and, after every 10 licks, they were permitted one lick at the spout offering the less-sweet solution. This 10 : 1 ratio meant that, relative to the amount of sweet solution consumed, the rats received less of the weaker solution than they chose to consume in the baseline test session. As a consequence of this constraint imposed by the experiment, Allison and Timberlake (1974) found that rats increased their consumption of the stronger solution. It is important to emphasize that this increase occurred in order to allow the rats to gain access to the less preferred solution, which, according to Premacks theory, should not have taken place. Timberlake and Allison (1974) explained their results in terms of an equilibrium theory of behavior. They argued that when an animal is able to engage in a variety of activities, it will have a natural tendency to allocate more time to some than others. The ideal amount of time that would be devoted to an activity is referred to as its bliss point, and each activity is assumed to have its own bliss point. By preventing an animal from engaging in even its least preferred activity, it will be displaced from the bliss point and do its best to restore responding to this point. In the experiment by Allison and Timberlake (1974), therefore, forcing the subjects to drink much more of the strong than the weak solution meant that they were effectively deprived of the weak solution. As the only way to overcome this deficit was to drink more of the sweet solution, this is what they did. Of course, as the rats approached their bliss point for the consumption of the weak solution, they would go beyond their bliss point for the consumption of the sweet solution. To cope with this type of conflict, animals are believed to seek a compromise, or state of equilibrium, in which the amount of each activity they perform will lead them as close as possible to the bliss points for all activities. Thus the rats completed the experiment by drinking rather more than they would prefer of the strong solution, and rather less than they would prefer of the weak solution. By referring to bliss points, we can thus predict when the opportunity to engage in one activity will serve as a reinforcer for another activity. But this does not mean that we have now identified completely the circumstances in which the delivery of a particular event will function as a reinforcer. Some reinforcers do not elicit responses that can be analyzed usefully by equilibrium theory. Rats will press a lever to receive stimulation to certain regions of the brain, or to turn on a light, or to turn off an electric shock to the feet. I find it difficult to envisage how any measure of baseline activity in the presence of these events would reveal that they will serve as reinforcers for lever pressing. In the next section we will find that a stimulus that has been paired with food can reinforce lever pressing in hungry rats. Again, simply by observing an animals behavior in the presence of the stimulus, it is hard to imagine how one could predict that the stimulus will function as a reinforcer. Our understanding of the nature of a reinforcer has advanced considerably since Thorndike proposed the Law of Effect. However, if we wish to determine with confidence if a certain event will act as a reinforcer for a particular response, at times there will be no better alternative than to adopt Skinners suggestion of testing for this property directly.
105
Conditioned reinforcement
The discussion has been concerned thus far with primary reinforcers, that is, with stimuli that do not need to be paired with another stimulus to function as reinforcers for instrumental conditioning. There are, in addition, numerous studies that have shown that even a neutral stimulus may serve as an instrumental reinforcer by virtue of being paired with a primary reinforcer. An experiment by Hyde (1976) provides a good example of a stimulus acting in this capacity as a conditioned reinforcer. In the first stage of the experiment, an experimental group of hungry rats had a number of sessions in which the occasional delivery of food was signaled by a brief tone. A control group was treated in much the same way except that the tone and food were presented randomly in respect to each other. Both groups were then given the opportunity to press the lever to present the tone. The results from the eight sessions of this testing are displayed in Figure 4.8. Even though no food was presented in this test phase, the experimental group initially showed a considerable willingness to press the lever. The superior rate of pressing by the experimental compared to the control group strongly suggests that pairing the tone with food resulted in it becoming a conditioned reinforcer. In the previous experiment, the effect of the conditioned reinforcer was relatively short lasting, which should not be surprising because it will lose its properties by virtue of being presented in the absence of food. The effects of conditioned reinforcers can be considerably more robust if their relationship with the primary reinforcer is maintained, albeit intermittently. Experiments using token reinforcers provide a particularly forceful demonstration of how the influence of a conditioned reinforcer may be sustained in this way. Token reinforcers are typically small plastic discs that are earned by performing some response, and once earned they can be exchanged for food. In an experiment by Kelleher (1958), chimpanzees had to press a key 125 times to receive a single token. When they had collected 50 tokens they were allowed to push them all into a slot to receive food. In this experiment, therefore, the effect of the token reinforcers was sufficiently strong that they were able to reinforce a sequence of more than 6000 responses.
Mean responses per session
150
KEY TERMS
100
Experimental
50
Control
4 Session
Conditioned reinforcer An originally neutral stimulus that serves as a reinforcer through training, usually by being paired with a positive reinforcer. Token reinforcer A conditioned reinforcer in the form of a plastic chip that can be held by the subject.
FIGURE 4.8 The mean rates of lever pressing for a brief tone by two groups of rats. For the experimental group the tone had previously been paired with food, whereas for the control group the tone and food had been presented randomly in respect to each other (adapted from Hyde, 1976).
A straightforward explanation for the results of the experiment by Hyde (1976) is that the tone became an appetitive Pavlovian CS and thus effectively served as a substitute for food. The results from experiments such as that by Kelleher (1958) have led Schwartz (1989) to argue that there are additional ways in which conditioned reinforcers can be effective (see also Golub, 1977):
They provide feedback that the correct response has been made. Delivering a token
after the completion of 125 responses would provide a useful signal that the subject is engaged in the correct activity. Conditioned reinforcers might act as a cue for the next response to be performed. Kelleher (1958) observed that his chimpanzees often waited for several hours before making their first response in a session. This delay was virtually eliminated by giving the subject some tokens at the start of the session, thus indicating that the tokens acted as a cue for key pressing. Conditioned reinforcers may be effective because they help to counteract the disruptive effects of imposing a long delay between a response and the delivery of a primary reinforcer. Interestingly, as far as tokens are concerned, this property of the token is seen only when the chimpanzee is allowed to hold it during the delay. Taken together, these proposals imply that the properties of a conditioned reinforcer are considerably more complex than would be expected if they were based solely on its Pavlovian properties.
T H E P E R F O R M A N C E O F I N S T R U M E N TA L B E H AV I O R
The experiments considered so far have been concerned with revealing the knowledge that is acquired during the course of instrumental conditioning. They have also indicated some of the factors that influence the acquisition of this knowledge. We turn our attention now to examining the factors that determine the vigor with which an animal will perform an instrumental response. We have already seen that certain devaluation treatments can influence instrumental responding, and so too can manipulations designed to modify the strength of the instrumental association. But there remain a number of other factors that influence instrumental behavior. In the discussion that follows we shall consider two of these influences in some detail: deprivation state and the presence of Pavlovian CSs.
Deprivation
The level of food deprivation has been shown, up to a point, to be directly related to the vigor with which an animal responds for food. This is true when the response is running down an alley (Cotton, 1953) or pressing a lever (Clark, 1958). To explain this relationship, Hull (1943) suggested that motivational effects are mediated by activity in a drive center. Drive is a central state that is excited by needs and energizes behavior. It was proposed that the greater the level of drive, the more vigorous will be the response that the animal is currently performing. Thus, if a rat is pressing a lever for food, then hunger will excite drive, which, in turn, will invigorate this activity.
107
A serious shortcoming of Hulls (1943) account is the claim that drive is nonspecific, so that it can be enhanced by an increase in any need of the animal. A number of curious predictions follow from this basic aspect of his theorizing. For example, the pain produced by electric shock is assumed to increase drive, so that if animals are given shocks while lever pressing for food, they should respond more rapidly than in the absence of shock. By far the most frequent finding is that this manipulation has the opposite effect of decreasing appetitive instrumental responding (e.g. Boe & Church, 1967). Conversely, the theory predicts that enhancing drive by making animals hungrier should facilitate the rate at which they press a lever to escape or avoid shock. Again, it should not be surprising to discover that generally this prediction is not confirmed. Increases in deprivation have been found, in this respect, to be either without effect (Misanin & Campbell, 1969) or to reduce the rate of such behavior (Meyer, Adams, & Worthen, 1969; Leander, 1973). In response to this problem, more recent theorists have proposed that animals possess two drive centers: One is concerned with energizing behavior that leads to reward, the other is responsible for invigorating activity that minimizes contact with aversive stimuli. These can be referred to, respectively, as the positive and negative motivational systems. A number of such dual-system theories of motivation have been proposed (Konorski, 1967; Rescorla & Solomon, 1967; Estes, 1969). The assumption that there are two motivational systems rather than a single drive center allows these theories to overcome many of the problems encountered by Hulls (1943) theory. For example, it is believed that deprivation states like hunger and thirst will increase activity only in the positive system, so that a change in deprivation should not influence the vigor of behavior that minimizes contact with aversive stimuli such as shock. Conversely, electric shock should not invigorate responding for food as it will excite only the negative system. But even this characterization of the way in which deprivation states influence behavior may be too simple. Suppose that an animal that has been trained to lever press for food when it is hungry is satiated by being granted unrestricted access to food before it is returned to the conditioning chamber. The account that has just been developed predicts that satiating the animal will reduce the motivational support for lever pressing by lowering the activity in the positive system. The animal would thus be expected to respond less vigorously than one that was still hungry. There is some evidence to support this prediction (e.g. Balleine, Garner, Gonzalez & Dickinson, 1995), but additional findings by Balleine (1992) demonstrate that dual-system theories of motivation are in need of elaboration if they are to provide a complete account of the way in which deprivation states influence responding. In one experiment by Balleine (1992), two groups of rats were trained to press a bar for food while they were hungry (H). For reasons that will be made evident shortly, it is important to note that the food pellets used as the instrumental reinforcer were different to the food that was presented at all other times in this experiment. Group HS was then satiated (S) by being allowed unrestricted access to their normal food for 24 hours, whereas Group HH remained on the deprivation schedule. Finally, both groups were again given the opportunity to press the bar, but responding never resulted in the delivery of the reinforcer. Because of their different deprivation states, dual-system theories of motivation, as well as our intuitions, predict that Group HH should respond more vigorously than Group HS in this test session. But it seems that our intuitions are wrong on this occasion. The mean number of responses made by each group in the test session are shown in the two gray
KEY TERM
Dual-system theories of motivation Theories that assume that behavior is motivated by activity in a positive system, which energizes approach to an object, and a negative system, which energizes withdrawal from an object.
histograms on the left-hand side of Figure 4.9, which reveal that both groups responded quite vigorously, and at a similar rate. The equivalent histograms on the right-hand side of Figure 4.9 show the results of two further groups from this study, which were trained to lever press for food while they were satiated by being fed unrestricted food in their home cages. Rats will learn to respond for food in these conditions, provided that the pellets are of a different flavor to that of the unrestricted food presented in the home cages. Group SS was then tested while satiated, whereas Group SH was tested while hungry. Once again, and contrary to our intuitions, both groups performed similarly in the test session despite their different levels of deprivation. When the results of the four groups are compared, it is evident that the groups that were trained hungry responded somewhat more on the test trials than those that were trained while they were satiated. But to labor the point, there is no indication that changing deprivation level for the test session had any influence on responding. Balleines (1992) explanation for these findings is that the incentive value, or attractiveness, of the reinforcer is an important determinant of how willing animals will be to press for it. If an animal consumes a reinforcer while it is hungry, then that reinforcer may well be more attractive than if it is consumed while the animal is satiated. Thus Group HH and Group HS may have responded rapidly in the test session because they anticipated a food that in the past had proved attractive, because they had only eaten it while they were hungry. By way of contrast, the slower responding by Groups SS and SH can be attributed to them anticipating food that in the past had not been particularly attractive, because they had eaten it only while they were not hungry. This explanation was tested with two additional groups. Prior to the experiment, animals in Group Pre(S) HS were given reward pellets while they were satiated to
200 Mean total responses
200 Mean total responses HH HS Group Pre(S) HS
150
150
100
100
50
50
SS
SH Group
Pre(H) SH
FIGURE 4.9 The mean number of responses made by six groups of rats in an extinction test session. The left-hand letter of each pair indicates the level of deprivation when subjects were trained to lever press for rewardeither satiated (S) or hungry (H)the right-hand letter indicates the deprivation level during test trials. Two of the groups were allowed to consume the reward either satiated, Pre(S), or hungry, Pre(H), prior to instrumental conditioning (adapted from Balleine, 1992).
109
demonstrate that the pellets are not particularly attractive in this deprivation state. The group was then trained to lever press while hungry and received test trials while satiated. On the test trials, the subjects should know that because, of their low level of deprivation, the reward pellets are no longer attractive and they should be reluctant to press the lever. The results, which are shown in the blue histogram in the left-hand side of Figure 4.9, confirmed this prediction. The final group to be considered, Group Pre(H) SH, was first allowed to eat reward pellets in the home cage while hungry, instrumental conditioning was then conducted while the group was satiated and the test trials were conducted while the group was hungry. In contrast to Group SH and Group SS, this group should appreciate that the reward pellets are attractive while hungry and respond more rapidly than the other two groups during the test trials. Once again, the results confirmed this predictionsee the blue histogram on the right-hand side of Figure 4.9 By now it should be evident that no simple conclusion can be drawn concerning the way in which deprivation states influence the vigor of instrumental responding. On some occasions a change in deprivation state is able to modify directly the rate of responding, as dual-systems theories of motivation predict. On other occasions, this influence is more indirect by modifying the attractiveness of the reinforcer. An informative account of the way in which these findings may be integrated can be found in Balleine et al. (1995).
Pavlovianinstrumental interactions
For a long time, theorists have been interested in the way in which Pavlovian CSs influence the strength of instrumental responses that are performed in their presence. One reason for this interest is that Pavlovian and instrumental conditioning are regarded as two fundamental learning processes, and it is important to appreciate the way in which they work together to determine how an animal behaves. A second reason was mentioned at the end of Chapter 2, where we saw that Pavlovian CSs tend to elicit reflexive responses that may not always be in the best interests of the animal. If a Pavlovian CS was also able to modulate the vigor of instrumental responding, then this would allow it to have a more general, and more flexible, influence on behavior than has so far been implied. For example, if a CS for food were to invigorate instrumental responses that normally lead to food, then such responses would be strongest at a time when they are most needed, that is, in a context where food is likely to occur. The experiments described in this section show that Pavlovian stimuli can modulate the strength of instrumental responding. They also show that there are at least two ways in which this influence takes place.
Motivational influences
Konorski (1967), it should be recalled from Chapter 2, believed that a CS can excite an affective representation of the US that was responsible for arousing a preparatory CR. He further believed that a component of this CR consists of a change in the level of activity in a motivational system. A CS for food, say, was said to increase activity in the positive motivational system, whereas a CS for shock should excite the negative system. If these proposals are correct, then it should be possible to alter the strength of instrumental responding by presenting the appropriate Pavlovian CS (see also Rescorla & Solomon, 1967).
An experiment by Lovibond (1983), using Pavlovianinstrumental transfer design, provides good support for this prediction. Hungry rabbits were first trained to operate a lever with their snouts to receive a squirt of sucrose into the mouth. The levers were then withdrawn for a number of sessions of Pavlovian conditioning in which a clicker that lasted for 10 seconds signaled the delivery of sucrose. In a final test stage, subjects were again able to press the lever and, as they were doing so, the clicker was occasionally operated. The effect of this appetitive CS was to increase the rate of lever pressing both during its presence and for a short while after it was turned off. A similar effect has also been reported in a study using an aversive US. Rescorla and LoLordo (1965) found that the presentation of a CS previously paired with shock enhanced the rate at which dogs responded to avoid shock. In addition to explaining the findings that have just been described, a further advantage of dual-system theories of motivation is that they are able to account for many of the effects of exposing animals simultaneously to both appetitive and aversive stimuli. For example, an animal may be exposed to one stimulus that signals reward and another indicating danger. In these circumstances, instead of the two systems working independently, they are assumed to be connected by mutually inhibitory links, so that activity in one will inhibit the other (Dickinson & Pearce, 1977). To understand this relationship, consider the effect of presenting a signal for shock to a rat while it is lever pressing for food. Prior to the signal, the level of activity in the positive system will be solely responsible for the rate of pressing. When the aversive CS is presented, it will arouse the negative system. The existence of the inhibitory link will then allow the negative system to suppress activity in the positive system and weaken instrumental responding. As soon as the aversive CS is turned off, the inhibition will be removed and the original response rate restored. By assuming the existence of inhibitory links, dual-system theories can provide a very simple explanation for conditioned suppression. It occurs because the aversive CS reduces the positive motivational support for the instrumental response.
Response-cueing properties of Pavlovian CRs

In addition to modulating activity in motivational systems, Pavlovian stimuli can influence instrumental responding through a response-cueing process (Trapold & Overmier, 1972). To demonstrate this point we shall consider an experiment by Colwill and Rescorla (1988), which is very similar in design to an earlier study by Kruse, Overmier, Konz, and Rokke (1983). In the first stage of the experiment, hungry rats received Pavlovian conditioning in which US1 was occasionally delivered during a 30-second CS. Training was then given, in separate sessions, in which R1 produced US1 and R2 produced US2. The two responses were chain pulling and lever pressing, and the two reinforcers were food pellets and sucrose solution. For the test stage, animals had the opportunity for the first time to perform R1 and R2 in the presence of the CS, but neither response led to a reinforcer. As Figure 4.10 shows, R1 was performed more vigorously than R2. The first point to note is that it is not possible to explain these findings by appealing to the motivational properties of the CS. The CS should, of course, enhance the level of activity in the positive system. But this increase in activity should then invigorate R1 to exactly the same extent as R2 because the motivational support for both responses will be provided by the same, positive, system. In developing an alternative explanation for the findings by Colwill and Rescorla (1988), note that instrumental conditioning with the two responses was conducted in
KEY TERM
Pavlovianinstrumental transfer Training in which a CS is paired with a US and then the CS is presented while the subject is performing an instrumental response.
111
separate sessions. Thus R1 was acquired against a background of presentations of US1 and, likewise, R2 was acquired against a background of US2 presentations. If we now accept that the training resulted in the development of S-R associations, it is conceivable that certain properties of the two rewards contributed towards the S component of these associations. For example, a memory of US1 might contribute to the set of stimuli that are responsible for eliciting R1. When the CS was presented for testing, it should activate a memory of US1, which in turn should elicit R1 rather than R2. In other words, the Pavlovian CS was able to invigorate the instrumental response by providing cues that had previously become associated with the instrumental response.
10 Mean responses per minute
R1
6
R2
0 0
2 3 Blocks of two trials
Concluding comments
The research reviewed so far in this chapter shows that we have discovered a considerable amount about the associations that are formed during instrumental conditioning. We have also discovered a great deal about the factors that influence the strength of instrumental responding. In Chapter 2 a simple memory model was developed to show how the associations formed during Pavlovian conditioning influence responding. It would be helpful if a similar model could be developed for instrumental conditioning, but this may not be an easy task. We would need to take account of three different associations that have been shown to be involved in instrumental behavior, SR, RUS, S(RUS). We would also need to take account of the motivational and response-cueing properties of any Pavlovian CSUS associations that may develop. Finally, the model would need to explain how changes in deprivation can influence responding. It hardly needs to be said that any model that is able to take account of all these factors satisfactorily will be complex and would not fit comfortably into an introductory text. The interested reader is, however, referred to Dickinson (1994) who shows how much of our knowledge about instrumental behavior can be explained by what he calls an associative-cybernetic model. In essence, this model is a more complex version of the dual-system theories of motivation that we have considered. The reader might also wish to consult Balleine (2001) for a more recent account of the influence of motivational processes on instrumental behavior. Our discussion of the basic processes of instrumental conditioning is now complete, but there is one final topic to consider in this chapter. That is, whether the principles we have considered can provide a satisfactory account for the problem solving abilities of animals.
FIGURE 4.10 The mean rates of performing two responses, R1 and R2, in the presence of an established Pavlovian conditioned stimulus (CS). Prior to testing, instrumental conditioning had been given in which the reinforcer for R1 was the same as the Pavlovian unconditioned stimulus (US), and the reinforcer for R2 was different to the Pavlovian US. Testing was conducted in the absence of any reinforcers in a single session (adapted from Colwill & Rescorla, 1988).
T H E L AW O F E F F E C T A N D P R O B L E M S O LV I N G
Animals can be said to have solved a problem whenever they overcome an obstacle to attain a goal. The problem may be artificial, such as having to press a lever for
reward, or it might be one that occurs naturally, such as having to locate a new source of food. Early studies of problem solving in animals were conducted by means of collecting anecdotes, but this unsatisfactory method was soon replaced by experimental tests in the laboratory (see Chapter 1). As a result of his experiments, Thorndike (1911) argued that despite the range of potential problems that can confront an animal, they are all solved in the same manner. Animals are assumed to behave randomly until by trial and error the correct response is made and reward is forthcoming. To capture this idea, Thorndike (1911) proposed the Law of Effect, which stipulates that one effect of reward is to strengthen the accidentally occurring response and to make its occurrence more likely in the future. This account may explain adequately the way cats learn to escape from puzzle boxes, but is it suitable for all aspects of problem solving? A number of researchers have argued that animals are more sophisticated at solving problems than is implied by the Law of Effect. It has been suggested that they are able to solve problems through insight. It has also been suggested that animals can solve problems because they have an understanding of the causal properties of the objects in their environment or, as it is sometimes described, an understanding of folk physics. We shall consider each of these possibilities.
KEY TERM
Insight An abrupt change in behavior that leads to a problem being solved. The change in behavior is sometimes attributed to a period of thought followed by a flash of inspiration.
Insight
An early objector to Thorndikes (1911) account of problem solving was Kohler (1925). Thorndikes experiments were so restrictive, he argued, that they prevented animals from revealing their capacity to solve problems by any means other than the most simple. Kohler spent the First World War on the Canary Islands, where he conducted a number of studies that were meant to reveal sophisticated intellectual processes in animals. He is best known for experiments that, he claimed, demonstrate the importance of insight in problem solving. Many of his findings are described in his book The mentality of apes, which documents some remarkable feats of problem solving by chimpanzees and other animals. Two examples should be sufficient to give an indication of his methodology. These example involve Sultan (Figure 4.11), whom Kohler (1925) regarded as the brightest of his chimpanzees. On one occasion Sultan, was in a cage in which there was also a small stick. Outside the cage was a longer stick, which was beyond Sultans reach, and even further away was a reward of fruit (p. 151): Sultan tries to reach the fruit with the smaller of the sticks. Not succeeding, he tries a piece of wire that projects from the netting in his cage, but that, too, is in vain. Then he gazes about him (there are always in the course of these tests some long pauses, during which the animal scrutinizes the whole visible area). He suddenly picks up the little stick once more,
FIGURE 4.11 Sultan stacking boxes in an attempt to reach a banana (drawing based on Kohler, 1956).
113
goes to the bars directly opposite to the long stick, scratches it towards him with the auxiliary, seizes it and goes with it to the point opposite the objective which he secures. From the moment that his eyes fell upon the long stick, his procedure forms one consecutive whole. In the other study, Kohler (1925) hung a piece of fruit from the ceiling of a cage housing six apes, including Sultan. There was a wooden box in the cage (p. 41): All six apes vainly endeavored to reach the fruit by leaping up from the ground. Sultan soon relinquished this attempt, paced restlessly up and down, suddenly stood still in front of the box, seized it, tipped it hastily straight towards the objective, but began to climb upon it at a (horizontal) distance of 12 meter and springing upwards with all his force, tore down the banana. In both examples there is a period when the animal responds incorrectly; this is then followed by activity that, as it is reported, suggests that the solution to the problem has suddenly occurred to the subject. There is certainly no hint in these reports that the problem was solved by trial and error. Does this mean, then, that Kohler (1925) was correct in his criticism of Thorndikes (1911) theorizing? A problem with interpreting Kohlers (1925) findings is that all of the apes had played with boxes and sticks prior to the studies just described. The absence of trial-and-error responding may thus have been due to the previous experience of the animals. Sultan may, by accident, have learned about the consequences of jumping from boxes in earlier sessions, and he was perhaps doing no more than acting on the basis of his previous trial-and-error learning. This criticism of Kohlers (1925) work is by no means original. Birch (1945) and Schiller (1952) have both suggested that without prior experience with sticks and so forth, there is very little reason for believing that apes can solve Kohlers problems in the manner just described.
The absence of trial-and-error responses in Kohlers (1925) findings might have been due to the fact that most apes would have had prior experience of playing with sticks.
An amusing experiment by Epstein, Kirshnit, Lanza, and Rubin (1984) also shows the importance of past experience in problem solving and, at the same time, raises some important issues concerning the intellectual abilities of animals. Pigeons were given two different types of training. They were rewarded with food for pushing a box towards a spot randomly located at the base of a wall of the test chamber. Pushing in the absence of the spot was never rewarded. They were also trained to stand on the box when it was fixed to the floor and peck for food at a plastic banana suspended from the ceiling. Attempts to peck the banana when not standing on the box were never rewarded. Finally, on a test session they were confronted with a novel situation in which the banana was suspended from the ceiling and the box was placed some distance from beneath it. Epstein et al. (1984) report that (p. 61): At first each pigeon appeared to be confused; it stretched and turned beneath the banana, looked back and forth from banana to box, and so on. Then each subject began rather suddenly to push the box in what was clearly the direction of the banana. Each subject sighted the banana as it pushed and readjusted the box as necessary to move it towards the banana. Each subject stopped pushing it in the appropriate place, climbed and pecked the banana. This quite remarkable performance was achieved by one bird in 49 sec, which compares very favorably with the 5 min it took Sultan to solve his similar problem. There can be no doubt from this study that the prior training of the pigeon played an important role in helping it solve the problem. Even so, the study clearly reveals that the pigeons performed on the test session in a manner that extends beyond trial-and-error responding. The act of pecking the banana might have been acquired by trial-and-error learning, and so, too, might the act of moving the box around. But the way in which the box was moved to below the banana does not seem to be compatible with this analysis. The description by Epstein et al. (1984) of the pigeons behavior bears a striking similarity to Kohlers (1925) account of Sultans reaction to the similar problem. It might be thought, therefore, that it would be appropriate to account for the pigeons success in terms of insight. In truth, this would not be a particularly useful approach as it really does not offer an account of the way in which the problem was solved. Other than indicating that the problem was solved suddenly, and not by trial and error, the term insight adds little else to our understanding of these results. I regret that I find it impossible to offer, with confidence, any explanation for the findings by Epstein et al. (1984). But one possibility is that during their training with the blue spot, pigeons learned that certain responses moved the box towards the spot, and that the box by the spot was a signal for food. The combination of these associations would then result in them pushing the box towards the spot. During their training with the banana, one of the things the pigeons may have learned is that the banana is associated with food. Then, for the test session, although they would be unable to push the box towards the blue spot, generalization from their previous training might result in them pushing the box in the direction of another signal for food, the banana.
Causal inference and folk physics

The term insight is now rarely used in discussions of problem solving by animals. As an alternative, it has been proposed that animals have some understanding of
115
causal relationships and that they can draw inferences based on this understanding to solve problems. When a problem is encountered, therefore, animals are believed to solve it through reasoning based on their understanding of the physical and causal properties of the objects at their disposal. To take the example of Sultan joining two sticks together to reach food, if he understood that this action would create a longer stick that would allow him to reach further, he would then be able to solve the problem in a manner that is considerably more sophisticated than relying on trial and error. Of course, we have just seen that the studies by Kohler (1925) do not provide evidence that animals can solve problems in this way, but the results from other experiments have been taken as evidence that animals are capable of making causal inferences. The following discussion will focus separately on this work with primates and birds.
Primates
Premack (1976, pp. 249261) describes an experiment with chimpanzees in which a single subject would be shown an array of objects similar to the one in Figure 4.12. To gain reward, the chimpanzee was required to replace the strange shape in the upper row with the knife from the lower row. The choice of the knife was intended to reveal that the ape understood this object causes an apple to be cut in half. Two of the four subjects that were tested performed consistently well on this task. They received a novel problem on each trial, thus their success could not depend on them solving the problem by associating a given choice with a particular array of objects. An alternative explanation for the problem shown in Figure 4.12 is that the apes had repeatedly seen an apple being cut with a knife and they may have learned to select the object from the lower row that was most strongly associated with the one from the upper row. Although this explanation will work for many of the test trials, Premack (1976) argues there were certain occasions where it provides an implausible explanation for the successful choice. For instance, one trial was similar to that shown in Figure 4.12 except the apple was replaced by a whole ball and a ball cut into pieces. Even though the subjects had rarely seen a knife and a ball together, they still made the correct choice (see also Premack & Premack, 1994, pp. 354357).
FIGURE 4.12 Sketch of an array of objects used by Premack (1976) to test for causal inference in chimpanzees (adapted from Premack, 1976).
The findings from the experiment are encouraging, but there is some doubt about how they should be interpreted. The two apes who performed well on the task had received extensive training on other tasks, and one of them, Sarah, had even been taught an artificial language (see Chapter 13). Perhaps the extensive training given to the chimpanzees resulted in them acquiring a rich array of associations that helped them perform correctly on the tests, without the need to understand the relevant causal relationships. It is also possible that because of the similarity between the shapes of an apple and a ball, stimulus generalization rather than a causal inference was responsible for the selection of the knife during the test with the ball. To properly evaluate the results from the experiment it would be necessary to have a complete account of the training that was given before it started. It would also be important to have a full description of the method and results from the experiment itself. Unfortunately, the information that is available is rather brief and the reader is left in some doubt as to how Premacks findings should be interpreted. Another possibility is that because of their extensive training, the two chimpanzees were able to appreciate causal relationships and to draw inferences from them in a way that is not open to relatively nave chimpanzees. Once again, there is insufficient information available for this possibility to be assessed. There is no denying that the experiment by Premack has revealed some intriguing findings but, before its full significance can be appreciated, additional experiments are needed in order to pursue the issues that have just been raised. Rather than study causal inference, some researchers have investigated what they call folk physics, which refers to a common-sense appreciation of the causal properties of objects in the environment. Problems could then be solved by drawing inferences from the understanding about these properties. Povinelli (2000) has conducted a thorough series of experiments to explore whether chimpanzees make use of folk physics in problem solving, and they point to rather different conclusions to those drawn by Premack (1976). In one of Povinellis experiments, chimpanzees were confronted with a clear tube that contained a peanut. To retrieve the food, they were required to push it out of the tube with a stick. Once they had mastered this skill they were given the same task but this time there was a trap in the tube. Pushing the stick in one direction caused the peanut to fall out of the tube, pushing it in the other direction caused the peanut to fall in the trap where it was inaccessible. A sketch of this simple apparatus can be seen in Figure 4.13. Three of the four chimpanzees that were given this task never mastered it, and the fourth chimpanzee came to terms with it only after many practice trials. The conclusion to be drawn from this study is that chimpanzees did not have any appreciation of the problem created by the presence of the trap, that is, they lacked an understanding provided by folk physics concerning the properties of traps. Instead, the eventual success of the single chimpanzee can be explained by assuming that she learned through trial and error how to avoid pushing the food into the trap by inserting the stick into the end of the tube that was furthest from the peanut. A similar failure to find evidence of successful performance on the same problem has been found with capuchin monkeys (Visalberghi & Limongelli, 1994). Povinelli (2000) cites a total of 27 experiments, using a variety of tests, all of which show that chimpanzees have a complete lack of understanding of the physical properties of the problems that confront them. The interested reader might also refer to an article by Nissani (2006), which reports a failure by elephants to display causal reasoning in a tool-use task. The negative results that have just been cited make it all the more important to determine whether Premack (1976) was correct in claiming that chimpanzees are
117
Failure
Success
FIGURE 4.13 Right: Diagram of the apparatus used by Povinelli (2000) and by Visalberghi and Limongelli (1994) to test whether an animal will push a peanut in the direction that ensures it does not fall into a trap. From Visalberghi and Limongelli, 1994. Copyright 1994 American Psychological Association. Reproduced with permission. Left: A monkey about to attempt to retrieve a nut from the apparatus. Photograph by Elisabetta Visalberghi.
capable of causal inference. For the present, it is perhaps wisest to keep an open mind about the capacity of primates to refer to folk physics when solving problems, but what about other species? Clayton and Dickinson (2006) have suggested that the most compelling evidence that animals have an appreciation of folk physics can be found in certain species of birds.
Birds
In one study, Seed, Tebbich, Emery, and Clayton (2006) presented a group of nave rooks a version of the trap problem used by Povinelli (2000; described above). When first confronted with this problem, the direction in which the birds pushed the food was determined by chance but, as training progressed, they showed a marked improvement in avoiding the trap. To test whether this improvement reflected anything more than learning through trial and error, the birds were given a new problem where performance was not expected to be influenced by the effects of any prior trial-and-error learning. Six out of seven birds performed poorly on the new problem, but one bird performed extremely well from the outset. As the authors point out, it is hard to know what conclusions to draw when one bird passes a test that six others have failed, but the performance of the one successful bird encourages the view that future research with rooks might reveal promising results. Another species of bird that might possess an understanding of folk physics is the raven. Heinrich (2000; see also Heinrich and Bugnyar, 2005) describes a series of experiments with hand-reared ravens in which the birds were presented with a piece of meat hanging from a perch (see the left-hand side of Figure 4.14). The meat could not be obtained by flying towards it and clasping it in the beak. Instead, to reach the meat some birds settled on the perch where the string was attached and grasped the string below the perch with their beak and pulled it upwards. To stop the meat falling back they placed a foot on the string and then let it drop from their beak whereupon they bent down to grasp again the string below the perch. This operation was repeated until the meat was near enough to be grasped directly with the beak. In another test, ravens were confronted with the arrangement shown in the right-hand side of Figure 4.14. On this occasion, meat could be retrieved by standing on the
FIGURE 4.14 Diagram of the apparatus used by Heinrich and Bugnyar (2005). A raven stood on the perch and was expected to retrieve food by pulling the string upwards (left-hand side) or downwards (right-hand side).
(a)
(b)
Perch 50 cm string
Perch
50 cm string Wire mesh Meat
Meat
perch and pulling the string downwards. Birds who had mastered the original task were also adept at mastering this new task, but birds without prior experience of pulling string never mastered this second task. Heinrich and Bugnyar (2005) believe these results show that ravens have some kind of understanding of meansend relationships, i.e. an apprehension of a causeeffect relation between string, food, and certain body parts (p. 973). In other words, they have an appreciation of folk physics. This conclusion is based on the finding that birds spontaneously solved the first problem but not the second one. It was assumed that an understanding of causeeffect relations would allow the birds to appreciate that pulling food in one direction would result in the food moving in the same direction (Figure 4.15). Such knowledge would be beneficial when the birds had to pull the string upwards to make the meat rise upwards, but it would be a hindrance in the second problem in which the birds were required to pull the string downwards to make meat rise upwards. Although the performance of the ravens is impressive, it does not necessarily demonstrate that the birds relied on folk physics to solve the problem that initially confronted them. As Heinrich and Bugnyar (2005) acknowledge, the solution to the first problem might have been a product of trial-and-error learning in which the sight of food being drawn ever closer served as the reward for the sequence of stepping and pulling that the birds engaged in. According to this analysis, the initial contact with the string would have to occur by chance, which seems plausible because the birds
119
FIGURE 4.15 A raven solving the problem set by Heinrich and Bugnyar (2005). Photographs by Bernd Heinrich and Thomas Bugnyar. Reprinted with permission.
beak may have been close to the string as it peered down from the perch at the food. It is also worth noting that the birds had experience of eating road-kill carcasses, which may have allowed them to refine their skills of pulling and stepping to retrieve edible constituents. In the case of the second problem, the nave birds would be unlikely to make contact with the string as they looked down on the food, and they would therefore be unlikely to initiate a response that would be rewarded by the sight of food being drawn upwards. In support of this claim, it is noteworthy that the authors observed the nave birds make rather few contacts with the string in the second problem. The success on the second problem by birds with experience of the first problem can also be readily explained by assuming that the original experience increased the likelihood that they would pull on string attached to the perch in the new problem. A similar experiment has been conducted with elephants by Nissani (2004), who concluded that even their successful performance was a consequence of nothing more than learning through trial and error. Before describing one final laboratory study, it is worth considering an example of tool use by birds in their natural environment. Woodpecker finches live on the Galapagos Islands, where many of them use twigs or cactus spines held in their beaks to extract insects from holes in trees. They will even modify these tools by shortening them if they are too long, and removing twiglets if they prevent the twig from being inserted into a hole. Although it might be thought that this behavior reflects an understanding of how sticks can be used as tools to extend the reach of the beak, and an understanding of how such tools can be modified to make them more effective, a careful study by Tebbich, Taborsky, Fessl, and Blomqvist (2001) provides a more mundane explanation for this behavior. It seems that juvenile woodpecker finches have a natural tendency to pick up twigs and cactus spines and to insert them in holes in trees. If this activity should result in food, then the particular action that has been performed will be repeated in other holes. Not all adult woodpecker finches display this skill, which has led Tebbich et al. (2001) to argue that tool use can be acquired only when the bird is young, and only if it is exposed to the appropriate environment. In other words, the skill of inserting twigs into holes is no more than a consequence of the interaction between learning through trial and error and the maturation of a species-typical behavior. Perhaps the most dramatic example of tool use in birds has been shown in New Caledonian crows (Weir, Chappell and Kacelnik, 2002). These birds live on New Caledonia, an island about 1600 km east of the north-east coast of Australia.
They use long thin strips of leaf with barbs running down one edge to draw prey from cracks and holes in trees. It is not known if this ability is learned or inherited, but if the conclusions drawn from the study by Tebbich et al. (2001) have any generality, then it will be a mixture of the two. In the experiment by Weir et al. (2002), a male and a female crow were expected to retrieve a piece of food from a bucket with a handle that was placed in a clear, vertical tube (Figure 4.16). The tube was so deep that it was impossible for the birds to reach the handle of the bucket with their beaks. A piece of straight wire and a piece of wire with a hook at one end were placed near the tube and the birds were expected to use the hooked wire to lift the bucket out of the tube by its handle. On one occasion, the male crow selected the hooked wire, which left the female with the straight wire. She picked up one end in her beak inserted the other end in a small opening and then bent the wire to create a hook which was of a suitable size to enable her to lift the bucket from the tube. A video clip of this sequence can be seen by going to the following web address http://www.sciencemag.org/cgi/content/full/ 297/5583/981/DC1. As one watches the female crow bend the wire it is hard not to agree with the authors that she was deliberately modifying the wire to create a tool, and that this modification relied on an understanding of folk physics and causality. However, appearances can be deceptive, and it would be a mistake to ignore the possibility that the birds behavior was a consequence of less sophisticated processes. As with the woodpecker finches, it is possible that the skill displayed by the female crow was a consequence of the interaction between inherited tendencies and learning based on prior experience with sticks, twigs, and so on. Before this explanation can be rejected with complete confidence, more needs to be known about the development of tool use in New Caledonian crows in their natural habitat. It is also a pity that rather little is known about the prior experiences of the bird in question, which was captured in the wild. The results from these tests for an understanding of folk physics in birds can perhaps most fairly be described as ambiguous in their theoretical significance. On the one hand, it is possible to explain most, if not all, of them in terms of the trial-and-error principles advocated by Thorndike (1911) almost a century ago.
FIGURE 4.16 A New Caledonian crow lifting a bucket out of a tube in order to retrieve food in an experiment by Weir, et al., (2002). Photograph by Alex Weir Behavioural Ecology Research Group, University of Oxford.
121
However, a critic of this type of explanation would argue that it is so versatile that it can explain almost any result that is obtained. Moreover, although the trial-and-error explanations we have considered may be plausible, there is no evidence to confirm that they are necessarily correct. On the other hand, some of the behavior that has been discovered with birds is so impressive to watch that many researchers find it hard to believe they lack any understanding of the problem that confronts them. For myself, my sympathies rest with an analysis of problem solving in terms of trial-and-error learning. The great advantage of this explanation is that it is based on firmly established principles of associative learning. Problem solving relies on the capacity for rewards to strengthen associations involving responses, and the transfer of the solution from one problem to another is explained through stimulus generalization. By contrast, much less is known about the mental processes that would permit animals to make causal inferences, or to reason using folk physics. Seed et al. (2006) note briefly that these processes may involve the capacity for acquiring abstract rules about simple, physical properties of the environment. Given such a proposal two questions arise: first, how is such knowledge about the environment acquired, and second, are animals capable of abstract thought? As far as I am aware, no-one has offered an answer to the first question and, as for the second question, we shall see in later chapters that whether or not animals are capable of abstract thought is a contentious issue that has yet to be fully resolved.
Subject index
Note: Page numbers in italic refer to information contained in tables and diagrams. abstract categories 17980, 189, 358 abstract mental codes 325, 359, 368, 371 abstract thought 121 acquired distinctiveness 1578, 157, 158 acquired equivalence 1578, 157, 158 active memory 220, 221 adaptability 1213 addition 24950, 250 African claw-toed frog 216, 216 African grey parrot and category formation 172, 182, 182 and communication 3513 and mimicry 304 and mirror-use 322, 323 number skills of 24850, 2489, 251, 252 aggression, displays of 3378, 337 AI see artificial intelligence air pressure 285 Akeakamai (dolphin) 3535 alarm calls 331, 3325, 336, 337 albatross 286 Alex (parrot) 24850, 2489, 251, 252, 3513, 368 alley running 94, 98, 106 and extinction 1313, 132 and navigation 2678 and number skills 2456, 246 alpha conditioning 44 American goldfinch 331 American sign language 3412, 34950, 349 amnesia, drug-induced 2245 amodal representation 2401, 241 amygdala 21, 224 analogical reasoning 187, 188, 368, 371 anecdotes 23, 112 animal intelligence 233 and brain size 1012, 10, 3614 definition 1216, 3623 distribution 412, 4, 36071 ecological view 228, 229, 231 and evolution 36971 general process view 2301 historical background 2233 null hypothesis of 261, 3649 reasons to study 1620 research methods 202 animal kingdom 45 animal welfare 1920 anisomycin 2245 anthropomorphism 234, 1689, 2623 ants 2658, 267 apes 16 and category formation 182 and communication 32950, 3356, 340, 3567 and language 116, 32950, 3567 and self-recognition 369 vocalizations 33940, 340 see also chimpanzee; gorilla; orang-utan Aplysia californica (marine snail) 43, 435, 46, 47, 149, 263 Arctic tern 289 artificial intelligence (AI) 19 associability/conditionability 678, 745, 91 associative competition 1012, 101 associative learning 3561, 123, 143 and attention 3656 conditioning techniques 2642 CRs 5561 and deception studies 313, 315 definition 35 and evolution 3701 memory model of 469, 469, 59 nature of 429 and the null hypothesis 3646 and problem solving 121 and the reflexive nature of the CR 601 and stimulus-stimulus learning 4952 and surprise 3656 and time 233 and US representations 525 see also instrumental (operant) conditioning; Pavlovian (classical) conditioning associative strength 6471, 66, 68, 83, 889, 91, 101, 102 and category formation 1757, 179 and discrimination learning 1523 equations 6570, 83, 87, 88, 89 and extinction 12531, 134, 142 negative 1278, 130, 176 and stimulus significance 834 associative-cybernetic model 111 asymptote 36 attention 3656, 371 automatic 86 and conditioning 63, 7491, 86 controlled/deliberate 86 and the CS 63, 76, 77, 7880 dual 86 multiple modes of 86 audience effects 3356 auditory templates 332 Austin (chimpanzee) 345, 346, 357 autoshaping 37, 37, 38, 60, 126, 127 omission schedules 60 baboon and deception 314, 314, 315 predatory nature 333, 336 Bach, J.S. 1723 back-propagation networks 1645 two-layer 164 bees light detection 285 navigation 26971, 271, 280 see also bumble-bee; honey-bee behaviorists 28 bicoordinate navigation 288 bidirectional control 3089, 3089, 310 biological significance 801, 845, 102 birds evolution 6 and imitation 3045, 305 and navigation 275 and problem solving 11721, 11820 short-term memory of 228, 22930 song birds 3312, 363 see also specific species black and white images 1734 black-capped chickadee 305, 363, 363 blackbird 302 bliss points 104 blocking 53, 634, 63, 6970, 78, 3656 LePelley on 91 and Mackintoshs theory 834 and navigation 293, 294 and the PearceHall model 889
412 Subject index

blocking (Continued) and the RescorlaWagner model 723, 73 blue tit 230, 3045 body concept 3234 bombykol 265 bonobo, and communication 3435, 344, 346, 3478, 348, 349, 357 bottle-nosed dolphin and communication 338, 3536, 3545 and learning set studies 362 brain 201 human 19 size 1012, 10, 11, 3614 brain size-body ratio 1011, 11, 361 brown bear 266 budgerigar 30910, 367 bumble-bee 298 Burmese jungle fowl 298 cache 226 canary 363 capuchin monkey and category formation 184 and diet selection 300 and problem solving 116, 117 see also cebus monkey carp 173 cat and communication 328 and the puzzle box escape task 267, 267, 28 category formation 17089, 358 abstract categories 17980, 189, 358 categories as concepts 17980 examples of 1713 exemplar theories of 1778, 178 feature theory of 1739 and knowledge representation 1889 prototype theory of 1789 relationships as categories 1807 theories of 1739 causal inference 11421, 115 cebus monkey and imitation 312, 312 and serial order 2567, 2567 see also capuchin monkey cephalization index (K) 1112, 12, 13, 361 chaffinch 331 chain pulling tasks 95, 97, 110, 141, 253 cheremes 341 chick, and discrimination learning 15961, 15961 chicken and communication 331 and discrimination learning 150 chimpanzee 4, 368, 371 and abstract representation 189 and category formation 17981, 180, 1867, 188, 189 and communication 24, 116, 327, 336, 33950, 3424, 357, 359 dominant 31718 and emulation learning 310 and imitation 306, 307, 310, 311, 367 and instrumental conditioning 105, 106 lexigram use 3435, 3434, 346 and mimicry 303 and number skills 245, 245, 248, 250, 250, 252 and plastic token use 3423, 343 and problem solving 11217, 11213, 115, 117 and self-recognition 31924, 320, 322 short-term memory of 228 and social learning 367 subordinate 31718 and theory of mind 31213, 31418, 316, 317, 324 and transitive inference 25960, 262 vocal tracts 340, 340 welfare issues 19 see also bonobo chinchilla 172, 172 chunking 2556 circadian rhythms 2334 circannual rhythms (internal calendars) 291 Clarks nutcracker 214, 228, 230, 2723, 294, 370 Clever Hans 16, 243, 243 clock-shift experiment 288 coal tit 230 cockroach 2334, 2623 cognition, and language 3589 cognitive maps 27680, 330, 36970 color perception 1734 communication 32659, 369 definition 327 development of 3302 honey-bees and 32830, 32830, 331, 3389 as innate process 3303, 336, 356 and intention 3356 and interpretation of the signal 3301 as learnt process 3303, 336, 337 and representation 3345 and the significance of signals 3345 vervet monkeys and 3327, 3334 see also language compass bearings 26970, 271, 272 compound stimuli 6873, 834, 889, 1269, 1513, 160, 1767 concepts, categories as 17980 concrete codes 188, 371 conditioned emotional response (CER) see conditioned suppression conditioned inhibition see inhibitory conditioning conditioned reinforcement 1056, 105 conditioned response (CR) 29, 35, 3940, 44, 5560, 285 air pressure as 285 and autoshaping 37 and blocking 63 compensatory 568, 58 consummatory 556 and discrimination learning 149, 155, 156, 164, 165 and eye-blink conditioning 367 and fear of predators 302 influence of the CS on 589 and inhibitory conditioning 42 and the memory model of conditioning 47 preparatory 56 reflexive nature of 601 and representation of the US 109 response-cueing properties 11011, 111 and stimulus-stimulus conditioning 502 strength of 64, 65, 70, 71, 73 conditioned stimulus (CS) 29, 30, 323, 123 associability/conditionability of 678, 75 associative strength 12531, 134, 142 and attention 63, 76, 77, 7880 aversive 110 and compensatory CRs 57, 58 compound 6873, 834, 889, 1269, 1513, 160 and conditioned suppression 38 and discrimination learning 162 and excitatory conditioning 35, 367, 38, 39, 40 and extinction 124, 12530, 1345, 13646 and eye-blink conditioning 367 and fear of predators 302 and imitation 305 influence on the CR 589 inhibitory 40, 412, 217 and instrumental conditioning 106, 10911 intensity 678, 68 and long-term memory studies 216, 217
Subject index
413
memory model of 468, 59 and the nature of US representations 524 neural mechanisms of 435 protection from extinction 1268 and the reflexive nature of the CR 601 and the renewal effect 1368 single 648 spontaneous recovery 1346 and stimulus-stimulus conditioning 502 and taste aversion conditioning 39 and timing 242 and trace conditioning 194 see also CSnoUS association; CSR association; CSUS association/contingency conditioned suppression (conditioned emotional response) 378, 38, 41, 42, 63, 67, 678, 712, 72, 85, 85, 8990, 89 conditioning alpha 44 and attention 63, 7491, 86 compound 6873, 834, 889, 1269, 1513, 160 excitatory 3540, 425, 445, 524, 723 eye-blink 367, 367, 534, 64, 64, 689, 69, 124, 125, 135, 135 information-processing model of 46, 46 instrumental 92121 observational 302, 308, 309 second-order (higher-order) 502, 51, 1857, 186 serial 4950 single CS 648 and surprise 6274 trace 1946, 1956 see also inhibitory conditioning; instrumental (operant) conditioning; Pavlovian (classical) conditioning; taste aversion conditioning conditioning techniques 2642 autoshaping 37, 37, 38 conditioned suppression 378, 38 control groups 3940, 40 excitatory conditioning 35, 3640 inhibitory conditioning 35, 402 configural cues 153 configural theory of discrimination learning 1557, 156 connectionist models of discrimination learning 1616, 1626 exemplar-based networks 1656, 165
hidden layers 165 multi-layer networks 1645, 165 single-layer networks 1624, 1624 of time 241 conscious thought 21011 consolidation theory (rehearsal theory) of long-term memory 21820, 218, 225 conspecifics 300, 324, 335 context 72, 138, 229 context-stimulus associations 7880, 901 contextual variables 15 contiguity 31, 63, 98, 98 continuous reinforcement 1301, 1312, 134, 134, 1456, 145 control groups 3940, 40 truly random control 40 cooperation, and language 3389 copying behavior 297, 298, 301, 30212, 324 mimicry 3034, 312, 324 see also imitation coriolis force 288 corvids 181, 182, 368 counting 233, 243, 2459, 2512 cardinal principle 252 oneone principle 252 stable-ordering principle 252 cow 229 coyote 589 CSnoUS association 1356 CSR (response) association 3645 inhibitory 140, 142 CSUS association/contingency 712, 8890, 91, 945, 99, 111, 346, 364 associative strength 6471, 83, 88 and discrimination learning 162 and extinction 123, 1368, 13840 and inhibitory connections 1368, 142 and learned irrelevance 856, 90 negative 712 positive 71 and stimulus significance 834, 857 and the strength of the CR 64 zero 71 dark-eyed juncos 363 dead reckoning/path integration 2669, 26970, 278, 280, 293, 367 decay theory 2023 deception 31315, 314 delay, gradient of 98, 98 delay conditioning 194 delayed matching to sample (DMTS) 1979, 228 and decay theory 2023 and forgetting 200, 2013, 204, 205
density discrimination 1679, 167, 168 deprivation 106109, 108, 111 detours 2768, 277 diana monkeys 335 diet selection 298301, 299 see also food discrimination learning 14869, 171, 178, 368 configural theory of 1557, 156 connectionist models of 1616, 1626 elemental theories of 155, 156 and learning sets 3612, 362 and metacognition 1669, 1678 and relational learning 14950 RescorlaWagner theory of 1525, 1534 Spence on 1502, 155, 156 stimulus preexposure 15761 theories of 14961 and theory of mind studies 317, 318 and transposition 14951 dishabituation 79, 1934 displacement (language) 337, 345, 3546 distance effect 257, 259 distractors 1934, 2012, 2034 surprising 203 DMTS see delayed matching to sample Do as I do test 3078, 310, 311, 367 dog 223 and classical conditioning 2930, 29 and communication 337, 337, 3501 and trial and error learning 256 dolphin 16, 16, 124 and category formation 181 and communication 3389, 3536, 3545 and learning set studies 362 and self-recognition 322, 324, 369 short-term retention of 198, 201, 206, 206, 207, 228 see also bottle-nosed dolphin drive centers 106, 107 drives 301, 1067 drug tolerance 568, 58 eagle 80, 333, 334, 335 ecological niches 36970 ecological view of animal intelligence 228, 229, 231 EDS see extradimensional shift
414 Subject index

electric shocks 14, 312, 434, 50, 53, 634, 67, 69, 713, 846, 8990, 107, 1245, 12930, 135 and conditioned suppression 38 and eye-blink conditioning 36 and inhibitory conditioning 41, 42 and long-term memory studies 216, 216, 217, 21921, 219 and navigation experiments 285 and preparatory CRs 56 and stimulus-stimulus learning 512 electroconvulsive shock (ECS) 218, 21819, 220, 2245, 228 elemental theories of discrimination learning 155, 156 elephant 36 long-term memory of 213, 213 and problem solving 116 and self-recognition 322, 324, 369 and theory of mind 317 emulation learning 310 endogenous control, of migration 2902, 290 English language, spoken 33940, 3445 environment 78 episodic memory 22531 episodic-like memories 2268 equilibrium theories of behavior 104 European robin 2912 evolutionary development 3, 410, 22, 23 and intelligence 36971 tree of evolution 6, 7 excitatory conditioning 3540, 723 and the nature of US representations 524 of neural processes 425, 445 exemplar effect 177 exemplars 175, 1779, 178 expectancy (RUS) theorists 945 see also RUS associations experience 11314 extinction 36, 512, 578, 656, 66, 108, 109, 12247, 224 associative changes during 13442 of conditioned inhibition 12930 conditions for 12542 enhanced 1289, 1289 as generalization decrement 1235, 125, 128, 1324, 144, 146 and an inhibitory SR connection 1402, 141 not affecting CSUS associations 13840, 139 partial reinforcement 1304, 131, 132, 134, 1457, 145 and Pavlovian conditioning 1427, 1436 protection from 1268, 1267 and the renewal effect 1368, 136, 137 and surprise 12530 trial-by-trial basis 1427, 1436 as unlearning 123 with a US 129 extradimensional shift (EDS) 813, 82 eye-blink conditioning 367, 367, 534, 64, 64, 689, 69 extinction 124, 125, 135, 135 face recognition 1734, 176, 176, 178 family trees 6, 89 fear, of predators 3012 fear conditioning see electric shocks; electroconvulsive shock feature theory of category formation 1739 feature-positive discrimination 1512, 1556 Fellow (dog) 3501 fish and category formation 173 and learning 13, 15 and navigation 274 short-term memory of 228 see also specific species folk physics 11621 food 8, 9 food-pulling tasks 11719, 11819 hiding/storing behavior 191, 214, 2267, 22930, 363, 370 rewards 1314, 14, 15, 278 see also diet selection foraging behavior 3001, 301 forgetting 21625, 228 decay theory of 2023 deliberate 2045 and proactive interference 2001, 201, 203, 204 and retroactive interference 163, 200, 2012, 203 and short-term retention 199205 fossil record 6 fruit-fly 327 frustration theory 1323, 135 Galapagos Islands 5, 6, 119 geese 2689, 2689, 270 general process view of animal intelligence 2301 generalization and category formation 174, 177, 178 gradients 150, 150, 156 mediated 180 numerosity 244, 244 temporal 2368, 237, 2401 see also stimulus generalization generalization decrement 37 as extinction 1235, 125, 128, 1324, 144, 146 geometric modules 2746 geometric relations 2724, 2734, 367 geraniol 234 gerbils 268, 26972, 270, 272 ghost control 310 goal directed learning 32 goldfish 3645 gorilla and communication 339 and self-recognition 319, 321, 322 gradient of delay 98, 98 grain selection 801, 81 grammar 3378, 34650, 3536, 355, 358 of American sign language 341 and lexigrams 343 using plastic tokens 342 gravity 288 great spotted woodpecker 191 great tit 3045, 305, 363 green turtle 283, 284, 291 grief 341 guppy 301 habituation 75, 77, 79, 80, 91 definition 192 and retroactive interference 201, 203 and short-term retention 1924, 193, 201, 203 see also dishabituation hamster and navigation 268, 2778, 277 social behavior of 5960 harbor seal 303 heading vectors 272 hedgehog 229 higher vocal centre 363 hippocampal place cells 2803 hippocampus 21, 3634 homing 284, 2869 and bicoordinate navigation 288 and landmark use 286 map and compass hypothesis of 2878 and olfaction 289 by retracing the outward route 287, 287 Homo sapiens 5 see also human beings honey-bee communication (waggle dance) of 32830, 32830, 331, 3389 and navigation 268 short-term memory of 228 sounds made by 32930
Subject index
415
and time perception 234, 235, 236, 262 Hooker, John Lee 173 Horizon (TV programme) 338 horse 16, 243, 243 human beings 371 brain 19 and category formation 177, 178 episodic memory of 225 and the exemplar effect 177 and language 327, 338, 33940, 3445, 356 and learning 74, 86 long-term memory of 213 metamemory of 2078 navigation skills 266, 275 and number 248 and serial position effects 207 see also Homo sapiens hunger 1069, 110 hunting by search image 801, 801 hyena 370 IDS see intradimensional shift iguana 5, 6 imaginal codes 188 imitation 303, 30412, 324, 3678 and bidirectional control 3089, 3089, 310 chimpanzees and 306, 307, 310, 311, 367 Do as I do test 3078, 310, 311, 367 and emulation learning 310 laboratory studies of 30610 mechanisms of 31112 monkeys and 31112, 312, 367, 368 naturalistic evidence 3046 and two-action control 30910, 309 inactive memory 220 indigo bunting 292 information-processing 1516, 21 information-processing model of conditioning 46, 46 of time 2378, 237, 239, 239, 240, 241 infrasound 285, 288 inheritance, of problem solving abilities 11920 inhibitory conditioning 35, 402, 413, 701, 73, 1278 detection 412 extinction 12930 and latent inhibition 76 and long-term memory 217 and the nature of US representations 545, 55
innate processes communication 3303, 336, 356 fear 302 insight 11214 instrumental (operant) conditioning 28, 30, 323, 92121 conditions of 97106 and discriminatory learning 153 extinction 1238, 1314, 141 historical background 935 and long-term memory studies 21617 and the memory model 111 and mimicry 303 nature of 937 and the null hypothesis 364, 3656 Pavlovian interactions 106, 10911 and problem solving 11121 and serial order 253 and vigor of performance 10611 intellectual curiosity 16 intelligence see animal intelligence intention, and communication 3356 internal clocks 2336, 262, 2878, 292 alternatives to 2412 interneurons 45 intertrial interval 197 interval scales 250 interval timing 233, 23641, 23741, 242, 263 midpoints 238, 23940 pacemaker 241, 263 intradimensional shift (IDS) 813, 82 Japanese macaque monkey 300, 306 Japanese quail and imitation 309, 309 and mate choice 301 jaw movement 3940, 40 Kanzi (bonobo) 3445, 344, 346, 3478, 348, 349, 357 kinesthetic self concept 3234 knowledge attribution 31519 knowledge representation 1889, 3589, 3689 abstract mental code 325, 359, 368, 371 concrete mental code 188, 371 self-recognition 31924, 3202, 3689 see also number; serial order; time Koko (gorilla) 319, 321 Lana (chimpanzee) 343, 346, 352 landmarks 2934 and cognitive maps 27880 and compass bearings 26970, 271, 272
in distinctively shaped environments 2746, 2745 and geometric relations 2724, 2734, 367 and heading vectors 272 and hippocampal place cells 2813 and homing 286 piloting with multiple 2712, 2712, 367 piloting with single 26971, 367 retinal snapshots of 270, 271 language 327, 33659, 371 American sign language 3412, 34950, 349 apes and 116, 32950, 3567 arbitrariness of units 337, 338 and cognition 3589 and cooperation 3389 definition 3368 discreteness of 337, 338, 345 and displacement 337, 345, 3546 grammar 3378, 3413, 34650, 3536, 355, 358 human 327, 338, 33940, 3445, 356 as innate process 356 and lexigrams 3435, 3434, 346 and motivation 357 and the null hypothesis 369 and plastic tokens 3423 and problem solving 19 productivity of 3378, 3467 requirements for learning 35679 and semanticity 337, 346 sentences 3423, 34650, 3524, 355, 356, 358 spoken English 33940, 3445 spontaneous 357 training assessment (apes) 34550 training methods (apes) 33940 training methods (non-primates) 3506 language acquisition device 3567 latent inhibition 76, 7980, 8891, 3656 Law of Effect 278, 301, 934, 1024 and problem solving 11121 learned irrelevance 84, 856, 85, 90 learning 89, 1315 communication and 3303, 336, 337 definition 13 emulation learning 310 goal directed 32 human beings and 74, 86 and migration 2923 perceptual 15861, 15961 relational 14950
416 Subject index

learning (Continued) speed of 1315, 14 stimulus-stimulus 4952 and surprise 6274 trial and error 257, 11214, 11617, 11921, 169, 211, 297, 323, 336 see also associative learning; discrimination learning; social learning learning curve 64, 64, 65 learning sets 3612, 362 leopard 335 lever pressing 93, 956, 98102, 1058, 110 and conditioned suppression 378 and CSUS contingency 71 and discrimination learning 1512, 153 and extinction 124, 141 and imitation studies 3067, 307 and interval timing 2368, 237, 23941, 2401 and number skills 245, 246, 251 and preparatory CRs 56 and serial order 253 lexigrams 3435, 3434, 346 light polarized 285 ultraviolet 285 light-dark cycle 2878, 292 artificial 288, 291 limited capacity theory 202, 2034 loggerhead turtle 2901, 290 long-distance travel 265, 28393 and air pressure 285 homing 284, 2869 and magnetic fields 284, 288, 291 migration 284, 28993 and navigational cues 2845, 2935 and polarized light 285 and ultraviolet light 285 long-term memory 191, 21231 capacity 21415, 214, 215 comparative studies of 22831 consolidation theory of 21820, 218, 225 durability 21517, 216, 217 episodic 22531 neural circuitry 218, 2245 retrieval theory of 218, 2205, 222, 223 loris 266 Loulis (chimpanzee) 3412, 342, 350 magnetic fields 26970, 284, 288, 291 magnitude effect 256, 259 mammalian evolution 6 Manx shearwater 286 map and compass hypothesis 2878 maps, cognitive 27680, 330, 36970 marmoset monkey 83 and imitation 310, 367 short-term memory of 3667, 366 marsh tit 363 marsh warbler 332 Matata (bonobo) 344 matching to sample 1803, 181, 184, 1867, 186, 189, 215, 368 see also delayed matching to sample mating and deception 31415 mate choice 301 maze running 94, 94 and foraging behaviour 3001, 301 and long-term memory 2224, 223 and navigation skills 2767, 277, 2813, 2812, 294 and trace conditioning 1956, 195, 196 see also radial maze; T-mazes mean length of utterances (mlu) 349, 349 meaning 337, 346 Melton (baboon) 314, 315 memory 1889 active 220, 221 episodic 22531 episodic-like 2268 and evolution 370 human 371 inactive 220 and the null hypothesis 3667, 366 reference 2378 retrieval 21 spatial 3634, 366, 3667 and standard operating procedures 769 and stimulus significance 81 visual 304 see also long-term memory; short-term retention memory model of associative learning 469, 469, 59, 111 memory traces 202, 211, 2412 mental states 28, 312, 3245 metacognition definition 166 and discrimination learning 1669, 1678 metamemory definition 208 and short-term retention 20711, 209 metamphetamine 241 Mexican jay 230 migration 284, 28993 definition 284, 289 endogenous control of 2902, 290 and learning 2923 and the null hypothesis 367 mimicry 3034, 312, 324 vocal 3034 mind 12 mirror neurons 31112, 324 mirrors 31924, 320, 321, 3689 Missouri cave bat 286 mlu see mean length of utterances molar theory of reinforcement 99100 molecular theory of reinforcement 100, 1012 monkeys 368 and category formation 179, 187 and discrimination learning 1669, 167, 168 episodic-like memory of 227 fear of predators 3012 and imitation 31112, 312, 367, 368 and instrumental conditioning 93 and metacognition 1669, 167, 168 mirror-use 322, 323 number skills of 2468, 247, 251, 252 and self-recognition 322 and serial order 253, 2569, 2568, 358 short-term retention of 198, 2012, 204, 228 and trace conditioning 195 and transitive inference 2601, 260, 262 see also specific species morphine 568, 58 motivation dual-system theories of 1078, 10910 and language 357 motor neurons 445, 45, 46 mouse 233 music 1723 mynah bird 303 natural selection 5 navigation 21, 26495 bicoordinate 288 and communication in honey-bees 32830, 32830, 331 long-distance travel 265, 28393 methods of 26583 and the null hypothesis 367 and the Rescorla-Wagner theory 293 short-distance travel 26583, 293 and the sun 267 navigational cues 2845, 2935 need 30, 1067 negative patterning 153, 156, 162 neocortex 364 neophobia 2989, 300 neural circuitry of associative learning 425, 44, 45 of long-term memory 218, 2245
Subject index
417
neural net theories see connectionist models New Caledonian crow 11920, 120 Nim (Neam Chimpsky) (chimpanzee) 341, 346, 34950, 357 noUS center 1346, 135 noUS representation 55 nominal scales 247 nous (mind) 12 null hypothesis of intelligence 12, 261, 3649 and associative learning 3646 as impossible to refute 369 and language 369 and memory 3667, 366 and navigation 367 and the representation of knowledge 3689 and social learning 3678 number 233, 24352, 263, 368 absolute number 2457, 2457 addition 24950, 250 interval scales 250 nominal scales 247 numerical symbols 24850, 2489, 252 ordinal scales 2478 and perceptual matching 2512 relative numerosity 2434, 244 representation 2478 subitizing 251 number-identification task 86 numerons 252 numerosity generalizations 244, 244 oak leaf silhouette experiment 171, 172, 173 observational conditioning 302, 308, 309 octopus 298 odour cues 309 olfaction 289 omission schedules 60 operant conditioning see instrumental (operant) conditioning orang-utan and communication 33940, 340 and mimicry 303 and self-recognition 3234 ordinal scales 2478 orienting response (OR) 746, 75, 878, 87 oscillators 236, 263 overshadowing 70, 1012, 293 Pacific sea anemone 192 paramecium 149, 149, 192, 193 parrot and communication 3513 and mimicry 303 see also African grey parrot
partial reinforcement 1304, 131, 132, 134, 1457, 145 partial reinforcement effect (PRE) 130, 132, 133, 1457 path integration see dead reckoning/path integration Pavlovian (classical) conditioning 20, 2830, 29, 323, 3561, 93, 99 using air pressure 285 and autoshaping 37 and the conditioned response 5561 and conditioned suppression 38 and diet selection 299 and discrimination learning 14950, 162 and dual attention 86 and evolution 3701 and extinction 1236, 12831, 1334, 13847 eye-blink conditioning 367 and fear of predators 302 and imitation 305 using infrasound 285 instrumental interactions 106, 10911 and long-term memory studies 217 using magnetic fields 284 memory model of 469, 46 neural mechanisms of 435, 44, 45 and the null hypothesis 3646 and overshadowing 101 and stimulus-stimulus learning 4952 taste aversion conditioning 39 and timing 242 using ultraviolet and polarized light 285 Pavlovian-instrumental transfer design 110 peak procedure 239, 239 peak shift 151, 153, 178 PearceHall theory 8691 pedometers 267 penguin 24, 245 pentobarbitol 224 perception, colour 1734 perceptual learning 15861, 15961 perceptual processing 21 periodic timing 2336, 263 oscillators 236, 263 pheromone trails 2656 Phoenix (dolphin) 353 phonemes 173, 341 phylogenetic scale 5 physiological techniques 201 pigeon 16, 1718, 556, 3656, 368 and autoshaping 37, 37, 38, 60 and category formation 1712, 172,
1735, 1745, 176, 177, 1825, 182, 1835, 358 and communication 339 and discrimination learning 151, 151, 154 episodic-like memory of 2278 and extinction 124, 1268 and homing 2869, 293 and imitation 310, 311 and inhibitory conditioning 40 and long-term memory 21417, 214, 215, 2278 metamemory of 211 navigation skills of 264, 272, 2849, 293 number skills of 2434, 244, 251 and problem solving 114 and selective association 84 and self-recognition 322 and serial order 2536, 2545, 259 short-term retention of 1978, 200, 2012, 204, 205, 206, 211, 228 and stimulus significance 803, 84 and stimulus-stimulus conditioning 50 and time 2412 and transitive inference 2612, 261 piloting with multiple landmarks 2712, 2712 with single landmarks 26971 pinyon jay 363 place cells 2803 plastic tokens 3423, 343 population-specific behavioral traditions 3056 PRE see partial reinforcement effect predators and alarm calls 331, 3325, 334, 337 fear of 3012 prefrontal lobe 21 Premack principle 1034 primacy effect 2067 primates 6, 89 and imitation 3056 and navigation 275 and social learning 367 see also specific species proactive interference 2001, 201, 203, 204 problem solving 267 and causal inference 11421, 115 and folk physics 11621 and insight 11214 and language 19 and the Law of Effect 11121 prospective code 198 protection from extinction 1268, 1267
418 Subject index

prototype theory 1789 protozoan 327 punishment 14, 312, 93 see also electric shocks; electroconvulsive shock puzzle boxes 267, 267, 28 quail 367 RUS associations 945, 95, 97, 1012, 11011, 311, 365 rabbit 64, 64, 689, 69, 7780, 110 and extinction 1245, 135 and habituation 1924, 193 and Pavlovian conditioning 367, 3940, 534 raccoon 601, 191 radial maze 2289, 366 and forgetting 2001, 201, 202, 203, 2045 and navigation 294 and serial position effects 2067, 208 and short-term retention 1967, 1967, 198, 199 rat 14, 312, 32, 59 and attention and conditioning 756, 75, 78, 838, 86, 8990 and compensatory CRs 568 and conditioned suppression 378 and CSUS contingency 71 and diet selection 298301, 299, 301 and discrimination learning 1512, 153, 153, 1578, 163 episodic-like memory of 227, 228 and extinction 128, 128, 1304, 139, 1402, 146 and imitation 3067, 307, 3089, 308, 3678 and inhibitory conditioning 42 and instrumental conditioning 93101, 94101, 1038, 103, 105, 108, 110, 365 and knowledge representation 188 and learned irrelevance 856, 86 long-term memory of 21625, 21819, 222, 2278 and the memory model of conditioning 479, 47 and method of learning 1415 navigation skills of 268, 2739, 2735, 2779, 2813, 281, 293, 294 number skills of 2456, 246, 2512 and orienting responses 878 and preparatory CRs 56 and selective association 84, 85 and serial order 253 short-term retention of 1947, 1957, 199207, 201, 204, 208, 2289 and stimulus significance 83, 84, 85 and stimulus-response theorists 312, 32 and stimulus-stimulus learning 4950 and time 2358, 237, 23941, 240, 241 and trace conditioning 1946, 1956 raven 11719, 11819 reactivation effects 2215, 222, 223 reasoning, analogical 187, 188, 368, 371 recency effect 2067 reference memory 2378 reinforcement conditioned 1056, 105 continuous 1301, 1312, 134, 134, 1456, 145 molar theory of 99100 molecular theory of 100, 1012 reinforcer devaluation design 956, 95, 97 reinforcers 93 and associative competition 1012 conditioned 1056 definition 1024 incentive value of the 108, 109 nature of the 1024 negative 93 positive 93 response contiguity 98 token 1056 relational learning 14950 relationships between objects 1807, 1889 second-order 1857, 186 renewal effect 1368, 136, 137 representation 212 A1 768, 76, 81 A2 7680, 76, 81 amodal 2401, 241 and communication 3345 inactive 76, 76 of number 2478 retrieval-generated A2 representations 789 self-generated A2 representations 778, 79 spatial 2589, 262 standard operating procedures model 76, 76 US 109 see also knowledge representation reptiles 6 RescorlaWagner model 645, 6871, 724, 83, 84, 91 and category formation 175, 176 and discrimination learning 1525, 1534, 156, 162, 163, 1645 and extinction 123, 1256, 12731, 142 and navigation 293 research methods 202 response-chaining 2537, 2547 response-cueing 11011, 111 response-reinforcer contingency 99100, 99, 100, 1012 retardation test 412 retention interval 1978, 216 retinal snapshots 270, 271 retrieval theory 218, 2205, 222, 223 retroactive interference 163, 200, 2012, 203 retrospective code 198 reward 93 anticipation of 945, 96, 108 and category formation 1756 and gradient of delay 98, 98 and the Law of Effect 112 and response-reinforcer contingency 99100 and short-term memory studies 199 and stimulus-response theorists 31, 32 rhesus monkey and category formation 181, 184 and metamemory 20811, 210 and serial order 2578, 2578 and short-term retention 206, 20811 Rico (dog) 351 rook 117 running wheels 103, 103 SR associations 365 and imitation 311 and inhibition in extinction 1402, 141 and instrumental conditioning 934, 96 and serial order 253 and theory of mind studies 318 SR theorists 46, 489, 52, 945, 96, 111 S(RUS) associations 967, 111 sage grouse 301 sameness 1805, 1815, 1889 Sarah (chimpanzee) 116, 187, 188, 260, 3423, 346, 347, 349, 359, 368, 371 satiety 107, 1089 satisfaction 28, 30 scala naturae 45 scrub jay 31819, 324 sea lion 181, 181, 215, 216
Subject index
419
Second World War 266 second-order (higher-order) conditioning 502, 51, 1857, 186 selective association 846 self-awareness 323 self-recognition 31924, 3202, 3689 semanticity 337, 346 sensory neurons 44, 45, 46 sensory preconditioning 50 sentences 3423, 34650, 3524, 355, 356, 358 comprehension 3478, 352, 3534, 355 production 34850, 3523, 356, 358 sequential stimuli 2456, 245 sequential theory 132, 133 serial conditioning 4950 serial delayed alternation 203 serial order 233, 2539, 2548, 263, 358, 368 and chunking 2556 and the distance effect 257, 259 and the magnitude effect 256, 259 spatial representation of 2589 and transitive inference 25961 serial position effects 2067, 208 serial recognition tasks 2537, 2547 serotonin 45 Sheba (chimpanzee) 248, 250, 252, 368 Sherman (chimpanzee) 345, 346, 357 short-distance travel 26583, 293 and cognitive maps 27680 and dead reckoning/path integration 2669, 269, 270, 278, 280, 293, 367 and detours 2768, 277 and distinctively shaped environments 2746, 2745 and geometric modules 2746 and geometric relations 2724, 2734 and hippocampal place cells 2803 methods of navigation 26583 and novel release sites 27880, 279 and pheromone trails 2656 piloting with multiple landmarks 2712, 2712 piloting with single landmarks 26971 short-term retention 190211, 366, 366 comparative studies of 22831 and decay theory 2023 forgetting 199205 and habituation 1924, 193 limited capacity theory of 202
and metamemory 20711, 209 methods of study 1919 and serial position effects 2067 and trace conditioning 1946 shuttle-boxes 2201 Siamese fighting fish 229 sign language 24, 3412, 34950, 349 signals 3345 silkworm moth 265 simultaneous chaining 2537, 2547 simultaneous stimuli 2467, 247 Slocum (sailor) 266 snakes 3012, 302, 333, 334 social behavior 5960 social enhancement 305 social groups 367 social learning 296325 and copying behavior 30212 and diet selection 298301, 299 and foraging behavior 3001, 301 and mate choice 301 and the null hypothesis 3678 and predator fear 3012 and self-recognition 31924, 3202 and theory of mind 31219, 3245 song birds 3312, 363 SOP model see standard operating procedures (SOP) model spatial memory 3634, 366, 3667 spatial representation 2589, 262 spontaneous recovery 1346 squirrel monkey 300 short-term retention of 206 and transitive inference 260, 260 standard operating procedures (SOP) model 7680, 76, 81, 83, 192 starling 2923 stars 292 Stellars jay 328, 328, 337 stickleback 192, 228, 328, 334 stimulus 20 context-stimulus associations 7880, 901 and habituation 1924 preexposure 15761 significance 806 see also conditioned stimulus; SR associations; S(RUS) associations stimulus enhancement 298, 306, 307, 308, 309 stimulus generalization definition 37 and discrimination learning 150, 153, 1556, 1578 neural basis of 45 and problem solving 116, 121 stimulus-response theorists 303, 32 see also SR associations stimulus-stimulus learning 4952
Stravinsky, Igor 1723 strawberry finch 332 subitizing 251 Sultan (chimpanzee) 11213, 112, 114, 115 summation test 42, 43 sun 267, 2878 sun compass 2878 surprise and distractors 203 and extinction 12530 and learning 6274, 84, 867, 90 symbolic distance effect 2601 T-mazes 195, 2034, 204 tamarin monkey 229, 300 and self-recognition 322 and short-term memory 3667, 366 taste aversion conditioning 14, 39, 479, 589, 78, 78, 956, 300 and discrimination learning 158, 15961 and trace conditioning 1945, 195 temporal generalization 2368, 237, 2401 theory of mind 31219, 3245, 371 and communication 336 and deception 31315, 314 and knowledge attribution 31519 three-spined stickleback 192 tiger 297 time 23342, 2623, 368 circadian rhythms 2334 connectionist model of 241 information-processing model of 2378, 237, 239, 239, 240, 241 interval timing 233, 23641, 23741, 242, 263 and memory traces 2412 periodic timing 2336, 263 scalar timing theory 238, 238, 23940, 240 tool use 11617, 117, 11920 trace conditioning 1946, 1956 trace intervals 194 transitive inference 25962, 2601 and spatial representation 262 and the symbolic distance effect 2601 and value transfer 2612 transposition 14951 transposition test 150 trial and error learning 257, 11214, 11617, 11921, 169, 211, 297, 323, 336 trial-based theories 142 truly random control 40 two-action control 30910, 309
420 Subject index

ultraviolet (UV) light 285 unconditioned stimulus (US) 29, 30, 323, 35, 40, 123 and attention 789, 80 aversive 110 and compensatory CRs 58 and diet selection 299 and discrimination learning 155, 156, 162, 165 and extinction 1256, 1289, 135, 13943, 1456 and eye-blink conditioning 36 and fear of predators 302 and imitation 305 and inhibitory conditioning 40, 41, 42 intensity 667 and long-term memory studies 217 memory model of 468, 59 and the nature of representations 525, 54 and the reflexive nature of the CR 60 representation 109 and stimulus-stimulus conditioning 4950 surprising 634, 69, 3656 and trace conditioning 194 see also CSUS association/contingency; noUS center; noUS representation; RUS associations; S(RUS) associations unobservable processes 212 urine 266 utterances, mean length of (mlu) 349, 349 value transfer 2612 vasoconstriction 778, 1923, 193 vasopressin 241 vervet monkey 3327, 3334 vestibular system 268 Vicki (chimpanzee) 307, 340 visual category formation 1712, 1736 visual memory 304 vocal mimicry 3034 vocal tract 340, 340 waggle dance (bees) 32830, 32830, 331, 3389 Washoe (chimpanzee) 24, 3412, 342, 344, 346, 3489, 350, 357 Western scrub jay 2267, 226 white-crowned sparrow 3312, 370 wind direction 289 wolf 589, 59 Woodpecker finch 119, 120 words 337 working memory 2378 wrasse 228 Yerkish 343 Zugunruhe 291, 292

Amimal Learning

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Amimal Learning

Uploaded by

Copyright:

Available Formats

Animal Learning & Cognition

Copyright 2008 Psychology Press http://www.psypress.com/animal-learning-and-cognition/

Copyright 2008 Psychology Press http://www.psypress.com/animal-learning-and-cognition/

Animal Learning & Cognition,

Copyright 2008 Psychology Press http://www.psypress.com/animal-learning-and-cognition/

Copyright 2008 Psychology Press http://www.psypress.com/animal-learning-and-cognition/

Copyright 2008 Psychology Press http://www.psypress.com/animal-learning-and-cognition/

Copyright 2008 Psychology Press http://www.psypress.com/animal-learning-and-cognition/

3 The conditions for learning: Surprise and attention

Copyright 2008 Psychology Press http://www.psypress.com/animal-learning-and-cognition/

10 Time, number, and serial order

13 Animal communication and language

14 The distribution of intelligence

References Author index Subject index

373 403 411

Copyright 2008 Psychology Press http://www.psypress.com/animal-learning-and-cognition/

Copyright 2008 Psychology Press http://www.psypress.com/animal-learning-and-cognition/

Copyright 2008 Psychology Press http://www.psypress.com/animal-learning-and-cognition/

Copyright 2008 Psychology Press http://www.psypress.com/animal-learning-and-cognition/

Copyright 2008 Psychology Press http://www.psypress.com/animal-learning-and-cognition/

Copyright 2008 Psychology Press http://www.psypress.com/animal-learning-and-cognition/

Copyright 2008 Psychology Press http://www.psypress.com/animal-learning-and-cognition/

Evidence for RUS associations

LiCl, lithium chloride; R, response; US, unconditioned stimulus.

Copyright 2008 Psychology Press http://www.psypress.com/animal-learning-and-cognition/

Evidence for SR associations

Evidence for S(RUS) associations

Copyright 2008 Psychology Press http://www.psypress.com/animal-learning-and-cognition/

LiCl, lithium chloride; R, response; S, stimulus; US, unconditioned stimulus.

THE CONDITIONS OF LEARNING

Copyright 2008 Psychology Press http://www.psypress.com/animal-learning-and-cognition/

8 Mean responses per minute

Copyright 2008 Psychology Press http://www.psypress.com/animal-learning-and-cognition/

Copyright 2008 Psychology Press http://www.psypress.com/animal-learning-and-cognition/

30 Mean responses per minute 25 20 15 10 5 0 0 5 10

Copyright 2008 Psychology Press http://www.psypress.com/animal-learning-and-cognition/

Copyright 2008 Psychology Press http://www.psypress.com/animal-learning-and-cognition/

Responses per minute

The nature of the reinforcer

Copyright 2008 Psychology Press http://www.psypress.com/animal-learning-and-cognition/

Copyright 2008 Psychology Press http://www.psypress.com/animal-learning-and-cognition/

Copyright 2008 Psychology Press http://www.psypress.com/animal-learning-and-cognition/

Mean responses per session

Copyright 2008 Psychology Press http://www.psypress.com/animal-learning-and-cognition/

Copyright 2008 Psychology Press http://www.psypress.com/animal-learning-and-cognition/

Copyright 2008 Psychology Press http://www.psypress.com/animal-learning-and-cognition/

200 Mean total responses

200 Mean total responses HH HS Group Pre(S) HS

Copyright 2008 Psychology Press http://www.psypress.com/animal-learning-and-cognition/

Copyright 2008 Psychology Press http://www.psypress.com/animal-learning-and-cognition/

Response-cueing properties of Pavlovian CRs

Copyright 2008 Psychology Press http://www.psypress.com/animal-learning-and-cognition/

10 Mean responses per minute

2 3 Blocks of two trials

Copyright 2008 Psychology Press http://www.psypress.com/animal-learning-and-cognition/

Copyright 2008 Psychology Press http://www.psypress.com/animal-learning-and-cognition/

Copyright 2008 Psychology Press http://www.psypress.com/animal-learning-and-cognition/

Causal inference and folk physics

Copyright 2008 Psychology Press http://www.psypress.com/animal-learning-and-cognition/

Copyright 2008 Psychology Press http://www.psypress.com/animal-learning-and-cognition/

Copyright 2008 Psychology Press http://www.psypress.com/animal-learning-and-cognition/

Copyright 2008 Psychology Press http://www.psypress.com/animal-learning-and-cognition/

50 cm string Wire mesh Meat