You are on page 1of 11

Journal of Experimental Psychology:

Human Learning and Memory

VOL. 4, No. 3 , MAY 1978

Recognition Memory for Aspects of Dialogue


Elizabeth Bates Mark Masling
University of Colorado Stanford University
Walter Kintsch
University of Colorado

Recent research has suggested that memory for, surface form in natural dis-
course may be more robust than laboratory studies of connected prose have
led us to believe. This experiment examined recognition memory for ana-
phoric versus explicit reference in a 20-minute videotaped drama. The results
demonstrated significant memory for meaning, manifested as the ability to
reject a false paraphrase. Furthermore, there was significant memory for
surface form for several types of reference, including pronouns versus proper
names, role versus proper names, and elliptical versus full clauses. In general,
surface memory tended to be. higher for explicit reference than for anaphoric
utterances. Finally, there were systematic and significant biases toward sev-
eral surface forms in control groups that were guessing about the relative
naturalness of alternative utterances. The surface-meaning distinction, at
least in natural discourse, should be reassessed as a distinction between se-
mantic meaning and the pragmatic function of various surface forms. Some .
problems concerning retrieval versus reconstruction in memory are considered.

This study is part of a series of investiga- tigate memory for those structures that are
tions into the nature of memory for natu- characteristic of natural discourse,
rally occurring linguistic materials in natu- Recent research on memory for discourse
ral settings (see also Keenan, MacWhinney has focused our attention on several pro-
& Mayhew, 1977; Kintsch & Bates, 1977). cesses that were less apparent in traditional
The purpose of these studies is to examine research on individual sentences or words,
whether the processes used in natural lin- For example, Bransford, Barclay, and
guistic memory are the same as those used Franks (1972) have demonstrated that
in the laboratory and in particular to inves- memory for a series of related sentences is
strongly influenced by a process of semantic
~~ integration. In this process, subjects con-
We are grateful to Herb Clark for discussions struct intersentential meanings based on
concerning the possible role of reconstruction ver- both explicitly presented information and on
SU
LrSlPrr°eS il^rtmizabeth ^rmation that was implicit in the rela-
Bates, Department of Psychology, University of tlons among the stimulus sentences. When
Colorado, Boulder, Colorado 80309. asked to determine which sentences actually
Copyright 1978 by the American Psychological Association, Inc. 0096-lS15/78/0403-0187$00.75

187
188 E. BATES, M. MASLING, AND W. KINTSCH

occurred in the stimulus set, subjects are are an approximation to language in natu-
often unable to distinguish inferences im- ral settings and that the expectations sub-
plicit in the text from material that was ex- jects bring into the laboratory do not radi-
plicitly presented. The theory of semantic cally alter the comprehension and memory
integration has in turn led to two further processes that they use with these texts.
emphases: (a) the distinction between The argument has been made that, if any-
memory for surface form and memory for thing, the artificial aspects of the laboratory
meaning and (b) the distinction between setting should enhance surface memory be-
memory-as-retrieval, in which some sort of yond its level in natural settings. In a
trace (surface or meaning) is recovered summary of the literature on memory for
from storage, versus memory-as-reconstruc- prose, Clark and Clark (1977) conclude
tion, in which meaning or surface material as follows:
is recognized or recalled not through re-
trieval of traces from storage but rather Normally, people "study" speech by listening to
it for its meaning and by discarding its word for
by rebuilding the input on the basis of one's word content quickly. They try to identify refer-
fragmented memory for sentences and one's ents, draw inferences, get at indirect meaning,
knowledge about probable structures in a and in general build global representations of the
given context. situation being described. When they later try to
remember this speech, they fail miserably on its
With regard to the first distinction, it has verbatim content, and they confuse two names for
been argued that memory for surface form the same referent, a sentence and its implications,
in discourse is extremely weak and short- and one piece of a global representation with an-
lived. True paraphrases (i.e., paraphrases other. Yet when they have to, people can "study"
speech word for word and later recall it ver-
consistent with the meaning of the text) are batim. Memorization, however, usually requires
indistinguishable from the actual input sen- repetition and special concentration on the surface
tence within 80 syllables (Sachs, 1967) or features of the speech to be remembered, (p. 172)
within four to eight intervening sentences
(Garrod & Trabasso, 1973). Furthermore, However, two recent studies have sug-
with regard to the second distinction, re- gested instead that recognition memory for
searchers such as James, Thompson, and surface form in natural settings is in fact
Baldwin (1973) have argued that the ap- surprisingly robust without memorization
parent recall or recognition of surface forms or, for that matter, without awareness that
may be based on reconstructive processes memory will be tested at all. Kintsch and
rather than on "true" retrieval. A subject Bates (1977) carried out two experiments
can often use his or her knowledge about on recognition memory for statements from
the construction of sentences in general, or a classroom lecture, including topic sen-
in a given context, to choose correctly be- tences, detail sentences, and a category of
jokes and other statements extraneous to
tween competing expressions for a given the "text act" of the lecture. In both experi-
semantic structure—without having actu- ments, the lectures were given during the
ally seen or heard the target sentence at usual class period, and students were un-
all. At the very least, this finding suggests aware that they would be tested for sen-
that in recognition memory research, we tence memory (except, of course, for the
must be very careful to examine the natural usual expectations that classroom material
biases favoring one paraphrase over an- would be relevant for course examinations).
other. In the first experiment, after a 48-hour
In general, the predominance of semantic delay, students were able to distinguish tar-
memory over memory for surface form has get statements from true paraphrases for
been explained in terms of the ecological topics, for details, and for jokes and other
validity of semantic integration in memory extraneous statements. In the second ex-
for natural language—assuming that the periment, after a S-day delay, surface mem-
passages used in these laboratory studies ory was significant for jokes and extraneous
RECOGNITION MEMORY FOR ASPECTS OF DIALOGUE 189

material, although it had disappeared for the referent itself? If surface form for ana-
topics and details. In other words, surface phora is retained over any significant period
memory for discourse in a classroom lecture of time, we may have to reassess our defi-
is much stronger than surface memory in tion of the nature and role of surface form
laboratory experiments; furthermore, mem- in semantic integration of natural discourse.
ory for surface form is a function of the To have some measure of control over
role of a statement in the whole text unit. these subtle processes in a large enough
In a similar study, Keenan et al. (1977) sample of subjects, we have had to take one
tape recorded a faculty lunchroom conversa- step backward into the laboratory. In this
tiotn and administered a recognition mem- experiment, the stimulus is a videotape of
ory test to the unsuspecting participants 36 an afternoon television drama (a "soap
hours after the conversation had occurred. opera" entitled Another World). This piece
After this delay, subjects were able to dis- of discourse was chosen over other candi-
tinguish target sentences from true para- dates because the afternoon serial, in com-
phrases for utterances high in what Keenan parison with other dramatic forms, more
et al. term "interactive value," that is, fig- closely resembles natural conversation in
ures of speech, mock insults, and jokes. the number of assumptions that are made
Descriptive statements low in interactive about ongoing knowledge of characters and
value showed significant memory for mean- events. Hence, this dramatic form mimics
ing (i.e., rejection rates for statements that the proportion of "old" versus "new" refer-
did not occur in the conversation) but no ences to characters and events that typically
significant memory for surface form. Con- occurs in natural conversation. Obviously,
trol experiments demonstrated that the ad- the subjects in the study were not partici-
vantage accrued to high interactive utter- pants in the discourse, and the research was
ances is a function of participation in the necessarily carried out in a laboratory set-
whole discourse rather than the relative ting. However, until the recognition mem-
"memorability" of the individual utter- ory test was presented, there were no cues
ances taken out of context. In a list-learning to suggest that a memory test was at hand.
experiment, with the same utterances pre- The videotape was rich enough in explicit
sented in random order to nonparticipants versus anaphoric reference to yield a suffi-
in the conversation, there was no difference ciently large item pool for testing, while at
in memory for surface form for high versus the same time approximating natural con-
low interactive utterances. versation as closely as possible.
In the present study, we have focused on
memory for anaphoric processes in dia- Method
logue, in particular pronominalization and
ellipsis versus explicit reference. There are Subjects
two reasons for this choice. First, anaphora There were 120 college students who partici-
is a pervasive process in natural discourse. pated as subjects in small groups. All obtained
High-frequency use of pronouns and ellipti- credit for an introductory psychology course.
cal reference characterizes natural conversa-
tions and distinguishes them from the more Materials
formal types of prose typically used in stud-
ies of discourse memory (e.g., stories and A half-hour daily segment from an afternoon
paragraphs). Second, the question of mem- dramatic serial was videorecorded in March of
1976, 6 months prior to its use in the present
ory for surface form seems particularly sa- experiment. After editing to remove commercial
lient with regard to anaphoric processes. If messages, the entire segment lasted approximately
the function of pronouns and other short 20 minutes, divided into six separate scenes with
forms is merely to identify previously es- changes of setting and characters. The tape was
transcribed and checked to insure correct wording.
tablished referents, why should we retain From this transcript, 43 target sentences were
any information other than knowledge of chosen, distributed evenly across the six segments.
190 E. BATES, M. MASLING, AND W: KINTSCH

The utterances were chosen to reflect six sen- Similarly, a typical multiple-choice item from the
tence types, in three reciprocal sets. The pronoun clausal set would be the following:
set contained expressions that referred to main (a) We're doing everything we can to make
characters by name and expressions that referred sure that she does, Ada. (target)
to main characters by pronoun. The clausal set (b) We're doing everything that we can to
contained utterances expressing information with make sure that she keeps the baby, Ada.
fully formed clauses versus utterances referring to (false paraphrase)
similar information with elliptical clauses. The (c) We're doing everything that we can to
role set contained utterances referring to main make sure that she regains consciousness,
characters by role (e.g., "his wife") versus utter- Ada. (true paraphrase)
ances referring to main characters by name or by All of the false paraphrases contained names or
pronoun. This last set, though infrequent in the information relevant to the script but either un-
text, provides a particularly interesting contrast true or inappropriate in the particular utterance
in anaphora, since in the pronoun set, proper in question.
names contain more explicit information than The multiple-choice items were typed in random
their reciprocal pronouns, whereas in the role set, order in a recognition memory test booklet. In
proper names are less explicit, marked, or infor- addition, a 250-word synopisis of the story, epi-
mative than the role references. There were dif- sode by episode, was constructed for use with
ferent numbers of items in each item type. In the half of the subjects. This synopsis was also ac-
pronoun set, there were 10 pronoun-target utter- companied by a list of the full cast of characters
ances. In the clausal set, there were 10 ellipitical- and their relations to one another. This manipula-
clause-target utterances and 8 full-clause-target tion was included to approximate a real-life situa-
utterances. Finally, in the role set there were tion in which a listener has background informa-
only 2 role-target utterances and 3 name-target tion about characters and events.
utterances. This latter set was necessarily small,
reflecting the relatively low frequency of this type Procedure
of construction in the text.
For each of the 43 target utterances, a multiple- The 120 subjects were assigned randomly to
choice item was constructed containing the target four groups, with 30 subjects in each. Group 1
utterance, a true paraphrase, and a false para- received neither the synopsis nor the television
phrase. The true paraphrase contained the same program. They were simply asked to judge, within
information as the target but reversed the direc- each multiple-choice item, the utterance most
tion of the surface form. In other words, a pro- likely to have occurred in an afternoon television
noun sentence was paraphrased with identical serial. Group 2 did receive the synopsis and cast
wording, except that a pronoun was substituted for of characters but did not view the program. They
the character's name. A name sentence in the too were asked, after reading the synopsis, to
same set was also paraphrased with identical select the most likely alternative within each of
wording, except that a pronoun was substituted the multiple-choice items. Group 3 viewed the
for the character's name. Similarly, in true para- videotape but did not receive the synopsis. Their
phrases of the clausal set, elliptical clauses were instructions were to circle the alternative within
substituted for full clauses with the same meaning. each item that they had actually heard in the
Elliptical clauses were paraphrased with an equiv- program. Group 4 read the synopsis prior to view-
alent full clause. Finally, in true paraphrases of ing the tape and were then given the multiple-
the role set, the character's role was paraphrased choice test with the same instructions as Group 3.
with his or her proper name or a pronoun, This procedure yields a 2 X 2 X 6 design, with
whereas in name sentences in the same set, an program/no program and synopsis/no synopsis as
appropriate role term was substituted for a name. between-subjects factors and the six item types as
The false paraphrases also maintained exactly the within-subjects factors.
same wording as the targets and true para- Within each of the four groups, subjects were
phrases, except that a false referent was substi- tested in smaller groups of four to six members.
tuted in the same name, role, or clause slot under- Testing took place in a laboratory room with no
going contrast. For example, a typical multiple- visible apparatus other than a large television
choice item from the pronoun-name set would be monitor and a playback machine. The two pro-
the following: gram groups (Groups 3 and 4) were simply told
that they would see a televsion program and be
(a) I wanted to get that Pendleton work done asked about it afterward. Although no deception
while he was out of the office, (target) was involved, there were also no cues to suggest
(b) I wanted to get that Pendleton work done that the purpose of the experiment was to assess
while Robert was out of the office, (true sentence memory.
paraphrase) We should note that the test items were not
(c) I wanted to get that Pendleton work done thematically independent of one another. Merely
while Willis was out of the office, (false by reading' through the test booklet, without a
paraphrase) synopsis, a subject could obtain some idea of the
RECOGNITION MEMORY FOR ASPECTS OF DIALOGUE 191
plot and the relations among characters (e.g., bility of choosing the target minus the prob-
that Rachel is in some sort of medical trouble ability of choosing its paraphrase).
and that either her husband or her father is wor-
ried about her). Hence, neither of the control In addition, separate analyses of variance
groups was expected to perform at chance levels were carried out to determine (a) the effect
in rejecting the false paraphrases. However, there of episode order and (b) the effect of item
was nothing in the random ordered test items to order in the test booklet. Neither of these
favor one surface form over another (e.g., to dis-
tinguish "Rachel" from "her" in a given sen- order effects reached significance. We can,
tence). If control subjects perform beyond chance then, conclude that the memory effects in-
levels in distinguishing true paraphrases from vestigated here are stable for at least 35 min-
targets, their performance presumably reflects utes (20 minutes viewing the program and
systematic biases about the nature of conversation,
or at least the conversations that are thought to
approximately 15 minutes answering the
occur in television serials. test booklet).

Results Memory for Meaning: Analysis


Across Subjects
Because the six item types contained un-
equal numbers of utterances, percentage Table 1 presents the memory-for-mean-
scores were used rather than absolute fre- ing scores for all four groups, for each of
quencies. Memory for meaning was calcu- the six item types. The analysis of variance
lated for each subject, for each item type, on these scores yielded significant main ef-
by subtracting the percent choice of false fects for all three factors.
paraphrases from 100%. Hence, if perform- First, there were two significant between-
ance were random, memory for meaning subjects effects, M5e = 351.5. The main ef-
would average 66%. Memory for surface fect of having seen the program was signifi-
form was calculated for each subject, for cant, F(l, 116) =98.37, p < .001. Hence,
each item type, by subtracting the percent subjects who had actually seen the video-
choice of the true paraphrase from the per- tape were better at rejecting false para-
cent choice of the target utterance. In this phrases than those who had not. However,
case, random performance should yield sur- the two control groups were also perform-
face memory scores of around 6%. Two ing at well beyond the 66% chance level
separate analyses of variance were carried (see Table 1). There was apparently suffi-
out on arc sine transformations of the cient information available in the internal
percentage scores, one analysis on memory structure of the test booklet alone to facili-
for meaning (the likelihood of choosing tate rejection of false alternatives. The syn-
either the target utterance or its para- opsis also had a significant effect on mem-
phrase), the other on memory for surface ory for meaning, F(l, 116) = 5.00, p <
form (the difference between the proba- .025. In other words, the synopsis and cast

Table 1
Mean Percentage Scores for Memory for Meaning

Group 1
(no program, Group 2 Group 3 Group 4
Target type no synopsis) (synopsis only) (program only) (program & synopsis)

Pronoun 83.66 88.33 94.66 97.33


Name 77.66 81.33 93.66 93.66
Ellipsis 68.33 78.33 93.66 93.33
Full clause 61.13 71.06 85.43 87.90
Name (role set) 74.93 87.06 88.10 95.10
Role 91.66 78.33 93.33 92.00

Note. Chance performance level = 66%.


192 E. BATES, M. MASLING, AND W. KINTSCH

100 the clausal items, guessing was close to


a;O 95
chance levels. Examination of cell means in
D Table 1 suggests that guessing on the clau-
|. sal items was close to random for Group 1,
while Group 2 (who had at least read the
>» synopsis) were performing somewhat be-
P 80
yond chance levels.
The interaction between synopsis and
'S 75 sentence type is illustrated in Figure 2.
.8
0 Here, too, we see that guessing levels were
6 '" lower for the clausal items than for items
| 65 involving reference to characters in the
story. Perhaps because of these baseline
60 differences, the synopsis had a more dra-
I I I I
A B C D E F
matic effect on the clausal items. However,
PRONOUN NAME ELLIPSIS FULL NAME ROLE there was also a peculiar interaction within
TARGET TARGET TARGET CLAUSE TARGET TARGET
TARGET (ROLE SET) (ROLE SET) the role-name set, such that subjects who
Figure 1. Memory for meaning as a function of
had seen the synopsis were better at reject-
program by item type. (Filled circles = program; ing false referents for name-target items
empty circles = no program.) than for role-target items. For subjects who
had not read the synopsis, the reverse was
of characters also aided all subjects in re- true. These effects are probably related to
constructing the story and rejecting false biases about surface form as well as memory
alternatives. The interaction between the for meaning in the strict sense. Hence, these
synopsis and program was not significant. findings should be considered together with
For the within-subjects factor of sentence the results of the second analysis of vari-
type, the M5e = 225.9. This main effect ance.
was significant, F(5, 580) = 13.87, p < .001.
Hence, memory for meaning was in part a Memory for Surface Form: Analysis
function of the type of material tested. The over Subjects
three-way interaction just missed signifi- Table 2 presents the mean surface mem-
cance (p < .07). However, there were sig- ory scores for each of the four groups, for
nificant interactions between sentence type each of the six item types. The analysis of
and viewing the program, F(5, 580) = 3.95, variance on these scores yields a significant
p < .002, and between sentence type and main effect of having seen the program,
reading the synopsis, F(5, 580) = 4.38, p F(l, 116) =98.16, p < .001, MSe= 1880.2.
< .001. However, in contrast to the results for
The interaction between program and memory for meaning, there was no signifi-
sentence type is illustrated in Figure 1. cant mean effect of the synopsis on sur-
While memory for meaning was always face memory scores.
higher for subjects who had actually seen There was, however, a significant inter-
the program, the improvement beyond action between program and synopsis, F(l,
guessing levels was more apparent for the 116) =3.85, p < .005. In general, it ap-
clausal items than for either the pronoun- pears that the synopsis served to increase
name set or the role-name set. Apparently, surface memory in those subjects who had
subjects who had not seen the program seen the program. However, for subjects
could nevertheless use the internal infor- who were merely guessing which alterna-
mation of the test booklet itself as well as tive was more likely to occur, the synopsis
the information in the synopsis to reject seemed to hinder performance rather than
false reference to characters, either in the help it. The synopsis and cast of characters
pronoun-name set or the role-name set. For may have set off response biases of subjects
RECOGNITION MEMORY FOR ASPECTS OF DIALOGUE 193

95 _ I I I I I |_
IUU
1 I I I I 1
95 - 8 85
i
§• 90
I -ra /
_^\^ "^ / / ~ I 65
1 Vs
'""~-C * '^•s.
fe 85 - ^ v^^ X « 55
/:
s^^X / i«
I 80

NX
s, 35
*
* 75
^ ,2 25
* 15
- 5
u '"
§ 65 _
(D
_
i :/^ -^ :
8. § -15
60 _
1 I I I I 1 * -25 - Ai i i i i i -
8 0 D E F
E F PRONOUN
PRONOUN NAME ELLIPSIS FULL NAME ROLE NAME ELLIPSIS FULL NAME ROLE
TARGET TARGET TARGET CLAUSE TARGET TARGET
TARGET TARGET TARGET CLAUSE TARGET TARGET TARGET (ROLE SET) (ROLE SET)
TARGET (ROLE SET) (ROLE SET
Figure 3. Memory for surface form (Groups 3
Figure 2. Memory for meaning as a function of and 4) versus guessing bias (Groups 1 and 2) as
synopsis by item type. (Filled circles = synopsis ; a function of item type. (Filled circles = memory;
empty circles = no synopsis.) empty circles = guessing.)
in Group 2 that differed systematically from
the response biases of subjects in Group 1 The three-way interaction was not signifi-
(i.e., those who had seen neither the synop- cant, nor was there a significant interaction
sis nor the program). In particular, exami- between synopsis and sentence type. How-
nation of Table 2 suggests that subjects in ever, there was a strong interaction between
the guessing bias group who were given a program and sentence type, F(5, 580) =
synopsis to read adopted the strategy of 7.89, p < .001. Hence, across item types
using the explicit information in the synop- the pattern of surface memory differed sig-
sis and cast of characters whenever possible, nificantly from the pattern of response bias
that is, avoiding pronominal reference and in the guessing groups, as illustrated in Fig-
selecting names in the name-pronoun set. ure 3. For example, in the guessing groups
Hence their guessing biases may reflect not there was a clear bias toward elliptical
only general biases about the nature of forms and against expanded clauses. Hence,
conversation but also the particular de- subjects were more likely to obtain correct
mand characteristics of their experimental scores when the target was actually an el-
group (i.e,j if given a cast of characters to liptical clause and less likely to guess cor-
read, use it in carrying out task demands). rectly when the target was a full clause. For

Table 2
Mean Percentage Scores for Memory for Surface Form

Group 1
(no program, Group 2 Group 3 Group 4
Target type no synopsis) (synopsis only) (program only) (program & synopsis)

Pronoun -6.33 -27.00 -3.33 -4.00


Name -0.33 6.66 17.00 29.00
Ellipsis 15.66 -1.66 19.00 30.66
Full clause -13.46 1.60 24.16 25.03
Name (role set) -8.86 1.13 28.90 39.90
Role 11.66 -11:66 66.66 78.66

Note. Chance performance level = 0%.


194 E. BATES, M. MASLING, AND W. KINTSCH

the pronoun-name set, response patterns lines. Given the finding that guessing biases
were in the opposite direction, that is, in were far from random among those who
favor of more explicit forms. Subjects were had not seen the program, the best illustra-
more likely to guess that the referent was titon of surface memory should contain a
a proper name than a pronoun, so that correction taking into account the amount
scores were higher when the target actually of baseline bias that must also have been
was a proper name and lower when the overcome by those who saw the program.
target was a pronoun. If we compare this Figure 4 presents both memory for mean-
particular bias with performance on mem- ing and memory for surface form in the
ory for meaning on the same pronoun-name memory groups, using the standard correc-
items, it appears that the guessing groups tion for guessing,
had "figured out" who did what to whom ,, -^observed ~ G
and hence may have been anxious to show •Mtrue — -, (-> i
their knowledge by circling correct names
rather than pronouns when given a choice. in which Ma}>am-ei is the average proportion
Finally, among the guessing groups, per- correct for Groups 3 and 4, and G is the
formance on the role-name set averaged to average for Groups 1 and 2.
chance levels, although the cell means in When the results are plotted in this man-
Table 2 suggest that biases were somewhat ner, as in Figure 4, it appears that memory
different depending on whether subjects had for surface form is better for the more ex-
read the synopsis. Control subjects without plicit form within each of the three recip-
the synopsis tended to avoid names; control rocal item sets. Within the pronoun set,
subjects with the synopsis tended to avoid surface memory is better if a name is heard
roles. than a pronoun. For the clausal items, there
Figure 3 also illustrates the average levels is more surface memory for full clauses than
of performance by item type for those sub- for elliptical clauses. Within the role-name
jects who actually saw the program. How- set, surface memory is better for roles than
ever, this particular illustration of surface for names. This trend is consistent with the
memory may be somewhat misleading, since theory of markedness in discourse (e.g.,
Givon, Note 1). According to markedness
it emphasizes deviation from random base-
theory, utterances which introduce charac-
ters or facts in explicit or marked surface
r Guessing (Percen

I I l l |
form are generally more novel and informa-
tive than utterances that refer anaphorically
g

• " >. f
s/ 1 -
to referents that are already mentally (if not
23 g

physically) present in the discourse context.


o ou The same effect of markedness does not
50 " / " - appear in the results for memory for mean-
1 ing.
u 40 An analysis of variance of the correct rec-
/ ognition performance scores upon which Fig-
S 30 -
E
ure 4 is based yielded a significant effect for
sentence types, F(4, 145) = 8.09, p < .01,
Recognition Rerfor
S

M5"e = .21, as well as a significant contrast


^ ^ - for unmarked versus marked sentences (A,
5

\ I I I I 1 C, D versus B, D, F), F(l, 145) =4.25,


A 8 C D E F
PRONOUN NAME ELLIPSIS FULL NAME ROLE p < .05, for surface memory, but no statisti-
TARGET TARGET TARGET CLAUSE TARGET TARGET cally significant differences for meaning
TARGET (ROLE SET) (ROLE SET
memory (F > I ) . In these analyses, sub-
Figure 4. Memory for meaning versus memory jects were treated as the random variable.
for surface form as a function of sentence type
(corrected for guessing bias). (Filled circles = In a parallel analysis with sentences as the
meaning; empty circles = surface form.) random variable, the differences in sur-
RECOGNITION MEMORY FOR ASPECTS OF DIALOGUE 195
face memory between sentence types were with either type of correction for bias, sur-
still significant, F(5, 37) =4.53, p < .002, face form memory is influenced by both sen-
whereas the contrast between marked and tence type and markedness, whereas memory
unmarked sentences did not reach signifi- for meaning is not.
cance, F(l, 37) = 2.41, .05 < p < .10. No
significant differences were found when Discussion
memory for meaning was analyzed. It should
be noted that the analyses over sentences The principal findings from this study can
may not be suitable for the design of the be summarized as follows :
present experiment, since the number of sen- (a) Contrary to findings for other types
tences per condition was too small, varying of prose, there was significant recognition
between 2 and 10. memory for surface form for at least these
The analyses of the data summarized in few aspects of dialogue, despite the large
Figure 4 combine the scores for Groups 1 amount of material presented in a 20-minute
and 2 in the correction for guessing bias. drama.
Since the patterns of guessing bias were (b) There were systematic and significant
quite different for those two groups, it response biases in the two guessing groups,
seemed advisable to compare the above find- suggesting that "memory" (for meaning or
ings with the patterns that obtain when the for surface form) should be defined taking
different types of guessing bias are consid- into account the relative gain or overcoming
ered separately. As noted earlier, the control of bias by those who actually saw the video-
group with the synopsis seemed to be par- taped program.
ticularly influenced by the demand charac- (c) The type of item tested had a signifi-
teristics of the experiment (i.e., the tend- cant effect on both memory for meaning and
ency to use a cast of characters, given that memory for surface form, in analyses across
one is available). By contrast, the guessing subjects. In analyses across materials, there
bias patterns for subjects with neither the was a significant effect of item type on sur-
synopsis nor the program may be a closer face memory only. The pattern of results
approximation to biases about the nature of across sentence types was quite different for
dialogue per se. Considering only the find- surface versus meaning memory, with a
ings for Group 3 (program only) corrected tendency for more marked surface forms to
for the bias obtained in the corresponding be retained better than less explicit forms.
control group, Group 1 (no program, no The first finding is consistent with results
synopsis), we obtain the following corrected by Kintsch and Bates (1977) and Keenan
surface memory scores: pronoun target = et al. (1977) regarding surface memory for
.03, name target = .17; ellipsis target = .04, natural language materials in contexts where
full clause target = .33; name target = .35, subjects do not expect a memory test. Fur-
role target = .62. Hence, the markedness thermore, in all three studies, surface mem-
patterns reported above were substantially ory is at least in part a product of the type
the same or increased over the levels illus- of material tested and the relationship of
trated in Figure 4. The corresponding mem- that material to the discourse unit as a
ory-for-meaning scores (i.e., Group 3 cor- whole. In the Kintsch and Bates study, sur-
rected for the bias in Group 1) are these: face memory was particularly strong for
pronoun target = .69, name target = .79; jokes and other statements that deviated
ellipsis target = .81, full clause target = from the "text act" of the lecture itself. In
.61; name target = .52, role target = 1.00. the Keenan et al. study, surface memory was
These corrections do yield a pattern some- significant only for material high in "inter-
what different from the one illustrated in active value"—a poorly defined dimension,
Figure 4. However, for both the Figure 4 but one that clearly involves the role of an
pattern and the pattern for Groups 1 and 3 utterance in conversation as opposed to the
only, sentence effects and markedness effects "memorability" of the individual sentences
are nonsignificant. We can conclude that out of context. In the present study, the
196 E. BATES, M. MASLING, AND W. KINTSCH

effect of item type on either memory for temporal order of events and settings, and
meaning or memory for surface form was the general flow of the discourse, to decide
quite complex. In general, the amount of which surface form a given speaker should
surface form retained seems to be affected by have used at a given point in conversation.
the relative explicitness or discourse mark- For example, given the sentence "I wanted
edness of a given form—a pattern that does to get the Pendleton work clone while he
not apply to memory for meaning. was out of the office," a subject will have
How do we explain the contrast between a higher probability of selecting the correct,
these findings and reports by Garrod and pronominal form over the nominal form if
Trabasso (1973), Sachs (1967), and others he or she remembers that Robert has already
that surface memory decays within four to been discussed at length at that point in the
eight intervening sentences or 80 syllables? conversation. Hence it is more likely that
The difference may be due at least in part Robert would be referred to by pronoun.
to the role of surface form in natural dis- There is a further possibility that subjects
course versus laboratory prose. The distinc- were using a mixture of retrieval versus re-
tion between meaning and surface in pas- construction processes, based on the differ-
sages administered out of context, with no ent decay functions for several types of sur-
real communicative purpose, is indeed a sort face information. Recall that in this study
of wheat-versus-chaff distinction. The sur- (as well as in the two other experiments on
face form has no purpose other than to con- language in natural settings), both visual
vey semantic structure, that is, information and auditory information was present in the
about who did what to whom, where, when, stimulus materials. This is in notable con-
and so on. Shifts from active to passive, ad- trast to the typical laboratory study with
verb reorderings, and other paraphrases may written prose. Baggett (1975) has shown
have some slight stylistic or rhythmic effect, that, although there is a semantic integration
but in laboratory settings these differences effect with simple visually presented stories,
rarely serve the kind of pragmatic function the decay function for visually presented in-
of highlighting or focusing information that formation is much longer than the surface
they serve in a continuous piece of natural decay for equivalent verbal stories. Indeed,
discourse. In natural settings, the surface there was some evidence for visual surface
form may often be the whole point of an memory 6 weeks after presentation. Simi-
utterance—particularly in jokes and figures larly, memory for auditorily presented ma-
of speech. Similarly, in establishing reference terial is stronger than memory for written
in discourse, explicit forms are chosen in- versions of the same verbal material (e.g.,
tentionally to draw attention to a referent as Murdock, 1974). It is possible that subjects
either new information or important and in the present study used bits and pieces of
topicalized old information. We suggest that visual and auditory surface memory to re-
in natural speech, the distinction between construct the surface form of the verbal ex-
meaning and surface form often corresponds pression. For example, the explicit versus
to a distinction between semantic versus anaphoric forms in the videotape typically
pragmatic meaning. The probability that a differed in their intonation contours. If the
given surface form zvill be retained will, at subject could recall a contrastive stress pat-
least in part, be a function of the pragmatic tern at a given point in conversation (or,
role that surface form plays in a given con- alternatively, an emphatic gesture), he or
text. she may well have concluded that the refer-
There are some further problems, how- ent was probably encoded in explicit form
ever, particularly with regard to the distinc- rather than as a pronoun.
tion between memory-as-retrieval and mem- Some of these questions could be an-
ory-as-reconstruction. It is possible that sub- swered by further research presenting the
jects in our study did not actually "store" auditory portion of the drama alone and/or
surface form in the strict sense. Instead, presenting the written transcript of the dia-
they may have used their knowledge of the logue alone, for comparison with memory
RECOGNITION MEMORY FOR ASPECTS OF DIALOGUE 197

for the auditory-visual presentation used in Linguistics, University of California, Los An-
the present study. However, these controls geles, 1975.
still leave open the possibility that recon-
struction on the basis of event chronology is References
responsible for apparent "retrieval" of sur-
face forms. Undoubtedly reconstruction and Baggett, P. Memory for explicit and implicit in-
retrieval are both used in the processing of formation in picture stories. Journal of Verbal
Learning and Verbal Behavior, 1975, 14, 538-
natural language materials. To what extent S48.
these two processes can be separated empiri- Bransford, J., Barclay, J. R., & Franks, J. J. Sen-
cally remains to be seen. tence memory: A constructive versus interpre-
The Clark and Clark (1977) summary tive approach. Cognitive Psychology, 1972, 3,
cited earlier applied to research on memory 193-209.
Clark, H., & Clark, E. Psychology and language.
for language in laboratory settings. Our New York: Harcourt Brace Jovanovich, 1977.
findings do not necessarily contradict that Garrod, S., & Trabasso, T. A dual memory infor-
research. First of all, what is undoubtedly mation processing interpretation of sentence
meaning-free surface form in a laboratory comprehension. Journal of Verbal Learning and
context may be quite meaningful in a natural Verbal Behavior, 1973, 12, 155-168.
setting. Second, the auditory and visual cues James, C., Thompson, J., & Baldwin, J. The re-
constructive process in sentence memory. Journal
available in natural dialogue may facilitate of: Verbal Learning and Verbal Behavior, 1973,
memory over the levels observed with writ- 12, 51-63.
ten materials in the laboratory. Third, in Keenan, J., MacWhinney, B., & Mayhew, D. Prag-
natural settings, the listener has a great deal matics in memory: A study of natural conversa-
more world knowledge available to aid in tion. Journal of Verbal Learning and Verbal
Behavior, 1977, 16, 549-560.
reconstruction as well as retrieval of dis- Kintsch, W., & Bates, E. Recognition memory for
course material. Our findings do suggest statements from a classroom lecture. Journal of
that the generalizability of laboratory re- Ex peri-mental Psychology: Human Learning and
search is limited and that more research is Memory, 1977, 3, 150-168.
necessary on memory for language in natural Murdock, B. B., Jr. Human memory: Theory and
data. Hillsdale, N. J.: Erlbaum, 1974.
settings and with natural materials. Sachs, J. Recognition memory for syntactic and
semantic aspects of connected discourse. Percep-
Reference Note tion & Psychophysics, 1967, 2, 437-444.
1. Givon, T. Towards a discourse definition of
syntax. Unpublished manuscript, Department of Received November 30, 1977 •

You might also like