Cognitive Linguistics Issue1 4.vol.21

Metaphor and metonymy: Making their connections more slippery
JOHN A. BARNDEN*
Abstract This paper continues the debate about how to distinguish metaphor from metonymy, and whether this can be done. It examines some of the dierences that have been alleged to exist, and augments the already existing doubt about them. The main dierences addressed are the similarity/ contiguity distinction and the issue of whether source-target links are part of the message in metonymy or metaphor. In particular, the paper argues that metaphorical links can always be used metonymically and regarded as contiguities, and conversely that two particular, central types of metonymic contiguity essentially involve similarity. The paper also touches briey on how metaphor and metonymy interact with domains, frames, etc. and on the role of imaginary identication/categorization of target as/under source items. With the possible exception of this last issue, the paper suggests that no combination of the alleged dierences addressed can serve cleanly to categorize source/target associations into metaphorical ones and metonymic ones. It also suggests that it can be more protable to analyse utterances at the level of the dimensions involved in the dierences than at the higher level of metaphor and metonymy as such.
* Correspondence address: School of Computer Science, The University of Birmingham, Birmingham, B15 2TT, UK. Tel: (44) (0)121 414-3816; Fax: (44) (0)121 414-4281. E-mail: 3J.A.Barnden@cs.bham.ac.uk4. Acknowledgments: This research was supported in part by grant EP/C538943/1 from the UKs Engineering and Physical Sciences Research Council (EPSRC), and grant RES-328-25-0009 from the UKs Economic and Social Research Council and EPSRC under the People at the Centre of Communications and Information Technologies programme. I am grateful to colleagues Rodrigo Agerri, Sheila Glasbey, Mark Lee and Alan Wallingon, and to the journal editors and anonymous reviewers, for important suggestions. Cognitive Linguistics 211 (2010), 134 DOI 10.1515/COGL.2010.001 09365907/10/00210001 6 Walter de Gruyter
J. A. Barnden metaphor; metonymy; metaphor/metonymy distinction; contiguity; similarity; part/whole metonymy; representational metonymy; resemblance metaphors; image metaphors.
Keywords:
1.
Introduction
Specifying the nature of metaphor and metonymy has long been a dicult problem. It has been particularly dicult to specify convincing grounds for dierentiating the two gures from each other (Barcelona 2000a; Cameron 1999a,b; Dirven and Porings 2002; Fass 1997; Haser 2005; etc.). The literature exhibits a wide variety of opinion. In this paper we look mainly at two important alleged grounds for dierentiation, namely (i) that metaphor involves similarity whereas metonymy involves contiguity or related notions of semantic/pragmatic connection (see, e.g., Dirven 2002; Jakobson 2002 [1956]; Lodge 1977; Norrick 1981; Nunberg 1978; Riemer 2001; and many others), and (ii) that metonymy preserves links to the source domain item as part of the message whereas metaphor does not (e.g., Dirven 2002; Haser 2005; Warren 2002). We briey consider several other issues, including the question of whether metaphor and metonymy interact dierently with some postulated structure of domains, frames, idealized cognitive models, or other compartmentalizations of conceptual information. The paper will conclude that these various possible grounds for dierentiation do not, as currently conceived at any rate, provide a rm distinction between metaphor and metonymy. This failure holds even if the putative grounds are combined rather than considered in isolation. The conclusion supports a similar one by Haser (2005: 15), but the paper adds qualitatively new evidence and critique. It leaves it open whether more careful accounts of the alleged dierences could lead to a crisp metaphor/metonymy distinction, or whether additional dierences could help. Our considerations will also not stop metaphor and metonymy having some tendency to dier in particular ways; with, for instance, metaphor tending to involve a rich form of similarity but metonymy tending not to. There have been various notions, in the literature, of a cline or spectrum of phenomena incorporating metaphor and metonymy. Radden (2002) makes one such proposal. Dirven (2002) discusses a phenomenon of post-metonymy, intermediate between metaphor and metonymy, although Riemer (2002) argues that (his own version of ) post-metonymy need not lead in the direction of metaphor. Croft and Cruse (2004: 220) give examples that suggest intermediate possibilities between metaphor
Metaphor and metonymy
and metonymy, while also warning that what may appear to be intermediacy may be the result of combining distinctly dierent processes. The present article regards the idea of a spectrum as broadly being on the right track, but pushes the idea further. It doubts that we should think of just a one-dimensional space of possibilities: rather, there are several and perhaps numerous dierent ways in which metaphor and metonymy can vary in their essenceincluding variance as regards amounts and types of similarity and contiguity. The positioning of metaphor and metonymy within the space created by these dimensions and others may be very complex. We will come back to the question of whether there may be intermediate possibilities between metaphor and metonymy in Section 4. Intermediacy is the question whether there are phenomena that have some of the qualities of both metaphor and metonymy but do not qualify as either. For now we focus briey on the contrasting notion of overlap of metaphor and metonymy. Overlap is the question of whether there is a phenomenon that at one and the same time qualies as being both metaphor and metonymy. We will see that there is evidence of a type of overlap that is distinct from a type that has already been much discussed in the literature. The latter type consists of the sorts of mixing of metaphor and metonymy that have been discussed under the headings of metaphor within metonymy, metonymy within metaphor, chaining of metaphor and metonymy, and so forth (Fass 1997; Goossens 1990; Kovecses 2002; Lan glotz 2006; Ruiz de Mendoza Ibanez and Dez Velasco 2002; Warren 2006). Those types of mixing involve some conceptual item A being linked to some item X, and X to B, where either the AX link is metaphorical and the XB link is metonymic, or vice versa. Rather, the type of mixing or overlap addressed in this paper is where an item A is linked to an item B in such a way that qualies simultaneously as both metaphor and metonymy, but where the situation is not analysable as a chaining of an AX link and an XB link. This simultaneity is also to be distinguished from alternativity: the possibility of alternative interpretations of an utterance taking the AB link to be just metaphorical or just metonymic. Ritchie (2006: 156) says, in referring mainly to metaphor, gurative use of language may itself constitute a eld of meaning, with dimensions such as conceptuality, opaqueness, literalness, triteness, formality, folkishness among others, and Cameron (1999b) has provided nine dimensions on which metaphor can vary. Many dimensions have been important elsewhere in the literature, such as aptness, vividness, memorability, imageability, evaluativeness, persuasiveness, literariness, social divisiveness/cohesiveness, entrenchedness and cultural specicity. But
J. A. Barnden
with the probable exception of literalness, such dimensions arguably do not aect decisions as to whether metaphor or metonymy is involved. The present article is more concerned with dimensions which could be said to be genuinely constitutive of metaphor and metonymy. Metaphoricity and metonymicity are, arguably, language-user-relative in a deep way. They are aected by such things as the particular lexicon, encyclopaedic knowledge, and interconceptual relationships held by a particular language user (whether utterer or understander). Thus, in principle, an expression should not be said to be metaphorical or metonymic in any absolute sense, but only for a particular user. Of course, in practice, many expressions may be metaphorical or metonymic for the vast majority of native users of a language, and the way in which expressions are metaphorical or metonymic may also be the same or similar across such users (e.g., involve the same conceptual metaphor such as LOVE AS JOURNEY, or the same metonymical schema such as CONTAINER FOR CONTENTS). Relativity has been pointed out by various other authors (Cameron 1999b; Dirven 2002; Geeraerts 2002; Norrick 1981; Pragglejaz Group 2007; Radden 2002; Radman 1997; Riemer 2002; Ritchie 2006; Ruiz de Mendoza Ibanez 1999). We will not be exploring it in this article, but we need to distinguish it from the issue of the dierentiation of metaphor and metonymy. User-relativity does not necessarily imply that metaphor and metonymy cannot in general be cleanly dierentiated or that, for a given user, particular cases of metaphor and metonymy cannot be cleanly dierentiated; and, conversely, a lack of a clean dierentiation between the notions of metaphor and metonymy does not necessarily imply user-relativity. Because of its aims, this paper does not rest upon any particular denition of metaphor or metonymy, but instead on other authors claims for metaphoricity or metonymicity of examples used, or on the present authors judgments of how particular examples would be classied in the eld. The paper does nevertheless embody a cognitive assumption, in viewing metaphor and metonymy as being largely to do with cognitive representation and processing issues as opposed to the surface form of utterances. (In the case of lexicalized metaphor or metonymy the representation and processing may have occurred in the past, and thus merely be part of etymological motivation). To the extent that metaphor and metonymy are matters of processing, the issues in this paper amount to questions such as the extent to which the processing creates or traverses similarity and/or contiguity links between conceptual items. However, the word link will be meant in a very general and theoryneutral way. Links certainly include those between target aspects and source aspects that are proposed in an explicitly mapping-based account
such as Conceptual Metaphor Theory, for instance when the participants in a love relationship are linked to occupants of a travelling car. But the notion is broader: for example, even though the sentence My job is a jail is analysed in the class-inclusion approach (Glucksberg 2001; Glucksberg and Keysar 1990) in terms of a category of entities that includes both the job and physical jails, we will still describe it as implicitly involving a link between the job and either a hypothetical physical jail or the general category or concept of physical jails, even if no direct link is established in the understanders mind or proposed in the theory. This way of talking is just in the service of having a uniform way of describing the fact that, in metaphor, at least one target item is explicitly or implicitly, and directly or highly indirectly, associated with at least one source item. Although the class-inclusion theorys attention is on the idea of class inclusion as such, our attention is on the nature of the implied link between target item and source item, e.g., between the job and the (hypothetical) physical jail, or between the job and the physical-jail category. This link could, for instance, be regarded as a similarity link based on possession of whatever property or set of properties are held to dene the superordinate category under which both target item and source item are placed. In the job/jail example, the similarity might consist of both the job and a physical jail being constraining and dicult to escape. (Care is needed here to postulate suitably abstract notions of constraining and escape, in light of the circularity objections raised by Ritchie (2006) and others). Also, a link can take a collapsed, degenerate form: namely, that of an imaginary identication. For example, if in a blend space (Fauconnier and Turner 1998; Turner and Fauconnier 1995, 2000, 2002) one entity is identied with another, that identication is itself a link in our sense. And if the entities come from input mental spaces, the corresponding entities in those spaces can also be said to be linked indirectly with each other via the identication link in the blend space. Finally, to the extent that metaphor and metonymy are in part processing issues, we need to keep in mind the possibility that links might only be classiable as metonymic or metaphorical by taking into account the way the links are used in processing rather than or as well as any other aspect of their nature. The structure of the rest of the paper is as follows. In Section 2, we mainly consider whether a distinction between metaphor and metonymy can be found in a distinction between similarity and contiguity. In Section 3 we look at the extent to which source/target links are themselves kept as part of the messages conveyed by metaphorical or metonymic utterances. Section 4 briey examines two further possible grounds for dierentiation, namely the interaction with conceptual compartments such as domains and frames, and the role of imaginary identication/
J. A. Barnden
categorization of target items and/or source items. Section 4 also comments further on intermediacy and overlap of metaphor and metonymy, and briey considers claims that (some) metaphor can be viewed as double metonymy. Section 5 concludes, and explains how it can be protable to analyse utterances at the level of the dimensions of variation discussed in the article (similarity, etc.), as opposed to the higher level of metaphor and metonymy as such.
2.
Similarity versus contiguity?
A traditional view has been that metaphor is a matter of similarity between source and target items, and metonymy a matter of contiguity between them (Dirven 2002; Feyaerts 2000; Jakobson 2002 [1956]; Lodge 1977; Norrick 1981; Ullman 1962). Haser (2005) provides a review. There is an enduring intuition that in metonymy the source and target item are related in some salient and easily accessed way, making the metaphor of spatial contiguity reasonable. In this sense, a composer is contiguous to his/her music, a time period to an important event occupying it, etc. Contiguity also includes, of course, more physical cases, as of a bottle being contiguous to its contents. Contiguity is meant to have no whi of likeness. But many authors have noted the slipperiness of the notions of similarity and contiguity (e.g., Chiappe 1998; Cooper 1986; Dirven 2002; Haser 2005; Riemer 2002). Norrick (1981: 27) says that the line between principles of similarity and of contiguity is at times fuzzy (although he does not go on to explore this fuzziness). The slipperiness of the notions compromises their ability to dierentiate metaphor from metonymy. In addition, metaphor can impose similarity rather than resting entirely on already noticed similarity (Black 1993 [1979]; Haser 2005; Indurkhya 1992). Consider an utterance that metaphorically casts a particular cloud as a camel, such as The camel was high up in the sky. For speaker or hearer, the cloud might initially only have a slight visual similarity to a camel; but the act of using the metaphor causes speaker and/or hearer to view the cloud much more as a camel. The structure of a camel is imposed on the cloud, giving the latter a structure that would not, or could not as easily, have been discerned otherwise, and may be partly articial. Eects can include the division of a part of the cloud into subparts in a non-obvious way, or conversely the agglomeration of two distinct parts of the cloud into one undierentiated part from the point of view of camel shape. Analogously, viewing a marriage as a business may cause one to add a structure to marriage that one had not previously perceived.
But we will see that similarity and contiguity are not as distinct as is assumed even by previous critics of them as a basis for dierentiation. The argument is in two parts. First we argue (in Section 2.1) that there appears to be no bar to viewing metaphorical linkage between source and target in metaphor as a special type of contiguity. Thus, if or when metaphorical linkage amounts to similarity, similarity is a special type of contiguity. Secondly, we argue a partial converse to this (in Sections 2.2 to 2.4): that certain familiar forms of contiguity involve similarity in an essential way, where moreover this similarity can sometimes be akin or even identical to the similarity underlying some metaphor. The discussion takes contiguity to include not just relations that have specically been labelled as such but also other relations such as pragmatic functions (Barcelona 2002, 2004; Nunberg 1978). The sets of relations discussed under the heading of contiguity and those under the heading of pragmatic function are very similar. In eect the discussion regards any relationship that has been held to be metonymic as a possible type of contiguity. Similarity can at one extreme be a matter of sharing some simple features and at the other a matter of a complex structural analogy. We also need to keep in mind a broad distinction in the ways in which things can be similar. A road can be similar to a snake in virtue of the shape of each, and one plan of action can be similar to another in virtue of their structure, similarity of individual steps, etc. These are examples of two things being intrinsically similarsimilar because of their own natures, independently of relationships to other entities outside themselves. On the other hand, in the context of metaphorically casting an organization as a solar system, the term planets could describe major employees even though there is no (relevant) intrinsic similarity between a planet and an employee. Relative to the overall similarity between the organization and a solar system we can say that employees and planets are extrinsically similarsimilar because of their relationships with other things taking part in an overall structural analogy. Of course, two things may be similar because of some mix of intrinsic and extrinsic similarity. We will not assume that all metaphor is necessarily based on similarity. In particular, some metaphors have been held to be based on experiential correlations between source and target (Barcelona 2000b; Grady 1997; Kovecses 1990; Radden 2002). For example, an experienced correlation between seeing and knowing (in that seeing can lead to knowledge) may be at the root of a metaphorical view of KNOWING AS SEEING. Now, whether this metaphorical view, once created, is a matter of similarity is a contentious matter. For instance, in one such case (SADNESS IS DOWN), Barcelona (2000b) claims that the metaphor involves similarity,
J. A. Barnden
but Haser (2005: 44) throws doubt on this. Fortunately, we do not need to adjudicate the matter in this paper. The argument of Section 2.1 actually applies to correlation-based metaphorical links as well as to similarity links; and the arguments of Sections 2.2 to 2.4 are not aected in their nature or signicance by the presence or otherwise of non-similaritybased metaphor. Hence, the remainder of the paper will refer both to similarity-based and to correlation-based metaphor, but will leave open the possibility that some or all of the latter is also within the former. One move that might be attempted is simply to disallow similarity from the class of associations called contiguity (Feyaerts 2000, following Ullman 1962). Then, of course, one would get a crisp distinction between (similarity-based) metaphor and metonymy. However, such a move seems unprincipled and to be made just to save the distinction, as opposed to examining the phenomena to see what the useful distinctions are, if any. So, we will assume that the notion of contiguity does not in itself contain a stipulation against similarity. 2.1. Metaphorical linkage as contiguity
Basically, we ask why metaphorical links shouldnt themselves be regarded as contiguities. The fact that (at least) similarity links in metaphor are not normally regarded as contiguity links, or the fact that authors uncritically portray similarity and contiguity as being dierent types of relationship, is hardly a valid answer to the question, as there is no accepted denition of how broad contiguity is. The term contiguity is in itself highly metaphorical and susceptible to a wide range of interpretations, as has often been observed. It is thus perhaps surprising that the question of metaphorical linkage counting as contiguity has not been raised more often. Consider the widespread phenomenon of referential metaphor. Typically, referential metaphor is said to occur when a denite noun phrase is used metaphorically to refer to some target item, as in (1), from Gibbs (1990): (1) The creampu didnt even show up.
A boxer in the context is being metaphorically viewed as a creampu and is being referred to by the phrase The creampu. Another, more mundane and conventional, example would be They have reached the third milestone on the project using the phrase the milestone to refer metaphorically to an important, planned event in the project. Thus, assuming that underlying (1) there is some postulated similarity link between the boxer in question and a hypothetical creampu (in the literal sense), we can use
this link to achieve indirect reference to the boxer (target item) via direct reference to the creampu (source item), just as we can use an alleged contiguity link in a metonymy to achieve indirect reference to a target item via a direct reference to a source item. Referential metaphor can also use correlation-based metaphorical schemata. Consider the passage: Susan sank into a pit of sadness. She stayed at the bottom for many months. Let us assume for the sake of argument that the passage is to be analysed using the SADNESS IS DOWN metaphor and that this is correlation based. Then in the second sentence the phrase the bottom is a referential metaphor for the worst phase of her sadness state. So, is there anything about metaphorical links that should prevent us from regarding them as a special case of contiguity links, at least when they are being used in referential metaphor? One sharpened version of this question could be: If contiguity links in general are salient semantic or pragmatic associations or salient applications of pragmatic functions, is there anything about similarity or correlation-based links in referential metaphor that should prevent them from qualifying as contiguities along with other types of salient association/function? Before going on we should dispose of one alternative to an assumption made a moment ago: the assumption that the phrase The creampu in (1) refers to a hypothetical literal creampu. One might argue instead that while creampu in the noun phrase does refer to the category of literal creampus, there is no act of postulating a member of that category: rather, the noun phrase acts much as if it had been The person who is, metaphorically speaking, a creampu using a creampu purely predicatively. (This would be consistent with a class-inclusion account of metaphor). However, we can still say that there is an (alleged) similarity between the boxer and (literal) creampus in general, or a similarity relationship between our concept of the particular boxer and the general concept of (literal) creampus. Our question would then become: Is there anything about this similarity that disqualies it from being a type of contiguity? For simplicity, in the following we will stick to the assumption that the phrase The creampu in (1) does refer to a hypothetical literal creampu, on the understanding that the discussion could be adjusted to t category-based accounts. Another distinction to note before going on is that the issue of whether correlation-based metaphorical links can be regarded as contiguities is dierent from the issue of whether the original correlations themselves are so regarded. If Susans state of sadness does not in fact cause any physical downness (e.g., drooping body), but is metaphorically cast as an imaginary physical downness, then the fact that some sadness states
10
J. A. Barnden
can cause, and therefore be contiguous to, downness does not force us to consider Susans sadness and the imaginary downness to be contiguous. However, it is certainly possible to allow the notion of contiguity to encompass potential as opposed to actual causation, in which case the metaphorical link would be one of contiguity, whatever else it might be. However, we do not rely on this argument in the following. Some grounds could be imagined for trying to maintain that metaphorical links, whether similarity-based or correlation-based, should not qualify as contiguity links. We will treat these in turn and argue against them. Naturally, there may be grounds beyond those considered here. First, it might be claimed that metaphorical links are more a matter of (possibly culture-wide) mental imposition upon the world than are the contiguity links in generally recognized forms of metonymy. Or, to paraphrase, perhaps metaphorical links are much more in the mind whereas metonymic links are much more a case of reecting what is objectively in the world. Thinking of a love relationship as a physical container is arguably more an imposed, mental view than regarding a physical container as being related to its physical contents.1 However, it is dicult to sustain a rigid contrast in general. As Dirven (2002) says, contiguity is itself to some extent partly in the eye of the beholder, and Norrick (1981) takes a similar view. It is partly a mentally, socially and culturally constituted matter that, for instance, a particular group of people is the football team representing Finland, and yet the word Finland can metonymically refer to the group, as in Finland lost the match. The situation is similar for many other types of metonymy where the source item plays some sort of social or political role with respect to the target item or vice versa, as in Bush attacked Iraq with Bush as source item and the USA or the US military as actual attacker. The Representational metonymies to be discussed in Section 2.2where, say, a pictorial image in a painting is used to refer to the depicted object, or vice versainvolve a mentally imposed representational link; also, Goodman (1968) argues that there are conventional and stipulative aspects to the way that paintings, etc. do their representing. Finally, in the celebrated example of using the phrase The ham sandwich to refer to the restaurant customer who ordered the sandwich (Nun-
1.
Of course, we might claim that the structuring of the world into, say, containers and contents is itself a mental imposition, and not part of the objective, real world. If that is factored in, then the distinction being drawn is between the real, objective world as naively conceived to exist in common-sense as opposed to what even common-sense would concede is imposed on the world by minds.
11
berg 1995), the act of ordering something in a restaurant only makes sense given a suitably constituted socioeconomic culture, one where certain discourse acts are regarded as constituting ordering. Furthermore, we can argue that many metaphorical similarity relationships do exist in the world, or arise objectively from it, just as much as many metonymic relationships do. This is clearest when the metaphor rests on a complex structural analogy such as that between an army and a society of army ants (example from Goatly (1997: 163)) or that between a commercial company and a solar system. There is a sense in which the partial isomorphism of structure really exists. It is a mathematical aspect of the world that exists just as much as a simple, familiar mathematical object such as the number 9 does. And, the partial isomorphisms exist just as much as the link between, say, the date 11th September 2001 and certain terrorist events does. So, given that dates and events are used metonymically for each othere.g., the (abbreviated, US-style) date 9/11 for some terrorist eventsand given that their relationship is a contiguity, it seems articial not to regard the abovementioned analogical links as contiguities. Secondly, perhaps contiguity and metaphorical linkage could be distinguished on grounds of structural correspondence (Barcelona (2004) provisionally suggests this). Perhaps similarity in metaphor involves a correspondence of some structure between source and target whereas contiguity does not: contiguity just relates two wholes that it leaves unanalysedso that their intrinsic similarity or otherwise is not an issueand whose relationships to other things are irrelevantso that extrinsic similarity is not an issue either. Now, similarity-based metaphor can indeed be seen to involve at least some small amount of structural correspondence. Even when the similarity consists only of the source and target items having one corresponding feature, such as perhaps some sort of weakness in the boxer/creampu case, we can see a structural correspondence: the boxer corresponds to the creampu, the weakness on the target side corresponds to the weakness on the source side, and the boxer having the former weakness property corresponds to the creampu having the latter weakness property. But, the trouble is that some metonymy involves structural correspondence as well, as we will see in Sections 2.2 and 2.4, where the structural correspondence is at least as rich as the minimal sort found in the boxer/creampu example of metaphor. As for correlation-based metaphor, if this does not (always) involve structural correspondence, then, given that metonymy does not usually involve structural correspondence, structural correspondence cannot be used as a metaphor/metonymy dierentiator. If on the other hand (some) correlation-based metaphor does involve some structural correspondence,
12
J. A. Barnden
as claimed by Barcelona (2000b), then we are back to the point that so does some metonymy. Thirdly, perhaps contiguity links should be restricted to associations that are conventional or rmly established (cf. discussion in Haser (2005: 22)). But even if this is correct it will not work to provide a distinction with similarity in metaphor, because most similarity in metaphor is highly conventional or rmly established, and the metaphorical links of correlation-based metaphor are also rmly established. Finally, it might be claimed that in metaphorical similarity there is no real source-side entity corresponding to the target-side entities, whereas in metonymy there is. For example, metaphorically casting a person Richard as a lion does not involve a particular, real lion, whereas metonymically referring to some real artworks via an artist does involve the artist being real as well. However, some core types of metonymy are open to having merely hypothetical source items. In a conversation in a library about the location of books about certain topics, we can say Santa Claus is on the top shelf meaning that books about Santa Claus are there, just as we can say Alexander the Great is on the top shelf or Car engines are on the top shelf with analogous intent. Equally, we can say Santa Claus is in the left-hand part of the picture, with Santa Claus being a case a Representational metonymy (Section 2.2). We can also similarly say Lions are on the top shelf and Theres a lion in the left-hand part of the picture without assuming that any particular real lion is discussed or depicted. Conversely, a metaphor source item can be real, as in Singapore is the Britain of the Far East (example quoted by Wee (2006) and following a common pattern of using a well-known existing entity as a metaphor for another entity). The conclusion so far is that there is nothing to stop us regarding the metaphorical links traversed in (at least) referential metaphor as special cases of contiguity. However, our claim is not restricted to referential metaphor as normally conceived, i.e., as being about metaphoricallyused denite noun phrases, any more than metonymy is conned to denite noun phrases. Rather, whenever a metaphorical link is used for accessing something in the target via something in the source, irrespective of the surface linguistic forms involved, we can claim the link is being used as a type of contiguity just as much as we can in standard referential metaphor examples such as (1). For one thing, our discussion would not be essentially changed if (1) had used an indenite noun phrase, as in Some creampus didnt even show up. More distantly, consider the common use of the phrasal verb eat up to refer to commercial taking-over, as in a sentence of form Company A tried to eat up Company B. We can say that there is a hypothetical act of physical eating-up that is conceptually
13
contiguous to the real taking-over, just as much as we can say that in the case of (1) there is a hypothetical creampu that is conceptually contiguous to the real boxer. 2.2. Contiguity involving similarity, 1: Representational metonymy
In this subsection and the next (2.3) we will be arguing that two (salient) types of contiguity can be viewed as involving similarity. This point is a partial converse to that of the previous subsection (2.1). Representations (things that represent) and their representatees (the things they represent) are often used to stand for each other in metonymy. Thus we have REPRESENTATEE FOR REPRESENTATION and REPRESENTATION FOR REPRESENTATEE. These are both covered by Warren (2006), but with dierent names from ours. We will use the term Representational metonymy to cover both directions of the metonymy. Warrens examples of REPRESENTATEE FOR REPRESENTATION include Ari painted a tanker (quoted by Warren from Fass (1997); a tanker is the metonymic source phrase). Let us assume that in context the sentence means that Ari painted a picture of a tanker, or a picture of various things including a tanker. Here the source REPRESENTATEE is the (possibly imaginary) tanker and the target REPRESENTATION is either the picture as a whole or the image of the tanker in the picture. Other examples would be Theres a tanker in the left hand side of the picture and Tony Blair is on the left hand side of the photo, both of which make it more explicit that a physical representation is being indirectly referred to (by the phrases a tanker and Tony Blair respectively). But the REPRESENTATION need not be a visual representation of the REPRESENTATEE. For instance, it can be an acoustic representation, as in . . . Knechts symphony begins amid beautiful countryside, but, rather more rapidly than in [Beethovens Pastoral Symphony] a storm approaches, breaks and fades away . . . (emphasis added).2 It is clear from the surrounding text that what is under discussion is an acoustic image of a represented storm. (This is worth noting because the above quotation, taken by itself, bears a possible contrasting interpretation in which storm is a metaphorical description of non-representational musical events in the symphony). And the REPRESENTATION in a Representational metonymy need not be any type of perceptual representation, as it could be an idea in someones mind and could concern something
2. From notes on the Pastoral symphony by Jan Smaczny for City of Birmingham Symphony Orchestra, England, April 2000.
14
J. A. Barnden
non-perceptual, as in Sallys disappointment was at the back of Johns mind all day long. Here the phrase Sallys disappointment metonymically refers to some IDEA OF Sallys disappointment (which is itself being metaphorically viewed as a physical object in Johns mind, which is metaphorically viewed as a container). An example of the reverse metonymic pattern, REPRESENTATION FOR REPRESENTATEE, is: In Goldnger Sean Connery saves the world from a nuclear disaster (Warren 2006: 48), with actor Sean Connery himself or the moving images of him in the lm as the REPRESENTATION and character James Bond as the REPRESENTATEE. It is of course the (ctional) person James Bond who saves the (ctional) world. Now, it is certainly true that even when the REPRESENTATION is a visual item or physical object (e.g., an image or an actor) and the REPRESENTATEE is a physical object, the REPRESENTATION need not bear any signicant visual resemblance to the representatee. In the sentence The town is on the left hand side of the map the visual representation of the town could be just a small black dot, and thus have little resemblance to the actual appearance of the town. However, we are concerned henceforth with the prominent special case of Representational metonymies where the representation is indeed based, at least in part, on some substantial sort of intrinsic perceptual similarity, and will take the visual-similarity subcase of Representational metonymy as particularly salient. In situations involving photos, pictures and the like the representation relationship is normally based at least in part on visual similarity, though other factors such as convention and stipulation can also be present (cf. Goodman 1968). Naturally, the visual similarity often involves a high degree of structural correspondence. Clearly, if we are to claim that metonymic links in general are contiguities, then we must admit that in similarity-based Representational metonymies the contiguity between source and target happens to take the form (at least in part) of similarity. Moreover, the similarity is central to the metonymy, not some incidental feature of it. These obvious points have not been given due weight in discussions of the metaphor-metonymy distinction. Perhaps this is because examples of Representational metonymy are rare in the metonymy literature, according to Warren (2006). This investigative rarity, however, does not do justice to the prevalence and ordinariness of the phenomenon. Warren (2006: 49) herself seeks to distinguish the type of similarity involved in similarity-based Representational metonymies from the type of resemblance involved in metaphor. She says that the former simply involves matching the REPRESENTATION with the REPRESENTATEE whereas in (many) metaphors the resemblance involves prop-
15
erty selection and adaptation. These claims of Warrens are disputable. First, she does not explain what this matching in the metonymic case amounts to. Secondly, the visual similarity in the metonymic case will almost always involve signicant property selection (except in the most photographic of depictions) and often some measure of adaptation (e.g., colours may be intensied, shapes simplied). Thirdly, she only says many metaphors. Fourthly, other authors have regarded the resemblance relation that visual depictions have to what they depict as usable in metaphor: for example, such resemblance is an example of one of Norricks (1981) iconic semiotic principles, and this general semiotic principle induces a metaphoric principle in the special case of language.3 Finally, it is dicult to see any fundamental dierence between the types of visual resemblance possible in the metonymies and those possible in imagebased metaphors (Lako 1993; Lako and Turner 1989). In image-based metaphorsalso sometimes called resemblance metaphors, though this is too vague a termtwo physical objects or images are put into a metaphorical relationship on the basis of their visual appearance (which can include motion), examples being the road snaked through the desert, where the word snaked uses an image metaphor to describe the shape of the road, and The rock that saved him was lathered and fringed with leaping strings of foam (quoted by (Goatly 1997: 271)) where the word lathered uses an image metaphor to describe the foamy rock and the words leaping and strings set up further image metaphors to describe the foam itself. It could be argued (and perhaps this is Warrens point) that in image metaphor one needs to understand just how the two things are similar in order to understand the utterance, whereas in Representational metonymy one can simply take it on trust that there is a similarity or some other sort of representational connection. The claim would be that to understand, say, what it is for a road to snake we need to think about what a moving snake looks like; whereas to understand Theres a snake in the left hand side of the picture we only need to know that there is some subimage or other in the picture that is intended to depict a snake. However, this dierence is at most a matter of degree and particular circumstances, and does not support any sharp distinction between Representational metonymy and image-based metaphor. In the case of Tony Blair is on the left hand side of the photo, could we be said to have properly understood the utterance without understanding something about how a representation in a photo could be similar to Tony Blairs actual appearance,
3. However, Norrick (1981) does not discuss Representational metonymy.
16
J. A. Barnden
so that the understanding involves more than just realizing that some similarity exists? Conversely, in the case of The road snaked through the desert we do not, for most purposes, need to have more than a vague idea of the bendiness of the road. It is not clear that the visual similarity considered by the understander is more detailed than the visual similarity the understander needs to consider in the Blair photo case. In response one might claim that in the Blair photo case, the act of metonymy as such is a very bare one, consisting simply in assuming that there is some visual representation of Blair in the photo, and does not require knowledge of what Blair looks like; but understanders perform an additional, pragmatic inference that that representation is probably a shape that looks like Blair, and if they happen to know what Blair looks like they can esh this looks-like relationship out. However, we could counter this with the parallel claim that in a sentence like The road snaked across the desert, the act of metaphorical understanding is a very bare one, consisting simply in assuming that there is some visual similarity between the road and a snake, and does not require knowledge of what a snake looks like; and there is an additional pragmatic inference that that similarity is probably one of physical shape, and if understanders happen to know what a snake and its movements look like they can enrich the similarity. Ultimately, the point is that in both Representational metonymy and image-based metaphor we have a relationship of similarity; the degree to which that similarity is apprehended by the understander is a matter of how rich an understanding the understander comes to, irrespective of whether we have a case of metonymy or a case of metaphor. Also, note that even if a town is represented by merely a small dot on a map, so that there is little or no intrinsic similarity, the spatial relationships within the map itself between the dot and other representations will be structurally analogous to the spatial relationships of the town to the representatees of those other representations. Thus, just as in metaphor, it is often the external associations of source and target items, rather than their internal structure, that are important in source/target similarity, so that the similarity is largely extrinsic to any given source/target link. In Representational metonymy the external associations are often an implicit part of the representational relationship between a particular REPRESENTATION and a particular REPRESENTATEE. Analysed in this way the map could be said to be a metaphorical representation of the geographical region in question, and the metaphorical links between town-denoting dots on the map to real towns are similarity links used as contiguities, on the lines of Section 2.1. It is just that we are now arguing that something we started o assuming was a contiguity link is also a similarity link like those used in a type of metaphor, rather than arguing as
17
we did in Section 2.1 that a similarity link in general can also be used as a contiguity link. It is not just that it is dicult to put a wedge between the general nature of similarity in Representational metonymy and that of similarity in image-based metaphor. Rather, we can even argue that the just the same particular similarity can operate in both metonymy and metaphor. On the metonymy side, consider the following, where the label 2my is short for metonymy case of 2 (2my) Theres a snake on the left-hand side of the drawing.
referring to a wavy line in the drawing that is intended to depict a snake. On the metaphor side, consider the following, where 2or is short for metaphor case of 2 (2or) Theres a snake on the left-hand side of the drawing.
as a way simply of describing a line in a drawing, where the line is not intended to depict a snake (it might depict something unconnected with snakes). (2my) and (2or) are deliberately the same sentence, but involved in dierent utterance events and we use dierent labels for ease of reference. Suppose the snake lines in the two drawings are identical. Then both examples involve exactly the same type and degree of visual snake/line similarity, and it is likely that there is no other type of similarity in play (such as would arise if, say, in one of the examples the line were drawn in ink containing snake bile!). This raises the possibility that we should classify at least some Representational metonymy that relies on similarity as also simultaneously being metaphor. But we should also wonder whether the contiguity in Representational metonymy is more than just the similarity. We might argue that in Representational metonymy we also have the extra feature that the similarity is being used in a particular way, namely in an act of representation, whereas metaphorical source items do not represent corresponding target items. But there is a case for suggesting that metaphor itself does involve a representation relationship: that source-domain items in metaphor do represent corresponding target-domain items, in the mind of someone using the metaphor. For example, in (2or) an imagined snake could be said to represent the line. Or, the persons idea of a snake could be said to represent the line (in the special source-to-target sense envisaged), as well as representing a snake in the ordinary way. It all depends on one what one means by the word representation. Whatever the relative merits of these arguments as to whether Representational metonymy has a feature beyond similarity that metaphor does
18
J. A. Barnden
not have, the point holds that a dierence between metaphor and metonymy cannot solely be based on metaphorical and metonymic links always diering as to whether similarity is involved or even on what forms of similarity they involve. 2.3. Contiguity involving similarity, 2: Partitive metonymy
We now turn to a very dierent type of contiguity, namely the contiguity involved in WHOLE FOR PART and PART FOR WHOLE metonymy. We will use the term Partitive contiguity or metonymy to cover both directions. We will see another important way in which contiguity can involve similarity. Similarity will be highly relevant to the way that some Partitive metonymy works, and in these cases we will say that the PART and WHOLE are relevantly Partitively similar or that the PART is relevantly whole-similar for short. Relevant Partitive similarity arises when the WHOLE and the PART to some extent share particular features that are important in the motivation for the metonymy. This applies, for example, in the traditional metonymic use of hand to mean a sailor. Important (traditional) functions of sailors, such as grasping a rope, are performed partly by their hands. So, there is a sense in which a sailor and his/her hands are functionally similar to some degree (of course, what the hands do in grasping is only part of what the whole body does in grasping). It is precisely this partial function sharinghence, partial functional similaritythat motivates the metonymy. But we can go further: to a degree, the whole person has the function in question because of having a part that has that function or an approximation to it. The parthood is central to the similarity, and the similarity is central to the signicance (in context) of the parthood. We are not concluding that, because some sort of similarity is involved, metaphor is therefore involved. This would prejudge the question of whether metaphor restricts the type of similarity, or involves more than just similarity. However, it is worth mentioning here the like test for distinguishing whether two items A and B have a metonymic or metaphorical relationship (Gibbs 1999). The core intent of the test is presumably to test for (metaphor-supporting) similarity. According to the test, if saying A is like B is appropriate in the context then we have a case of metaphor. But it might be thought odd to say that Sailors are like their hands or The hands are like the sailor, even in a context where sailors use of their hands is salient. However, if anything this shows the dubious validity of the test for our current purposes. The conditions under which we might judge any given form of words like Sailors are like their hands to be appropriate will be aected by many factors. Also, the test is unfair
19
in using an impoverished likeness sentence: we should really be assessing a sentence like Sailors are like their hands in that they have functions such as that of grasping rope in common. Surely this sentence is appropriate and true in the relevant context. In short, the suspicion is that the test picks up on, if anything, just default preconceptions of likeness rather than the more specialized, less obvious but nevertheless relevant and technically important forms arising in specic contexts. We will look at further examples of relevant Partitive similarity. The shared features in the examples happen to be ones of function or appearance, but the phenomenon could apply much more widely. The central observation about our examples will be that matters of appearance or function will be what the utterance is getting at, and the PART will be similar to the WHOLE precisely in appearance or function respectively. We will also continue to see in the examples that the WHOLE has the appearance or function it has partly because the PART has it, so the similarity arises in part from the parthood; and (in the PART FOR WHOLE direction) the particular part is chosen in the metonymy because of the similarity. It is common for one important contribution, or even the main contribution, to the appearance of something to come from a certain type of part of the thing, notably the outer surface of the whole thing, or the outer surface of an especially salient part of the thing. Warren (2006: 42) mentions the metonymic PART-FOR-WHOLE use of the word palefaces by Native Americans (at one time) to refer to white people. So consider a sentence such as We run away when we see palefaces, uttered perhaps in a 1950s cowboy lm. The persons face is relevantly similar, in appearance, to the person as a whole (or more precisely to their skin as a whole): you can normally tell someone is a white person by looking just at their face, and the fact that someone is white is highly relevant to the understanding of the sentence. Also, the very motivation for the particular type of metonymy in question is precisely the similarity of appearance of face to whole skin and then to the whole person. Consider now the following example (mentioned by Warren (2006: 43), but of course an example of a common way of speaking): (3) Everyone who wants a roof should have one.
Although the phrase the roof could be referring literally just to roofs, it is more likely to be metonymically referring to roofed dwellings. Part of the function of a dwelling is to shelter the occupants, and an important aspect of that function is provided by the roof. Assuming the sheltering function is relevant to the understanding of the sentence in context, we see that roofs and dwellings are relevantly Partitively similar.
20
J. A. Barnden
The above examples are about parthood within physical objects, but relevant Partitive similarity is not conned to these. A more abstract example is The meal was enjoyable when the phrase the meal refers not just to the food served and the eating of it but the whole occasion, including conversation, etc. (This metonymy is similar in style to the metonymy that Norrick (1981: 9394) discusses of [to] cook referring to the whole meal-preparation process). The meal in the narrow sense is a PART of the whole occasion. They share the function of providing food to the eaters; the occasion has that function because (in part) the narrow meal does so; and the narrow meal is by default an important aspect of the enjoyability. The hand (for sailor), palefaces, roof, and meal examples are all PARTFOR-WHOLE, but the point works for WHOLE-FOR-PART as well. Consider the sentence She has a good head (Warren 2006: 44). Warren mentions an interpretation in which the head links metonymically to the persons intelligence viewed as a PART of the head. Clearly, the heads function of engaging in intelligent thought comes directly from the PART in this analysis. Some researchers might object that what is happening in WHOLE FOR PART cases is merely zone activation rather than metonymysee, e.g., Croft and Cruse (2004), following Paradis (2004). There is no room to argue against this stance here, but we do not need to rely on the WHOLE FOR PART direction anyway: the above consideration of PART FOR WHOLE is enough for our purposes. The similarity in Partitive metonymy can be like that used in some metaphor. It is dicult to drive a wedge between appearance-based relevant Partitive similarity and appearance-based similarity in image-based metaphor. As regards function-based relevant Partitive similarity, note that functional similarity is key in some metaphors, as in brain-as-computer or vice versa, insofar as both entities are viewed as having problemsolving as one function. Finally, we return to the issue addressed in Section 2.1 of whether metaphor and metonymy can be clearly divided as regards the degree of subjectivity and mental imposition. While it is plausible that metaphor generally has these qualities to a higher degree, it is by no means clear that the dierence across the board is enough for a clear dierentiation. In particular, parthood can be subjective and mentally imposed, especially when the entities are at least somewhat abstract. For instance, in the meal/ occasion case, the dividing line between the whole occasion and the meal as a sub-occasion of eating is fuzzy and subjective. Are drinks included within the (narrow) meal? Are snacks, coee etc. away from the main table included within it? Also, food might be available buet-style on a table for guests to get and take to other positions whenever they want,
21
so that there are no spatial or temporal boundaries to the narrow meal at all, and yet it is still appropriate to use the word meal to refer metonymically to the whole occasion (and the whole occasion is still more than just a narrow meal aspect). In such cases the very idea that the occasion has a part constituting a narrow meal is a mental imposition of structure, quite aside from the question of what the exact boundaries of that part are.
2.4.
Contiguity involving similarity, 3: Other cases
Other types of metonymy may involve similarity in an essential way. Panther (2006) discusses phrases like a Pearl Harbour where a proper name is used as if it were a common noun. He analyses this particular example as involving, rst, a metonymic step to an event, and then a metonymic step from that specic event to other events of the same kind. Panther (2006: 181 at note 24) says that this second metonymic step goes from the specic event to events that are, as he says, like the original one in significant respects. Thus, the essential nature of the metonymic step is to traverse a similarity link. There is even some appreciable similarity in the case of a sports team or a single athlete representing a country. Such representation only makes sense in the context of more than one country (or other social, political or geographical unit) being represented. So, there is a one-to-one correspondence between some countries and their athletes/teams. This can by itself be viewed as a minimal type of structural correspondence (hence similarity)minimal in the sense that correspondence of relationships is not yet involved. But we can enrich the correspondence by taking competitive relationships between teams/athletes to correspond to competitive relationships between countries. The structural correspondence is extrinsic to any one country and its team/athlete, but this is not a problem as the similarity in metaphor generally can also be extrinsic relative to individual items involved (cf. the solar-system/organization similarity in the preface of Section 2). In addition, in the case of a team as opposed to a single athlete, there is additional intrinsic similarity that conceivably contributes to the metonymy. The team is similar to the country in being composed of people who work cooperatively for the good of the team (and hence also for the country), just as the people in a country work cooperatively in some measure for the good of the country. Part of the very reason the team is viewed as representing the country may be that it has the features just mentioned, though this deserves further discussion.
22 3.
J. A. Barnden Source/target links as part of the message (link survival)
Warren (1999, 2002, 2006) asserted versions of the claim that, in a metonymy, the source/target link itself is kept as part of the message of the utterance. Similar or strongly related views have also been expressed by Croft (2006), Dirven (2002), Haser (2005), Panther (2006) and Radden and Kovecses (1999). As an example of the phenomenon, when people understand the sentence Finland lost the [football] match they surely construct semantic mental representations in which the football team in question is indeed identied as being the team associated with the country Finland; the mental identication of the team is not (or at least not solely) done by some other means. Thus, the metonymic link between Finland and the team is preserved as part of the representation of sentence meaning: the role that the target item plays in relation to the source item is an important part of the message, not just a processing route to determining the message. The Finland-to-team link is not ( just) used to determine a mental list of player names, say. On the other hand, inclusion of links within the message supposedly does not happen in metaphor. Dirven (2002) claims this, in eect; and Warren (2006: 15) even says that metaphor involves the annihilation of the source by the target. Consider the sentence They have reached the third milestone on the project. Even if understanding of this is obtained on-line by considering a hypothetical physical milestone or the physicalmilestone category, it is reasonable to suppose that there is no need to have, within the resulting representation for the gurative meaning of the sentence, a record of the fact that the plan component in question is linked to that hypothetical milestone or milestone category. The claim that, in metonymy, the source-target link (at least often) survives into sentence meaning is appealing. Indeed, in many cases of metonymy there is no way of specifying the target item other than by reference to the source. Someone can utter or understand Finland lost the match without having any knowledge of the players names, or any way of referring mentally to the team other than by some analogue of the description the football team of Finland. And even when there is some other readily available way of mentally referring to the target, part of the point of the sentence would often be lost if the explicit link to the source were thrown away. However, it remains to be seen how universally the source-target link needs to survive. For instance, consider the sentence John has brains where brains are being used metonymically to refer to intelligence (cf. Haser 2005: 46). As Haser indicates, it is perfectly adequate for the mental representation of the meaning to state simply that John has intelligence, with no reference to brains.
23
And, on the other hand, we now argue that many uses of metaphor do appear to keep links to source items. (See also Croft (2006), and the notion of knowledge by metaphorical character in Stern (2000)). Part of the point of a poetic or other literary work is often the way in which information is expressed, and in particular the metaphors used. For example, in Shakespeares play As You Like It the character Jaques says All the worlds a stage . . . and then lengthily elaborates this idea. Part of the message is the comparison of world to theatre (especially in view of the dramatic irony that Jaquess real world is already theatre for us), not just the information about the world in itself that we may get as a result of comparing world to theatre, and then forgetting about the link to the theatre. More mundanely, many names for things have a metaphorical quality, and it is plausible that use of the names involves remembering the links to source items. An example is the name army ant. The reason army ants are so-called is a rich behavioral similarity to soldiers and other army units (see the popular science exposition quoted by Goatly 1997: 163), and at least in case of someone rst learning about them it is dicult to believe that reference back to the behavior of real army units is not active in the persons mind as an important part of the conception of the ants. Again, consider someone using the phrase the camel to refer to a cloud that looks like a camel, and saying The camel has broken its neck to describe the cloud coming apart at the place of its so-called neck. Now, Indurkhya (1992) uses the case of a similarity between a camel and a cloud as an example of how structure from a metaphor source (the camel) can be imposed on a target (the cloud), rather than residing intrinsically in the target. (See comments above, in the preface of Section 2). Thus, the identication of some part of the cloud as corresponding to the neck may be ineliminably dependent on the comparison to the camel, and indeed the head and the rest of the body may only be vaguely related to the subshapes in the cloud, so that we cannot necessarily just regard the neck part as the place where the head part joins the rest-of-body part, because those parts may not themselves be clearly specied. Under these conditions, consider what the understanders representation of the sequence of events has been. How is the place of the breakage to be internally represented? We have to assume either that (a) he/she has kept a detailed, entirely spatial representation of the original cloud, and, after somehow picking out a subregion N of it as the referent of its neck by considering the shape of camels, remembers only the spatial characterization of N as the internal representation of where the breakage was, or that (b) he/she refers mentally (whether consciously or not) to the supposed neck part
24
J. A. Barnden
by some representation that could be glossed as the cloud part, whatever it is, corresponding to the camels neck or the cloud part between the head part and the body part (perhaps together with a representation of roughly where that part is). Consider also: Theres a camel in the sky, with the word Theres interpreted existentially (as opposed to deictically), and in a context where it is clear that a cloud is being mentioned. Especially if the understander is not looking at the sky, there is now no specic cloud to compare to a camel, so unless the understander arbitrarily imagines a particular cloud shape, the most natural suggestion is that he/she mentally describes the cloud as having a shape similar to that of a camel. This point transfers directly to metaphor example (2or) in Section 2.2 in a context where the understander has not seen the drawing and so has no specic line shape to describe unless he/she mentally invents one. Thus, the link is likely to survive into the message of (2or) under certain conditions. Equally, the mental representation of the shape alluded to by metonymy example (2my) is presumably something like a shape representing a snake. So, for both (2my) and (2or), having links as part of the message is potentially equally important, depending on discourse context. Although the cloud/camel example and line/snake example (2or) are cases of image-based metaphor, the point they make is not specic to such metaphor. In particular, in any metaphor where source structure is imposed on the target, aspects of the target may best be mentally identied via corresponding aspects of the source. By contrast, there is no particular reason to insist that, with unreective use of unremarkable metaphorical phraseology such as milestones in everyday discourse, we should take source/target links (even if used during understanding) to be part of the message.
4. 4.1.
Additional discussion Two other possible dierences between metaphor and metonymy
We consider rst the idea that a metonymic step stays within a single conceptual compartment of some sort (a domain, Idealized Cognitive Model, frame, etc.) whereas metaphor crosses between dierent compartments. We concentrate here on compartmentalizations that are static in the sense of not relying on decisions made about the interpretation of the very utterances under consideration, so that when A and B are alluded to in an utterance, the question of whether A and B are in the same compartment or not cannot be varied by the utterance itself.
25
The idea of using a static compartmentalization to distinguish metonymy and metaphor has major problems. For instance, a country can not only be used metonymically to stand for its team, but a country and a football team (even its own actual team) can be put into metaphorical relationship to each other, either way round. The captain can be likened to a national leader, other roles within the team can be likened to roles within society, manipulations of the ball can be likened to interactions within society, etc. Either a country and its team are in the same compartment or they are not, and, in either case, either metaphor or metonymy can link the country and team in an utterance. The particular utterance is not allowed to help dene whether the country and team are in the same compartment or not. Utterances (2my) and (2or) supply another example, especially if we assume that in each situation the same line is present, and in each situation the snake is the same imagined, prototypical snake. Utterance (2my) is metonymic and (2or) is metaphorical, but they involve exactly the same interaction with any static compartmentalization. Even if dierent lines and imagined snakes were allowed in the two situations it would be dicult to nd a principled way in which the line and snake could be in the same compartment for (2my) and dierent compartments for (2or). A variety of other authors have found problems with compartmentbased dierentiations between metaphor and metonymy, including Barcelona (2002), Cameron (1999a), Croft (2002), Feyaerts (2000), Haser (2005), Kittay (1989), Moore (2006), Panther (2006) and Peirsman and Geeraerts (2006). However, some authors criticize one proposed compartment-based distinction only to introduce a proposal that itself has aws. As just one example, Moore (2006) claims that metonymy operates within a frame whereas metaphor crosses between frames. He says that the days of the week form a frame. But days can be used metaphorically for each other, as in the following part of a blog posting, created on a Monday: Sunday felt like Monday, so in honor of it [i.e., today] being honorary Tuesday, I am doing two minis [mini-blogs] today . . .4 Of course, if one were allowed to propose a compartmentalization that depended on decisions about metonymy and metaphoricity of utterancese.g., one put a snake and a line in the same compartment because of a metonymic utterance like (2my)then the compartmentalization would not be of much use in dening the dierence between metaphor and metonymy. But one general problem even about static compartmentalizations is that rarely if ever is any clear constraint placed in what
4. http:/ /ladynicole.blogspot.com/2005_08_01_archive.html (accessed 3rd July 2008).
26
J. A. Barnden
one is allowed to propose as being packaged into a compartment, so that for any particular set of examples one can often suggest a static compartmentalization that makes the postulated metaphorical links cross between compartments and the postulated metonymic links not do so. Haser (2005 Ch. 2) makes similar points. We now turn to another possible ground for dierentiating metaphor and metonymy, namely imaginary identication or categorization. As a feature of metaphor this has appeared in various forms and under a variety of names, for example in blending theory (Fauconnier and Turner 1998; Turner and Fauconnier 1995, 2000, 2002)in that corresponding target and source items become identied as a single item in a metaphorical blendand somewhat similarly in the ATT-Meta approach (Barnden 2001, 2006; Barnden et al. 2004). It has also been suggested by Warren (2002). The idea is that in the course of understanding a metaphorical utterance the understander imagines that the target item and source item are the same thing (e.g., imagines Richard to be a specic hypothetical lion) or imagines the target item to be in the source item when this is construed as a category (e.g., imagines Richard to be in the physical lion category). Not only does imaginary identication or categorization fail to be a generally accepted tenet about metaphor, but also it is possible to construe metonymy as involving it. Turner and Fauconnier (2000, 2002) and Fauconnier (2009) apply imaginary identication to some metonymically linked items, as well as to metaphorically linked ones: for example, in Turner and Fauconnier (2000, 2002) a printing press and a newspaper company that are metonymically related to each other become one item in a blend. Fauconnier (2009) rearms the point from Turner and Fauconnier (2002) that in their blending treatment of anger as heat, the heat, the anger and the bodily reactions correlated with angerand thereby metonymically related to angerbecome identied as one element in the blend. Thus, even if imaginary identication were established as essential in metaphor it would not uncontroversially distinguish it from metonymy. 4.2. Overlap, intermediacy, and combinations of dierences
Our arguments do not prevent some combination of the discussed dimensions, rather than some single dimension, from serving to distinguish metaphor and metonymy. But it is going to be very dicult to come up with such a combination. The snake/line examples (2my) and (2or) from Section 2.2 are similarly if not identically situated on the dimensions of (a) compartmentalization, (b) similarity, (c) contiguity, (d) structural corre-
27
spondence, (e) link survival, and (f ) source-item hypotheticality. As for (a): the two examples interact identically with any static compartmentalization, under the assumptions of Section 4.1. As for (b) and (c), in metaphorical (2or) the link is not only a similarity link but can be regarded as a metonymically-used contiguity link on the lines of Section 2.1. The contiguity, while dierent from that in (2my), is similar to it in many respects, e.g., the extent to which it is mentally imposed. For (d): both examples involve the same type and degree of visual similarity, and therefore of structural correspondence. For (e): link survival into the message may need to happen just as much for metaphorical (2or) as for metonymic (2my), as we saw at the end of Section 3. For (f ): example (2or) involves a hypothetical snake and (2my) may well do so. It is possible also that the question of imaginary identication/categorization would not distinguish (2my) and (2or), given the discussion in Section 4.1. Even if there is exact co-positioning of two dierent utterances in the spaceensuring an overlap of the metaphor and metonymy regions in the spaceit does not necessarily follow that there is overlap in the stronger sense dened in the Introduction: namely that there is a phenomenon that at one and the same time qualies fully as being both metaphor and metonymy (in a sense dierent from chaining or metaphtonymy, as mentioned in the Introduction). This sort of overlap might fail to occur because there might be constitutive dimensions that we have not considered. However, the arguments in Section 2.1 do suggest that we should take seriously the idea that referential metaphor is always also a type of metonymy, and the arguments in Section 2.2 suggest that similarity-based Representational metonymy is (often or perhaps always) also a type of metaphor. Of course, the reason for the snake/line link in (2my) being metonymic can be construed as being dierent from the reason for its being metaphorical in (2or). The metonymicity could reside just in the fact that the link is crossed for the purpose of achieving the appropriate mental reference, rather than in the intrinsic nature of the link; and the metaphoricity could reside in the use of the link as part of likening one thing to another and/or imaginatively identifying one thing with another. This would make metonymy and metaphor be, by denition, dierent properties of the use of links, no matter how much these properties coincide in application and no matter what the links are like intrinsically. In this view, the overlap of metaphor and metonymy in the use of a given link is a mixing of two dierent phenomena, but it is a dierent sort of mixing from the chaining of two links, one metonymic and the other metaphorical. The situation as regards intermediacy is dierent. Recall from the Introduction that intermediacy is the issue of whether there are phenomena
28
J. A. Barnden
that have some of the qualities of both metaphor and metonymy but are not classied as either. Notice rst that intermediateness and overlap do not imply each other, and are strongly contrasting concepts. Two types of thing can overlap without there being anything outside both types that is nevertheless close enough to both to qualify as intermediate between them; and two non-overlapping types of thing can have things between them. Prime numbers and even numbers overlap on the number 2, but this does not imply there is anything intermediate between evenness and primeness; and a 30-year old is intermediate between being a teenager and a middle-aged person whereas of course teenagers and middle-aged people have no overlap. However, the question of intermediacy versus overlap can be more complicated. The fringes of two categories might overlap and we could say that items within this overlap are intermediate between more central parts of the categories. Now, although each individual dimension that we have discussed has some similarity to a spectrum or continuum as proposed by other researchers, and the word spectrum or continuum suggests possibilities intermediate between metaphor and metonymy, our arguments do not necessarily imply intermediate phenomena. This article leaves the issue to future investigation. If we could show that on at least one constitutive dimension there were phenomena intermediate between metaphor and metonymy then the phenomena would necessarily be intermediate in the whole space. The intermediacy on the one dimension would be enough to stop the phenomena qualifying fully as either metaphor or metonymy. But in principle at least there could be phenomena that are intermediate in the whole space but that are not intermediate between them on any one dimension. Indeed, metaphor and metonymy might use largely the same portions of every dimension separately and yet still be completely separate in the whole space and allow intermediate possibilities there, because of complex positioning within the whole space.5 The way in which the dimensions enrich and sharpen the analysis of overlap and intermediacy between metaphor and metonymy is one way in which it is fruitful to consider the dimensions as opposed to conning discussion to the broader concepts of metaphor and metonymy.
5.
This could happen much as with shapes in geometric space. Consider an upright square in 2D space. Split the square on a diagonal and separate the two resulting triangles a little. The points within both triangles involve roughly the same range of horizontal positions and the same range of vertical positions. Yet the triangles do not overlap at all and there are intermediate points.
Metaphor and metonymy 4.3. (Some) metaphor as double metonymy
29
One type of theoretical move that considerably muddies the gurative water is to claim that metaphor in general, or some type of metaphor, is actually composed of metonymy in some way. For instance, the analyses of Riemer (2001, 2002) and Barcelona (2000b) indicate that any metaphor whose meaning is construed as transferring features identically from source to target can be viewed as metonymy, because those features can be viewed as metonymically linked to both the source item and the target item. (See also Haser 2005: 25). In this vein Riemer mentions the view of Group m (1981) that metaphor is double synecdoche (therefore double metonymy). That iswith some deviation from Group ms own terminologythere is a metonymy from the source concept to a feature also possessed by the target concept, and from that feature to the target concept. Alternatively, this can be viewed as a metonymy from the source concept up to a shared superordinate category (the category of things that possess the feature in question) and then down to the target concept. A theoretician is free to regard metaphor as composed of metonymy in some such way, but there is a sense in which it does not really matterit is a labelling move that leaves undisturbed the important point, namely that certain types of language involve certain types of conceptual link, or collections of links, where individual links may have qualities of similarity, contiguity or both, or may have other qualities; where the assembly of links used may be some strange mixture of types; and where dierent instances of language can structurally and procedurally arrange the links involved in a large number of dierent ways. Notice also that casting metaphor as metonymy does not of itself mean that the importance of, say, similarity in metaphor is downgraded. For example, in a double-metonymy view we still have the issue of what particular shared attribute (or set of attributes) or covering abstraction is being used, i.e., what the similarity is. It is just that this similarity is being theoretically structured as being via metonymies involving shared properties or via a covering abstraction. And of course we cannot couple just any two metonymies together to get a metaphor.
5.
Conclusions
We have seen that several ways in which metaphor and metonymy have been thought to dier do not work, because the alleged dierences are fuzzy or are at best only general tendencies. The fuzziness and slipperiness of the dierences is even greater than some authors have already considered them to be. In particular, we have argued that distinctions between
30
J. A. Barnden
metaphor and metonymy based on the following issues fail: contiguity versus similarity; source/target links surviving as part of the message; interaction with conceptual compartments; and (in passing) structural correspondence and hypotheticality of source items. While the authors own ATT-Meta theory of metaphor involves imaginary identication, as does blending theory, it is possible that metonymy involves it as well. Not only do the mentioned dimensions not serve individually to distinguish metaphor from metonymy, we have seen evidence that no combination of them does, because clear cases of metaphor and metonymy can be at the same position in the multidimensional space; and even that sometimes the use of a source/target link can be simultaneously metonymic and metaphorical. Some of our arguments echo points made by other authors, but we have added qualitatively new evidence and critique. Of course, the arguments do not bear upon the distinguishing power of other possible dimensions. Another question is whether our arguments bear against the possibility that there are (proto)typical forms of metaphoricity and metonymicity that can be cleanly distinguished. Haser (2005) argues against even this being possible on the dierentiating grounds that have been put forward in Cognitive Linguistics, but the present article leaves the possibility open, partly because it is by no means clear what counts as (proto)typical. In that metaphor and metonymy involve fuzzily dened ranges of complex combinations of contiguity, similarity, link survival, etc., it is helpful in the interests of more precise, richer, deeper and more liberated analysis to disentangle these properties from each other, even though the individual notions of contiguity, similarity, link survival, etc. are themselves fuzzily dened. We have argued that metaphor and metonymy can each involve types of contiguity and similarity, thus violating tacit, simplistic assumptions that these properties are opposed to each other. By arguing that some metonymy involves similarity in an essential way we encourage attention on investigating just what are the forms and extents of similarity that appear in metonymy and metaphor. Again, instead of seeking a way of rmly dierentiating metaphor and metonymy through their interaction with conceptual compartments such as domains and frames, we can concentrate on neutrally examining the ways in which metonymy and metaphor stay within or cross between compartments in particular regimes of compartments. We have also seen that the dimensions enrich the analysis of overlap and intermediacy between metaphor and metonymy. In particular, we advance beyond single-spectrum views and bring to light new possibilities for intermediacy. The question of whether more distant regions of the multidimensional space are interesting also arises.
31
Thus, a major conclusion from the discussion in this article is that instead of worrying about whether some utterance is metaphorical or metonymic, or even about how far along a literal/metonymic/metaphorical continuum it is, we should often be asking instead: What degree and type of similarity does it involve, if any? What sort of contiguity does it involve, if any? Does it involve link survival? Is the source item hypothetical, and in what way? Is there any imaginary identication? And so forth. Considering the dimensions in themselves helps to free us from a mindset that seeks clear-cut dierences between metaphor and metonymy when these may not exist. The most radical form this conclusion might take is the eliminativist possibility that the words metaphor and metonymy are just pragmatically useful labels in approximate discussions, not legitimate foci for detailed technical attention. Ritchie (2006: 11) says that Metaphor, and gurative language generally, is but a convenient way of identifying and discussing a widely-recognized but fuzzily dened subset of [certain interpretive connections he discusses]. Fauconnier (2009) says that metaphor, metonymy, etc. elude rigorous denition and that these categories do not provide deep insight; that insight comes from looking at the detailed underlying cognitive operations involved, such as blending, and the way they are combined. However, irrespective of whether such eliminativist suggestions are correct, the points made above about the usefulness of the dimensional analysis hold good. Received 8 July 2008 Revision received 14 May 2009 University of Birmingham
References
Barcelona, Antonio (ed.). 2000a. Metaphor and metonymy at the crossroads: A cognitive perspective. Berlin & New York: Mouton de Gruyter. Barcelona, Antonio. 2000b. On the plausibility of claiming a metonymic motivation for conceptual metaphor. In Antonio Barcelona (ed.), Metaphor and metonymy at the crossroads: A cognitive perspective, 3158. Berlin & New York: Mouton de Gruyter. Barcelona, Antonio. 2002. Clarifying and applying the notions of metaphor and metonymy within cognitive linguistics: An update. In Rene Dirven & Ralf Porings (eds.), Metaphor and metonymy in comparison and contrast, 207277. Berlin & New York: Mouton de Gruyter. Barcelona, Antonio. 2004. Metonymy behind grammar: The motivation of the seemingly irregular grammatical behavior of English paragon names. In Gunter Radden & Klaus-Uwe Panther (eds.), Studies in linguistic motivation, 357374. Berlin & New York: Mouton de Gruyter. Barnden, John A. 2001. Uncertainty and conict handling in the ATT-Meta contextbased system for metaphorical reasoning. In Varol Akman, Paolo Bouquet, Richmond
32
J. A. Barnden
Thomason & Roger A. Young (eds.), Modeling and using context: Third International and Interdisciplinary Conference (Lecture Notes in Articial Intelligence 2116), 1529. Berlin: Springer. Barnden, John A. 2006. Articial intelligence, gurative language and cognitive linguistics. In Gitte Kristiansen, Michel Achard, Rene Dirven & Francisco J. Ruiz de Mendoza Ibanez (eds.), Cognitive linguistics: Current applications and future perspectives, 431459. Ber lin & New York: Mouton de Gruyter. Barnden, John A., Sheila R. Glasbey, Mark G. Lee & Alan M. Wallington. 2004. Varieties and directions of inter-domain inuence in metaphor. Metaphor and Symbol 19(1). 130. Black, Max. 1993 [1979]. More about metaphor. In Andrew Ortony (ed.), Metaphor and thought, 2nd edn., 1941. Cambridge & New York: Cambridge University Press. Cameron, Lynne. 1999a. Operationalising metaphor for applied linguistic research. In Lynne Cameron & Graham Low (eds.), Researching and applying metaphor, 128. Cambridge & New York: Cambridge University Press. Cameron, Lynne. 1999b. Identifying and describing metaphor in spoken discourse data. In Lynne Cameron & Graham Low (eds.), Researching and applying metaphor, 105132. Cambridge & New York: Cambridge University Press. Chiappe, Dan L. 1998. Similarity, relevance, and the comparison process. Metaphor and Symbol 13(1). 1730. Cooper, David E. 1986. Metaphor. Oxford: Blackwell. Croft, William. 2002. The role of domains in the interpretation of metaphors and metony mies. In Rene Dirven & Ralf Porings (eds.), Metaphor and metonymy in comparison and contrast, 161205. Berlin & New York: Mouton de Gruyter. Croft, William. 2006. On explaining metonymy: Comments on Peirsman and Geeraerts, Metonymy as a prototypical category. Cognitive Linguistics 17(3). 317326. Croft, William & D. Alan Cruse. 2004. Cognitive linguistics. Cambridge & New York: Cambridge University Press. Dirven, Rene. 2002. Metonymy and metaphor: Dierent mental strategies of conceptualiza tion. In Rene Dirven & Ralf Porings (eds.), Metaphor and metonymy in comparison and contrast, 75111. Berlin & New York: Mouton de Gruyter. Dirven, Rene & Ralf Porings (eds.). 2002. Metaphor and metonymy in comparison and con trast. Berlin & New York: Mouton de Gruyter. Fass, Dan. 1997. Processing metaphor and metonymy. Greenwich, CN: Ablex. Fauconnier, Gilles. 2009. Generalized integration networks. In Vyvyan Evans & Stephanie Pourcel (eds.), New directions in cognitive linguistics, 147160. Amsterdam: John Benjamins. Fauconnier, Gilles & Mark Turner. 1998. Conceptual integration networks. Cognitive Science 22(2). 133187. Feyaerts, Kurt. 2000. Rening the Inheritance Hypothesis: Interaction between metaphoric and metonymic hierarchies. In Antonio Barcelona (ed.), Metaphor and metonymy at the crossroads: A cognitive perspective, 5978. Berlin & New York: Mouton de Gruyter. Geeraerts, Dirk. 2002. The interaction of metaphor and metonymy in composite expres sions. In Rene Dirven & Ralf Porings (eds.), Metaphor and metonymy in comparison and contrast, 435465. Berlin & New York: Mouton de Gruyter. Gibbs, Ray W., Jr. 1990. Comprehending gurative referential descriptions. Journal of Experimental Psychology: Learning, Memory and Cognition 16(1). 5666. Gibbs, Ray W., Jr. 1999. Researching metaphor. In Lynne Cameron & Graham Low (eds.), Researching and applying metaphor, 2947. Cambridge & New York: Cambridge University Press. Goatly, Andrew. 1997. The language of metaphors. London & New York: Routledge.
33
Goodman, Nelson. 1968. Languages of art: An approach to a theory of symbols. Indianapolis: Bobbs-Merrill. Goossens, Louis. 1990. Metaphtonymy: The interaction of metaphor and metonymy in expressions for linguistic action. Cognitive Linguistics 1. 323340. Glucksberg, S. 2001. Understanding gurative language. Oxford & New York: Oxford University Press. Glucksberg, Sam & Boaz Keysar. 1990. Understanding metaphorical comparisons: Beyond similarity. Psychological Review 97(1). 318. Grady, Joseph E. 1997. THEORIES ARE BUILDINGS revisited. Cognitive Linguistics 8(4). 267290. Group m. 1981. A general rhetoric. Baltimore & London: The John Hopkins University Press. Haser, Verena. 2005. Metaphor, metonymy and experientialist philosophy: Challenging cognitive semantics. Berlin & New York: Mouton de Gruyter. Indurkhya, Bipin. 1992. Metaphor and cognition: An interactionist approach. Dordrecht: Kluwer. Jakobson, Roman. 2002 [1956]. The metaphoric and metonymic poles. In Rene Dirven & Ralf Po rings (eds.), Metaphor and metonymy in comparison and contrast, 4147. Berlin & New York: Mouton de Gruyter. Reprinted with added abstract by Rene Dirven from Roman Jakobson & M. Halle (eds.) 1971 [1956], Fundamentals of Language 2, 9096. The Hague & Paris: Mouton.) Kittay, Eva F. 1989. Metaphor: Its cognitive force and linguistic structure. Oxford: Clarendon Press. Kovecses, Zoltan. 1990. Emotion concepts. Berlin & New York: Springer. Kovecses, Zoltan. 2002. Metaphor: A practical introduction. Oxford & New York: Oxford University Press. Lako, George. 1993. The contemporary theory of metaphor. In Andrew Ortony (ed.), Metaphor and Thought, 2nd edn., 202251. Cambridge & New York: Cambridge University Press. Lako, George & Mark Turner. 1989. More than cool reason: A eld guide to poetic metaphor. Chicago: University of Chicago Press. Langlotz, Andreas. 2006. Idiom creativity: A cognitive-linguistic model of idiomrepresentation and idiom-variation in English. Amsterdam & Philadelphia: John Benjamins. Lodge, David. 1977. The modes of modern writing: Metaphor, metonymy and the typology of modern literature. London: Edward Arnold; Ithaca, NY: Cornell University Press. Moore, Kevin E. 2006. Space-to-time mappings and temporal concepts. Cognitive Linguistics 17(2). 199244. Norrick, Neal R. 1981. Semiotic principles in semantic theory. Amsterdam: John Benjamins. Nunberg, Georey. 1978. The pragmatics of reference. Bloomington, IN: Indiana University Linguistics Club. Nunberg, Georey. 1995. Transfers of meaning. Journal of Semantics 12(2). 109132. Panther, Klaus-Uwe. 2006. Metonymy as a usage event. In Gitte Kristiansen, Michel Achard, Rene Dirven & Francisco J. Ruiz de Mendoza Ibanez (eds.), Cognitive linguistics: Current applications and future perspectives, 147185. Berlin & New York: Mouton de Gruyter. Paradis, Carita. 2004. Where does metonymy stop? Senses, facets, and active zones. Metaphor and Symbol 19(4). 245264. Peirsman, Yves & Dirk Geeraerts. 2006. Metonymy as a prototypical category. Cognitive Linguistics 17(3). 269316.
34
J. A. Barnden
Pragglejaz Group 2007. MIP: A method for identifying metaphorically used words in discourse. Metaphor and Symbol 22(1). 139. Radden, Gu nter. 2002. How metonymic are metaphors? In Rene Dirven & Ralf Porings (eds.), Metaphor and metonymy in comparison and contrast, 407434. Berlin & New York: Mouton de Gruyter. Radden, Gunter & Zoltan Kovecses. 1999. Towards a theory of metonymy. In Klaus-Uwe Panther & Gunter Radden, Metonymy in language and thought, 1759. Amsterdam & Philadelphia: John Benjamins. Radman, Zdravko. 1997. Diculties with diagnosing the death of a metaphor. Metaphor and Symbol 12(2). 149157. Riemer, Nick. 2001. Remetonomyzing metaphor: Hypercategories in semantic extension. Cognitive Linguistics 12(4). 379401. Riemer, Nick. 2002. When is a metonymy no longer a metonymy. In Rene Dirven & Ralf Porings (eds.), Metaphor and metonymy in comparison and contrast, 379406. Berlin & New York: Mouton de Gruyter. Ritchie, L. David. 2006. Context and connection in metaphor. Basingstoke & New York: Palgrave Macmillan. ez, Ruiz de Mendoza Iban Francisco J. 1999. From semantic underdetermination via metaphor and metonymy to conceptual interaction. Paper No. 492, LAUD Linguistic Agency, SERIES A: General and Theoretical Papers. Ruiz de Mendoza Ibanez, Francisco J. & Olga I. Dez Velasco. 2002. Patterns of conceptual interaction. In Rene Dirven & Ralf Porings (eds.), Metaphor and metonymy in comparison and contrast, 489532. Berlin & New York: Mouton de Gruyter. Stern, Josef. 2000. Metaphor in context. Cambridge, MA & London, UK: Bradford Books, MIT Press. Turner, Mark & Gilles Fauconnier. 1995. Conceptual integration and formal expression. Metaphor and Symbolic Activity 10(3). 183204. Turner, Mark & Gilles Fauconnier. 2000. Metaphor, metonymy, and binding. In Antonio Barcelona (ed.), Metaphor and metonymy at the crossroads: A cognitive perspective, 133 145. Berlin & New York: Mouton de Gruyter. Turner, Mark & Gilles Fauconnier. 2002. Metaphor, metonymy, and binding. In Rene Dirven & Ralf Porings (eds.), Metaphor and metonymy in comparison and contrast, 469 487. Berlin & New York: Mouton de Gruyter. Ullmann, Stephen. 1962. Semantics: An introduction to the science of meaning. Oxford: Blackwell. Warren, Beatrice. 1999. Aspects of referential metonymy. In Klaus-Uwe Panther & Gunter Radden, Metonymy in language and thought, 121135. Amsterdam & Philadelphia: John Benjamins. Warren, Beatrice. 2002. An alternative account of the interpretation of referential metonymy and metaphor. In Rene Dirven & Ralf Porings (eds.), Metaphor and metonymy in compar ison and contrast, 113130. Berlin & New York: Mouton de Gruyter. Warren, Beatrice. 2006. Referential metonymy. Scripta Minora of the Royal Society of Letters at Lund, 20032004: 1. Stockholm: Almqvist and Wiksell International. Wee, Lionel. 2006. Proper names and the theory of metaphor. Journal of Linguistics 42. 355371.
Grammatical weight and relative clause extraposition in English

ELAINE J. FRANCIS*
Abstract In relative clause extraposition (RCE) in English, a noun is modied by a non-adjacent RC, resulting in a discontinuous dependency, as in: Three people arrived here yesterday who were from Chicago. Although discourse focus is known to inuence the choice of RCE over truth-conditionally equivalent sentences with canonical structure (Rochemont and Culicover 1990; Takami 1999), Hawkins (2004) and Wasow (2002) have proposed in addition that RCE should be preferred when the relative clause is long (or heavy) relative to the VP because such structures are processed more eciently in comprehension and production. The current study tested this hypothesis based on Hawkins (2004) domain minimization principles. In an acceptability judgment task, canonical sentences were rated signicantly higher than extraposition sentences when the RC was light, but this dierence disappeared when the RC was heavy. In a self-paced reading task, extraposition sentences were read signicantly faster than canonical sentences when the RC was heavy, but there was no dierence when the RC was light. In an analysis of RCE in the ICE-GB corpus, extraposed RCs were signicantly longer than the VP on average, whereas canonical RCs were signicantly shorter, and the proportion of sentences with extraposition decreased as the ratio of VP-to-RC length increased. These ndings support Hawkins
* Correspondence address: Department of English and Linguistics Program, 500 Oval Drive, Purdue University, West Lafayette, IN 47907, USA. E-mail: 3ejfranci@purdue. edu4. Acknowledgements: I am grateful to research assistants Yanhong Zhang and Najeong Kim for all their help with data collection and processing. I would also like to thank Ewa Da browska, Stephen Matthews, and three anonymous CL referees for their detailed and insightful comments on earlier versions of the paper, as well as Bill Croft, Pat Deevy, Alex Francis, Jack Hawkins, and Tom Wasow for their helpful discussions of the data. This research was funded by the Department of English and the College of Liberal Arts at Purdue University. Cognitive Linguistics 211 (2010), 3574 DOI 10.1515/COGL.2010.002 09365907/10/00210035 6 Walter de Gruyter
36
E. J. Francis
(2004) domain minimization principles and help explain why a discontinuous dependency is allowed and sometimes preferred even in a language with relatively xed word order. Keywords: grammatical weight; relative clause extraposition; sentence processing; English syntax.
1.
Introduction
In relative clause extraposition from subject NP, a relative clause modies a noun within the subject NP but occurs in a position following the VP. In sentence (1a) from the International Corpus of English Great Britain (henceforth, ICE-GB), for example, the VP soon appeared intervenes between the noun sets and the relative clause that were able to receive all the TV channels. The extraposition sentence in (1a) can be used to express the same proposition as the canonical sentence in (1b), in which the relative clause directly follows its head noun. (1) a. b. New sets soon appeared that were able to receive all the TV channels. (ICE-GB) New sets that were able to receive all the TV channels soon appeared.
Relative clause extraposition (henceforth RCE) in (1a) violates the prescriptive rule banning misplaced modiers and would likely be corrected to (1b) if found in a students essay (see Trenga 2006: 5156). Nevertheless, this type of extraposition occurs naturally in formal speech and formal writing as well as informal styles, as illustrated in the additional examples in (2) from the ICE-GB corpus: (2) a. The indications are that Europe will once again become the priority because slowly and awkwardly, a treaty is taking shape which will most likely emerge by the end of this year as a new European union. (formal speech) However, a close look at the nature of these countries development experiences will reveal that certain conditions existed which cannot be applied to all other countries at all times. (formal writing) I think it would take uh oh three-quarters of an hour to an hour for somebody to start who didnt have any any any experience at all. (informal speech)
b.
c.
Linguists agree that the misplaced modier in RCE involves some kind of discontinuous dependency between the subject NP and the extra-
Grammatical weight
37
posed relative clause.1 However, the precise nature of this dependency has been a matter of debate within formal approaches to syntax. Since the relative clause (henceforth, RC) occurs in a position completely outside the subject NP, this dependency apparently violates the requirement of X-bar structure that modiers should occur as either complements to the head or as adjuncts to a phrasal projection of the head. As a way of reconciling RCE with X-bar theory, the earliest generative accounts assumed rightward movement of the RC from its canonical position (e.g., Baltin 1981; Ross 1967). However, as Rochemont and Culicover (1990) point out, this solution was problematic because RCE is both more and less constrained than typical movements like wh-movement. For example, RCE is subject to a locality condition that prevents the extraposed RC from occurring outside the clause containing its antecedent (originally formulated as the Right Roof Constraint, Ross 1967), while at the same time freely allowing certain island violations, for example extraction of the extraposed element out of the subject NP, as in the examples in (1a) and (2ac) above. Because of its special properties, RCE and related extraposition constructions have been problematic for syntactic theories and have resulted in a number of dierent analyses, all of which involve special formal or interpretive mechanisms. In addition to rightward movement (Ross 1967, Baltin 1981), other accounts have included discontinuous constituency of the NP (McCawley 1987, 1998) and stranding of the extraposed phrase due to leftward movement of non-extraposed elements (Kayne 1994). Another more surface-oriented approach favors base-generated adjunction of the extraposed constituent to IP or VP (a normal case of adjunction as far as constituent structure is concerned), with the addition of a special interpretive rule to state the discontinuous dependency and the relevant locality constraints (Rochemont and Culicover 1990). Along the same lines, Culicover and Jackendo (2005) characterize RCE as a case of simple adjunction in the syntax, but as involving a radical mismatch between syntax and semantics. In his review of the literature on extraposition, Baltin (2006) argues that each of these approaches has its merits, but that none of them provides a fully satisfactory explanation of all the relevant data (see also Bianchi 2002). Regardless of which formal approach one takes, however, it is clear that RCE is unexpected from a purely formal perspective. Why should English allow a discontinuous dependency that requires special syntactic
1. I remain neutral as to whether nominals with determiners are best analyzed as NP or DP since this issue is not important for the research questions addressed in the current study.
38
E. J. Francis
or interpretive mechanisms when a canonical structure can be used to express the same proposition? Why are RCE structures allowed and even preferred in some contexts in a language with relatively xed word order and constituent structure? The answer that has been proposed most often is that RCE is used to express a particular type of discourse information. Although the details of various accounts dier, the general consensus is that RCE tends to occur when the RC expresses new, contrastive or important information and the VP expresses old or backgrounded information (cf. Huck and Na 1990, 1992; Kuno and Takami 2004; Rochemont and Culicover 1990; Takami 1999). In (2b) above, for example, the RC expresses the most important information of the sentence while the VP existed merely asserts the existence of the subject certain conditions, helping to introduce the subject. More generally, RCE adheres to the tendency of focused constituents to occur later in a sentence than old or backgrounded information. Thus, the marked word order of RCE may be exploited by speakers to highlight the information status of the RC. Another factor that has been discussed in relation to this type of extraposition is predicate type. Rochemont and Culicover (1990: 65) observe that extraposition from subject typically occurs with unaccusative predicates such as predicates of existence and appearance. However, they show that other predicate types such as unergative and transitive predicates are also permissible as long as the predicate can be understood as c-construable (i.e., old or backgrounded) information. This is in contrast to other constructions such as Presentational there Insertion (PTI), which appear to be more strictly limited to occur with unaccusative predicates, as illustrated by the contrast in (3ab). (3) a. b. A man phoned her up who she didnt know. (RCE) *There phoned her up a man who she didnt know. (PTI)
Thus, Rochemont and Culicover (1990: 66) argue that the preference for unaccusative predicates derives from their typical use in discourse rather than from any lexically specied constraints on extraposition. While acknowledging the importance of discourse information structure, the current study investigates the possible role of a dierent factor grammatical weightin licensing RCE in English. Exact denitions vary, but the term grammatical weight usually refers to the length and/or complexity of a phrase in relation to other phrases in the same sentence: heavier constituents are longer or structurally more complex than lighter constituents. Grammatical weight has been implicated in a number of phenomena involving non-canonical word order in several languages (Arnold et al 2000, 2004; Cheung 2006; Hawkins 1994, 2004; Konieczny 2000; Lohse et al 2004; Matthews and Yeung 2001; Siewierska 1993; Stal-
Grammatical weight
39
lings et al 1998; Uszkoreit et al 1998; Wasow 1997; Yamashita and Chang 2001) and I will propose that it plays an important role in English RCE as well. Following Quirk et al (1972), Wasow (2002: 3) presents the following descriptive generalization: (4) Principle of End Weight (PEW): Phrases are presented in order of increasing weight.2
For example, in a phenomenon known as Heavy NP Shift (henceforth, HNPS), the direct object of a transitive verb occurs at the end of the sentence following an oblique argument or adjunct (usually a PP), as in (5b), rather than occurring in its canonical position adjacent to the verb, as in (5a). In accordance with the PEW, corpus studies have shown that HNPS normally occurs when the object NP is heavier than the PP, as in (5b) (Arnold et al 2000; Wasow 1997). Sentences with a light object NP in a shifted position, as in (5c), do sometimes occur, but are uncommon and are typically judged as less acceptable in the absence of special intonation or discourse conditions. (5) a. b. The waiter brought the wine we had ordered to the table. (Canonical) The waiter brought to the table the wine we had ordered. (HNPS) (Arnold et al. 2000: 28) ?The waiter brought to the table the wine.
c.
Why should longer, more complex constituents tend to occur later in the sentence? Wasow (1997) and Arnold et al (2000) have argued that placing heavy constituents later facilitates production and planning of utterances by giving speakers extra time to formulate the heavier constituent (see also Arnold et al 2004, Stallings et al 1998, Yamashita 2002, and Yamashita and Chang 2001). In addition, Hawkins (1994) has proposed that placing heavy constituents at the end allows listeners and readers to process sentences more eciently by allowing faster recognition of the major constituents of the sentence (see also Matthews and Yeung 2001).3 Similarly, Gibson (1998) has proposed that moving heavy constituents to
2. Wasow (1997: 82) points out that this idea has been around for a long time and was, for example, noted by Otto Behaghel (1909). 3. Note that in Hawkins theory, this is only true for head-initial languages. For head-nal languages such as Japanese and Turkish, it is predicted that heavy constituents should be shifted frontward to facilitate constituent recognition (2004: 108109).
40
E. J. Francis
the end can reduce integration costs in the processing of non-local dependencies. To the extent that these explanations can be distinguished from each other empirically, experimental and corpus studies of production and comprehension of alternative word orders have provided evidence in favor of both production and parsing-based explanations. As a result of such ndings, Hawkins (2004) has extended his approach to apply more globally to both production and comprehension through a general principle of domain minimization (see also Cheung 2006, Lohse et al 2004). Culicover and Jackendo (2005: 167) and Wasow (2002: 67) have suggested that RCE in English should be sensitive to grammatical weight in a similar manner to HNPS. As Wasow states, Both the nal position of the (usually heavy) extraposed element and the lightening of the NP serve to increase the probability of satisfying the PEW (Wasow 2002: 7). Although Wasows remarks are suggestive, there have been no previous empirical studies testing this prediction for English. The current study seeks to ll this gap by investigating the role of grammatical weight in the processing, acceptability, and usage of RCE in English. One challenge for investigating the role of grammatical weight is that it is correlated with discourse status: lighter constituents tend express old information while heavier constituents tend to express new information (e.g., Givon 1983). Furthermore, only discourse status has been investigated in previous studies of RCE, and even this factor has not been investigated using quantitative data. However, previous studies of other syntactic alternations have shown that discourse status and grammatical weight can have independent eects. For example, Arnold et al (2000) found that each factor had an independent eect on the likelihood of people producing a particular variant for HNPS and Dative sentences. While discourse status is not directly investigated in the current study, it is held constant in the experiments so that the eect of weight can be isolated, and it is investigated indirectly in the corpus study by means of an analysis of predicate types. In the following sections, I report on the results of two psycholinguistic experiments and a corpus study of weight eects in RCE. The results of Experiment 1, a self-paced reading experiment, showed a signicant reading time advantage for RCE when the RC was heavy and no dierence between RCE and canonical structure when the RC was light. The results of Experiment 2, an acceptability judgment task, showed that canonical sentences were judged as signicantly more acceptable than RCE sentences when the RC was light, but that this dierence disappeared when the RC was heavy. Finally, a corpus study found that on average, extraposed RCs were longer than the VP while non-extraposed RCs were
Grammatical weight
41
shorter, and that the proportion of sentences with RCE decreased as the ratio of VP length to RC length increased. I argue that these results support Hawkins (2004) theory and help explain why RCE is preferred in some contexts despite the discontinuous dependency. The paper is structured as follows. Section 2 discusses previous research on grammatical weight and RCE in German and sets out the predictions of Hawkins (2004) theory for RCE in English. Sections 34 discuss two psycholinguistic experiments that tested the eects of grammatical weight on sentence processing: a reading time task (Experiment 1) and an acceptability judgment task (Experiment 2). Section 5 discusses a corpus study of naturally-occurring examples of RCE to show how grammatical weight aects speakers and writers structural choices. Section 6 concludes the paper and discusses some of the implications of the results reported here. 2. Relative clause extraposition, grammatical weight, and processing
As Wasow (2002: 7) points out, the discontinuous dependency in RCE not only complicates the syntax but also presumably increases processing complexity. However, studies of grammatical weight have shown that moving heavy constituents to the end, as typically happens with RCE, can also facilitate processing. To see how these two opposing constraints might interact in the case of RCE, it is necessary to consider some examples within a particular theoretical framework. Two major theories have dealt specically with weight eects in sentence processing: Gibsons (1998) Syntactic Prediction Locality Theory (SPLT) and Hawkins (1994, 2004) performance-based theory of constituent order. Both theories are locality based in that both predict that there should be a greater cost to working memory when listeners or readers must integrate linguistic information across a distance. Thus, in the case of RCE, integrating the RC with its head noun across an intervening VP constituent should incur some cost to working memory. However, the same theories predict that in cases where the RC is heavy, the cost of integrating the subject NP with the verb will be greater, and processing eciency may be maximized by placing the heavier constituent at the end. The two theories dier in a number of details, most importantly in the way that the distance between constituents is measured and in how the processing domains are dened. However, they make similar predictions with respect to end weight eects, and I will not attempt to distinguish between them empirically in this study. Rather, I will couch the study in terms of Hawkins (2004) theory because this theory spells out the predictions for extraposition phenomena in a more detailed manner. In
42
E. J. Francis
addition, Hawkins theory is more comprehensive in making predictions for choice of structure in production as well as for processing ease in comprehension, and for dealing with diverse linguistic phenomena in a variety of typologically distinct languages. Hawkins (2004) theory of eciency and complexity in grammars sets out very specic predictions for weight eects in the production and comprehension of language. The main thrust of this theory is the principle of Minimize Domains, as dened in (6): (6) Minimize Domains: The human processor prefers to minimize the connected sequences of linguistic forms and their conventionally associated syntactic and semantic properties in which relations of combination and/or dependency are processed. The degree of this preference is proportional to the number of relations whose domains can be minimized in competing sequences or structures, and to the extent of the minimization dierence in each domain. (Hawkins 2004: 104)
One thing this principle predicts is that speakers should prefer to rearrange heavy constituents to minimize the domains in which relations between linguistic elements are processed. Several syntactic and semantic domains are relevant for processing, including syntactic dependencies in phrase structure, lexically specied dependencies between head words and their complements (e.g., subcategorization, theta roles, collocations), semantic dependencies between modiers and modied phrases, and various kinds of co-indexation (e.g., pronominal coreference, ller-gap dependencies). The most important domain for our purposes is the Phrasal Combination Domain (called Constituent Recognition Domain in earlier versions of the theory) as dened in (7): (7) Phrasal Combination Domain (PCD): the smallest string of elements required to construct a mother node and its immediate constituents (Hawkins 2004: 107).
The assumption is that constituents can be constructed as soon as the head word (or other constructing category) is encountered. In English Heavy NP-Shift, for example, the PCD for the VP includes the verb and head word of each of its complements (i.e., the verb, the preposition head of PP, and the determiner or noun introducing the NP). When the NP is heavy, the PCD for VP can be made smaller by moving the NP to a position following the PP, as shown in (7ab). (8) a. PCD for canonical VP The waiter brought the wine we had ordered to the table.
Grammatical weight b. PCD for VP with HNPS
43
The waiter brought to the table the wine we had ordered. A simple way of calculating PCDs is in terms of IC-to-word ratios (Hawkins 1994). In (8ab) the VP has three immediate constituents: the verb, the object NP, and the PP. Thus, the IC-to-word ratio for the VP in (8a) is 3/7 (43%), while the IC-to-word ratio for the VP in (8b) is 3/5 (60%). Thus, the shifted structure in (8b) is predicted to be more ecient than the canonical structure in (8a). Although no previous studies of grammatical weight have investigated RCE in English, Hawkins theory has been tested in two studies of a similar type of extraposition in German (Uszkoreit et al 1998; Konieczny 2000). Both studies looked at the eect of grammatical weight on preferences for constituent order in sentences similar to those in (9ab) below from Hawkins (2004: 137). Note that in German, the main verb comes at the end of the verb phrase (when there is an auxiliary verb), and the object NP comes before the verb. In this type of extraposition, the RC is moved from the canonical object position as in (9a) to a position following the verb as in (9b). I will refer to this type of extraposition as RCE from Object NP. (9) a. PCDs for canonical sentence
Er hat gestern das Buch das der alte Professor verloren hatte gefunden. he has yesterday the book that the old professor lost had found.
Object NP VP He found the book yesterday that the old professor had lost. IC-to-word ratios: Object NP 3/3 (100%), VP 3/10 (30%) b. PCDs for extraposition sentence
Er hat gestern das Buch gefunden das der alte Professor verloren hatte. he has yesterday the book found that the old professor lost had.
Object NP VP He found the book yesterday that the old professor had lost. IC-to-word ratios: Object NP 3/4 (75%), VP 3/4 (75%) As shown above in (9ab), the PCD for the object NP (i.e., the distance for integrating the head noun with its RC) is minimized in the canonical sentence, with an IC-to-word ratio of 100%. However, the PCD for VP (i.e., the distance for integrating the verb with its object) is minimized in the extraposition sentence, with an IC-to-word ratio of 75%. To determine
44
E. J. Francis
which structure is most ecient for a given sentence, what matters is the overall minimization of all the relevant domains. For example, when the RC is heavy and there is only a one-word intervening verb, as in (9b), the most ecient structure should be extraposition because the combined PCDs for Object NP and VP are minimized. This is shown in both the higher average IC-to-Word ratio for extraposition (75% vs 65%) and in the lower number of words needed to construct both the NP and VP domains in the extraposition sentence (8 words vs. 13 words) (Hawkins 2004: 138). More generally, extraposition should be more ecient than the corresponding canonical structure to the extent that RC length exceeds extraposition distance (i.e., length of the main verb and any modiers preceding it). This is because making the RC longer increases the VP domain in the canonical sentence, while making the extraposition distance shorter decreases the NP domain in the extraposition sentence. Both Uszkoreit et al (1998) and Konieczny (2000) found some evidence in support of Hawkins predictions for German RCE. In a corpus study that looked at sentences similar to those in (8ab) above, Uszkoreit et al (1998) found that the rate of extraposition was highest when extraposition occurred over a short distance (a one-word verb) and when the RC itself was long. For example, extraposition occurred about 95% of the time for sentences with a one-word main verb (gefunden in 8 above) but only about 10% of the time when the main verb was preceded by a four-word modifying phrase (Uszkoreit et al 1998: 7). The length of the RC also affected the rate of extraposition. For example, when a one-word phrase preceded the verb and the RC was short (two or three words), extraposition only occurred in about 33% of the relevant sentences. However, this increased to 82% when the RC was long (1015 words) (Uszkoreit et al 1998: 9). Uszkoreit et al also conducted an acceptability judgment task for which they systematically varied the extraposition distance and the length of the RC. As expected, extraposition sentences were rated higher as RC length increased and as extraposition distance decreased. Contrary to expectation, however, canonical sentences were rated as more acceptable than extraposition sentences in almost all cases. Extraposition sentences were rated as highly as canonical sentences only when the extraposition distance was short (one word) and the RC was long (1998: 12). Konieczny (2000) conducted an acceptability judgment experiment and a self-paced reading experiment to investigate weight eects in the processing of German RCE. Results of the acceptability judgment task were very similar to the results of the acceptability study by Uszkoreit et al (1998). Acceptability of extraposition was highest when the extraposition distance was short (one word) and when the RC was medium or long (Konieczny 2000: 638639). Also similar to Uszkoreit et als (1998) re-
Grammatical weight
45
sults, acceptability of canonical sentences was overall higher than acceptability of extraposition sentences in all conditions. Konieczny (2000) also conducted a self-paced reading experiment in which it was found that, as expected, the relative pronoun was read slower when the RC was extraposed, indicating some processing cost for integrating the noun with a nonadjacent RC. However, contrary to locality-based predictions, reading time at the main verb was signicantly slower in the extraposition sentences than in the canonical sentences even though the verb was closer to its object in the extraposition sentences. Konieczny (2000: 643) explains this eect (sometimes called antilocality since there is a processing advantage for non-local structures) as possibly the result of the RC in the canonical sentence helping readers anticipate the phrase-nal verb by providing additional information about one of its arguments and by allowing extra time for readers to deduce information about the verb. Also contrary to locality-based predictions, longer RCs did not signicantly aect reading time of the verb, nor did longer extraposition distance have any signicant eect on reading time of the relative pronoun. Given the very dierent results of the acceptability task (which patterned with Uszkoreit et als acceptability data) and the self-paced reading task, Konieczny (2000: 644) suggests that locality-based predictions are clearest for production, whereas online comprehension is subject to other eects such as anticipatory processing. However, he does not explain why locality eects in comprehension have been shown in other studies.4 RCE from Object NP in German is not quite the same as RCE from Subject NP in English, but the predictions of Hawkins theory are quite similar. As illustrated in (9ab), the PCD for the subject NP is minimized in the canonical sentence, while the PCD for the matrix clause is minimized in the extraposition sentence. Similar to the case of German RCE, extraposition is preferred to the extent that the length of the RC exceeds the length of the matrix VP. When the RC is heavier than the VP, as in (10ab), there is predicted to be an overall advantage for extraposition because the combined minimization, as measured in total number of words needed to construct both domains (Hawkins 2004: 138), is smaller for extraposition. In (10ab), for example, the total is 14 for the canonical sentence and 6 for the extraposition sentence. The same advantage for extraposition is also shown in the IC-to-Word ratios, where the average
4. See Vasishth and Lewis (2006) for additional evidence of anti-locality eects with headnal structures in Hindi. The authors attempt to explain both locality and anti-locality eects using a theory of activation decay.
46
E. J. Francis
ratio is higher for extraposition than for canonical structure (87.5% vs 59%). (10) a. PCDs for canonical sentence
New sets that were able to receive all the TV channels appeared.
Subject NP Matrix S IC-to-word ratios: Subject NP 3/3 (100%), Matrix S 2/11 (18%) b. PCDs for extraposition sentence
New sets appeared that were able to receive all the TV channels.
Subject NP Matrix S IC-to-word ratios: Subject NP 3/4 (75%), Matrix S 2/2 (100%) The experiments reported here are based on the predictions of Hawkins theory as outlined above.5 Experiment 1 uses a self-paced reading task to test the eects of grammatical weight on the processing of canonical vs. extraposition structures in English. Similarly, Experiment 2 investigates the eects of grammatical weight on acceptability judgments for the same kind of sentences as in Experiment 1. With the weight of the VP held constant (longer than the RC in the light RC condition and shorter than the RC in the heavy RC condition), predictions for these experiments (based on PCDs alone) are as follows:6
5.
6.
It should be noted that Hawkins (2004) theory takes account of several domains in addition to PCD. Within the Subject NP, there is a semantic dependency between the head noun and the RC, since the RC restricts the meaning of the noun, and there is co-indexing between the head noun and the relative pronoun. Similarly, within the Matrix S, there are additional dependencies between the predicate and the subject (e.g., selectional restrictions, theta role assignment). However, since there are additional dependencies within both the NP and VP domains, the overall predictions turn out to be very similar to the predictions for PCD alone. Therefore, following John Hawkins suggestion (p.c. 2007) the hypotheses for the current study are based on the simpler metric of PCDs alone. As an anonymous reviewer points out, additional factors other than weight or domain minimization could invalidate the rst two hypotheses. However, since weight is the only factor that was manipulated, with other known factors held constant, and since sentence materials were designed to be fully felicitous with both extraposed and canonical structures (apart from any weight eects), I will assume as a starting point that the rst two hypotheses should hold.
Grammatical weight
47
When the RC is shorter than the VP, canonical sentences should be read faster and judged as more acceptable than extraposition sentences. When the RC is longer than the VP, extraposition sentences should be read faster and judged as more acceptable than canonical sentences. As grammatical weight of the RC increases, acceptability ratings for canonical sentences should decrease and reading times for canonical sentences (mean RT per word) should slow down because of the longer distance for integrating the subject NP with the main verb.
However, note that if reading time is subject to anti-locality eects as found in Koniecznys (2000) study of German RCE, canonical sentences should be read faster than extraposition sentences regardless of RC weight because the RC should facilitate processing of the verb. An analysis of RCE in the ICE-GB corpus investigates the related issue of how grammatical weight aects speakers choice of structure in language use. Since the principle of Minimize Domains applies to production as well as perception, the theoretical predictions for the corpus analysis are similar to those of the experiments. All else being equal, extraposition should be preferred in language use in cases where RC length exceeds VP length, and this preference should be stronger as the dierence in length becomes greater. When all relevant examples of sentences with RCs modifying the subject are included in the analysis, we predict the following: The proportion of sentences with extraposition should be highest when the ratio of VP length to RC length is lowest (i.e., when the RC is much longer than the VP) and should decrease as this ratio increases. For extraposition sentences, the RC should be longer on average than the VP, while the converse should be true of canonical sentences. These weight eects are distinct from the eects of predicate type and should hold when predicate type is held constant.
Sections 35 report the results of Experiments 12 and the corpus analysis. We will see that all of the results showed signicant eects of grammatical weight in the expected direction, and that the predictions were borne out most clearly in the corpus analysis.
3.
Experiment 1: Reading time
The goal of Experiment 1 was to test whether grammatical weight aected processing eciency of sentences with an extraposed or non-extraposed RC modifying the subject NP. A self-paced reading task, for which whole
48
E. J. Francis
sentence reading time was measured, was used to assess processing eciency. As is standard for self-paced reading tasks, it was assumed that faster reading times indicate faster, more ecient processing, at least for sentences that are understood correctly. 3.1. Methods
3.1.1. Participants. Forty participants were recruited from the student body at Purdue University. All were native speakers of American English between the ages of 18 and 39 (average age 22). There were 17 men and 23 women. Participants gave informed consent and were each paid $6 for completing a 3545 minute session. 3.1.2. Materials. Experimental stimuli consisted of ve sets of nine sentences each in a 3 3 repeated measures design. Three levels of RC weight (4, 8, and 15 words) and three levels of structure (canonical RC, extraposed RC, adjunct clause) were manipulated, and VP length was held constant at ve words. Lexical content of the sentences was chosen to be maximally felicitous in both RCE and canonical congurations. To satisfy semantic and pragmatic conditions on RCE, only indenite, quantied subject NPs were used, and only intransitive, unaccusative VPs were used (see Rochemont and Culicover 1990: 6068). In addition, sentences were constructed so that the RC could readily be construed as new information. A sample stimulus set is shown in Table 1. Grammatically acceptable ller sentences of varying lengths were also included in the experiment. 3.1.3. Procedure. Following a brief background questionnaire, participants were presented with a series of sentences on a computer screen. Each sentence was presented in its entirety, and participants were instructed to press a button as soon as they had read and understood the sentence. Following each sentence, participants were presented with a true-or-false question about the content of the sentence (see Appendix A for exact instructions). To ensure validity of the results, only sentences with accurate responses to the comprehension question were included in the analysis of reading time. An E-Prime program was used to present the stimuli and record the whole-sentence reading times and true-false responses. Sentences were presented in 4 blocks of 33 sentences each. Each block consisted of 11 test sentences and 22 llers, except for block 4, which consisted of 12 test sentences and 23 llers. For each participant, sentences were ordered randomly within each block, and ordering of blocks was
Grammatical weight
49
also random. Participants were given the opportunity to take a break at the end of each block. 3.1.4. Hypotheses. Following Hawkins (2004), it was predicted that that there should be a reading time advantage for extraposition over canonical structure when the RC is heavier than the VP due to the longer distance for integrating the verb with its subject. Conversely, there should be a reading time advantage for canonical structure over extraposition when the RC is lighter than the VP because integration of the subject noun with the RC is easier when they are adjacent. Finally, reading time for canonical sentences should get slower as RC weight increases due to the increased distance for integrating the subject with the main verb. These predictions are reected in the total number of words for combined PCDs, as shown in Table 1. Extraposed relatives stay the same in all three weight conditions, at 10 words. Canonical relatives start at 9 words in the light condition, where they should be read faster than extraposed relatives, but increase to 13 and 20 words in the medium and heavy conditions, where they should be read slower than extraposed relatives. A third structureadjunct clausewas included as a control condition. This type consisted of a main clause identical to the main clause of the other two structures and a nite subordinate clause adjoined to the main clause following the VP (see examples in Table 1). Similar to extraposed RCs, adjunct clauses occur at the end of the sentence, but adjunct clauses are not involved in any discontinuous dependency with the subject NP. Therefore, reading times for sentences with adjunct clauses are predicted to be faster than reading times for extraposition sentences when the clause is light, similar to canonical sentences. However, since adjunct clauses do not intervene between the subject and the verb (unlike canonical RCs), reading times for adjunct sentences are not expected to slow down in the medium and heavy conditions. In terms of total number of words for combined PCDs, sentences with an adjunct clause stay at 9 words in all three conditions. 3.2. Results
For purposes of comparison across light, medium, and heavy conditions, mean reading time per word rather than whole sentence reading time was used in the analysis. Mean reading time per word (henceforth RTW) was calculated by dividing each whole sentence reading time by the number of words in the sentence. All test sentences within each category (light, medium, heavy) were of the same length in words and the three length conditions diered only in the length of the relative clause. Sentences within
50
E. J. Francis
Table 1. Sample stimulus set consisting of three clause weights and three structures Condition Light 4 words Canonical RC Sample stimulus sentence Three people who were from Chicago arrived here early yesterday morning. Three people arrived here early yesterday morning who were from Chicago. Three people arrived here early yesterday morning after they left Chicago. Three people who were from a northern suburb of Chicago arrived here early yesterday morning. Three people arrived here early yesterday morning who were from a northern suburb of Chicago. Three people arrived here early yesterday morning after they left a northern suburb of Chicago. Three people who were originally from a far northern suburb of Chicago which is called Lake Forest arrived here early yesterday morning. Three people arrived here early yesterday morning who were originally from a far northern suburb of Chicago which is called Lake Forest. Three people arrived here early yesterday morning after they secretly left a far northern suburb of Chicago which is called Lake Forest. IC-to-word ratios Subject NP: 3/3 (100%) Matrix S: 2/6 (33%) Total words: 9 Subject NP: 3/8 (37.5%) Matrix S: 2/2 (100%) Total words: 10 Subject NP: 2/2 (100%) Matrix S: 3/7 (43%) Total words: 9 Subject NP: 3/3 (100%) Matrix S: 2/10 (20%) Total words: 13 Subject NP: 3/8 (37.5%) Matrix S: 2/2 (100%) Total words: 10 Subject NP: 2/2 (100%) Matrix S: 3/7 (43%) Total words: 9 Subject NP: 3/3 (100%) Matrix S: 2/17 (12%) Total words: 20
Extraposed RC
Adjunct clause
Medium 8 words
Canonical RC
Extraposed RC
Adjunct clause
Heavy 15 words
Canonical RC
Extraposed RC
Subject NP: 3/8 (37.5%) Matrix S: 2/2 (100%) Total words: 10
Adjunct clause
Subject NP: 2/2 (100%) Matrix S: 3/7 (43%) Total words: 9
each condition of each stimulus set consisted of the same words, with the exception of the adjunct clause condition where the rst three words of the adjunct clause had to be changed to accommodate the dierent clause type (see Table 1). As shown in Figure 1 and Table 2, RTW for all three structures decreased (got faster) as clause weight increased. This trend was strongest
Grammatical weight
51
Figure 1. Mean reading time per word by clause weight
Table 2. Mean reading time per word by clause weight Light Canonical mean std. error n mean std. error n mean std. error n 356.44 15.38 40 381.88 23.84 40 361.93 15.32 40 Medium 322.07 16.90 40 346.40 21.66 40 341.43 16.43 40 Heavy 330.53 21.22 40 292.94 14.83 40 346.62 20.99 40
Extraposed
Adjunct
for extraposition sentences, for which mean RTW decreased from 382 ms in the light condition to 293 ms in the heavy conditiona dierence of 89 ms. In contrast, RTW for canonical and adjunct sentences decreased by only 25 ms and 15 ms, respectively. A two-way repeated measures ANOVA showed a signicant main effect of clause weight by participants, but the eect did not reach signicance in the by items analysis: F12; 38 6:56, p < 0:01; F22; 3 4:07, p 0:14. There was also a signicant interaction between clause weight and structure in the participant analysis, but the interaction did not reach signicance in the item analysis: F14; 36 3:18, p 0:02; F24; 1 11:15, p 0:22. No main eect for structure was found:
52
E. J. Francis
F12; 38 0:66, p 0:52; F22; 3 0:33, p 0:74. The dierence between participant and item analyses is most likely due to the fact that there were forty participants, but only ve items (sentence sets) were used. Pairwise t-tests conrm that the main eect of weight and the interaction between weight and structure were primarily due to the decrease in RTW for heavy extraposition sentences. While there was a signicant difference in RTW between light and heavy extraposition sentences (t 4:77, p < 0:01), there was no signicant dierence between light and heavy canonical sentences (t 1:33, p 0:19) or between light and heavy adjunct sentences (t 0:91, p 0:37). RTWs for extraposition sentences were signicantly faster than for canonical sentences in the heavy clause condition (t 2:59, p 0:01). In the light clause condition, RTWs for canonical sentences were slightly faster than for extraposition sentences, but this dierence was not signicant (t 1:45, p 0:16). 3.3. Discussion
As predicted by Hawkins (2004) theory, there was a signicant reading time advantage for extraposition over canonical structure in the heavy clause condition. Unlike Koniecznys (2000) ndings for German RCE, in which readings times on the main verb were uniformly faster for canonical sentences than for extraposition, the present study found no antilocality eects. The dierent results for German vs. English are not necessarily at odds with each other, however. Konieczny (2000) measured reading time on the main verb, whereas the present study measured only whole sentence reading time.7 While Koniecznys results show a consistent advantage for canonical structure with respect to reading time on the verb, possibly reecting readers early anticipation of the verb, it is still possible that the reading time for the entire sentence might have shown no such advantage, or even an advantage for extraposition, if such data had been collected. Arguably, the whole sentence reading times collected for the present study are a more direct reection of overall processing time for the sentence than are the localized reading times reported in the German study, and therefore might be expected to align more closely with corpus data, as they in fact do (see Section 5 below).8 Due
7. Another dierence between Koniecznys (2000) study and the present study is that in German RCE from Object NP, a head-nal transitive verb must be integrated with the direct object preceding it, but in English RCE from Subject NP, a head-initial intransitive verb must be integrated with the subject. Since all verbs need a subject in English, and since the subjects in the stimuli were equally plausible with or without the relative clause, there might be less reason for anticipatory eects to occur. I am grateful to an anonymous CL reviewer for pointing this out.
8.
Grammatical weight
53
to limitations of both studies, however, it is not possible to compare the results directly. Some of the other ndings are not exactly as predicted by Hawkins (2004). Although reading times were slightly faster for canonical and adjunct sentences than for extraposition sentences in the light condition, these dierences were not signicant. Despite the discontinuous dependency in the extraposition sentences, there was no evidence of added processing cost for extraposition. This result may, however, be due to limitations of the experimental design. In the light clause condition, the RC had four words and the VP had ve words, so that only a one-word advantage for canonical structure was predicted. It may be that the dierence between the relevant NP and Matrix S domains was too small to show up as a dierence in reading time. If, instead, the experiment had included a light clause condition in which the VP was twice as long as the RC, as in (11ab) below, we might have found a signicant advantage of canonical structure over extraposition. (11) a. b. Three people who were from Chicago arrived in Albuquerque early yesterday morning at 6 am. (Canonical) Three people arrived in Albuquerque early yesterday morning at 6 am who were from Chicago. (Extraposition)
Experimental evidence for a dierence in reading time between sentences like (11a) and (11b) awaits future research. However, in Section 5 below, we will see evidence from a corpus analysis that canonical structure is in fact strongly preferred in language use when VP length exceeds RC length. Also contrary to prediction, RTW for canonical sentences did not slow down as clause weight increased, but instead got slightly (though not signicantly) faster. In addition, RTW for extraposition sentences got signicantly faster as clause weight increased even though increased clause weight would have had no eect on the relevant PCDs. Because only whole sentence reading times were measured, it is dicult to determine why this happened. However, one possibility is that there was a general eect of sentence length such that longer sentences were read faster per word than shorter sentences. If so, this could help explain both patterns. For canonical sentences, the general speeding up with longer sentences might have counteracted the predicted slowing eect of heavy RCs, resulting in no signicant change in RTW. For extraposition sentences, the speeding up of RTW in the heavy condition could be entirely due to a general eect of sentence length, since heavier RCs would have had no effect on the PCDs for extraposition. However, since I am unaware of any
54
E. J. Francis
previous studies showing such an eect, this explanation must be considered speculative. 4. Experiment 2: Acceptability judgment task
The goal of Experiment 2 was to test whether grammatical weight affected acceptability judgments of sentences with an extraposed or nonextraposed RC modifying the subject NP. A written survey was used to collect sentence judgments using a 9-point scale. It was expected that acceptability judgments should reect domain minimization preferences in a similar manner to reading time. 4.1. Methods
4.1.1. Participants. Thirty-one participants were recruited from the student body at Purdue University. All were native speakers of American English between the ages of 18 and 54 (average age 24). There were 19 men and 12 women. Participants gave informed consent and were each paid $6 for completing a 3545 minute session. Data from one participant were excluded from the analysis because the participant turned out to be a native speaker of Spanish. Data from thirty participants were included in the analysis. People who had participated in Experiment 1 were excluded from the subject pool. 4.1.2. Materials. Sentence materials were the same as in Experiment 1 (see Table 1), with the exception of the ller sentences. As in the previous experiment, ve sets of nine test sentences were used. Each set included three levels of RC weight (4, 8, and 15 words) and three levels of structure (canonical RC, extraposed RC, adjunct clause). VP length was held constant at ve words. Unlike in Experiment 1, where only grammatically acceptable ller sentences were used, ller sentences covered a wide range of dierent levels of acceptability as well as dierent sentence lengths. Fillers were categorized in advance as good, medium, or bad in acceptability based on judgments of the same or similar sentences given in published sources such as syntax textbooks and research articles. 4.1.3. Procedure. Following a brief background questionnaire, participants were asked to complete a written survey. The survey consisted of a series of sentences for which participants were asked to rate each sentence on 9-point scale, where 9 means completely acceptable and 1 means completely unacceptable (see Appendix B for exact instructions). Rating scores were entered by circling a number from 1 to 9 below each sentence.
Grammatical weight
55
As in Experiment 1, sentences were presented in 4 blocks of 33 sentences each. Each block consisted of 11 test sentences and 22 llers, except for block 4, which consisted of 12 test sentences and 23 llers. Participants lled out one of four dierent survey scripts with two dierent orderings of sentences within each block and two dierent orderings of blocks. Pseudo-random order was used to arrange sentences within each block to avoid similar sentences on the same page of the survey. Participants were given the opportunity to take a break at the end of each block. Responses were later entered by hand into an Excel spreadsheet. 4.1.4. Hypotheses. Because canonical and extraposition sentences are equally grammatical (in the sense of being permitted by the grammar), and acceptability judgments are predicted to follow domain minimization preferences, weight-based predictions for acceptability are similar to those for reading time in Experiment 1. Following the domain minimization preferences shown in Table 1, it was predicted that extraposition should be rated higher than canonical structure when the RC is heavier than the VP (medium and heavy conditions) due to the longer distance for intergrating the verb with its subject. Conversely, canonical sentences should be rated higher than extraposition sentences when the RC is lighter than the VP (light condition) due to the longer distance between the subject noun and the RC. Acceptability of canonical sentences should decrease as RC weight increases due to the increased distance between the subject and the main verb. Finally, adjunct sentences should be rated similarly to canonical sentences (and higher than extraposition sentences) in the light condition, but unlike canonical sentences, ratings of adjunct sentences should not decrease in the medium and heavy conditions. 4.2. Results
As shown in Figure 2 and Table 3, mean ratings for canonical sentences decreased from 8.05 in the light clause condition to 6.69 in the heavy clause condition. Ratings for adjunct sentences similarly decreased from 7.63 in the light condition to 6.97 in the heavy condition. In contrast, ratings for extraposition sentences started lower, at 6.33 in the light condition, and increased slightly to 6.41 in the heavy condition. A two-way repeated measures ANOVA showed a signicant main effect for weight in both the participant and item analyses: F12; 28 11:06, p < 0:01; F22; 3 19:15, p 0:02. There was also a signicant main eect of structure for both participant and item analyses: F12; 28 13:13, p < 0:01; F22; 3 60:84, p < 0:01. There was a signicant interaction between clause weight and structure in the participant analysis
56
E. J. Francis
Figure 2. Mean acceptability ratings by clause weight9
Table 3. Mean acceptability ratings by clause weight Light Canonical mean std. error n mean std. error n mean std. error n 8.05 0.16 30 6.33 0.33 30 7.63 0.24 30 Medium 8.09 0.15 30 6.67 0.29 30 7.60 0.18 30 Heavy 6.69 0.25 30 6.41 0.31 30 6.97 0.24 30
Extraposed
Adjunct
but this trend did not reach signicance in the item analysis: F14; 26 8:59, p < 0:01; F24; 1 17:87, p 0:18. Pairwise comparisons suggest that the main eect of structure was due to the lower acceptability of extraposition sentences as compared with canonical and adjunct sentences in the light and medium conditions. Canonical sentences were rated signicantly higher than extraposition sen-
9.
Error bars in all gures represent standard error of the mean and are based on the by participant data.
Grammatical weight
57
tences in the light condition (t 5:94, p < 0:01), and in the medium condition (t 5:83, p < 0:01), while there was no signicant dierence between canonical and extraposition sentences in the heavy condition (t 1:24, p 0:23). Adjunct sentences patterned similarly to canonical sentences, except that ratings did not decrease as much in the heavy condition. Adjunct sentences were rated signicantly higher than extraposition sentences even in the heavy condition (t 2:60, p 0:01). The main eect of weight and the interaction between weight and structure were apparently due to the decrease in ratings for canonical and adjunct sentences (but not for extraposition sentences) in the heavy condition. Pairwise t-tests conrm that there was a signicant dierence between light and heavy canonical sentences (t 6:34, p < 0:01) and between light and heavy adjunct sentences (t 3:25, p < 0:01), but no signicant dierence between light and heavy extraposition sentences (t 0:45, p 0:66). 4.3. Discussion
Canonical and adjunct sentences started at high acceptability in the light condition and decreased in acceptability in the heavy condition. On the other hand, extraposition sentences stayed at moderate acceptability in all three conditions, converging with canonical sentences in the heavy condition (though adjunct sentences were still rated higher). This pattern of results partially conrms the initial hypotheses. As predicted, acceptability ratings for canonical sentences decreased as clause weight increased. Also as predicted, there was a signicant advantage for canonical sentences and adjunct sentences over extraposition sentences in the light clause condition. Contrary to prediction and unlike in Experiment 1, however, there was no advantage for extraposition in the heavy condition. Rather, the advantage for canonical sentences shown in the light and medium conditions simply disappeared in the heavy condition. Also unexpectedly, ratings for adjunct sentences decreased in the heavy condition, though not as much as ratings for canonical sentences did. While clear eects of grammatical weight were shown in both Experiment 1 and Experiment 2, it is interesting that the results patterned in a dierent way. Reading time results showed no dierence among structures in the light condition, but an advantage for extraposition in the heavy condition. In contrast, acceptability results showed an advantage for canonical structure in the light condition, but no signicant dierence among structures in the heavy condition. It is likely that the dierent results are due to dierences between the two tasks. While reading time is an online measure assumed to reect processing diculty in a relatively
58
E. J. Francis
direct manner, acceptability judgments are collected oine and may be subject to additional factors such as prescriptive rules and stylistic considerations. Since extraposition from NP occurs naturally in both spoken and written English, we know that this structure is permitted by the grammar of English. Therefore, all the test sentences used in Experiments 1 and 2 should be grammatical in the sense of being part of speakers grammatical competence. However, in prescriptive terms, extraposed RCs are a type of misplaced modier of which highly educated people (in this case, university students) are likely to be aware. The misplaced modier is especially noticeable since the subject-modifying RC occurs completely outside the subject NP in a position normally reserved for VP or sentence modiers. In addition, it is possible that the lower acceptability for RCE reects a frequency-based preference for adjacency between the head noun its modifying relative clause. As shown in a corpus study reported in section 5 below, RCE occurred in only 15% of the relevant cases. Although both canonical order and RCE are licensed by the competence grammar, speakers may have a sense that the canonical order is more frequent and therefore more basic, resulting in lower acceptability ratings for RCE sentences.10 Finally, although the lexical content was chosen to be fully felicitous with extraposition, the isolated sentences in the experiment were not situated in any discourse context, possibly drawing more attention to the extraposition structure than a natural context would have done. It is plausible, therefore, that the discrepancy between reading time and acceptability results could be explained as follows. In the light clause condition, there was no dierence in reading time between canonical and extraposition sentences, suggesting that processing eciency for the two structures is similar. Thus, a prescriptive and/or frequency-based bias in favor of canonical order might help explain the lower acceptability of extraposition in the light condition. In the heavy clause condition, there was a clear reading time advantage for extraposition over canonical structure, suggesting that extraposition sentences are processed more eciently than canonical sentences when the RC is heavy ( just as the theory predicts). The weight-based advantage in processing eciency for extraposition sentences might have counteracted the normative bias in favor of canonical structure, resulting in approximately equal acceptability for canonical and extraposition structures in the heavy condition. However, it is still not clear why adjunct sentences decreased in acceptability in the heavy condition rather than staying the same in all conditions.
10. For example, see Bresnan (2006) for evidence that fully grammatical but infrequent or improbable structures are commonly judged as less acceptable.
Grammatical weight
59
It is interesting that Uszkoreit et al (1998) and Konieczny (2000) got very similar results to these in their acceptability judgment experiments for RCE sentences in German. Uszkoreit et al (1998: 12) found that canonical sentences were rated higher than extraposition sentences overall, but that this dierence was neutralized when the extraposition distance was short and the RC was long. Similarly, Konieczny (2000: 638639) found that canonical sentences were rated higher in all conditions, but that there was the least dierence between canonical and extraposition sentences when the extraposition distance was short and the RC was long. A frequency-based explanation is less plausible for German than for English, since RCE is more common in German. For example, Uszkoreit et al (1998: 6) found that extraposition occurred in 43% (340 out of 789) of the relevant cases. Therefore, I conjecture that prescriptive rules or other stylistic considerations aected metalinguistic judgments in both English and German, causing extraposition sentences to be rated somewhat lower than the domain minimization preferences would predict, while a frequency-based bias in favor of canonical structure might have been an additional factor for English. We now turn to an additional kind of evidence for grammatical weight eectsa corpus study of naturally-occurring speech and writing.
5.
Corpus analysis
Hawkins (2004) idea of processing eciency through domain minimization applies not just to comprehension but also to production. It is therefore predicted that preferences for choice of one structure over another in language use should reect domain minimization preferences. This prediction has been conrmed in previous corpus studies of Heavy NP Shift, Particle Shift, and Dative Shift in English (Arnold et al. 2000; Lohse et al. 2004; Wasow 1997) as well as RCE in German (Uszkoreit et al 1998). For the current study, it was predicted that RCE from Subject NP should occur most often in naturally occurring speech and writing when the RC is longer than the VP, and that incidence of extraposition should decrease as the ratio of VP to RC length increases. 5.1. Methods
5.1.1. Corpus. The International Corpus of English Great Britain (henceforth ICE-GB) was used for this study (Nelson, Wallis, and Aarts 2002). The corpus includes about one million words of British English in a variety of genres of both speech and writing.
60
E. J. Francis
5.1.2. Data collection and coding. Finite and non-nite clauses for which the subject NP was modied by a RC were collected by means of category and tree fragment searches in the ICE-GB corpus. Each clause was then coded by hand for the following categories: extraposition status, Subject NP length, VP length, RC length, verb complex, main predicate, predicate type, RC type, head noun of RC, relative pronoun, and discourse type. RCs were classied as extraposed or canonical depending on their position, with canonical RCs occurring within the subject NP and extraposed RCs occurring at the end of the sentence following the VP. Phrase length was counted in words as dened by units of text separated by blank spaces. Repetitions and restarts were excluded from the word count. Verb complex included the main verb and its auxiliaries in the exact form used in the sentence. Main predicate was coded as the lemma form of the main verb (or adjective, in the case of predicate adjectives). The main predicate was then assigned to one of following predicate types: transitive action verb, transitive stative verb, intransitive unergative verb, intransitive unaccusative verb, passive verb, copular verb, raising verb, and predicate adjective. The distinction between unaccusative and unergative was based on the presentational there test: unaccusative verbs are those verbs that t into a sentence such as: There arrived three guests. Passive verbs, which were almost always forms of transitive action verbs, were included as a category separate from both transitive and unaccusative verbs. RC type was coded according to the following categories based on the grammatical function of the relative pronoun within the RC: subject, direct object, object of preposition, possessive, and adjunct. Grammatical function of the head noun was not coded, since the head noun was always the subject. The form of the relative pronoun was also noted (e.g., which, that, who, where, to whom, etc.). Discourse type was coded as either speech or writing depending on whether the sentence came from a spoken or written source. 5.1.3. Hypotheses. Based on the same domain minimization preferences used in Experiments 1 and 2, extraposition should be preferred to the extent that RC length exceeds VP length. This is because extraposition minimizes the combined PCDs for Subject NP and Matrix S in just those cases for which the RC is longer than the VP. (See predictions for reading time in Section 3 above). It was therefore predicted that, in a corpus of sentences with a subject-modifying RC, the RC should be longer on average than the VP when extraposition is used. Conversely, the RC should be shorter on average than the VP when the canonical structure is used. In addition, the proportion of sentences with extraposition should
Grammatical weight
61
be highest when the ratio of VP length to RC length is lowest and should decrease as this ratio increases. Although spoken English is arguably a more direct reection of online demands in sentence production than written English is, previous studies of grammatical weight have found similar eects in speech and writing (Wasow 1997: 99). Therefore, it was predicted that similar weight eects would show up in both spoken and written sentences in the ICE-GB corpus. The category of predicate type was coded to see whether lexical and semantic properties of the RCE sentences were in line with previous ndings and to test whether grammatical weight eects are independent of predicate type. Gueron (1980), Kuno and Takami (2004), Rochemont and Culicover (1990), and Takami (1999) have proposed that extraposition from subject NP is subject to certain semantic or pragmatic constraints that aect which predicate types occur with RCE. Based on this previous research, intransitive, unaccusative predicates were predicted to occur most frequently in RCE sentences because of the tendency for predicates of this type to represent old or backgrounded information and to serve a presentational function with respect to the information following the verb (see Rochemont and Culicover 1990: 6568). Transitive predicates, which rarely serve this kind of function, were predicted to occur only infrequently with extraposition. However, following Rochemont and Culicover (1990: 6568), it was expected that no predicate types should be completely excluded from occurring with extraposition because the relevant restriction is contextual rather than strictly lexical or semantic in nature. Finally, the eects of grammatical weight were predicted to hold independently of predicate type. 5.2. Results
Of the 391 sentences collected for this study, 332 (85%) had canonical structure, while 59 (15%) had extraposition. As predicted, extraposed RCs were signicantly longer than the VP on average (12.36 vs. 3.44 words; t 8:19, p < 0:01), while nonextraposed RCs were signicantly shorter than the VP (8.37 vs. 12.94 words; t 8:06, p < 0:01), as shown in Figure 3. Also as predicted, the proportion of sentences with extraposition consistently decreased as the ratio of VP length to RC length increased. While 91% of sentences for which the RC was at least ve times longer than the VP (VP-to-RC ratio of 0.2 or less) involved extraposition, only 2% of sentences for which the RC was the same length or shorter than the VP (VP-to-RC ratio of 1.0 or greater) involved extraposition, as shown in
62
E. J. Francis
Figure 3. Mean length of VP and RC for sentences with extraposed vs. canonical RCs
Figure 4. Percentage of extraposed RCs by ratio of VP length to RC length
Figure 4. No examples of extraposition were found when the VP was more than 1.3 times longer than the RC. Thus, the weight of the RC in relation to the VP appears to be a strong predictor of extraposition. Similar results were found looking at VP length alone: 90% of sentences with a one-word VP involved extraposition while only 32% of sentences with four-word VPs had extraposition, and no examples of extraposition were found when the VP had more than 11 words. Results for RC length alone were less dramatic, but in the predicted direction. Only 12% of three and four word RCs were extraposed, but this increased to 33% for RCs of 15 words or more. Results for discourse type (spoken vs. written) and predicate type were in line with previous ndings. Similar to the results of Wasows (1997: 99) study of weight eects in double object constructions, length of VP in relation to RC showed the same general pattern for speech and writing. As shown in Table 4, canonical RCs were shorter than the VP on aver-
Grammatical weight
Table 4. Mean VP and RC length (words) in spoken vs. written sentences Speech Extraposition Canonical VP length RC length VP length RC length 3.60 12.88 12.95 6.92
63
Writing 3.32 11.97 12.94 9.31
age while extraposed RCs were longer than the VP on average for both spoken and written sentences. Findings for predicate type were compatible with Rochemont and Culicovers (1990) observations. As shown in Figure 5, the two most common predicate types occurring with extraposition were passive (46%) and unaccusative (24%), together accounting for 70% of all RCE sentences. In contrast, transitive and copular predicates were the two most common predicate types occurring with canonical clauses. It is interesting that passives were more frequent than unaccusatives, since only unaccusatives had been discussed in previous work on RCE. This novel nding underscores an advantage of using quantitative corpus data, since previous studies used mostly constructed examples or individual attested examples (e.g., Huck and Na 1990, 1992; Rochemont and Culicover 1990; Takami 1998). However, these results are still in line with the prediction that unaccusatives should be the most frequent predicate type, provided that passive predicates are classied together with unaccusatives. Although passives and unaccusatives do not have exactly the same behavior, this categorization is justiable based on the similar syntactic, semantic, and aspectual properties of passive and unaccusative predicates in English and other languages (cf. Burzio 1986) as well as their similar discourse functions.11 As expected, there were no strict constraints, since all six predicate types were represented in both data sets (Figure 5). These results are consistent with Rochemont and Culicovers claim that the apparent restriction on predicate type is pragmatic in nature rather than strictly semantic, syntactic, or lexical (1990: 6668). As shown in Table 5, the frequency of extraposition was lower than the frequency of canonical structure overall (15% vs 85%), but it was much
11. For this study, (active) unaccusative predicates were identied based on their ability to occur with presentational there. Note that many passive predicates also allow presentational there: i. There appeared no evidence of any mistakes. (unaccusative) ii. There was found no evidence of any mistakes. (passive)
64
E. J. Francis
Figure 5. Predicate types for canonical and extraposition sentences
Table 5. Frequency of extraposed and canonical clauses by predicate type Predicate type Unaccusative/passive Other predicate types Total Extraposed 37% (n 41) 6% (n 18) 15% (n 59) Canonical 63% (n 69) 94% (n 263) 85% (n 332) Total 100% (n 110) 100% (n 281) 100% (n 391)
higher with unaccusative/passive predicates (37%) than with other predicate types (6%). Thus, it appears that both predicate type and weight help predict the occurrence of extraposition. Furthermore, it appears that predicate type is independent of weight, since weight eects were apparent even when predicate type was held constant. Figure 6 shows the proportion of clauses with extraposition by weight ratio for all clauses containing passive or unaccusative predicates. Similar to the overall results shown in Figure 4, 100% (17 of 17) showed extraposition when the RC was at least ve times longer than the VP, while only 8% (4 of 51) showed extraposition when the RC was the same length or shorter than the VP. Similar results were found when examining only the clauses containing other predicate types (i.e., all clauses except those with a passive or unaccusative predicate): 67% (4 of 6) showed extraposition when the RC was at least ve times longer than the VP, while only 1% (2 of 190)
Grammatical weight
65
Figure 6. Percentage of extraposed RCs by ratio of VP length to RC length: Unaccusative and passive verbs only
showed extraposition when the RC was the same length or shorter than the VP. Following the approach of Diessel (2008), a binary logistic regression analysis was conducted to conrm whether grammatical weight and predicate type were really independent factors predicting the occurrence of extraposition vs. canonical structure. The two independent variables included in the model were weight ratio (i.e., ratio of VP to RC length, a continuous variable) and predicate type (i.e., passive/unaccusative vs. all others, a binary categorical variable). Both factors were signicant in predicting extraposition, with weight being the stronger of the two predictors: weight ratio (X 2 1 38:91, p < 0:01), predicate type (X 2 1 10:47, p < 0:01). In addition, there was no signicant interaction between the two factors (X 2 1 0:28, p 0:59), meaning that the eects of weight were independent of the eects of predicate type. Note that these results are consistent with an account in which discourse status is also independent of weight and perhaps a stronger predictor of extraposition than predicate type. However, since the corpus was not coded for discourse status, additional studies would be needed to conrm this possibility. 5.3. Discussion
Results of the corpus analysis showed a strong eect of RC weight in accordance with the domain minimization preferences of Hawkins (2004) theory. Although RCE was relatively infrequent overall, at only about 15% of the sentences with subject-modifying RCs, extraposition was
66
E. J. Francis
strongly preferred in cases where the VP length (extraposition distance) was one or two words or where the RC was at least four times longer than the VP. In contrast, extraposition happened in only about 2% of cases in which the RC was the same length or shorter than the VP. Furthermore, in sentences where extraposition occurred, the RC was almost always longer than the VP and was more than three times longer on average. Conversely, in sentences with canonical structure, the VP was about 1.5 times longer than the RC on average. Although the trends are clearly in the expected direction, these results are not exactly as the domain minimization preferences predict. In cases where the RC and VP were exactly the same length, for example, extraposition occurred in only 9% of sentences even though the PCDs for Subject NP and Matrix S domains would have been approximately equal. These results suggest that grammatical weight needs to be strongly skewed in favor of extraposition before speakers will use extraposition consistently. But of course grammatical weight is not the only relevant factor. As we have already seen, semantic and pragmatic constraints, as identied in previous work on RCE, limit the occurrence of extraposition. Since canonical structure tends to occur in a wider range of semantic and pragmatic contexts, we would expect canonical structure to occur more frequently in general. Indeed, we saw that the rate of extraposition went up to 37% with passive and unaccusative verbs, as compared with only 6% with other predicate types (Table 5). However, 37% is still a minority of cases, and most of those cases occurred when the RC was at least twice as long as the VP (Figure 6). Thus, even when considering only semantically felicitous unaccusative and passive predicates, the rate of extraposition was still lower than domain minimization principles would predict. An additional reason for the relative infrequency of RCE may be that RCE is in competition with other constructions besides the canonical structure. For example, RCE sentences with unaccusative main verbs can often be paraphrased using the presentational there construction, as illustrated in (11c). (11) a. b. c. New sets that were able to receive all the TV channels soon appeared. (Canonical) New sets soon appeared that were able to receive all the TV channels. (Extraposition) There soon appeared new sets that were able to receive all the TV channels. (Presentational there)
Presentational there serves a discourse function similar to that of extraposition, and, in addition, allows a heavy subject NP to occur at the end
Grammatical weight
67
of the sentence together with its modiers.12 Because the presentational there construction does not require extraposition and minimizes the NP and Matrix S domains at the same time, it may be preferable to either canonical structure or RCE in sentences with a heavy subject-modifying RC and an unaccusative main verb.13 Further investigations would be needed, however, to conrm this hypothesis. A nal reason why RCE only occurred consistently when the RC was much heavier than the VP might be related to conventionalized preferences for certain phrase structure congurations. Hawkins (1994: 90) observed a similar result for HNPS (Heavy NP Shift) in a corpus study of English: HNPS only applied in about 30% of cases for which the NP was longer than the PP and only occurred consistently when the NP was at least ve words longer than the PP. Hawkins (1994) proposed that the large weight dierence required to induce a preference for Heavy NP Shift (in contrast to the smaller dierence required to show a preference for placing a heavy PP at the end in a V-PP-PP sequence) may be due to a conventionalized preference for verb-object order in English phrase structure. A similar explanation could apply to the present case as well, since adjacency between a head noun and its modifying RC is also a highly conventional property of English phrase structure, with extraposition requiring special syntactic and interpretive rules. It is not clear, however, where such an adjacency preference would come from. One possibility is that the preference reects speakers sensitivity to stylistic considerations, such as the prescriptive rule banning misplaced modiers. Alternatively, an adjacency preference could be simply the result of biases developed through encountering the canonical structure much more frequently in language usea case of infrequent use of RCE perpetuating infrequent use of RCE. This would then be similar to the idea of frequency bias used to help explain the acceptability results in Section 4. Finally, the adjacency preference might be encoded in the grammar itself. Hawkins (1994: 8990) proposed that canonical verb-object order is licensed by the phrase structure grammar whereas HNPS may be licensed only by weight-based principles. I believe that all of these possibilities may be at least partially correct, and I will suggest in Section 6 that such
12. The underlined portion of (11c) is not the surface grammatical subject, but it is the logical subject in that it represents the single argument of a one-place predicate (with there being an expletive subject) and corresponds to the grammatical subject of the canonical sentence in (11a). 13. Presentational there is restricted to occur with unaccusative predicates, and unlike RCE is usually infelicitous with other types of predicates regardless of pragmatic context (Rochemont and Culicover 1990: 66).
68
E. J. Francis
an adjacency preference might best be stated in the grammar in terms of a default constraint on the syntax-semantics interface, as proposed by Culicover and Jackendo (2005). 6. General discussion and conclusions
RCE from Subject NP involves a discontinuous dependency which deviates from the normal X-bar structure of phrases (at least on the surface), adding complexity to the grammar with little or no eect on the meaning of the sentence. In addition, RCE is subject to a prescriptive rule banning misplaced modiers and is singled out in some style guides as an example of poor writing (e.g., Trenga 2006: 56). From this perspective, it is perhaps surprising that RCE occurs naturally in both formal and informal language use. While it is known from previous research that discourse information structure plays a role in licensing this construction, the current study shows that grammatical weight is also an important factor in determining how and when speakers use RCE. Furthermore, the current study provides evidence that the increased incidence of RCE with heavy RCs is related to processing eciency. Hawkins (2004) theory of domain minimization predicts that the degree to which RCE is be preferred in both comprehension and production should be correlated with the degree to which RC length exceeds VP length. In support of this hypothesis, corpus analyses reported here showed that the incidence of RCE increased as the proportion of VP to RC length decreased and that RCE was strongly preferred over canonical structure when the RC was four or more times longer than the VP. Incidence of RCE was lower overall than was predicted by domain minimization principles alone, but this can plausibly be explained by semantic and pragmatic constraints on RCE, competition from other non-canonical sentence types, and perhaps a conventionalized preference for adjacency between a head noun and its modifying clause. Importantly, a logistic regression analysis showed that weight was a signicant predictor of extraposition and that the eects of weight were independent of predicate type. Also in support of Hawkins proposals, reading time results showed that when the RC was three times heavier than the VP (the heavy condition), RCE was processed signicantly faster than canonical structure. Importantly, lexical content and information structure were controlled in the reading time experiment and only grammatical weight was manipulated, meaning that there was an eect of weight independent of lexical and informational factors. The results of the acceptability judgment task also supported the idea of domain minimization, though perhaps not as clearly as the results for
Grammatical weight
69
reading time and corpus analysis. Results showed that canonical structure was rated higher than RCE overall, but that this preference disappeared when the RC was heavy. Although RCE was not the preferred structure in the heavy condition, the eect of grammatical weight was still in the predicted direction and was very similar to the eect shown in previous acceptability judgment studies of German RCE (Konieczny 2000; Uszkoreit et al 1998). It may be that the lack of preference for RCE in the heavy condition is related to the nature of the judgment task, since asking for judgments of isolated sentences may serve to highlight the fact that RCE deviates from the canonical (and much more frequent) phrase structure pattern in which the relative clause it adjacent to its head noun and also violates a prescriptive rule. In contrast, the self-paced reading task (which supported Hawkins predictions more clearly) did not ask participants to make any evaluative judgments and may be viewed as a more direct measure of processing eciency. The overall pattern of these results strongly suggests that grammatical weight is a gradient factor related to processing eciency rather than part of a categorical rule of grammar. Nevertheless, these results also have implications for the syntactic analysis of RCE. For example, one nding of the corpus analysis was that RCE never occurred in cases where the VP was more than 1.3 times longer than the RC. Similarly, in the acceptability judgment experiment, RCE sentences in the light and medium conditions were rated signicantly lower than canonical sentences. If one were to ignore the gradient eects of weight that were also shown in these studies, this might give the appearance of a categorical constraint in which weight is somehow specied as a binary syntactic feature. Although weight (or heaviness) is not a feature of any of the major theories of RCE syntax (unlike for some analyses of Heavy NP Shift, for example), syntactic locality conditions such as Subjacency are. Given the controversial nature of locality conditions in the literature on extraposition (cf. Baltin 2006) and the heavy reliance on constructed examples within this body of research, there is a real danger that gradient eects related to processing might be mistaken for categorical syntactic rules. Strunk and Snider (2008) in fact provide systematic evidence that syntactic locality conditions on RCE from Object NP are gradient rather than categorical. In their acceptability experiments on RCE from Object NP in English and German, they found that many Subjacency-violating sentences were judged no worse than non-violating sentences and that the Subjacency effects that did show up were gradient in nature, depending on the depth of embedding of the extraposed RCs antecedent. In addition, Strunk and Snider (2008) report a corpus study of RCE in German showing that Subjacency-violating sentences occurred naturally in discourse and that
70
E. J. Francis
the eects of syntactic locality on frequency of RCE were gradient, patterning similarly to the eects of grammatical weight in Uszkoreit et als (1998) corpus study of German and the current study of English. Thus, if structurally-dened locality conditions are gradient and reside outside the syntactic component of grammar, the syntax of RCE may be simpler than has been assumed in most generative approaches to extraposition. The current study does not speak to this issue directly, but provides a pattern of data to look for in determining whether a proposed constraint is gradient and performance-based, or whether it is more accurately stated as a categorical syntactic rule.14 An additional implication for the grammatical analysis of RCE comes from the reading time results. These results suggest that the dependency between the RC and its head noun is processed similarly to the dependency between the verb and its subjectboth are subject to weight-based locality eects that can be measured in terms of IC-to-word ratios. Following Hawkins (1994, 2004), the ratios are calculated as though the head noun and its extraposed RC were part of the same NP just as the subject NP and its predicate are part of the same clause. This is reminiscent of McCawleys (1987, 1998) analysis of RCE as involving a single, discontinuous NP constituent. While McCawleys proposal is probably too strong a conclusion on the basis of these results, the similarity of the two dependencies suggests that extraposing the RC is not, in itself, a signicant source of additional complexity. Rather, what contributes to complexity is the cost of integrating information across a distance. Culicover and Jackendo s (2005: 166167) approach is appealing from this perspective because it states the usual correspondence between syntactic constituency and semantic modication as a default pattern (soft constraint) that is violated in RCE rather than trying to reconcile RCE with X-bar structure through some kind of rightward or leftward syntactic movement. Hawkins (1994) idea of a conventionalized grammatical preference for adjacent word order can perhaps best be thought of in terms of such a default constraint. Rather than claiming, as Hawkins (1994: 89) does for HNPS, that the non-canonical word order pattern is not licensed by the grammar at all, I would propose instead that RCE is licensed by
14. Following the general approach of Culicover and Jackendo (2005), my assumption is that a categorical syntax exists, but need not include rules for patterns that are predictable from semantics, pragmatics, processing, or other non-syntactic factors. Another possibility I have not considered here is that of gradient rules within the syntax (cf. Featherston 2005). Since grammatical weight is evidently a performance-based phenomenon in the current study, I will not address this possibility any further here. See Wasow (2002: 115158) for relevant discussion.
Grammatical weight
71
the grammar, but that it violates a default constraint which species the preferred syntax-semantics correspondence. Such a constraint violation captures the intuition that canonical ordering is the more basic option, as reected in corpus frequency and acceptability judgments, but does so without complicating the syntactic structure (extraposed RCs are simply clausal adjuncts in syntax). Although the current study does not directly argue in favor of a particular syntactic analysis, the reading time results, which showed no preference for canonical structure, lend themselves nicely to such an approach. In sum, the results of all three experiments (self-paced reading, acceptability judgment, and corpus analysis) suggest that grammatical weight plays a signicant role in licensing RCE from Subject NP in English. Grammatical weight is therefore helpful for explaining why RCE is permitted by the grammar and preferred in certain contexts of language use, despite the discontinuous dependency that is incurred. Appendix A: Instructions for Experiment 1 You will be presented with a series of sentences, each followed by a statement. Your rst task is to read the sentence and then press the left button as soon as you have understood the sentence. Following each sentence, you will be presented with a simple statement. Your second task is to decide whether the statement is true or false based on the information in the sentence you just read. To make your response, press the left button for true and the right button for false. In making your decision, use only the information contained in the sentence itself. Avoid making any inferences beyond the actual content of the sentence. Please make your responses as quickly and accurately as possible. After you have made your selection, there will be a brief pause and the next sentence will appear. There will be four sets of sentences. Following each set, the computer will prompt you to take a short break. After you have rested, you may press the space bar to continue with the next set. When the last set is nished, you will be prompted to inform the experimenter. Any questions? Please place your index and middle nger on the response pad to select the left button for true, right button for false. You may press the spacebar with your other hand when you are ready to begin. Appendix B: Instructions for Experiment 2 Please read each of the sentences listed below. For each sentence, we would like you to indicate your reaction to the sentence. Mark your
72
E. J. Francis
response sheet by circling a number from (1) to (9). Use (9) for sentences that seem fully normal, and understandable to you. Use (1) for sentences that seem very odd, awkward, or dicult for you to understand. If your feelings about the sentence are somewhere between these extremes, use one of the middle responses from (2) to (8). Please try to use the entire scale, not just the end points of the scale. THERE ARE NO RIGHT OR WRONG ANSWERS. Please base your responses solely on your gut reaction, not on rules you may have learned about what is proper or correct English. Please work straight through the survey and DO NOT turn back to a page after you have completed it. You will have the opportunity to take three short rest breaks during the survey. Rest breaks are indicated at the bottom of certain pages. For example, you may encounter sentences like the following in the survey: Worst
We persuaded there to be strike. Phil wrote a poem for his mother in honor of her fortieth birthday. She asked that whether she should come to the party.
Best
f
1 2 1 2 1 2
3 3 3
4 4 4
5 6 5 6 5 6
7 7 7
8 8 8
9 9 9
Please turn to the following page when you are ready to begin.
References
Arnold, Jennifer E., Thomas Wasow, Ash Asudeh & Peter Alrenga. 2004. Avoiding attachment ambiguities: The role of constituent ordering. Journal of Memory and Language 51. 5570. Arnold, Jennifer E., Thomas Wasow, Anthony Losongco & Ryan Ginstrom. 2000. Heaviness vs. newness: The eects of structural complexity and discourse status on constituent ordering. Language 76(1). 2855. Baltin, Mark. 1981. Strict bounding. In Carl L. Baker & John McCarthy (eds.), The logical problem of language acquisition, 257295. Cambridge, MA: The MIT Press. Baltin, Mark. 2006. Extraposition. In Martin Everaert & Henk van Riemsdijk (eds.), The Blackwell companion to syntax, vol. 2, 237271. Malden, MA: Blackwell. Behaghel, Otto. 1909. Beziehungen zwischen Umfang und Reihenfolge von Satzgliedern. Insogermanische Forschungen 25. 110142. Bianchi, Valentina. 2002. Headed relative clauses in generative syntax-part I. Glot International 6(7). 197204.
Grammatical weight
73
Bresnan, Joan W. 2006. Is syntactic knowledge probabilistic? Experiments with the English dative alternation. In Sam Featherston & Wolfgang Sternefeld (eds.), Roots: Linguistics in search of its evidential base, 7596. Berlin: Mouton de Gruyter. Burzio, Luigi. 1986. Italian syntax. Dordrecht: Reidel. Cheung, Ki Shun Antonio. 2006. Processing factors in language comprehension and production: The case of Cantonese dative constructions. Hong Kong: University of Hong Kong MPhil thesis. Culicover Peter. W. & Ray S. Jackendo. 2005. Simpler syntax. Oxford: Oxford University Press. Diessel, Holger. 2008. Iconicity of sequence: A corpus-based analysis of the positioning of temporal adverbial clauses in English. Cognitive Linguistics 19(3). 457482. Featherston, Sam. 2005. That-trace in German. Lingua 115(9). 12771302. Gibson, Edward. 1998. Linguistic complexity: Locality of syntactic dependencies. Cognition 68(1). 176. n, Givo Talmy. 1983. Topic continuity in discourse. Quantitative cross-language studies (Typological Studies in Language 3). Amsterdam: John Benjamins. Gueron, Jacqueline. 1980. On the syntax and semantics of PP-Extraposition. Linguistic Inquiry 11. 637678. Hawkins, John A. 1994. A performance theory of order and constituency. Cambridge, UK: Cambridge University Press. Hawkins, John A. 2001. Why are categories adjacent? Journal of Linguistics 37(1). 134. Hawkins, John A. 2004. Eciency and complexity in grammars. Oxford: Oxford University Press. Huck, Georey J. & Youngnee Na. 1990. Extraposition and focus. Language 66(1). 51 77. Huck, Georey J. & Youngnee Na. 1992. Information and contrast. Studies in Language 16(2). 325334. Kayne, Richard S. 1994. The antisymmetry of syntax. Cambridge, MA: MIT Press. Konieczny, Lars. 2000. Locality and parsing complexity. Journal of Psycholinguistic Research 29(6). 627645. Kuno, Susumu & Ken-ichi Takami 2004. Functional constraints in grammar: On the unergative-unaccusative distinction. Amsterdam: John Benjamins. Lohse, Barbara L., John A. Hawkins & Thomas Wasow. 2004. Domain minimization in English verb-particle constructions. Language 80(2). 238261. Matthews, Stephen & Louisa Y. Y. Yeung. 2001. Processing motivations for topicalization in Cantonese. In Kaoru Horie & Shigeru Sato (eds.), Cognitive-functional linguistics in an East Asian context, 81102. Toyko: Kurosi. McCawley, James D. 1987. Some further evidence for discontinuity. In Georey J. Huck & Almerindo E. Ojeda (eds.), Discontinuous constituency (Syntax and Semantics 20), 185 200. New York: Academic Press. McCawley, James D. 1998. The syntactic phenomena of English. Chicago: University of Chicago Press. Nelson, Gerald, Sean Wallis & Bas Aarts. 2002. Exploring natural language: The British component of the International Corpus of English. Amsterdam: John Benjamins. Quirk, Randolph, Sidney Greenbaum, Georey Leech & Jan Svartvik. 1972. A grammar of contemporary English. London: Longman. Rochemont, Micahel S. & Peter W. Culicover. 1990. English focus constructions and the theory of grammar. Cambridge: Cambridge University Press. Ross, John R. 1967. Constraints on variables in syntax. Cambridge, MA: Massachusetts Institute of Technology dissertation.
74
E. J. Francis
Siewierska, Anna. 1993. Syntactic weight vs. pragmatic factors and word order variation in Polish. Journal of Linguistics 29(2). 233265. Stallings, Lynne M., Maryellen C. MacDonald & Padraig G. OSeaghdha. 1998. Phrasal ordering constraints in sentence production: phrase length and verb disposition in HeavyNP Shift. Journal of Memory and Language 39. 392417. Strunk, Jan & Neal Snider. 2008. Extraposition without subjacency. Paper presented at the 30th Annual Convention of the German Society of Linguistics (DGfS 2008), Bamberg, Germany, 2729 February. Takami, Ken-ichi. 1999. A functional constraint on Extraposition from NP. In Akio Kamio & Ken-ichi Takami (eds.), Function and Structure, 2356. Amsterdam: John Benjamins. Trenga, Bonnie. 2006. The curious case of the misplaced modier: How to solve the mysteries of weak writing. Cincinnati, OH: Writers Digest Books. Uszkoreit, Hans, Thorsten Brants, Denys Duchier, Brigitte Krenn, Lars Konieczny, Stephan Oepen & Wojciech Skut. 1998. Studien zur performanzorientierten Linguistik: Aspekte der Relativsatzextraposition im Deutschen. In CLAUS Report No. 99, 114. Saarbrucken: Universitat des Saarlandes. Vasishth, Shravan & Richard L. Lewis. 2006. Argument-head distance and processing complexity: explaining both locality and antilocality eects. Language 82(4). 767794. Wasow, Thomas. 2002. Postverbal behavior. Stanford: CSLI Publications Wasow, Thomas. 1997. End-weight from the speakers perspective. Journal of Psycholinguistic Research 26(3). 347361. Yamashita, Hiroko. 2002. Scrambled sentences in Japanese: Linguistic properties and motivations for production. Text 22(4). 597633. Yamashita, Hiroko & Franklin Chang. 2001. Long before short preference in the production of a head-nal language. Cognition 81. B45B55.
Magari*
FRANCESCA MASINI and PAOLA PIETRANDREA
Abstract We propose a constructionist approach to the polyfunctionality of the Italian focus particle magari (roughly corresponding to maybe, but also I wish). The sheer syntactic versatility of this word leads us to detect its formal regularities at the level of discourse congurations. This level of analysis, identied within the French linguistic tradition, is dened by the maintenance of a predicate-argument-adjunct structure in discourse. The salient feature of discourse congurations is their shape, which can be described by referring to a number of topological patterns: lists of elements in the same syntactic position, repetition of syntactic structures, shifting of elements from a post-verbal to a pre-verbal position and so on. These topological patterns are meaningful to an extent and they are eligible to be regarded as constructions. Magari is shown to be regularly associated with a general topological pattern, namely a list of items that occupy the same syntactic position as the item focused by magari. Each semantic function of magari correlates with one particular kind of list. These associations of a form (the dierent types of lists) and a meaning (the functions of magari) are shown to be related to one another by means of inheritance links.
` * Correspondence address: Dipartimento di Linguistica, Universita Roma Tre, Via Ostiense 236, 00146 Rome, Italy. E-mails: 3fmasini@uniroma3.it4; 3pietrand@uniroma3.it4. Acknowledgements: The research was carried out within a project on the topology of ` grammatical meaning in discourse constructions (Topogram, Universita Roma Tre). We would like to express our graditude to Claire Blanche-Benveniste, Elisabetta Bonvino, Claudio Iacobini, Sylvain Kahane, Alessandro Lenci, Henning Nlke and Raaele Simone, who read previous drafts of this paper and provided invaluably helpful comments and suggestions. A special thank goes to Ewa Dabrowska, Editor-in-Chief at Cognitive Linguistics, who guided us through a very fair and stimulating revision process, and to an anonymous Associate Editor and two anonymous referees, who sensibly improved our paper with their comments. The usual disclaimers apply. Cognitive Linguistics 211 (2010), 75121 DOI 10.1515/COGL.2010.003 09365907/10/00210075 6 Walter de Gruyter
76
F. Masini and P. Pietrandrea non factuality; focus particle; construction grammar; discourse conguration; topology; lists.
Keywords:
1.
The polyfunctionality of magari
The word magari has attracted considerable attention among Italian linguists because of its especially intriguing polyfunctionality that knows no parallel in its counterparts in other European languages (cf. Arcaini 1997, 2000; Licari and Stame 1989; Schiemann 2008). Firstly, magari can have the function of a general marker of non factuality. In this case, it roughly corresponds to the English adverb maybe. See example (1): (1) ` Magari e a casa Maybe (s)he is at home
Magari can also function as a scalar operator (in the sense of Fillmore et al. 1988 and Kay 1990), triggering a scale of non factuality whose extreme position is occupied by the constituent in the focus of magari. See (2): (2) Bisognerebbe negoziare una tregua, un armistizio, magari la pace It would be necessary to negotiate a ceasere, an armistice and maybe peace
Besides, magari may act as a non factual concessive marker, as in (3), where the speaker concedes that the subject is clever despite thinking that he has not studied enough: (3) ` ` Magari e intelligente, ma non e abbastanza preparato He might be clever, but he has not studied enough
In imperatival contexts, magari weakens the illocutionary force of the order, as in the following example: (4) Magari parlagliene tu! Perhaps you yourself could talk to him about it!
Finally, magari functions as an optative marker. This happens when it occurs in exclamative contexts: (5) Vorrei tanto vedere un lm come quello. Magari ne facessero ancora! I really would like to watch a movie like that. I wish they still made some! A: Vuoi un po di riposo? Would you like to rest a bit?
(6)
Magari B: Magari! Id love to! / I wish I could!
77
2.
The problem
The problem that arises is: how can we account for the polyfunctionality of magari? First of all, one should decide whether the functions of magari are somewhat related to each other or are completely independent, i.e., homonymous. There is a good reason for rejecting the latter hypothesis, namely: the set of functions held by magarinon factual, scalar non factual, non factual concessive, imperative, optativerecalls in most respects the semantic network developed by several irrealis markers of non factuality in various non-European languages (cf., e.g., Elliott 2000; Lazard 1998). The crosslinguistic presence of similar semantic networks makes it fairly unlikely that we are dealing with pure homonymy. We therefore consider the various functions of magari as microsenses of this word, that is distinct sense units [ . . . ] that occur in dierent contexts and whose default construals stand in a relation of mutual incompatibility at the same hierarchical level (Croft and Cruse 2004: 126127). Under this perspective, the word magari has a hyperonymic reading and a cluster of hyponymous readings, whose default construals are sister incompatibles (Croft and Cruse 2004: 127). The question now arises of identifying the contexts that licence the various functions of magari and the nature of the relations holding between these functions. Such a task is made more complicated by the sheer syntactic versatility of magari. Indeed the contexts in which magari occurs can be properly detected only by adopting a wide-ranging notion of context. In Section 3 we describe the practical diculties encountered in the analysis of magari and the theoretical approach and tools adopted for solving them, whereas in Section 4 we provide a qualitative and quantitative description of magari and its various functions. In Section 5, we give a construction grammar account of our ndings.
3. 3.1.
The theoretical approach Construction grammar
A fruitful theoretical approach to the kind of problem outlined in the previous section is to place the analysis of magari in the wide framework of construction grammar. As is well known, construction grammar comprises a number of dierent models (cf., among others, Croft 2001; Fillmore et al. 1988; Goldberg 1995, 2006; Kay and Fillmore 1999), which
78
F. Masini and P. Pietrandrea
nevertheless share a set of basic tenets. The main tenet regards the very notion of construction, which is dened as a conventionalized association of a form and a meaning and is considered the basic unit of linguistic analysis. This denition virtually captures every meaningful unit of language, ranging from simple words to more complex and abstract sentence-level structures, such as argument structure constructions (Goldberg 1995) or sentence types (Michaelis and Lambrecht 1996). It is thus clear that certain higher-level abstract patterns, those that are commonly considered as context for lower-level lexically specied units, may be treated as full linguistic objects in this framework, provided that they prove to be meaningful to some extent. Another assumption of constructionist approaches that is crucial for our purposes is that they take into account not only syntactic and semantic information, but also lexical and/or pragmatic information (Kay 1990: 61) and that all this information is coded simultaneously in the construction and contributes to characterize the constructions itself. This provides the tools for the detection of the correlations between certain contexts, or rather constructions, and the various functions of magari. This latter theoretical issue has been recently addressed by Fried (2007), who has convincingly argued that the relations between the dierent functions of the same polyfunctional lexical unit are better understood if one takes into account the entire construction in which they occur, rather than the single item under examination. Therefore, the entire construction becomes the true linguistic form to be investigated. Still another aspect of constructionist approachesand in particular of Goldbergs cognitive construction grammar (Goldberg 2006)that will prove useful in our analysis is the use of inheritance links, which account for the relations holding among constructions. The inheritance system works this way: if construction A shares some formal properties with construction B, then construction A also shares some semantic properties with B, and the two constructions are related by an inheritance link. As is well known, according to Goldberg (1995: 75 ), there are four major types of inheritance links: polysemy links (IP ), subpart links (IS ), instance links (II ) and nally metaphorical extension links (IM ).1 Given this
1.
Polysemy links (IP ) capture the nature of the semantic relations between a particular sense of a construction and any extensions from this sense (Goldberg 1995: 75); subpart links (IS ) are posited when one construction is a proper subpart of another construction and exists independently (1995: 78); instance links (II ) are posited when a construction is a more fully specied version of the construction it is linked to (1995: 79); nally, metaphorical extension links (IM ) are posited when two constructions are found to be related by a metaphorical mapping (1995: 81).
Magari
79
framework, all the possible abstract constructions that host the adverb magari can be regarded as constructions that share at least one formal propertynamely the presence of magariand that are eligible to be linked to one another at the representational level. Therefore, once we have identied the form of the constructions in which magari occurs, we will have to determine which kind of inheritance links connect these various constructions to one another. First of all, though, we have to address the formal analysis of the contexts of magari, which is not unproblematic. 3.2. A practical diculty
A practical diculty in the formal analysis of magari concerns the abovementioned syntactic versatility of this word, which makes it particularly hard to dene its contexts of occurrence. Indeed, magari occurs in every illocutionary act: assertions (cf. examples (1) to (3)), orders (4), exclamations (5). It can also occur in questions, as exemplied in (7):2 (7) ` Non potrebbe essere uscito con un amico? Non sara magari con suo fratello? Dont you think he might have gone out with a friend? Couldnt he be with his brother?
What is more, the categorial status of magari is not easy to dene: it can be used either as an interjection (6) or as an adverb (cf. examples (1) through (5) and (7)). Occasionally, and retaining some of the semantic properties of its adverbial function, it can also be understood as a clause connective, as exemplied in (8): (8) ` Magari un po debolina, magari me la sono immaginata, magari e solo un eetto ottico . . . Ma vi giuro che lho vista [Web] Maybe a bit feeble, maybe I dreamt it, maybe its just an optical eect, but I swear I saw it
2. The analysis presented in this article was carried out with the aid of real examples. In particular, we made use of corpora of contemporary written and spoken Italian, respectively la Repubblica corpus [laR] and Lessico di frequenza dellitaliano parlato [LIP] (see Section 4 for further information about these corpora); in addition, we took examples from the Web [Web] and contemporary novels. Examples taken from the mentioned corpora or the Web are marked with the corresponding abbreviation in squared brackets at the end of the example. Texts taken from novels include the full reference of the novel. Intuition-based examples, on the contrary, have no indication.
80
As an adverb, magari can have scope on units of dierent size and category. First, it can have both a clausal scope, as in (1), (3), (4) and (5), and a phrasal scope, as in examples (2), (7) and (9) to (11): (9) ` E un piacere venir qui e vedere tutta questa gente che si commuove per me e che magari ha pianto per la mia vittoria [Web] Its nice to come here and see all these people who are moved because of me and who, maybe, cried for my victory Gli aerei piccoli e molto utilizzati, con personale poco pagato e maga` ri stanco, non possono dare la massima tranquillita ne agli utenti ne ai sindacati [Web] Small and thoroughly exploited aircrafts, with a badly paid and possibly tired sta, cannot fully reassure either users or trade unions Si discute magari male, ma sempre molto a lungo We discuss perhaps badly, but always at great length
(10)
(11)
When magari has scope on a phrase, the latter can be a verb phrase (9), an adjectival phrase (10), an adverbial phrase (11), a prepositional phrase (7) and even a noun phrase as in (2). Besides, magari is endowed with an almost unrestrained syntactic mobility: it can occur in fact at every major phrasal boundary. For example, if we consider the proposition in (12), we may have the patterns in (13): (12) (13) ` Luigi e venuto Luigi has come ` a. Magari Luigi e venuto ` b. Luigi magari e venuto ` c. Luigi e venuto, magari
Some regularities can be easily detected even in such a complex picture. For example, magaris use as an optative is preferably expressed with an interjection in exclamative contexts. The imperative use is associated with orders. Nevertheless, some diculties remain in detecting the relevant context associated with non factual, scalar and concessive uses of magari. The assertive context, for example, appears to be associated with all these functions. The size and category of the unit within the scope of magari is not a relevant factor in determining the function of this word. As a matter of example, we have shown above that magari retains the same scalar function whether it has scope on a verb phrase (9), an adjectival phrase (10) or a nominal phrase (2). Such a function is also compatible with a clausal scope of magari, as shown in (14):
Magari (14)
81
` ` Forse e venuto ieri, ha passato qui tutto il pomeriggio e magari si e fermato a dormire Perhaps he came yesterday, he spent the whole afternoon here and possibly he stopped for the night
Not even the distribution of magari within the clause seems to be a relevant parameter to detect its correct function. For instance, when it has a sentential scope, magari may have the same concessive function whether it fronts the clause (15), is pre-verbal (16) or, nally, post-verbal (17): (15) (16) (17) Magari Luigi ha sbagliato, ma io non me ne sono accorta Luigi might have made a mistake, but I couldnt spot it Luigi magari ha sbagliato, ma io non me ne sono accorta Luigi ha sbagliato, magari, ma io non me ne sono accorta
This syntactic versatility of magari makes it not trivial to identify the structural constraints that characterize the maximally abstract magari construction (the hyperonymic magari) and all other sub-constructions. As things stand, we could simply propose that, in all its functions, magari can be described as the only lexically specied part of a maximally abstract construction in which the only relevant information regards the internal make-up, i.e., the presence of the unit magari with its phonetic properties (even the information about its categorial status is underspecied, since it may behave both as an adverb and as an interjection). Such a characterization is obviously largely unsatisfactory for our purposes. 3.3. Theoretical tools: Discourse congurations, topological structures, topological patterns
The diculty described in the previous section has led us to look for tools that could help in better dening the trans-categorial and trans-level nature of magari. Within the French linguistic tradition, a level of analysis has been identied that is called conguration de discours (discourse conguration) (cf., among others, Blanche-Benveniste 1993, 1997; BlancheBenveniste et al. 1979, 1990; Gerdes and Kahane in print.). In order to dene discourse congurations, we assume as a primitive what BlancheBenveniste et al. (1979) called construction maximale (maximal construction), i.e., the predicate-argument-adjunct structure. The predicateargument-adjunct structure is hardly ever realized together in a sequence in discourse. More frequently, it is gradually built by means of repetitions, rewordings, additions, and other kinds of insistences on one or
82
more of its positions. So, for example, the predicate-argument-adjunct structure in (18) may be realized as in (19) as well as in (20):3 (18) (19) ADJ1 -ARG1 -PRE-ARG2 ` Forse chissa io ho scelto il momento sbagliato Maybe, who knows, I have chosen the wrong moment 1 2 forse maybe ` chissa who knows ADJ1 (20) io I ARG1 ho scelto have chosen PRE il momento sbagliato the wrong moment ARG2
Magari lui rincorre un sogno, unutopia, un ideale qualunque Maybe he pursues a dream, a utopia, an ideal whatsoever 1 2 3 ADJ1 ARG1 PRE magari maybe lui he rincorre pursues un sogno a dream unutopia a utopia un ideale qualunque an ideal whatsoever ARG2
A given predicate-argument-adjunct structure can also be instantiated more than once in discourse. For example, the spoken sequence in (21) features two repetitions of the ADJ1 -ARG1 -PRE-ARG2 structure, beside the multiple instantiations of the ARG1 and ARG2 positions: (21) praticamente per ogni tipo di gioco cera un edicio specico. Per esempio il circo serviva alle corse dei carri, lanteatro alle lotte dei gladiatori, lo stadio per i giochi atletici In practice, for every kind of game there was a specic building. For example the circus was for the chariot races, the amphitheatre for the gladiator ghts, the stadium for athletic games [from Bonvino 2005: 61]
3.
Apart from the ARG, ADJ and PRE abbreviations for argument, adjunct and predicate, respectively, we sometimes use other labels, namely: ASP (aspectual element), CAUSE (causative element) and MOD (modal element). Besides, note that the translations of the examples throughout the paper are deliberately as literal as possible in order to facilitate the grid representation (see below).
Magari 1 praticamente in practice per ogni tipo di gioco for every kind of game il circo the circus cera there was serviva was un edicio specico a specic building alle corse dei carri for the chariot races
83
per esempio for example
lanteatro the amphitheatre lo stadio the stadium
alle lotte dei gladiatori for the gladiator ghts per i giochi atletici for athletic games PRE ARG2
ADJ1
ARG1
The chunk made up of the sequence of units that instantiate or repeat a given predicate-argument-adjunct structure is called discourse conguration (Pietrandrea 2008a). As this denition makes clear, discourse congurations are objects dened in purely syntactic terms. Interestingly, though, they may have a semantic investment. Let us consider the following example: (22) Io mangio. Il mondo crolla I eat. The world collapses
In this case there is a discourse conguration dened by one repetition of the syntactic structure ARG-PRE. The two sentences making up the discourse conguration depict totally unrelated situations. Still, the syntactic parallelism between the two sentences forces the addressee to nd a semantic relation between them: in this case a relation of contrast. Interestingly, when the same situations are depicted by two contiguous but distinct structures, which hence do not form a proper discourse conguration, the addressee is not invited to postulate a relation between the two situations: (23)
?? ??
Sono io che mangio. Il mondo crolla It is me who eat. The world collapses
84
Discourse congurations can be more or less extended: while the discourse congurations in (19) and (20) are limited to a clause, the discourse conguration in (21) spans an entire text. Therefore, crucially, discourse congurations are dened regardless of the boundary between the clausal and the supra-clausal level. The discourse congurations from (19) through (21) are represented in grids, i.e., through a rewriting procedure elaborated mainly by Blanche-Benveniste and colleagues (1979), Bilger (1982), BlancheBenveniste and colleagues (1990), Bilger and colleagues (1997), and Bonvino (2005) and Pietrandrea (2008a) for Italian. This rewriting procedure consists in a representation of the speech ow on a bi-dimensional plane and is constrained by three rules: (i) the horizontal axis of the plane should feature the sequence of the positions that dene the predicateargument-adjunct structure; (ii) the vertical axis should list all the actual realizations within each position; (iii) a left-to-right and top-down reading of the string contained in the grid should render the linear order of the represented chunk. This representation highlights an important fact, namely that the salient feature of discourse congurations is their shape and not the categories they are made up of. We refer to the shape of a discourse conguration with the term topological structure. Such a topological structure can be described by referring to a number of bi-dimensional topological patterns: lists of elements in the same position, repetitions of syntactic structures, chiasms of elements shifting from a pre-verbal to a post-verbal position (or viceversa) and so on.4 It should be noted that units belonging to dierent levels and categories may enter the same topological structure. For example, the two discourse congurations used as answers in (24) and (25) are made up of clauses and nominal constituents respectively. Nevertheless, they present the very same topological structure, characterized by a list of instantiations of the rightmost position, whose last element is preceded by magari (cf. Section 4.4 for more details on this structure).
4.
For example, the discourse conguration in (21) is characterized by two lists of arguments in the ARG1 and ARG2 positions; one repetition of syntactic structure (line 1 and line 2) and a chiasm between the rst two realizations of the ARG1 and ARG2 positions. The pre-verbal hyperonym in ARG1 -line1 position per ogni tipo di gioco for every kind of game is exemplied by a post-verbal hyponym in ARG2 -line2 alle corse dei carri for the chariot races, whereas the post-verbal hyperonym in ARG2 -line1 position un edicio specico a specic building is exemplied by a pre-verbal hyponym in ARG1 -line2 il circo the circus (Bonvino 2005: 61).
Magari (24) A: B: 1 ` ` Come mai e cos tranquillo? Why is he so calm? ` ` ` Sara rientrato presto, si sara riposato, magari avra dormito He probably came back home early, rested, maybe slept ` sara rientrato presto he probably came back home early ` si sara riposato he probably rested magari maybe ADJ1 (25) A: B: 1 2 3 magari maybe ADJ1 ` avra dormito he probably slept PRE
85
2 3
` Chi puo essere stato? Who could have done this? Un gatto, un cane, magari una scimmietta A cat, a dog, possibly a small monkey un gatto a cat un cane a dog una scimmietta a small monkey ARG1
The body of research carried out on discourse congurations has shown, albeit incidentally, that certain topological patterns, as well as the various topological structures within which those patterns are unied, may have very abstract meanings. For example, we have seen above in (22) that the repetition of the same syntactic structure may carry a meaning of contrast (see also Blanche Benveniste 1997: 113). Another example of meaningful topological pattern comes from the listing of elements in only one position of the grid. The abstract topological pattern list has the very general meaning of relation among the conjuncts and may assume more specic meanings according to the exact way in which it is instantiated. For instance, it is acknowledged that a list instantiated by conjuncts preceded by one or more additive conjunctions, is interpreted as an additive list (26).
86 (26)
F. Masini and P. Pietrandrea Ha comprato il pane e il latte (S)he bought bread and milk 1 2 PRE ha comprato (s)he bought e and ADJ1 il pane bread il latte milk ARG1
Accordingly, a list instantiated by conjuncts preceded by one or more disjunctive conjunctions is interpreted as an alternative list (27). (27) Torna domani o dopodomani (S)he will come back tomorrow or the day after 1 2 PRE torna (s)he comes back o or ADJ1 domani tomorrow dopodomani the day after ADJ2
Yet other types of lists are possible (cf. Bonvino et al. 2009 for a preliminary study, but also Gerdes and Kahane in print). A list which features the repetition (for two, three or more times) of the same lexical material in the same position conveys a general meaning of intensication that, we may suppose, specialises according to the categorial nature of the repeated constituent. For example, the repetition of the adjective piccola small in (28) acts as a superlative, while the repetition of the verb in (29), as noted by Bertinetto (1991: 50), is a special way to express continuous aspect in Italian. (28) Ho visto una casa piccola piccola I saw a little little house 1 2 PRE ARG1 ho visto I saw una casa a house piccola little piccola little ADJ1
Magari (29)
87
Leroe cerca cerca cerca ma non trova nulla The hero searches, searches, searches but does not nd anything [from Bertinetto 1991: 50] 1 2 3 4 ma but non not ADJ1 ADJ1 ARG1 PRE leroe the hero cerca searches cerca searches cerca searches trova nds PRE niente anything ARG1 ARG2
A list which features the repetition of semantically related elements, especially co-hyponyms may convey a meaning of lexical approximation (30). (30) Cera un elenco, un sommario, un indice insomma There was a list, a summary, an index lets say 1 2 3 PRE cera there was un elenco a list un sommario a summary un indice an index ARG1 insomma lets say ADJ1
We propose in Figure 1 a tentative representation of the relations observed between the listing phenomena mentioned above. Albeit preliminary, this representation shows that the use of a topological methodology allows to provide a unied account for a number of constructional phenomena usually treated under dierent domains. To sum up, topological patterns can be viewed as indenitely extended, bi-dimensional, syntactic patterns, dened regardless of the boundary
88
Figure 1. The constructional network for lists
between the clausal and the supra-clausal level and (at least at the most abstract level) regardless of the categories they are made up of. These formal patterns are meaningful to a certain extent. 3.4. Topological patterns as constructions
The existence of meaningful abstract patterns naturally recalls the notion of construction in construction grammar. We would propose therefore to consider topological patterns as a type of constructions that operate at the level of discourse congurations. Including topological patterns among the array of constructions is in line with some important recent attempts to break the boundary of the clause/sentence (Mithun 2005, 2008) and to extend the notion of con struction to upper-level entities (Fried and Ostman 2004; Fried and Ost man 2005; Ostman 2005). In fact, as made clear by Fried and Ostman
Magari
89
(2005) and Ostman (2005), construction grammar, as a theory, has no built-in limitation with respect to the extension of the notion of construction to larger stretches of discourse. Yet, only a very limited number of works on constructions deal with upper-level entities. Among them, Ostman (2005) should be mentioned, who suggests that constructions can be detected at the textual level, claiming that there exist discourse patterns with a form (i.e., text type) and a meaning or function (i.e., genre). Our paper follows this line of research. However, whereas Ostman (2005) extends the notion of construction to entire texts and claims that this textual setting is essential for interpreting certain sentences correctly, in this paper we hypothesize that there exist constructions insensitive to the boundary of the clause/sentence and dened by their topological structure. Like all constructions, they can be more or less specied and enter inheritance systems. The analysis of magari in what follows will be driven by these theoretical hypotheses. 4. The analysis of magari
Pietrandrea (2007) hasalbeit cursorilyobserved that, although distributional regularities cannot be found at the clause level, magari has a regular topological distribution in discourse congurations. In particular, she noted that 42 out of the 75 tokens of magari occurring in a small corpus of the Roman variety of spoken Italian (about 56 percent), are associated with a specic kind of topological pattern: the focus of magari belongs to a list of items that realize the same syntactic position. In utterance (31), for example, the ARG2 position is realized by four dierent arguments (in a scene, in a forest, in a jungle, in the desert), the rst of which is in the focus of magari. (31) Che ne so poteva comparire una scenograa che che magari li riportava ne in un ambiente, in una foresta piuttosto che in una giungla nel deserto [LIP] I dont know a set could appear that that maybe reconveyed them in in a scene, in a forest rather than in a jungle, in the desert
90

1 che ne so poteva una che I dont comparire scenograa that know could a set appear che that magari maybe li riportava ne reconveyed in them in un ambiente in a scene in una foresta in a forest piuttosto che rather than in una giunga in a jungle nel deserto in the desert ARG1 ADJ1 ADJ1 PRE ARG1 ADJ2 PRE ARG2
This strong tendency to occur in lists induced Pietrandrea (2007) to semantically characterize magari, commonly understood as an epistemic adverb, as a generic marker of non factuality, more precisely as a marker of non exclusion of factuality (NEF). In other words, putting forward the constituent in the focus of magari as but one of a set of possible options, the speaker does not fully subscribe to the factuality of the proposition realized through that constituent (that reconveyed them in a scene): (s)he simply does not exclude that that proposition could be factual. We have further extended Pietrandreas analysis with the aid of two large, diatopically balanced corpora of both spoken and written contemporary Italian, namely: the la Repubblica corpus (written, approx. 380 million tokens, cf. Baroni et al. 2004) and the Lessico di Frequenza dellItaliano Parlato (LIP) corpus (spoken, approx. 500.000 tokens, cf. De Mauro et al. 1993).
Magari
91
We randomly selected 600 occurrences of magari (300 in the written corpus and 300 in the spoken corpus). For the sake of consistency, we subtracted from this rst corpus 35 occurrences (32 spoken and 3 written) that could not be easily interpreted, such as for example (32), where the speaker interrupts himself right after uttering magari, thus making a proper classication impossible: (32) ` ` Cioe non vorrei scartare questa possibilita a priori insomma magari I mean I wouldnt want to rule this out from the outset, I mean, magari
We analyzed the remaining 565 occurrences of magari within the context of their discourse congurations; i.e., we took as a relevant unit of analysis the whole chunk made up of the sequence of units that instantiate or repeat the predicate-argument-adjunct structure that each occurrence of magari contributes to dene. A rst thorough analysis of this corpus allowed us to identify the ve main functions of magari mentioned in Section 1. Afterwards, both authors coded separately the entire data set. The more problematic cases were discussed together. This led to exclude, for the sake of simplicity, the 20 occurrences of magari (9 spoken and 11 written) that fullled more than one function. Thus, for example, we excluded occurrences such as (33), where magari functions at the same time as a concessive and as a scalar marker.5 The nal corpus therefore amounts to a total of 545 (286 written and 259 spoken) occurrences of magari. (33) Ce la mettono tutta, magari scrivono anche bei pezzi, ma sono troppo limitati, possono esprimersi solo parzialmente [laR] They try hard, they might even write nice pieces, but they are too limited, they can only partially express themselves
The discussion of the more problematic cases also led to a more precise semantic characterization of the ve functions associated to magari, which can be dened as follows: equipotential non exclusion of factuality (ENEF); scalar non exclusion of factuality; scalar concessive conditional; weakened imperative; optative.
5. These are cases of unications, which are extremely interesting from a theoretical point of view. However, for our current purposes, they would have biased the overall picture.
92
Table 1. Functions of magari in the corpus ENEF Written With list Without list Spoken With list Without list Total With list Without list % 30 23 7 87 47 40 117 70 47 21% Scalar 223 145 78 91 63 28 314 208 106 58% Concessive 27 27 0 28 28 0 55 55 0 10% Imperative 3 1 2 47 15 32 50 16 34 9% Optative 3 0 3 6 0 6 9 0 9 2% Total 286 196 90 259 153 106 545 349 196 100% % 100% 69% 31% 100% 60% 40% 100% 64% 36%
The quantitative results of our investigation on the nal sample are given in Table 1. The data in Table 1 show a complex picture. First of all, the ve functions of magari are not equally distributed across the corpus: the scalar and the ENEF functions cover together 79 percent of the occurrences of magari in the corpus (58 percent and 21 percent respectively), whereas concessives and imperatives are more marginal (10 percent and 9 percent respectively) and optatives are very infrequent (2 percent of the occurrences). Secondly, the association of magari with lists is regular: 349 occurrences of magari out of 545 (64 percent) have in their focus a constituent belonging to a list. It should be noted that, far from being a phenomenon typical of spoken language, the written occurrences of magari present an even more regular association with lists (69 percent). It can also be observed that, with the exception of optatives, which are by the way very rare, magari tends to be associated with lists no matter its exact function. See Figure 2. Yet the association of magari with lists is not equally distributed across the various functions (w 2 71;48, df 4, p < 0;001). As is shown in Figure 3, scalar and concessive magari clearly prefer the association with lists, whereas optative, imperative and (to a lesser extent) ENEF magari tend to be associated with lists less than expected. The ve functions of magari are not even equally distributed across modalities: indeed, there is a signicant interaction between the various functions of magari and the spoken vs. written modality (w 2 121;05, df 4, p < 0;001).6 In particular, scalar uses are more frequent in writing
6. We thank the Associate Editor who reviewed the paper for pointing this out.
Magari
93
Figure 2. The association of the ve functions of magari with lists
Figure 3. Interaction between the ve functions of magari and their association with lists
than expected, while all other uses are preferred in spoken data. These regularities are indicated in Figure 4. Another signicant regularity emerges from the observation of the distribution of the ve functions across modalities x [list] vs. [list] constructions (w 2 204;57, p < 0;001).7 As represented in Figure 5, in fact, it
7. See note 6.
94
Figure 4. Interaction between the ve functions of magari and their occurrence across modalities
Figure 5. Interaction between the functions of magari and their occurrence across modalities and list constructions
is clear that the ENEF magari retains its preference for spoken modality, regardless of its association with lists, whereas the strong preference of imperatives for spoken modality breaks down with list constructions. The concessive magari prefers list constructions regardless of modality,
Magari
95
whereas the scalar magari prefers written modality regardless of its association with lists. Finally, optatives disprefer the association with list constructions regardless of modality. All in all, the analysis of the distribution of the ve functions of magari underlines the peculiar behaviour of the optative function, which is much less frequent than the others and does not occur in lists regardless of modality. It will be clearer in what follows that, as theoretically hypothesized within the Behavioral Prole approach (Divjak 2006; Divjak and Gries 2006, Gries 2006), the behavioral regularities observed at the distributional level have a relevance at the cognitive and constructional levels as well. 4.1. Magari as a focus particle
The tendency of the elements focused by magari to occur in lists has led us to consider this word as a particular type of focus particle. As shown by Nlke (1983, 2001) and Konig (1991), focus particles, such as the En glish also, even, only or the French meme are particles endowed with a remarkable syntactic mobility, which have scope on a constituent and focus on a part of it, thereby interacting with the focus structure of the sentence in which they occur (Konig 1991: 10). By focusing on a part of the scope, in fact, focus particles relate the value of the focused expression to a set of paradigmatic alternatives. For example, in (34) also has scope on the entire sentences and focuses on Piero, relating the value Piero to a set of paradigmatic alternatives. This entails the presupposition that someone else has left: (34) Piero has also left
This property, that derives from the very notion of focus (Rooth 1992), has been highlighted by Nlke (1983, 2001), who denes the focus particles of French adverbes paradigmatisants (paradigmatizing adverbs), i.e., adverbs presupposing the existence of a paradigm of variables that act as alternatives to the element in their focus. Magari presents all the features that are typical of focus particles: it is characterized by a noticeable syntactic mobility, it has scope on constituents of various type and size and it focuses on a part of them, relating this focused part to a set of alternatives. In (31), for example, magari has scope on the constituent li riportava in un ambiente reconveyed them in a scene and focuses on in un ambiente in a scene, which is therefore related with a set of alternatives (in a forest, in the jungle, in the
96
desert). The peculiarity of magari is that, in the vast majority of its occurrences, the set of alternatives is not merely presupposed, but concretely realized by the list of units occupying the same position of the focused element. As we will see, the characterization of magari as a paradigmatizing adverb will have important consequences for our analysis. 4.2. Dening the topological structure associated with magari
Considering magari as a focus particle anchored to both a focus and a scope enables us to provide a more rigorous denition of the relevant portion of topological space associated with this word. This can be dened as the space on the grid delimited on the horizontal axis by the position of magari plus the extension of its scope and on the vertical axis by the extension of the list of elements occupying the same position of the focus of magari. While the focus of magari is easily identiable through classical tests, the extension of its scope is a less apparent matter. Following the rules established by Nlke (2001: 274) for detecting the extension of the scope of French focus particles, we will distinguish two cases. If magari is pronounced with a neutral intonation, as in (35), it scopes over the whole sequence of units to its right, until the intonational phrase ends. If magari is pronounced with a parenthetical intonation, as in (36), not only the sequence of units to its right, but also the immediately preceding phrase is included in its scope: (35) (36) ` Magari TORNA SUBITO, se non e proprio scemo He might come back immediately, if hes not completely stupid ` STARA CANTANDO, magari SOTTO LA DOCCIA, LA SUA CANZONE PREFERITA He might be singing, maybe in the shower, his favourite song
How are sentences with broken scope like (36) to be represent in grids? The rules mentioned in Section 3.3 impose a representation of the sentence in (36) as in (37). In order to account for the fact that in the abstract predicate-argument-adjunct structure magari has scope over the entire clause, we should write it in the lower left position; however, in order to preserve the linear order of the sequence, we should also write it one line below with respect to the rst constituent uttered.
Magari (37) Grid representation of (36) 1 ` stara cantando he will be singing magari maybe sotto la doccia in the shower PRE ADJ2 la sua canzone preferita his favorite song ARG1
97
ADJ1
The topological structure relevant for our analysis is now univocally dened. Henceforth, it will be visually delimited by a thicker border, as shown in (38) and (39). It should be noted that this is a mere topological unit, which can be instantiated by items of very dierent type and size ranging from the sole magari, as in (38), to an entire text, as in (39). (38) Magari! I wish (it were like this)! 1 magari I wish PRE (39) Magari stava mangiando, o passeggiando, semplicemente, sul ponte ` . . . magari era l che si stava aggiustando i pantaloni Maybe he was eating, or strolling, simply, on the deck . . . maybe he was over there straightening his trousers [from Alessandro Baricco, Novecento, Milan, Feltrinelli, 1994] 1 2 magari maybe o or stava he was mangiando eating passeggiando semplicemente sul ponte simply strolling on the deck ` era l che si stava he was over there ASP ADJ1 PRE aggiustando i pantaloni straightening his trousers PRE
magari maybe
98
This structure can be dened in constructional terms as a semi-specied topological structure characterized not only by the presence of a fully lexically specied item (magari), but also, in most cases, by a specic topological pattern. The latter can be described as a list of equivalent items that occupy the very same position as the focus of magari and that can be of dierent type and size. Although this abstract structure recurs, no matter what the exact function of magari is, the exact form of the list changes according to the function of magari. In the following sections, this phenomenon will be examined in detail. 4.3. Equipotential non exclusion of factuality (ENEF)
About 21 percent of the occurrences of magari fulll the function of presenting the focus of magari as an element whose factuality is not excluded on a par with the factuality of other elements. We call this function equipotential non exclusion of factuality (ENEF). The speaker puts the element in the focus of magari and its alternatives on the same level. In doing so, (s)he does not exclude, but (s)he neither subscribes to the focused element, which is considered equally possible with respect to the other options. Examples of ENEF magari are provided in (40) through (42): (40) ` ` Tentero magari la corona Ibf o Wbc, insomma continuero [laR] Maybe I will try (to win) the Ibf or the Wbc title, in any case I will go on 1 2 3 insomma in any case ` continuero I will go on ARG1 ADJ1 (41) PRE ADJ2 ARG1 ADJ1 ` tentero I will try magari maybe o or la corona the title Ibf Ibf Wbc Wbc
Avremo modo di discutere sui nostri capolavori e sui titoli che magari sono stati messi una o poche volte [Web] We will have a chance to talk about our masterpieces and about the titles that maybe have been quoted one or few times
Magari
1 avremo modo well have a chance di discutere to talk dei nostri capolavori about our masterpieces
99
e dei titoli and about the titles
che that
magari maybe
sono stati messi have been quoted
una one
o or
poche volte few times PRE ADJ2
MOD PRE
PRE ARG1
ARG1 ADJ1
ADJ1
(42)
` ` Magari e arrivato lautobus o e passato un suo amico in macchina [Web] Maybe the bus has come or maybe a friend of his in a car has passed by 1 2 magari maybe o or ADJ1 ` e arrivato has come ` e passato has passed by PRE lautobus the bus un suo amico a friend of his ARG1 in macchina in a car ADJ2
When used in this function, the focus of magari regularly occurs (60 percent of the occurrences in the two corpora, 77 percent in the written corpus) at the top of a list of elements that either occupy one and the same position, as in (40) and (41), or instantiate the same syntactic structure, as in (42). Although in all the examples above magari fullls the same semantic function, it should be noted that the constituent in its focus may belong to very dierent categories: it can be a prepositional argument (31), a nominal argument (40), an adjunct (41) or also a clause (42). This fact supports the hypothesis that it is the topological structure of the construction (in particular the position of the focus of magari at the top of a list), rather than other categorial variables, that is relevant for licensing the non factual reading of magari.
100
It is worth mentioning that the conjuncts that are listed below the constituent focused by magari can be introduced by disjunctive conjunctionssuch as piuttosto che rather than in (31) or o or, as in (40) through (42)or by a second occurrence of magari, as in (43), which can function as a disjunctive connective according to Mauri (2008a, 2008b). Sometimes, especially when the list is long enough, the items can be listed without explicit conjunction markers. In all cases the list is interpreted as a disjunctive list. (43) ` ` ` ` Magari e l da un attimo magari e l da sempre Maybe hes been there for a second, maybe hes been there forever [from Alessandro Baricco, Oceano Mare, Milan, BUR, 1999]
The regular association of magari with disjunctive lists suggests that the overall eect of equipotential non exclusion of factuality is constructional in nature. As already argued, magari is a general marker of non factuality. This marker happens to be regularly associated with lists. It is precisely this regular association with lists that turns magari into a more specic kind of marker, i.e., a marker of non exclusion of factuality. The fact that the list we are dealing with is disjunctive in nature adds still another feature. Indeed, the disjunctive list can be characterized as the semantic relation which obtains between two (or more) items that are equally possible [ . . . ] and are potential substitutes for each other (Mauri 2008a: 25). Therefore, the fact that the focus of magari belongs to a disjunctive list suggests that it is put forward as an option not to be excluded on a par with the other listed options. This combination of features contributes to produce the interpretation of magari as a marker of equipotential non exclusion of factuality. 4.4. Scalar non exclusion of factuality
As much as 58 percent of the occurrences of magari in the corpus (78 percent in the written corpus) fulll the function of scalar operator of non factuality, in that they trigger a scale of non factuality whose extreme position is occupied by the element in its focus. Examples are in (44) through (47): (44) I lm di oggi saranno stati approvati dallalto tre, quattro, magari cinque volte [laR] Todays movies have been probably approved from on high three, four, maybe ve times
Magari 1 i lm doggi Todays movies saranno stati approvati have been probably approved dallalto from on high tre three
101
2 3 magari maybe
quattro four cinque volte ve times ADJ2
ADJ1 ARG1 (45) PRE ADJ1 ADJ2
Vorrei strapparle una parola, una battuta, magari un mezzo sorriso [Web] I would like to get a word, a quip, maybe a faint smile out of her 1 vorrei I would like to strapparle get out of her una parola a word una battuta a quip magari maybe MOD PRE PRE ADJ1 ARG2 un mezzo sorriso a faint smile
2 3
(46)
` Li condanna a vivere in una societa che non a torto e non per razzismo li vede con sospetto, li sfugge e magari li respinge [laR] It condemns them to live in a society that not injustly and not for racism views them with suspicion, keeps away from them and maybe rejects them
102
1

li condanna it condemns them a vivere to live in una ` societa in a society che that non a torto not injustly e non per razzismo and not for racism li vede con sospetto views them with suspicion li sfugge keeps away from them e magari and maybe ARG1 PRE ARG1 ADJ1 ADJ2 ADJ1 ADJ2 li respinge rejects them
PRE
(47)
Alla ne io mi sarei sentita in colpa e magari lui avrebbe nito per detestarmi [Web] In the end I would have felt guilty and maybe he would have ended up hating me 1 alla ne in the end e magari and maybe ADJ1 ADJ2 io I lui he mi sarei sentita in colpa would have felt guilty avrebbe nito per detestarmi would have ended up hating me PRE
ARG1
The scalar function can be considered a particular instance of the focusing character of magari. It was shown in Section 4.1 that, as a focus particle, magari entails the existence of a certain number of propositions that form a paradigm. In (44), for example, the paradigm is comprised of the following propositions: they have been approved from on high three times, they have been approved from on high four times, they have been approved from on high ve times. In the function under examina-
Magari
103
tion, magari indicates that the constituent in its focus (the proposition they have been approved from on high ve times) realizes the most extreme proposition in the paradigm, i.e., the most non factual one or, rather, the last one for which the speaker would not exclude the factuality. This imposes a directionalityand consequently a scalarityto the paradigm, which turns into a scalar domain of non factuality, in which the proposition realized by the constituent in the focus of magari has the highest degree of non factuality. As examples (44) through (47) make clear, when fullling this function, the focus of magari regularly (66 percent) occurs at the bottom of a list of constituents occupying the same syntactic position or realizing the same syntactic structure. Also in this case, it is clear that the constituents in the list may dier largely from each other in size and category: they can be nominal arguments (45), predicates (46), adjuncts (44) or even entire clauses (47). This suggests that the scalar meaning of magari is licensed by the peculiar topological structure associated with it, i.e., by the occurrence of the focus of magari at the bottom of a list of constituents. Our analysis so far shows that the function of magari as a scalar operator of non exclusion of factuality is constructional in nature. The general non factual meaning of magari combined with a list yields an overall meaning of non exclusion of factuality, as shown in Section 4.3. The fact that magari focuses on the last conjunct of a listas already noted by Fauconnier (1976) and Kay (1990) in their analyses of the French word meme and the English word evenintroduces in the same construction an entire domain (corresponding to the items listed above the one in the focus of magari plus the item in the focus of magari) and, at the same time, the most extreme item of that domain (corresponding to the focus of magari). Representing one of the listed items as the most extreme in the domain induces a ranking. Given the semantic nature of magari, the listed items are ordered for increasing degree of non factuality, more precisely they are ordered from the most likely to the last one for which the speaker would not exclude the factuality. In spite of the well established association of scalar magari with list constructions, 42 percent of the occurrences of magari with this function are not associated with a list. We will account for these exceptions in Section 4.8. 4.5. Scalar concessive conditional
About 10 percent of the occurrences of magari have a particular type of scalar function: they are scalar concessive conditionals. An example was provided in (3), another one is in (48):
104 (48)
F. Masini and P. Pietrandrea ` ` Ciascuna di queste vicende e, magari, piccola; ma la loro somma e un grande dramma [laR] Each of these events is, maybe, small; but their sum is a great tragedy
The name scalar concessive conditional has been proposed by Haspelmath and Konig (1998) to indicate concessive constructions such as that in (49): (49) Even if we do not get any nancial support, we will go ahead with our project
This construction can be regarded as a particular conditional construction in which a set of protases is related to an apodosis (Haspelmath and Konig 1998: 565). In (49) the set of protases is made up of the vari ous conditions evoked by the scalar operator even (if we get great nancial support, if we get some nancial support, if we do not get any nancial support). These conditions are clearly ranked in a scalar domain according to degree of adversity for the situation described in the apodosis, the condition in the focus of even (if we do not get any nancial support) being considered as the most adverse. In scalar concessive conditionals, the set of protases describes non factual conditions, whereas the apodosis is normally factual.8 It is exactly the combined eect of the factuality of the apodosis and the adversity of the circumstances described in the adverbial clause that triggers the concessive interpretation of this conditional construction. Examples such as (48) can be considered as particular cases of scalar concessive conditionals. Magari in fact evokes a set of conditions arranged in a scalar domain: whether each of these events is remarkable, whether each of these events is normal, or whether each of these events is small. These conditions are non factualdue to the presence of magariand they are ranked not only according to degree of adversity but also to degree of non factuality. The condition in the focus of magari ` (ciascuna di queste vicende e piccola each of these events is small) is therefore not only the most unfavorable, but also the most unlikely. The main clause, however, is clearly factual. As shown in the grid representations in (50) and (51), when magari fullls the function of scalar concessive conditional, its focus is always a non factual item that occupies the rst position of a list made up of at least
8.
Konig and Haspelmath (1998: 573) discuss some marginal exceptions to the factuality of the apodosis, but they are not relevant for our purposes.
Magari
105
two conjuncts, the last of which is introduced by an adversative conjunc` tion (ma/pero but, invece whereas): (50) Il comandante Arguelles si aspettava quindi un temporale, magari violento ma facile da superare [laR] Captain Arguelles therefore expected a possibly violent, but easy to overcome storm
il comandante Arguelles Captain Arguelles si aspettava expected quindi therefore un temporale a storm
magari arguably ma but
violento violent facile da superare easy to overcome ARG1 ADJ2
ADJ1 ARG1 PRE ADJ1 ARG2
(51)
` ` Magari andra per le lunghe, ma non nisce cos Maybe it will go overtime, but it doesnt end like this 1 3 magari maybe ma but non not ADJ1 ADJ1 PRE ` andra it will go nisce it ends PRE per le lunghe overtime ` cos like this ARG1
[laR]
Also in this case, the conjuncts in the focus of magari may be constituents of dierent category and size, such as adjuncts (50) or clauses (51). Therefore, it is only the kind of topological structure associated with this use of magari that licenses its scalar concessive meaning. On the one hand, the list within the topological structure of magari instantiates a particular type of list: [x1 (x2 , . . .), adversative conjunction, xlast , where x1 (x2 , . . .) 3non factual4, xlast 3factual4]. This abstract scheme is also typically associated with non factual concessive meanings in Italian:
106 (52)
F. Masini and P. Pietrandrea ` ` Puo essere che sia intelligente, pero non lo dimostra It is possible that (s)he is clever, but (s)he doesnt show it
On the other hand, the occurrence of magari in a conditional concessive context licences its scalar reading. Magari, as well as other focus particles (such as the Italian anche/pure also or the German auch also), always acquires a scalar meaning in conditional concessive contexts. As shown by Konig (1991: 64), this reinterpretation depends on the Gricean maxim of Relevance: if a conditional connection between two eventualities is asserted and presupposed, it is invariably the more remarkable case that it is asserted. Thus in (50) it would be trivial to assert that the storm expected by the captain may be mild. This eventuality is presupposed, whereas the more remarkable case the storm may be violent is asserted. This tendency entails that the focus of the conditional concessive magari is usually interpreted as the most extreme item in a scalar domain of conditions not to be excluded, i.e., the conditional concessive magari is always interpreted as scalar. 4.6. Imperative
About 9 percent of the occurrences of magari in the corpus have in their focus an imperative (53) or a related construction (Konig and Siemund 2007), such as a hortative (54) or a deontically modalized sentence (55). (53) Magari diglielo, faglielo comunque capire che ci tieni a lui! Maybe tell him, anyway make him understand that you care about him! Senti questo teniamolo, magari vediamolo alle prime bozze! [LIP] Listen, lets keep this, maybe lets see it at the proofreading stage! Bisogna seguire un certo regime alimentare, bisogna magari mangiare un po meno It is necessary to follow a certain diet, it is necessary, maybe, to eat a little less
(54) (55)
As shown by Elliott (2000: 76) and De Haan (2004), the presence of a marker of non factuality in imperative and related constructions is quite a common phenomenon across languages. This association may be motivated by the fact that commands describe non factual situations, which favors the presence of a non factual marker (see Elliott 2000: 76). Nevertheless, as mentioned in Section 1, magari does not merely harmonically mark the non factuality of the command, butas often happens crosslinguistically (Mithun 1995)it also fullls another function: it serves to weaken the force of the command (or exhortation). For example, the
Magari
107
commands in (53) are considered as less mandatory, and consequently more polite, than their counterpart in (56) where magari is absent: (56) Diglielo, faglielo comunque capire che ci tieni a lui! Tell him, anyway make him understand that you care about him!
Needless to say, the weakened imperative function of magari is marginal in the written corpus, while it is attested in as much as 18 percent of the occurrences in the spoken corpus. As the examples (53) to (55) and the grid representations below show, the imperatives in the focus of magari often occur in lists. This holds for about 32 percent of the imperative magari in our corpus. The imperatives may occur at the top of a disjunctive list, as in (57), or at the bottom, as in (58), in which case they also have a scalar meaning: (57) Grid representation of (53) 1 2 magari maybe diglielo tell him faglielo make him comunque anyway capire understand che ci tieni a lui that you care about him
CAUSE ADJ1 (58) PRE
ADJ1
PRE ARG1
Prova a calmarti un po [ . . . ] e magari chiedi scusa alla mamma [Web] Try to calm down a bit and possibly apologize to your mother 1 2 e magari and possibly ADJ1 prova try chiedi scusa apologize PRE a calmarti to calm down alla mamma to your mother ARG1 ADJ2 un po a bit
The occurrence of the imperative focused by magari in a list of imperatives makes it clear how magari weakens the illocutionary force of the imperative. When the speaker puts the focus of magari at the top of a
108
list of alternatives (s)he invites the listener to take his/her command into account as but one of other options. When (s)he puts the focus of magari at the bottom of a list of commands he puts forward the focused command as to be executed as a last resort. It is clear that the weakened imperative function of magari, is, as well as the other functions, constructional in nature. The imperative (or its related constructions) endows the construction with the illocutionary force of a command. The non exclusion value of magari, combined with the occurrence of its focus in a list, is used to present the focused command as an option which is not to be excluded. 4.7. Optative
Less than 2 percent of the occurrences of magari in the corpus have an optative function such as that represented in the following example: (59) ` Magari fosse cos semplice! I wish it were so simple!
It has been shown by Pietrandrea (2008b) that, when introduced in Italian in the 13th century, the word magarietymologically related to the Greek makarios (blissful)only had an optative meaning. It was usually employed as a predicative adjective uttered with an exclamative intonation referring to a sentential subject introduced by the complementizer ke that, as in (60): (60) Makare ke mme abberanno uccisa! If only they killed me! [Iacopone da Todi, XIII laude del Laudario Urbinate, 13th century]
When fullling the optative function, magari is always associated with an exclamative intonational prole. As the examples below show, in the scope of magari there can be a past subjunctive (61), an innitive (62), a non verbal element (63) or even a element as in (6), reproduced in (64): (61) (62) (63) (64) Magari venisse! I wish he would come! Magari averne! I wish I had! Magari due! I wish (there would be) two of them A: Vuoi un po di riposo? Do you want to rest a bit? B: Magari! Id love to!
Magari
109
Apparently the focus of the optative magari is never associated with topological structures characterized by lists. Pietrandrea (2008b) suggests that a relation of semantic bleaching exists between the optative and the non exclusion of factuality function of magari. Such a bleaching may have been historically induced by ambiguous contexts and reinforced precisely by the coalescence of magari within constructions characterized by lists. In fact, as pointed out by Pietrandrea (in print), an optative meaning can be conceived of as the indication of a selection among a set of alternative (SoAs). Consequently, the occurrence of magari within list constructions (where more than one alternative option is expressed) has the eect of weakening the meaning of selection and favouring a more general non factual reading. Synchronically speaking, the optative and the non exclusion of factuality uses of magari are nevertheless related to one another in that they both express non factuality. 4.8. Exceptions
The fact that the focus of the optative magari never belongs to a list has been theoretically explained in the previous section. However, we know from the data in Table 1 that the foci of ENEF, scalar and imperative magari are also not necessarily part of a list. This phenomenon characterizes 36 percent of the occurrences in the corpus and, therefore, it represents an exception to be explained. Let us consider the cases in which magari fullls an ENEF function, but its focus does not belong to a list, as in the following example: (65) ` ` Non rischio, non opero scelte che magari potevano suscitare contrasti [laR] (S)he didnt risk, (s)he didnt make choices that maybe could cause disagreements
1 ` non rischio (s)he didnt risk ` non opero (s)he didnt make scelte choices che that magari maybe potevano suscitare could cause PRE contrasti disagreements
ARG1 PRE ARG1 ADJ1
ADJ1
ARG2
110
We may hypothesize that magari behaves in this context as a regular focus particle. It merely presupposes, without realizing it, the existence of a paradigm of unspecied alternatives to the focus of magari. For example, the clause in the focus of magari in (65) ( potevano suscitare contrasti could cause disagreements) evokes a paradigm of other unspecied, but semantically related, possible alternatives which are not explicitly mentioned in the text, such as: potevano provocare proteste could cause protests, potevano procurare inimicizie could cause hostility, etc. Therefore, it may be hypothesized that this type of magari has inherited the property of evoking paradigms of elements alternative to its focus precisely from its more frequent association with fully realized lists. A similar line of reasoning might be applied to the occurrences of magari in imperatival contexts. Thus a sentence like (66) evokes, without realizing it, a paradigm of alternative commands (tell him, dont tell him), thereby weakening the illocutionary force of the imperative. (66) Magari diglielo Maybe tell him/her
Another exception to be dealt with regards the scalar function of magari. About 34 percent of the scalar occurrences of magari appear in contexts without a list. These are cases like those represented in the following example: (67) [ . . . ] dovrei parlarvi di vini, magari toscani [laR] [ . . . ] I should talk to you about wines, possibly Tuscan (wines)
In this construction magari and its focus have a parenthetical intonation. This characteristic leads us to put forward a possible explanation for these apparently exceptional cases. Indeed, according to the rules established in Section 3.3 for grid representations, when the constituent immediately preceding magari is parenthetical, it should be included in its scope. As a consequence, the grid representations of (67) would be akin to that provided in (68): (68) Grid representation of (67) 1 dovrei I should parlarvi talk to you magari possibly MOD PRE PRE ADJ1 ARG1 ARG1 di vini about wines toscani Tuscan ADJ2
Magari
111
If we rely on this representation, we can hypothesize that magari has focus on a constituent that lies at the bottom of a partially instantiated list. The constituent in the focus of magari is an adjunct that modies the backgrounded part of the scope. In (67), for example, magari focuses on the adjunct toscani Tuscan, which modies the backgrounded item (di vini about wines). This item is given the rst time without modications in the sequence dovrei parlarvi di vini I should talk to you about wines. It is then modied, without an explicit repetition, in the sequence magari toscani possibly Tuscan. The modication, without reiteration of the backgrounded part of the scope, means that the latter elided. The overall eect is that a sequence such as (67) is interpreted as equivalent to the following: (69) [ . . . ] dovrei parlarvi di vini, magari di vini toscani 9 [ . . . ] I should talk to you about wines, possibly Tuscan wines 1 dovrei I should parlarvi talk to you magari possibly MOD PRE PRE ADJ1 ARG1 di vini about wines di vini about wines ARG1 toscani Tuscan ADJ2
In (69), the modifying adjunct focused by magaritoscani Tuscan lies at the bottom of a list of two adjuncts. The rst of these adjuncts (position ADJ2 -line1) is a element, i.e., the position is empty. Consequently, it conveys a meaning such as non qualied (wines) and the whole sequence is interpretable as:
9. Constructions like (69) are indeed grammatical and attested. See, for example, the following instances: ` ` (i) Poi si vedra, se trovero il tempo per dedicarmi ad un uomo, magari un uomo vero Then we will see, if I will nd the time to dedicate myself to a man, possibly a real man (ii) Cera una volta sarebbe un inizio perfetto per cominciare una storia, magari una storia per bambini Once upon a time would be a perfect beginning to start a story, possibly a story for children
112 (70)
F. Masini and P. Pietrandrea [ . . . ] dovrei parlarvi di vini quali che siano, magari di vini toscani [ . . . ] I should talk to you about whichever wines, possibly Tuscan wines
In conclusion, the presence of the modifying adjunct at the bottom of the partially instantiated list would trigger a scale of non factual propositions. The actualization of a more specic eventsuch as that described by the proposition I talk to you about Tuscan winesis in fact to be conceived as less likely than the actualization of a more general event such as that described by the proposition I talk to you about wines. To sum up, two hypotheses can be put forward in order to explain the cases in which the focus of magari does not belong to a list and to relate these cases to other more frequent cases with lists. The rst hypothesis, which virtually applies to all exceptional cases, is that magari evokes a paradigm of possible alternatives with respect to the element in its focus by virtue of its frequent association with concretely realized lists. The second hypothesis only applies to scalar occurrences. In these cases the constituent in the focus of magari is always parenthetical and can be seen as the second conjunct of a partially instantiated list. In both cases, the presence of a list is posited. This entails that from a cognitive, if not linguistic, point of view the focus of magari always belongs to a list, either fully instantiated, or partially instantiated, or simply evoked. This association of magari with a list would mark its general non exclusion of factuality meaning. 5. The network of magari constructions
As mentioned in the introductory sections of this paper, the main goal of our investigation is to understand which contexts license the various functions of magari and whether there is a relation between these functions (and of course which sort of relation). In order to reach this goal, we set our analysis within the general theoretical framework of construction grammar, which in principle allows to treat contexts as linguistic objects, and then we used a specic working methodology, namely the topological methodology, which allowed us to identify a set of topological structures in which magari regularly occurs. In this section we will give a more rened constructionist account of our ndings. The analysis carried out in Section 4 shows that we can distinguish two main magari constructions: the optative magari construction; the map of non exclusion of factuality (NEF) magari constructions: ENEF;
Magari scalar NEF (with fully and partially instantiated list); scalar concessive conditional; weakened imperative.
113
As we briey discussed in Section 4.7, the optative magari presents specic distributional properties that distinguish it from other magari constructions: it is very infrequent, it does not occur with lists and it is always associated with an exclamative intonational prole. However, the two constructions are not completely independent from one another. Firstly, Pietrandrea (2008b) showed that there exists a diachronic semantic bleaching from optative to non exclusion of factuality, thus positing a sort of diachronic link between the two constructions. Secondly, from the point of view of our synchronic analysis, the two constructions share the presence of magari and a general non factuality feature. As for the set of NEF magari constructions, we identied a class of topological structures in which magari occurs regularly. All these structures refer to a more general topological structure that can be represented as in Figure 6: the lexically specied adverb magari is followed by its scope, which is made up of a background and a focus; the latter is part of a list, i.e., is one of the listed elements. It is important to note that there is no explicit information about levels, categories, word order or sentence types, so all this information is underspecied, as well as the type of list involved. All the dierent magari constructions analysed in the sections above are more specied instances of the maximally abstract construction outlined in Figure 6. In other words, these constructions have some properties that specify the abstract construction (partially) described in Figure 6. These properties are listed below: ENEF magari construction: list disjunctive focus x1
Figure 6. The topological structure of the Abstract NEF magari construction
114
F. Masini and P. Pietrandrea scalar NEF magari construction (fully instantiated list): focus xlast scalar concessive conditional magari construction: x1 (x2 , . . .) 3non factual4 vs. xlast 3factual4 focus x1 imperative magari construction: sentence type imperative speech act command, exhortation, etc.
If we accept the hypothesis of the partially instantiated list put forward for the exceptional cases of magari with a scalar function (cf. Section 4.8), then we still have another construction with the following overriding properties: scalar NEF magari construction (partially instantiated list): list [x1 , x2last ], in which x1 focus x2last intonation: parenthetical
Therefore, from a constructionist perspective, the distribution of the non exclusion of factuality magari can be accounted for by positing a hierarchy of closely related topological structures, each of which is regularly associated with one determined function of magari and all of which are linked to a more abstract construction with the general meaning of 3non exclusion of factuality4 and the topological structure described in Figure 6. The network of magari constructions emerging from our results is presented in Figure 7. Before commenting on this gure, it is worth discussing some of the conventions used. First, we made use of the inheritance links proposed by Goldberg (1995) (cf. Section 3.1) to relate the various constructions at issue. Second, regarding the representation of the constructions themselves, we adapted the boxes-within-boxes notation (Fried 2007; Fried and Ostman 2004) to our needs by incorporating the outline of the topological structure associated with magari as the formal part of the construction. Third, some elements are graphically highlighted in order to facilitate the reading of the network: the various magari constructions endowed with a topological structure are enclosed in boxes with thicker borders, whereas the overriding properties of each subconstruction are put in boldface. Constructions with an uncertain status are marked by a question mark near the inheritance link and enclosed in a box framed by a dotted line. Finally, as can be seen, Figure 7 does not contain the imperative magari. The representation of this construction is given in Figure 8, which we will comment on later. Now let us return to Figure 7.
Magari
115
Figure 7. The network of magari constructions
116
Overall, the network of constructions proposed in Figure 7 reveals that the network of magari constructions is basically governed by Instance inheritance links (II ). A maximally abstract Non factual magari construction in instantiates by both the Optative magari construction and the Abstract NEF magari construction. The other magari subconstructions are inherited from the Abstract NEF magari construction by means of instance inheritance links. The Abstract NEF magari construction as represented in Figure 6 is also linkedby means of a Subpart inheritance link (IS )to an independent List construction with the maximally abstract meaning of 3relation between the listed items4, whose existence has been proposed in Section 3.2. As already pointed out, it is precisely the presence of a list that somehow turns the general non factual meaning of magari into a 3non exclusion of factuality4 meaning. At the same time, the ENEF magari construction is linkedby means of an IS to the Disjunctive list construction, which is an instance of the general List construction.10 The main property of the Abstract NEF magari construction is that it features a topological list that includes the element which is in the focus of magari. At this level, however, the interaction between the list and the focus is still underspecied, as well as the list itself. This information becomes more specied as we reach the lower levels of the hierarchy. Both the ENEF magari construction and the Scalar NEF magari construction (with a fully specied list) specify which element of the list is in the focus of magari. In the former, the focus is at the top of the list, and the list is disjunctive; in the latter, the focus is at the bottom of the list. According to the hypothesis put forward in Section 4.8, the Scalar NEF magari might be instantiated by another constructionthe Scalar NEF magari construction with a partially instantiated listwhose topological structure is even more constrained. The list cannot contain more than two elements and the rst one is a null element. Finally, the Abstract NEF magari construction is instantiated by the Scalar concessive conditional construction, which includes a contrastive list in which one or more non factual elements are contrasted with a fac-
10. It should be noted that this link is not strictly necessary, since the Disjunctive list may be instantiated by directly within the ENEF magari construction, that inherits the more general List construction from the Abstract NEF magari construction. We however decided to maintain this link for the sake of explicitness, and more precisely to highlight the semantic contribution of the disjunctive list to the whole ENEF construction.
Magari
117
Figure 8. Imperative magari constructions
tual element. The Scalar concessive conditional construction is an instantiation of both the Abstract NEF magari construction and the (abstract) concessive construction, therefore we are dealing with a case of multiple inheritance. In addition, this construction is linked by a purely semantic link (represented here by a dotted line) to the Scalar NEF magari construction, since they share the scalarity feature. As mentioned above, the Imperative magari construction is not present in this network. In fact, this is due to the fact that we interpret the Imperative magari construction as a further instantiation of both the ENEF magari construction and the Scalar NEF magari construction. As mentioned in Section 4.6, and as reproduced in Figure 8, the Imperative magari construction may have the topological structure of both the former and the latter. In both cases, the corresponding meaning of magari is maintained and a general function of weakening of the command/ exhortation is added. Therefore, the Imperative magari construction can be seen as a lower-level construction in which sentence type and speech act information is specied. Also, the two constructions are linked to one another (by a dotted link), since they share the Sentence type and Speech act features. In conclusion, the proposed constructionist analysis allows to connect all magari constructions with one another in an inheritance hierarchy and therefore gives us a better understanding of the speakers knowledge of this piece of grammar.
118 6.
F. Masini and P. Pietrandrea Conclusions
The word magari has a number of grammatical meanings: equipotential non exclusion of factuality, scalarity, concessivity, weakening of the illocutionary force of the imperative and optativity. All these meanings proved to be constructional in nature, i.e., they are determined by the various constructions in which magari occurs. These constructions were eciently identied by looking at topological patterns, i.e., structures that are recognizable at the discourse conguration level. This level of analysis, dened by the maintenance of a given predicate-argumentadjunct structure in discourse, crosses the traditional divide between clausal and supra-clausal level and can only be characterized in terms of its topological structure. The distribution of magari within discourse congurations has revealed interesting regularities. With the exception of the more marginal and more ancient optative function, magari is regularly associated with certain abstract topological patterns that can be characterized as lists containing the element focused by magari. The exact shape of this topological pattern is the only distinctive property that allows to univocally identify each type of magari. Finally, these meaningful topological structures can be reinterpreted as proper constructions, whose peculiarity consists in that they are insensitive to the boundary between clauses and are bi-dimensional in nature. They can also be represented in an inheritance hierarchy, which shows how the dierent magari constructions are inherited from a maximally abstract construction. Received 7 April 2008 Revision received 27 October 2008 Dipartimento di Linguistica ` Universita Roma Tre Rome, Italy
References
Arcaini, Enrico. 1997. Le connecteur magari dans une perspective comparative. In Gerd Wotjak (ed.), Studien zum romanisch-deutschen und innerromanischen Sprachvergleich, 5976. Frankfurt am Main: Peter Lang. Arcaini, Enrico. 2000. Italiano e francese. Unanalisi comparativa. Turin: Paravia Scriptorium. Baroni, Marco, Silvia Bernardini, Federica Comastri, Lorenzo Piccioni, Alessandra Volpi, Guy Aston & Marco Mazzoleni. 2004. Introducing the La Repubblica corpus: A large, annotated, TEI(XML)-compliant corpus of newspaper Italian. In Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC 2004, Lisbon, 2628 May), 17711774. Paris: ELRA.
Magari
119
Bertinetto, Pier Marco. 1991. Il verbo. In Lorenzo Renzi, Giampaolo Salvi & Anna Cardinaletti (eds.), Grande grammatica italiana di consultazione, vol. II, 13161. Bologna: Il Mulino. ` Bilger, Mireille. 1982. Contribution a lanalyse en grille. Recherches sur le francais parle 4. 195215. ` Bilger Mireille, Mylene Blasco, Paul Cappeau, Frederic Sabio & Marie-Josee Savelli. 1997. Transcription de loral et interpretation: illustration de quelques dicultes. Recherches sur le francais parle 14. 5585. Blanche-Benveniste, Claire. 1993. Repetitions de lexique et glissement vers la gauche. Re cherches sur le francais parle 12. 934. Blanche-Benveniste, Claire. 1997. Approches de la langue parlee en francais. Paris: Ophrys. Blanche-Benveniste, Claire, Bernard Borel, Jose Deulofeu, Jacky Durand, Alain Giacomi, Claude Loufrani, Boudjema Meziane & Nelly Pazery. 1979. Des grilles pour le francais parle. Recherches sur le francais parle 2. 163205. Blanche-Benveniste, Claire, Mireille Bilger, Christine Rouget & Karel Van den Eyende. 1990. Le francais parle. Etudes grammaticales. Paris: Editions du Centre National de la Recherche Scientique. Bonvino, Elisabetta. 2005. Le sujet postverbal. Une etude sur litalien parle. Paris: Ophrys. Bonvino, Elisabetta, Francesca Masini & Paola Pietrandrea. 2009. List Constructions: a semantic network. Paper presented at the Third International Conference of the French Association of Cognitive Linguistics (AFLiCo) Grammars in Construction(s), Paris, Nanterre, 2729 May. Croft, William. 2001. Radical Construction Grammar. Oxford: Oxford University Press. Croft, William & Alan D. Cruse. 2004. Cognitive Linguistics. Oxford: Oxford University Press. de Haan, Ferdinand. 2004. On representing semantic maps. Tucson, AZ: University of Arizona manuscript. URL: http:/ /www.u.arizona.edu/~fdehaan/papers/semmap.pdf (accessed 3 March 2008). De Mauro, Tullio, Federico Mancini, Massimo Vedovelli & Miriam Voghera. 1993. Lessico di frequenza dellitaliano parlato. Milan: ETAS libri. Divjak, Dagmar S. 2006. Ways of intending: Delineating and structuring near synonyms. In Stefan Th. Gries & Anatol Stefanowitsch (eds.), Corpora in cognitive linguistics: Corpusbased approaches to syntax and lexis, 1956. Berlin & New York: Mouton de Gruyter. Divjak, Dagmar S. & Stefan Th. Gries. 2006. Ways of trying in Russian: Clustering behavioral proles. Corpus Linguistics and Linguistic Theory 2(1). 2360. Elliott, Jennifer R. 2000. Realis and irrealis: Forms and concepts of the grammaticalization of reality. Linguistic Typology 4. 5590. Fauconnier, Gilles. 1976. Etude de certains aspects logiques et grammaticaux de la quantication et de lanaphore en francais et en anglais. Paris: Champion. Fillmore, Charles J., Paul Kay & Mary Catherine OConnor. 1988. Regularity and idiomaticity in grammatical constructions: The case of let alone. Language 64(3). 501538. Fried, Mirjam. 2007. Constructing grammatical meaning. Isomorphism and polysemy in Czech reexivization. Studies in Language 31(4). 721764. Fried, Mirjam & Jan-Ola Ostman. 2004. Construction Grammar: A thumbnail sketch. In Mirjam Fried & Jan-Ola Ostman (eds.), Construction Grammar in a cross-language perspective, 1186. Amsterdam & Philadelphia: John Benjamins. Fried, Mirjam & Jan-Ola Ostman. 2005. Construction Grammar and spoken language: The case of pragmatic particles. Journal of Pragmatics 37(11). 17521778. Gerdes, Kim & Sylvain Kahane. In print. Speaking in Piles. Paradigmatic Annotation of a Spoken French Corpus. In Proceedings of the Fifth Corpus Linguistics Conference, Liverpool.
120
Goldberg, Adele. 1995. Constructions. A Construction Grammar approach to argument structures. Chicago: The University of Chicago Press. Goldberg, Adele. 2006. Constructions at work. Oxford: Oxford University Press. Gries, Stefan Th. 2006. Corpus-based methods and cognitive semantics: The many meanings of to run. In Stefan Th. Gries & Anatol Stefanowitsch (eds.), Corpora in cognitive linguistics: Corpus-based approaches to syntax and lexis, 5799. Berlin & New York: Mouton de Gruyter. Haspelmath, Martin & Ekkehard Konig. 1998. Concessive conditionals in the languages of Europe. In Johan van der Auwera (ed.), Adverbial relations in the languages of Europe, 277334. Berlin & New York: Mouton de Gruyter. Kay, Paul. 1990. Even. Linguistics and Philosophy 13. 59111. Kay, Paul & Charles J. Fillmore. 1999. Grammatical constructions and linguistic generalizations: The whats X doing Y? construction. Language 75(1). 133. Ko nig, Ekkehard. 1991. The meaning of focus particles: A comparative perspective. London: Routledge. Ko nig, Ekkehard & Peter Siemund. 2007. Speech acts distinctions in grammar. In Timothy Shopen (ed.), Language typology and syntactic description, vol. I: Clause structure, 2nd edn, 276304. Cambridge: Cambridge University Press. Lazard Gilbert. 1998. Lexpression de lirreel: essai de typologie. In Leonid Kulikov & Heinz Vater (eds.), Typology of verbal categories: Papers presented to Vladimir Nedjalkov on the occasion of his 70 th birthday, 237248. Tubingen: Max Niemeyer. Licari, Carmen & Stefania Stame. 1989. Pour une analyse contrastive des connecteurs prag matiques italiens et francais: magari/peut-etre, anzi/au contraire. Studi Italiani di Linguis tica Teorica e Applicata 18. 153161. Mauri, Caterina. 2008a. The irreality of alternatives: Towards a typology of disjunction. Studies in Language 32(1). 2255. Mauri, Caterina. 2008b. Coordination relations in the languages of Europe and beyond. Berlin & New York: Mouton de Gruyter. Michaelis, Laura & Knud Lambrecht. 1996. Towards a construction-based theory of language function: The case of nominal extraposition. Language 72(2). 215247. Mithun, Marianne. 1995. On the relativity of irreality. In Joan Bybee & Suzanne Fleischman (eds.), Modality in grammar and discourse, 367388. Amsterdam & Philadelphia: John Benjamins. Mithun, Marianne. 2005. On the assumption of the sentence as the basic unit of syntactic structure. In Zygmunt Frajzyngier, Adam Hodges & David S. Rood (eds.), Linguistic diversity and language theories, 169183. Amsterdam & Philadelphia: John Benjamins. Mithun, Marianne. 2008. The extension of dependency beyond the sentence. Language 84(1). 69119. Nlke, Henning. 1983. Les adverbes paradigmatisants: fonction et analyse. Copenhagen: Akademisk Forlag. Nlke, Henning. 2001. Le regard du locuteur 2: Pour une linguistique des traces enonciatives. Paris: Kime. Ostman, Jan-Ola. 2005. Construction discourse: A prolegomenon. In Jan-Ola Ostman & Mirjam Fried (eds.), Construction grammars. Cognitive grounding and theoretical extensions, 121144. Amsterdam & Philadelphia: John Benjamins. Pietrandrea, Paola. 2007. The grammatical nature of some epistemic-evidential adverbs in Spoken Italian. Italian Journal of Linguistics 1. 3964. Pietrandrea, Paola. 2008a. Certamente and sicuramente: Encoding dynamic and discursive aspects of commitment in Italian. Belgian Journal of Linguistics 22. 221246.
Magari
121
Pietrandrea, Paola. 2008b. Constructionalization, grammaticalization and discourse. The case of magari. Paper presented at the 4th New Reections on Grammaticalization Conference, Leuven, 1619 July. Pietrandrea, Paola. In print. The conceptual structure of irreality. A focus on non-exclusionof-factuality as a conceptual and a linguistic category. Language Sciences. Rooth, Mats. 1992. A theory of focus interpretation. Natural Language Semantics 1. 75 116. Schiemann, Anika. 2008. La polisemia di magari (e forse). Analisi corpus based su CORAL-ROM italiano. In Emanuela Cresti (ed.), Prospettive nello studio del lessico italiano. Atti SILFI 2006, vol. I, 299307. Florence: Firenze University Press.
Reviewing imagery in resemblance and non-resemblance metaphors

JOSE MANUEL URENA and PAMELA FABER*
Abstract This article analyses the nature of mental imagery in metaphoric thought as envisaged by the contemporary theory of metaphor in Cognitive Linguistics (Lako 1993). Our study of metaphor in the eld of marine biology draws on two crucial aspects of mental imagery, namely dynamicity and pervasiveness. Image metaphors and behaviour-based metaphors have generally been regarded as two dierent types of resemblance metaphor. In our view, the dynamicity of certain mental images highlights inherent similarities between these two types of metaphor, and makes the dierences between them more apparent than real. For this reason, we propose a more rened description of resemblance metaphors in terms of the static or dynamic nature of the mental images underlying them. Our study also underlines the fact that mental images permeate all classes of metaphor, and that the pervasiveness and dynamicity of mental images aords insights into both resemblance metaphors and non-resemblance metaphors. Keywords: mental imagery; metaphor; dynamicity; marine biology; cognitive linguistics.
1.
Introduction
Conceptual Metaphor Theory and Primary Metaphor Theory establish a sharp distinction between metaphors that arise from physical or behavioural analogy and metaphors motivated by abstract or subjective cognitive processes. In the case of Conceptual Metaphor Theory, Lako (1993)
* Correspondence address: University of Granada, Buensuceso Street 11, Postcode 18002, Granada, Spain. Tel.: (34) 958 240517. E-mails: 3jmurena@ugr.es4; 3pfaber@ugr.es4. Cognitive Linguistics 211 (2010), 123149 DOI 10.1515/COGL.2010.004 09365907/10/00210123 6 Walter de Gruyter
124
J. M. Urena and P. Faber
and Lako and Turner (1989) distinguish between conceptual-structural/ conventional metaphors and image metaphors. However, Gradys (1997, 1999) Primary Metaphor Theory distinguishes between correlation metaphors and resemblance metaphors. These classes of metaphor arise by virtue of our embodied conceptualisation system. In this article, we use the term resemblance metaphor to refer to image metaphors and behaviour-based metaphors, and non-resemblance metaphor to refer to conceptual/conventional metaphor and correlation metaphor. We prefer the term non-resemblance metaphor for three reasons. Firstly, it encompasses any type of metaphor that does not arise from resemblance. Secondly, the term conceptual metaphor is not felicitous because resemblance metaphors are also conceptual, as underlined in other studies (Kovecses 2002; Alexiev 2005). Thirdly, the term conventional metaphor is not a good choice because resemblance metaphors are also conventional. According to Gradys (1999) characterisation of resemblance metaphors, image metaphors are associated with motionless visual images, whose motivation for metaphorical transfer is based on physical properties (e.g., shape and colour). In contrast, there are other metaphors that result from behavioural comparison, and therefore, are typically linked to motion and dynamicity. This article challenges this classication. Our study of resemblance metaphor in the eld of marine biology indicates that image metaphors and behaviour-based metaphors are closely linked. In fact, what truly differentiates these metaphors is the static or dynamic nature of their underlying images1. Rather than belonging to two dierent categories, they should be regarded as belonging to a graded category in which members dier in terms of the dynamicity of their images. As shall be seen, imagery is also important in non-resemblance metaphors. This means that images and their analysis should go far beyond mere physical or behavioural resemblance. 2. Dening imagery
Imagery has two related senses. First of all, it refers to quasi-perceptual experience, which signicantly resembles perceptual experience, but oc-
1.
In this regard, this research work is on a par with other studies. For instance, Caballero (2006) explores a host of image metaphors that are dynamic, and Pena (2003) makes a distinction between situational and non-situational metaphors that is based on the feature of dynamicity.
Reviewing imagery
125
curs in the absence of the appropriate perceptual stimuli (Thomas 1999: 208). This denition includes image schemas and mental images, both of which are key ideas in Cognitive Linguistics (cf. Johnson 1987; Lako and Johnson 1999). An image schema can be regarded as an instance of imagery simpliciter or an especially unsaturated form of imagery, produced by simulating only the very earliest and most generally applicable stages of the process of a perceptual exploration (Thomas 2009, A note on schema and image schema, para. 9). Apart from being an unsaturated form of imagery, they are non-intentional because they do not participate in the conscious act of perceiving. In other words, image schemas are emergent properties of unreective bodily experience (Gibbs and Colston 2006: 247). In contrast, a mental image is a more substantiated kind of mental representation. It cross-cuts any sensory mode, and embodies our perceptual and imagistic awareness2. Mental images are intentional, insofar as they involve a conscious mental act of perceiving. In other words, they are the result of more eortful cognitive processes (Gibbs and Colston 2006: 247). Mental images also have content specicity, the complexity of which is constrained by linguistic and environmental situatedness. Image schemas may also be complex, but in the sense that they can combine to give structure to conceptual domains (cf. Cienki 1997: 9; Kimmel 2005). This structure entails conceptual relationships, and accordingly, it has been shown that some image schemas are subsidiary or subordinate to others (Pena 1999). The second sense of imagery is related to a well-entrenched view in cognitive psychology that mental images are a key factor in creative thought (cf. Finke et al. 1992; Weisberg 1986). Consequently, imagery not only refers to true imagination, but also to the production of mental images that arise from our capacity to separate, shue, distort and recombine simpler mental images in the rst sense. This ability has its cognitive uses
2. The notion of mental image is admittedly rather vague. Because of this, a mental image need not refer to a mental picture, but can also refer to sensory images or image simulations in dierent sensory modes. As Pylyshyn (2003: 113) aptly points out, this is due to the fact that neither language nor pictures are sucient to represent the content of thought and that most thought is not available to conscious inspection [ . . . ] If there is something special about the format in which we think when we have the experience of seeing with the minds eye, nobody has satisfactorily articulated what it is.
126
(Finke et al. 1989; Finke et al. 1992), including the generation of metaphoric thought. In our view, both senses of imagery are essential to account for mental images in metaphoric thought.
3.
Image metaphors and behaviour-based metaphors
Grady (1999), Lako (1993), and Lako and Turner (1989) agree that the core feature of image metaphors is the comparison between the images of two entities. Lako (1993: 230) writes that because two images are being compared, these metaphors are called image metaphors:
Metaphoric image-mappings work in just the same way as all other metaphoric mappings: by mapping the structure of one domain onto the structure of another. But here, the domains are conventional mental images.
The prototypical conception of an image metaphor is a metaphor based on resemblance in shape and/or colour. In this way, clear instances of marine biology image metaphors are seahorse (Hippocampus), which refers to a sh with a horse-like head (see picture in Table 1 in the Appendix), and milksh (Chanos chanos) because of the white underside of this sh (see picture in Table 1 in the Appendix). These metaphors are based on visual perception, which is the dominant component of our embodied conceptualisation system (Watt 1991). For this reason, this type of metaphor has the highest degree of iconicity or mental imagery shared by the source and target concepts. These metaphors clearly dier from behaviour-based metaphors, such as hermitcrab (Paguroidea) in which the crab acts like a hermit instead of looking like one. According to Grady (1999), behaviour-based metaphors cannot be called image metaphors because they are based on behavioural rather than physical resemblance. By way of example, Grady (1999: 89) mentions the well-known metaphor Achilles is a lion. Since Achilles courageous actions resemble the aggressive behaviour of lions without any claim about his physical appearance, this metaphor cannot be considered an image metaphor. Strictly speaking, according to Gradys classication, marine biology terms such as sea nettle, archersh or triggersh are not image metaphors either, since they are based on behavioural or functional resemblance. Nevertheless, we argue that these metaphors also evoke mental images, and that mental images are not exclusively associated with metaphors based on physical comparison.
Reviewing imagery 3.1. Images in behaviour-based metaphors
127
There is a growing body of research in cognitive psychology showing that mental representations of perceptual experience are central to cognition (cf. Damasio 1994; Finke 1989; Paivio 1971, 1986; Thomas 19993). In this section we show that behaviour-based metaphors are grounded in mental images that can be either dynamic or static. 3.1.1. Dynamic images in behaviour-based metaphors. Behaviour-based metaphors, such as Achilles is a lion, emerge from the visual experience of a motor action, which yields a set of images that are eshed out by spatial-dynamic actions. In this metaphor, we evoke images of Achilles bravely confronting his enemies and a lion ercely ghting other animals for survival (also in Ruiz de Mendoza and Pena 2008). The nature of these images is constrained by the image-schematic topology of the target domain, which cannot be violated by the cognitive topology of the source domain, while still remaining consistent with it. This is in consonance with Lako s Invariance Principle (Lako 1990, 1993). In spite of this constraint, the images retrieved by this type of metaphor are disperse since each individual may re-create these actions in dierent mental scenarios. As Deane (2005: 247) points out, the same spatial relation may receive distinct representations in multiple representational modalities. Although most research on imagery in contemporary cognitive psychology focuses on visual perception (and, to a much lesser extent, on audition), there is growing evidence that kinaesthetic, somaesthetic and haptic perception is also pivotal to mental image formation (cf. Gibbs 2006; Gibbs et al. 1994; Gibbs and Colston 2006 [1995]; Popova 2005). This means that mental images need not necessarily be visual in nature, and that visual imagery and kinaesthetic imagery share a common representational, and possibly neuropsychological substrate (Gibbs 2006: 124). According to Paivios (1971, 1986) dual coding approach, cognitive tasks are mediated not only by linguistic processes, but also by a nonverbal imagery model of thought as well. What Paivio calls the image system in our brains refers to both non-verbal objects and events, and arises not only from visual stimuli, but also from auditory, kinesthetic, and other
3. Thomas (1999) dwells on the three major theories of imagery in Conceptual Science, namely Picture Theory, Description or Propositional Theory, and Perceptual Activity Theory. He aligns himself with Perceptual Activity Theory, according to which perceptual learning is not viewed as a matter of storing descriptions (or pictures) of perceived scenes or objects, but as the continual updating and rening of procedures or schemata (Thomas 1999: 218).
128
sensory components of non-verbal information. Accordingly, behaviourbased and function-based metaphors can also be regarded as image metaphors because they are closely linked to conventional mental images representing events, which are not necessarily based on visual stimuli. Moreover, since behaviour and function mostly involve (loco)motion on account of a correlation or cause-eect event, most behaviour-based images (i.e., images that feature the behaviour of a living being) and function-based images (i.e., images that feature the functioning of an instrument, device or machine) are unquestionably dynamic. 3.1.1.1. Sea wasp. In the eld of marine biology, many specialised concepts have basic-level category denominations. This guarantees richly contoured and easily retrievable mental images, since the basic level is the level of rich mental images and rich knowledge structure (Lako 1993: 212). For example, the metaphor sea wasp, which is an alternative scientic name for jellysh Chironex eckeri (see picture in Table 1 in the Appendix), evokes an easily retrievable image that primes kinaesthetic perception. It also gives priority to the more subjective sensory image of actually participating in an event, rather than to the objective and visual pattern of observing it. In this case, the perceptual experience foregrounded is touch, which is a somaesthetic and kinaesthetic sense, and like vision, also a spatial sense (Popova 2005: 402). This metaphor evokes the dynamic event image of our touching a wasp, its stinging us, and our subsequent experience of pain. This image is mapped or superimposed onto the image of a jellysh injecting its stinging capsules or nematocysts under our skin, which causes the pain. This metaphor has a metonymic basis. The close relationship and interaction between metaphor and metonymy has been underlined in recent research (cf. Barcelona 2003; Radden 2002). More precisely, the sea wasp metaphor is based on two conceptual metonymies operating on the two domains or categories connected by the metaphor. In the metonymies, the source is the stinging capacity, which is a shared attribute of the targets wasp and jellyfish. In other words, both wasps and this type of sh have to be metonymically understood from their salient property stinging capacity as metonymic source, which creates the abstract similarity that makes the metaphorical connection between the source (wasp) and the target (jellyfish). 3.1.1.2. Archersh. Another example of a behaviour-based metaphor relying on dynamic images is archersh (Toxotidae). The behaviour of this sh is compared to that of an archer, which includes the function of an archers bow, which shoots arrows at a target. The reason for this
Reviewing imagery
129
comparison is that archersh have the ability to spit water droplets at aerial insects (either on the wing or resting on surfaces above the water), and thus knock them onto the water to be eaten (see picture in Table 1 in the Appendix). Thus, the dynamic image of an archer shooting an arrow at his target is superimposed onto the image of an archersh spitting water at an insect. This metaphor also has a metonymic basis. The source domain of the metonymies is shooting capacity as instantiated by: (i) the archers use of a bow and arrow; (ii) the archershs projection of water droplets to hit insects. The source domain of the metonymies stands for the targets, archer and archerfish, and is in turn responsible for the abstract similarity that makes the metaphorical connection between the source (archer) and the target (archerfish). All of these metaphors can also be approached from the perspective of Conceptual Blending Theory (Fauconnier and Turner 1998, 2002). They are clear instances of formal blending, more specically, of compounding. For example, archersh involves two input spaces relating to archer and sh, plus the conventional array of meanings linked to these lexical items. However, the projection to the blended space is selective, including only the subset of semantic features associated with the concepts of both archer and sh, along with their forms (i.e., word projection). Thus, both conceptual structure and linguistic structure are projected onto the blend, giving rise to a new emergent structure. Figure 1 illustrates this scenario. This structure is novel both from a linguistic point of view (the creation of a new word) and from a semantic point of view (the creation of a new meaning). 3.1.2. Static images in behaviour-based metaphors. Although behaviour most frequently implies dynamic mental images, curiously enough, we have found behaviour-based metaphors in the eld of marine biology that are based on static images. For instance, the metaphor hawksh (Cirrhitidae) refers to a sh that behaves like a hawk because it rests atop the highest point on the coral reefs, waiting for suitable prey to appear (see Table 1 in the Appendix). The sh then dives down to capture its prey. The initial image of a motionless hawksh awaiting its prey on a high vantage point maps onto that of a motionless hawk on a tree branch or cli, waiting to capture its prey. Still another example is the metaphor garden eel (Heterocongridae). Garden eels receive this name because they live in colonies, keeping the main portion of their bodies buried in the sandy sea bottom while the rest remains upright in the open sea (see Table 2 in the Appendix). This behaviour retrieves a motionless image which resembles that of slim
130
Figure 1. Blended space of archersh
plants in a garden. The garden eel metaphor also entails a physical aspect motivation: the mass-eect shape of the eels allows for the comparison between these animals and a garden. On this basis, we argue that this is another metonymy-based metaphor. The source of the metonymies, i.e., the state of standing still, maps onto the targets, i.e., plants in the garden and eel, and prompts the metaphorical connection between them. 3.2. Dynamic image metaphors
In marine biology, most of the examples refer to either a behavioural/ functional model or a physical-aspect model. For instance, the metaphors sea nettle and sea wasp are based on behaviour. Archersh integrates behavioural and functional motivations, whereas triggersh arises from resemblance in function. Independently, horseshoe crab is a shape-induced metaphor, and sea lettuce emerges as a result of comparison in shape and colour (see Table 1 in the Appendix).
Reviewing imagery
131
Generally speaking, we tend to think of shape and colour as more static than dynamic attributes. However, it is evident that an entity can change its shape as well as its colour. Accordingly, there are also dynamic metaphors based on physical comparison. This fact supports the claim that people nd it easier to make sense of [ . . . ] moving objects over those that are stationary (Gibbs and Colston 2006: 252). Concerning shape, Deane (2005: 249250) arms that there are multiple representations of shape: one representation depicts static forms; the other depicts dynamic form. Lako (1993: 229) provides the following example when describing the characteristics of image metaphors: the image of the slow, sinuous walk of an Indian woman is mapped onto the image of the slow, sinuous, shimmering ow of a river. Though involving dynamicity, this example features an image metaphor because it is moving shapes or lines that are compared. However, dynamism also entails behavioural or functional patterns. These are processed in our brains, and create interrelated experiencebased concepts that become meaningful because of these regular patterns. This evidently leads to behaviour-based or function-based metaphors. Thereby, the slow and sinuous walk of the Indian woman is part of the way she walks, and thus, of her behaviour. Likewise, the slow and sinuous ow of a river is also part of its behaviour. Thus, this is a resemblance metaphor which integrates physical and behavioural motivations. There are also resemblance metaphors in marine biology that combine both behaviour and physical appearance. Such is the case of the anglersh (Lophius). This sh behaves like, and thus, resembles an angler for two reasons: (i) the shape of the foremost spine of its dorsal n looks like a shing rod with its shing line and eshy bait at its tip (see picture in Table 2 in the Appendix); (ii) this spinal shing rod is used as a lure for attracting prey which stray close enough for the anglersh to swallow. Since catching a prey is an action or event, this can be regarded as a dynamic image. Still another example is the metaphor boxer crab (Lybia tessellata). This crab holds an anemone in each pincer, and uses these anemones for protection (usually against octopuses) in the same way as a boxer uses his sts against his opponent (see picture in Table 2 in the Appendix). These little round-shaped anemones resemble boxing gloves, while the action of attacking predators with the anemones is a type of behaviour that resembles that of a boxer. Regarding colour, an example of a dynamic resemblance metaphor is chameleon sh (Badis badis). This is a freshwater sh that changes its skin colour when hungry, threatened or protecting its eggs, ospring, or territory (see pictures in Table 2 in the Appendix). This change of skin
132
colour occurs within a single static locus (i.e., locomotion is not involved). Yet, this type of eect creates mental video-clips of sequentially unfolding images, which naturally implies change or dynamic structure. This is the reason why we recruit the superimposed dynamic images of a real chameleon and of this sh, altering their skin colour. This physical ability is part of their behaviour. In summary, image metaphors and behaviour-based metaphors are not clearly dierentiated categories, since there is a group of metaphors that possess characteristics of both, and thus reside in a transition zone between the two. 3.3. Fictive dynamicity in resemblance metaphors
Our tendency to think in terms of dynamic patterns has been documented (cf. Talmy 1999 [1996]). Such tendency hinges upon representations that are motionless in nature. These representations emerge from what Talmy (1999 [1996]: 245) calls ception or ctive motion, which involves sensory stimulation, mental imagery, and ongoingly experienced thought and aect. Metaphor is found in ctive motion constructions dealing with spatial description (Talmy 1996). Regarding specialised language, Caballero (2006) identies instances in architectural discourse where metaphor plays a role in ctive motion. Example (1) given by Caballero (2006: 180) includes motion verbs codifying actual static scenes, which are conceptualised as non-veridical dynamic images through metaphorisation: (1) Based on a boomerang shaped plan, the new building steps down from a prow at its south end to embrace a new public space.
In the eld of marine biology, we have also found metaphorically extended motion verbs that evoke visual mental images involving ctive dynamic structure, as shown in the following examples: (2) (3) The Rst Reef, the worlds largest known deep water coral reef, forms a structure that fades away to depths between 300 and 400 m. The seaward edge of a reef is fairly steep and slopes down to deeper water. Since the water is generally clearer, corals may grow to the depths of 50 m depending on light available. In tile Pulmonate tile rudimentary velum, v, is marked by a line of granular ciliated cells, which [ . . . ] bends up towards the dorsal surface, in such a way as to almost encircle the tentacles.
(4)
These metaphors are clearly imagistic in nature, and form a part of the experts visual thinking (Caballero 2006: 3). What makes this type of
Reviewing imagery
133
metaphors interesting is their complex nature. They can be regarded as instantiations of the more general metaphor form is motion (Lako and Turner 1989: 142144). At the same time, these metaphors emerge because the form that they evoke matches the actual shape of the entities, and is based on how they are visually scanned. In other words, despite the fact that they are often classied as non-resemblance metaphors, resemblance is involved here, but of a more sophisticated kind. We can thus conclude that while the two types of resemblance metaphor cannot be regarded as clear-cut categories because dynamic structure and static structure permeate both categories, in some resemblance metaphors it is the boundaries between static structure and dynamic structure that are fuzzy. However, the fuzzy boundaries between both schemas in these resemblance metaphors answer psychological strategies rather than reect the actual state of aairs. In short, the conceptualisation of factive statis or stationariness through this kind of metaphor is biased because it results in images involving ctive change.
4.
Non-resemblance metaphors
Resemblance metaphors emerge from the superimposition of easily retrievable mental images. Yet, non-resemblance metaphors also involve the retrieval of mental images. Precisely, the great bulk of research on gurative mental imagery is currently on non-resemblance metaphors. Consequently, both types of metaphor are more closely linked than previously assumed. The pervasiveness of mental imagery is due to the logic of our embodied conceptual system, which licenses the creation of any type of metaphor on the basis of mental images. Therefore, strictly speaking, mental imagery constitutes the grounding of metaphoric thought. 4.1. Mental images in non-resemblance metaphors
Lako (1993: 229) writes that the rationale of conceptual metaphors, namely understanding abstract concepts through concepts directly grounded in bodily experience, involves mental imagery, which is the mental realisation of such experience:
Abstract reasoning is a special case of imaged-based reasoning. Image-based reasoning is fundamental and abstract reasoning is image-based reasoning under metaphorical projections to abstract domains.
As a general rule, words can designate portions of conventional mental images (Lako and Johnson 1999: 69). Recent research provides evidence
134
that language makes much greater use of the brains mental imagery than previously thought (Rohrer 2005: 166). In keeping with the two-domainof-experience mapping system proposed by Conceptual Metaphor Theory, when both domains are active, imagery associated with sourcedomain entities can be activated, and thereby associated with the targetdomain entities neurally connected to them (Lako and Johnson 1999: 56). As previously discussed, kinaesthetic perception involves motor activity or bodily (loco)motion, which occurs in space. In fact, mental imagery is prominent in the form of spatial-dynamic images, especially when it comes to real or imagined body action. As pointed out by Rohrer (2005: 169), mental imagery can also be kinaesthetic, as in the felt sense of ones own body image. Mediation of the lived body action for mental image formation is called embodied simulation (Gallese 2005). Gibbs and Perlman (2006: 223) arm that processing metaphoric meaning is not just a purely cognitive act, but involves some imaginative understanding of the bodys role in structuring abstract concepts. Examples of embodied simulation can be found in expressions such as chewing on the idea and grasping an idea, which arise from the conceptual metaphor ideas are objects. Gibbs et al. (2006) demonstrated that people imaginatively engage in the act of chewing or grasping something to better understand these metaphorical phrases. Furthermore, it has been shown that the literal re-enactment of gurative verbal cues activates the primary motor and somatosensory cortices in our brains (Rohrer 2005). This underscores the signicance of embodiment or sensorimotor experience for metaphorical concept formation. Since Conceptual Metaphor Theory posits that abstract concepts are ultimately grounded in perceptual or bodily grounded experience (Ko vecses 2005; Lako 1990; Lako and Johnson 1980), mental imagery is thus an integral part of all metaphors. As Caballero (2003a: 152) stresses in the eld of architecture, if a distinction is to be made between images and concepts, such a distinction should not lie in the image component, since all the information organised and processed in our minds is essentially imagistic. In a like way, it can also be argued that all the information organised and processed in our minds, including images, is also conceptual. Mental images are likewise present in an extensive class of metaphorical or imageable idioms (Lako 1987; Lako and Johnson 1999). An imageable idiom comes with a conventional rich mental image and knowledge about that image (Lako and Johnson 1999: 68). According to Lako and Johnson (1999), a signicant portion of the array of linguistic expressions stemming from the conceptual metaphor love is a journey
Reviewing imagery
135
consists of idioms. They give the expression spinning ones wheels as an example. In marine biology we have also found non-resemblance metaphors based on dynamic mental images. This is the case of recruitment, which refers to the incorporation of new members of one species to the stock of the already existing individuals, particularly those living in communities. This includes shoals of sh and planktonic aggregates. This metaphor can be subsumed by the more general metaphor marine communities are military structures, which gives rise to metaphorical terms, such as intrusion, cohort, sentinel organism, invasive exotic species, evolutionary arms race and line of defence. Recruitment activates the generic dynamic mental image of a group of organisms that increases as new organisms join them. The specic details of this image largely depend on the context in which the metaphor is embedded. For instance, the recruitment of individuals of the species Engraulis encrasicolus, which is a type of anchovy, evokes a dierent mental image from that evoked by the recruitment of individuals of the species Labidocera scotti, a kind of marine planktonic copepod (i.e., a small crustacean). A second factor constraining and modelling the mental image activated is encyclopaedic meaning. According to Lako and Johnson (1999: 69), a metaphorical word is not just a linguistic expression of a metaphorical mapping, but the linguistic expression of an image plus knowledge about the image plus one or more metaphorical mappings. Thus, the mental image of the recruitment of anchovies evoked by an expert in marine biology is certainly richer than that evoked by a layman. However, it is also true that cognitive patterns give priority to the objective aspects of images rather than to their subjective implications (Dewell 2005: 386). Moreover, culture, as a specic type of contextual factor, also has a decisive role in forming conventional rich images, which appear to be pretty much the same from person to person in the same culture (Lako 1987: 450). However, when cultures dier, so do images. Consequently, when a European biologist builds a mental image of the recruitment of anchovies, in all likelihood the image brought to mind is that of an anchovy of the species Engraulis encrasicolus, most frequently found in the Mediterranean Sea. The physical features of this species of anchovy are dierent from those of the species Encrasicholina heterolobus, which inhabits the Indo-Pacic region. A biologist from Australia would probably activate an image of this species of anchovy, when he or she is thinking about recruitment. It can thus be concluded that mental images permeate both resemblance metaphors and non-resemblance metaphors.
136 4.2.
J. M. Urena and P. Faber Similarities between non-resemblance metaphors and resemblance metaphors
The previous section showed that mental images, traditionally associated with resemblance metaphors, have an important role in non-resemblance metaphors as well. In this section we argue that resemblance metaphors also have features that are traditionally considered to pertain exclusively to non-resemblance metaphors. Conceptual Metaphor Theory has primarily focused on conceptual/ conventional metaphors, which emerge from multiple mappings between two content-rich domains of experience. In other words, they have rich knowledge and rich inferential structure (Lako and Turner 1989: 91). Although this work is of undeniable interest, it has also meant that image metaphor has been more or less left out in the cold, and has been regarded as a kind of second-class metaphor. The main reason for this is that image metaphor is regarded by Lako and co-workers as a eeting, ad hoc kind of metaphor with an impoverished inner structure (Lako 1987, 1993; Lako and Turner 1989). Nevertheless, in recent years there has been a renewed interest in resemblance metaphor. Corpus-based research both in general language (Deignan 2007) and specialised discourse (Caballero 2003a, b, 2006 in architecture) shows that resemblance metaphors are well-established, conventional metaphors that arise from enduring and productive patterns of gurative thought. For example, our research in marine biology shows that certain wellentrenched resemblance metaphors can be brought together under productive, encompassing metaphors. Accordingly, terms like elephant seal (Mirounga), seahorse (Hippocampus), sea lion (Otariidae), hawksh (Cirrhitidae), spider crab (Maiidae), boarsh (Capros aper) and sand tiger shark (Carcharias taurus) can be subsumed by the general metaphor sea animals are land animals. Another such metaphor is marine organisms are workers, which stems from the multiple-correspondence process involving metaphorical terms, such as surgeonsh (Acanthuridae), pilot sh (Naucrates ductor), anglersh (Lophius), ddler crab (Uca), harvestsh (Peprilus alepidotus), by-the wind sailor (Velella spirans), nurse shark (Ginglymostoma cirratum), innkeeper worm (Urechis), and rock cook (Centrolabrus exoletus)4. Still another aspect that places resemblance metaphors on the same level as non-resemblance metaphors is that resemblance metaphors
4. As shown in Section 3.3, the resemblance metaphors involving ctive dynamicity can also be subsumed by a more general metaphor (i.e., form is motion).
Reviewing imagery
137
meet the two generalisation principles proposed by Lako (1993: 209) for non-resemblance metaphors, namely, the polysemy generalisation and the inferential generalisation. According to the polysemy generalisation, certain linguistic expressions of the source domain acquire related senses. For example, terms like thresher, and sponge have two or more senses, one of which refers to marine organisms. The main sense of thresher refers to a man who threshes the grain by beating it with a ail (a long, thin tool). In marine biology, a thresher is a shark of the genus Alopias. The metaphorical motivation is resemblance in both shape and behaviour. Regarding shape, the sharks abnormally long, thin, caudal n looks like a ail, and insofar as behaviour is concerned, the shark uses its aillike n to strike its preys and render them dazed. In the case of sponge, the central meaning of the concept is the marine biology sense. Sponge refers to a marine invertebrate animal of the phylum Porifera, characteristically having a porous skeleton composed of brous material or siliceous or calcareous spicules. The metaphoric sense of sponge, namely, porous plastics, rubber, cellulose, or other material chiey used for washing, bathing, and cleaning, arises on account of physical resemblance because the porous structure of this object looks like the skeleton of the marine organism. By virtue of the inferential generalisation, each mapping denes an open-ended class of potential correspondences across inference patterns (Lako 1993: 10). This is true for metaphors in the domain of marine biology because new species are continually being discovered. Such species are usually given metaphorical names that t into existing metaphorical systems within the domain, and thus increase the number of cross-domain correspondences and mappings that characterise a given resemblance metaphor. This capacity to infer, which emerges from the topological or gestaltic structure of conceptual (as opposed to linguistic) metaphors (Lako and Johnson 1980), follows a robust domain logic to create terms for marine organisms, as well as other terms. Accordingly, surgeonsh have scalpels; anglersh use their baits; harvestsh harvest food for survival; pilotsh usually travel together with sharks (see picture in Table 1 in the Appendix); and a burrow of innkeeper worms is occupied by several commensals. These examples clearly reect the role played by visually biased gurative language as an ecient instrument to organise thought through prolic inferential processes and evaluation (Caballero 2006: 3). More concretely, these examples show the importance of visual thinking in the creation of conceptually rich domain knowledge, which is enhanced by the high number of systematic crossdomain correspondences.
138
In this respect, the dierence between both classes of metaphors lies in the type and nature of mappings involved since both have multiple mappings as well as polysemic and inferential conditions. 4.3. Mental images in primary metaphors and correlation metaphors
Grady (1999: 87) clearly highlights the role of imagery in primary metaphors:
Quantity, desire [ . . . ] may take place at the level of cognition whose operation is not directly accessible to consciousness. In order to manipulate them at the conscious level it may be necessary to tie these elements of mental experience to specic sensory images.
Grady (1997: 100) arms that the direct bodily basis of primary source concepts is processed in our brains in the form of images (image content), which are paired with target concepts (response content) to build primary scenes. Although he states that primary scenes are not necessarily eshed out by rich content, at the same time, he holds that primary scenes are local structures, motivated by particular moments in our experience. He thus argues for the participation of down-to-earth conceptual structure in the source domain that cannot be as abstract as image schemas5. This conceptual structure would therefore have a greater level of specicity than image schemas, and illustrate the need of our cognitive system to resort to more or less specic images when constructing (metaphoric) meaning. Gradys view is likewise endorsed by Lima (2006: 115):
For instance, all cases of containers can be included in the image schema of a container, but each case may involve many primary scenes, such as (a) going into a room or (b) taking something out of a box, which can generate distinct metaphors. Even if we can have a schematic mental representation that is abstract enough to include all cases, the experiences that generate the metaphors do not seem to be the same in all of them. For example, in scene (a) going into a room, we experience going into spaces with certain characteristics and certain limits; in (b) taking something out of a box, we experience interacting with a container and its contents.
It could be argued that high-level primary metaphors such as organisation is physical structure are not based on images of a specic type.
5.
Far from criticizing image schemas, we are underlining the important role of images in metaphoric thought that is not resemblance-based.
Reviewing imagery
139
However, we should not forget that these are metaphoric formulations, namely, abstractions or generalisations of concrete linguistic and environmental situations. As research shows, language processing draws on location-specic perceptual images of entities and their attributes (Bergen et al. 2007: 734). Therefore, we can only have mental images of more or less specic types of physical structure (e.g., solids, liquids, etc.) if the abstraction is substantiated and situated. Evidently, the more information that we have about an entity, the richer its mental image will be. Moreover, mental images are generated by assembling the parts of the image one part at a time (Gibbs and Colston 2006: 247). In other words, we can speak of a procedural representation, in which the mental image of an entity is not built all at once, but rather sequentially by scanning its parts. It is our claim that mental images rather than solely image schemas are often the grounding for both resemblance metaphor and nonresemblance metaphors. According to Grady (1997), primary metaphors involve experiential correlation, which consists of establishing a strong conceptual link between two distinct events that iteratively co-occur. This phenomenon usually gives rise to correlation metaphors because after repeated cooccurences in our experience of the world, we come to conceive one event in terms of another. This evidently makes correlation metaphors dierent from resemblance metaphors. These basic correlation-induced physical experiences are generally recurring events throughout our life. For example, the experience of ones body moving through space generates the metaphor actions are selfpropelled motions and expressions, such as I am moving right along on the project (Lako and Johnson 1999: 52). This expression is linked to the dynamic image of ourselves moving through space. Since experiential correlation often implies a cause-eect event, it would seem that the retrieval of dynamic mental images for primary metaphors is at the center of this process. However, this is not always the case, because the range of sensorimotor domains activated in primary metaphors also includes static experiences, involving domains such as temperature (affection is warmth), size (important is big), physical conguration (uninteresting is flat) and location (states are locations)6.
6. See Grady (1997) for an account of other primary metaphors, or Lako and Johnson (1999: 5054), who also provide an inventory.
140 4.4.
J. M. Urena and P. Faber Similarities and dierences between correlation metaphors and resemblance metaphors
In the eld of marine biology, we have found resemblance metaphors based on cause-eect relationships, rather than physical similarity. The reliance on the cause-eect schema brings this type of resemblance metaphors close to correlation metaphors. In resemblance metaphors of this kind, the two entities involved share some facet of their behaviour or function (in the case of the source concept). In the case of the sea nettle, this property is a defence mechanism shared by both the plant and the marine organism. The experience involves a cause-eect experience of touching a nettle/sea nettle (cause/stimulus) and the subsequent perception of an itchy sensation (eect/response). The example of the sea nettle is a cause-eect metaphor that highlights active (kinaesthetic) sensorial experience, i.e., direct interaction. However, there are also cases in which cause-eect events are backgrounded because passive (visual or auditory) experience is primed. For example, the term ghost crab, which is based on a resemblance metaphor, only implies the visual experience of a motor action. The ghost crab (Ocypode) receives its name because of its ability to disappear from sight almost instantly by sprinting and scuttling at speeds up to 10 miles per hour while making sharp directional changes. This crustacean behaves this way when it establishes visual contact with a potential predator (correlation or cause-eect event). Consequently, marine biology resemblance metaphors could be classied in terms of the sensorimotor experiences that the terminological designations are based upon. The specication of the motivations for metaphorical transfer is essential for a clear classication of resemblance metaphors. Grady (1999: 98) acknowledges that an in-depth analysis is needed in order to further rene the dierent types of metaphor by carefully considering the motivations for these metaphors. This emphasises the signicance of our embodied conceptualisation system, which licenses the formation of any type of metaphor on the basis of mental images, and involves all manner of sensorimotor experiences. Cause-eect structure can also be at work in resemblance metaphors where behavioural comparison operates together with physical comparison. This is the case of the cookie-cutter shark (Isistius brasiliensis). This shark remains motionless at the sea bottom while its body emits a vivid, greenish phosphoresecent gleam, except for a black band around its throat. The prey of this shark is usually large fast-swimming sh. They are lured by what appears to be the silhouette of a small sh, which is actually the sharks non-luminiscent black collar. The shark behaves like
Reviewing imagery
141
a cookie-cutter in that once it has locked onto its lured prey (cause), the shark extracts cookie-shaped plugs of esh (physical resemblance) from the victim (eect). It should be noted that correlation in resemblance metaphors, while also entailing a cause-eect schema7, diers from correlation in primary metaphors in that in primary metaphors there is correlation between the source concept and the target concept, i.e., the source has a bearing on the target. Such source-target correlation does not exist in resemblance metaphor. For instance, in the sea nettle metaphor, the correlation between the hand touching the sea nettle (marine organism) or nettle (plant), and the resulting painful sensation is limited to either the source or the target. In other words, there is no co-occurrence of the events of touching the marine and the non-marine nettle, but only a resemblance between the overall cause-eect structure of source and target. What occurs in resemblance metaphors of this kind is that the type of correlation operating in the source concept is mapped onto the target concept. This mapping is sanctioned by the nature of the experience, which is the motivation for the correlation in the target concept. Accordingly, the stinging plant is an appropriate source for the sea nettle because the former does not violate the conceptual topology of the latter (Invariance Principle), and both share their overall cause-eect structure. In this way, the source helps to activate relevant aspects of the target, and even allows the perceiver to infer other properties about it. Still another dierence between resemblance metaphors grounded in a cause-eect schema and correlation metaphors is the metonymic nature of the latter. The prototypical example which illustrates this claim is the more is up metaphor. Radden (2002: 414) writes:
In order to correlate two variables, they have to be conceptually contiguous. The correlation of quantity and verticality provides a perfect example of conceptual contiguity in that both variables originate from the same experiential basis.
Radden argues that the metonymic grounding of correlation metaphors is based on a continuum ranging from literalness via metonymy to metaphor. This process ties in with the notions of conation and deconation,
7. Correlation is not exclusively a matter of cause-eect links. In fact, Radden (2002: 414) argues that there are positive correlations, which tend to evoke a causal interpretation, and negative correlations, which do not invite a causal interpretation. However, in dealing with metaphor, we bind correlation to cause-eect structure because cause-eect correlation (i.e., positive correlation) is the only type of correlation that pertains to metaphor (Radden 2002: 414).
142
as well as the developmental model of primary scenes, which gives rise to primary metaphors (Grady 1997). The metaphor more is up comes into its own in four stages. The rst stage involves up being literally conceptualised. At the second stage, the variable or dimension quantity is conceptualised through partial metonymy (up for up). At the same time, quantity is linked to the dimension verticality by means of the experiential basis of conation (i.e., up more). Conceptual conation takes place in primary scenes, such as seeing the level of uid in a container go up when more uid is poured into it. As Radden (2002: 10) points out, the two manifestations of this highly frequent scene, rise of a level and rise of quantity, occur simultaneously and are intimately correlated. The third stage is deconation, whereby up and more become a full metonymy (i.e., up for more). At the nal stage of the continuum, this full metonymy becomes the primary metaphor more is up. Thus, it can be stated that the immediate basis of primary metaphors is metonymic in nature. Radden (2002: 414) further states that the causal relation between quantity and verticality strengthens the metonymic basis of both these variables and licenses the reversibility principle of metonymic relationships, according to which the ow of causation may be seen in either direction: something is more because its level is higher or the level is higher because its quantity is more . In contrast, the pairs of cause-eect correlations that give rise to resemblance metaphors such as sea nettle occur independently (one pair in the source domain and another pair in the target domain). Therefore, they are not contiguous, which is the essential feature of metonymic mappings.
5.
Conclusions
Within the category of resemblance metaphors there are certain very clear examples of image metaphors and behaviour-based metaphors. Prototypical image metaphors are based on visual stimuli in the form of static images. In contrast, prototypical behaviour-based metaphors are based on motion and dynamicity. However, other resemblance metaphors do not clearly belong to one category or the other. In our opinion, resemblance metaphor is a graded category, in which members are better or worse exemplars, based on the static or dynamic nature of the mental images underlying them. In other words, image metaphors and behaviour-based metaphors can be dierentiated in terms of their degree of dynamicity. The existence of resemblance metaphors which combine behaviour and shape or colour seems to point to the need for a new, more rened classication.
Reviewing imagery
143
Evidence from specialised elds, such architecture and marine biology, shows that there is a more complex type of resemblance metaphors that are grounded in ctive dynamicity. Such dynamicity answers psychological strategies rather than reects the actual state of aairs. In this case, it is the boundaries between static structure and dynamic structure that are fuzzy. Recent corpus-based research shows that resemblance metaphors and non-resemblance metaphors share features, which up until now have been attributed exclusively to non-resemblance metaphor. The reason for this is the nature of mental imagery, which is an integral part of every stage of metaphor formation. Received 10 December 2008 Revision received 9 September 2009 University of Granada
Appendix Tables 1 and 2 show the marine biology resemblance metaphors referred to in this article. The metaphors in Table 1 represent separate, clear-cut categories, and are either image metaphors or behaviour/function-based metaphors. In contrast, Table 2 contains resemblance metaphors that are based both on behaviour/function and physical appearance. Pictures are provided to help understand the motivation for metaphorical transfer.
144
Table 1. Resemblance metaphors that are either image metaphors or behaviour/functionbased metaphors Metaphorical term Type of metaphorical motivation Shape Picture
Image metaphors (motionless mental images)
Seahorse
Milksh
Colour
Sea lettuce
Shape colour
10
8. Picture provided at www.divegallery.com/seahorse_page1.htm. Last access 21 May 2009. 9. Picture provided by Bryan Harry at http:/ /www.nps.gov/archive/npsa/NPSAsh/ sh_pops/chanidae/milksh01.htm. Last access 21 May 2009. 10. Picture provided by Guy Werner at http:/ /home.vicnet.net.au/~earthcar/sgreport2. htm. Last access 21 May 2009.
Reviewing imagery
Table 1 (Continued ) Behaviour/ function-based metaphors (dynamic mental images) Sea wasp Behaviour
145
11
Pilot sh
Behaviour
12
Archersh
Behaviour Function
13
Behaviourbased metaphor (motionless mental image)
Hawksh
Behaviour
14
11. Picture provided by Dr. Zoltan Takacs at http:/ /zoltantakacs.com/zt/pw/in/album. php?idx=18. Last access 21 May 2009. 12. Picture provided by Eric Orchin at http:/ /photo.net/photodb/photo?photo_id =5270518. Last acces 21 May 2009. 13. Picture provided by Stefan Anitei at http:/ /news.softpedia.com/images/news2/ Archersh-Tunes-its-Shot-Power-to-the-Prey-Size-2.jpg. Last access 21 May 2009. 14. Picture provided by Mark Pidcoe at http:/ /week.divebums.com/2006/Nov06-2006/ index.html. Last access 21 May 2009.
146
Table 2. Resemblance metaphors that can be regarded as both image metaphors and behaviour/function-based metaphors Metaphorical term Type of metaphorical motivation Behaviour Colour Picture
Metaphors based on dynamic mental images
Chameleon sh
15
Boxer crab
Behaviour Shape
16
Anglersh
Behaviour Function Shape
17
Metaphor based on a motionless mental image
Garden eel
Behaviour Shape
18
15. Pictures provided by LA Productions at http:/ /aqualandpetsplus.com/Oddball, %20Badis%20badis.htm. Last access 21 May 2009. 16. Pictures provided by Linda Johnston at http:/ /www.lembehresort.com/photo_image _pic_boxer_crab_by_linda_johnston_g8m38.html. Last access 21 May 2009. 17. Picture provided by Bruce Robison/Corbis at http:/ /animals.nationalgeographic.com/ animals/sh/anglersh.html. Last access 21 May 2009. 18. Picture provided by the Gull Dive Center at http:/ /www.gullboatsandrv.com/index. aspx/Dive_Shop. Last access 21 May 2009.
Reviewing imagery References
147
Alexiev, Boyan. 2005. Contrastive aspects of terminological metaphor. Soa: University of Soa PhD dissertation. Barcelona, Antonio. 2003. On the plausibility of claiming a metonymic motivation for conceptual metaphor. In Antonio Barcelona (ed.), Metaphor and metonymy at the crossroads. Cognitive approaches, 3158. Berlin & New York: Mouton de Gruyter. Bergen, Benjamin, Shane Lindsay, Teenie Matlock & Srini Narayanan. 2007. Spatial and linguistic aspects of visual imagery in sentence comprehension. Cognitive Science 31. 733764. Caballero, Mara del Rosario. 2003a. Metaphor and genre: The presence and role of metaphor in the building review. Applied Linguistics 24. 145167. Caballero, Mara del Rosario. 2003b. How to talk shop through metaphor: Bringing metaphor research to the ESP classroom. English for Special Purposes Journal 22(2). 177194. Caballero, Mara del Rosario. 2006. Re-viewing space: Figurative language in architects assessment of BuiltSpace. Berlin & New York: Mouton de Gruyter. Cienki, Alan. 1997. Some properties and groupings of image schemas. In Marjolijn Verspoor, Kee Dong Lee & Eve Sweetser (eds.), Lexical and syntactical constructions and the construction of meaning, 315. Amsterdam: John Benjamins. Damasio, Antonio. 1994. Descartes error. Emotion, reason and the human brain. New York: Grosset/Putnam. Deane, Paul. 2005. Multimodal spatial representation: On the semantic unity of over. In Beate Hampe (ed.), From perception to meaning. Image schemas in Cognitive Linguistics, 235282. Berlin & New York: Mouton de Gruyter. Dewell, Robert. 2005. Dynamic patterns of CONTAINMENT. In Beate Hampe (ed.), From perception to meaning. Image schemas in Cognitive Linguistics, 370393. Berlin & New York: Mouton de Gruyter. Deignan, Alice. 2007. Image metaphors and connotations in everyday language. Annual Review of Cognitive Linguistics 5. 173191. Fauconnier, Gilles & Mark Turner. 1998. Conceptual integration networks. Cognitive Science 22(2). 133187. Fauconnier, Gilles & Mark Turner. 2002. The way we think: Conceptual blending and the minds hidden complexities. New York: Basic Books. Finke, Ronald. 1989. Principles of mental imagery. Cambridge: MIT Press. Finke, Ronald, Thomas Ward & Steven Smith. 1992. Creative cognition: Theory, research, and applications. Cambridge: MIT Press. Finke, Ronald, Steven Pinker & Martha J. Farah. 1989. Reinterpreting visual patterns in mental imagery. Cognitive Science 13. 5178. Gallese, Vittorio. 2005. Embodied simulation: From neurons to phenomenal experience. Phenomenology and the Cognitive Sciences 4. 2348. Gibbs, Raymond W. Jr. 2006. Embodiment and Cognitive Science. New York: Cambridge University Press. Gibbs, Raymond W. Jr., Dinara Beitel, Michael Harrington & Paul Sanders. 1994. Taking a stand on the meaning of stand: Bodily experience as motivation for polysemy. Journal of Semantics 11. 231251. Gibbs, Raymond W. Jr. & Herbert L. Colston. 2006. The cognitive psychological reality of image schemas and their transformations. In Dirk Geeraerts (ed.), Cognitive Linguistics: Basic Readings, 239268. Berlin: Mouton de Gruyter. Gibbs, Raymond W. Jr., Jessica J. Gould & Michael Andric. 2006. Imagining metaphorical actions: Embodied simulations make the impossible plausible. Imagination, Cognition and Personality 25(3). 221238.
148
Gibbs, Raymond W. Jr. & Marcus Perlman. 2006. The contested impact of cognitive linguistic research on the psycholinguistics of metaphor understanding. In Gitte Kristiansen, Michel Achard, Rene Dirven & F. J. Ruiz de Mendoza (eds.), Cognitive Linguistics: Current applications and future perspectives, 211228. Berlin & New York: Mouton de Gruyter. Grady, Joseph. 1997. Foundations of meaning: Primary metaphors and primary scenes. Berkeley: University of California PhD dissertation. Grady, Joseph. 1999. A typology of motivation for conceptual metaphor: correlation vs. resemblance. In R. W. Jr. Gibbs, Raymond W. Jr. & Gerard Steen (eds.), Metaphor in Cognitive Linguistics, 79100. Amsterdam & Philadelphia: John Benjamins. Kimmel, Michael. 2005. Culture regained: Situated and compound image schemas. In Beate Hampe (ed.), From perception to meaning: Image schemas in Cognitive Linguistics, 285 342. Berlin & New York: Mouton de Gruyter. Ko vecses, Zoltan. 2002. Metaphor: A practical introduction. Oxford: Oxford University Press. Ko vecses, Zoltan. 2005. Metaphor in culture: Universality and variation. New York: Cambridge University Press. Lako, George. 1987. Women, re, and dangerous things. Chicago: University of Chicago. Lako, George. 1990. The Invariance Hypothesis: Is abstract reason based on imageschemas? Cognitive Linguistics 1(1). 3974. Lako, George. 1993. The contemporary theory of metaphor. In A. Ortony (ed.), Metaphor and thought, 202251. Cambridge: Cambridge University Press. Lako, George & Mark Johnson. 1980. Metaphors we live by. Chicago: University of Chicago Press. Lako, George & Mark Johnson. 1999. Philosophy in the esh: The embodied mind and its challenge to Western thought. New York: Basic Books. Lako, George & Mark Turner. 1989. More than a cool reason: A eld guide to poetic metaphor. Chicago: Chicago University Press. Lima, Paula Lenz Costa. 2006. About primary metaphors. D.E.L.T.A. 22. 109122. Paivio, Alan. 1971. Imagery and verbal processes. New York: Holt, Rinehart and Winston. Paivio, Alan. 1986. Mental representations: A dual coding approach. New York: Oxford University Press. Popova, Yanna. 2005. Image schemas and verbal synaesthesia. In Beate Hampe (ed.), From perception to meaning: Image schemas in Cognitive Linguistics, 395419. Berlin & New York: Mouton de Gruyter. Pena, Mara Sandra. 2003. Topology and cognition. What image-schemas reveal about the metaphorical language of emotions. Munich: Lincom Europa. Pena, Mara Sandra. 1999. Subsidiarity relationships between image-schemas: An approach to the force schema. Journal of English Studies 1, 187207. Pylyshyn, Zenon. 2003. Return of the mental image: Are there really pictures in the brain? Trends in Cognitive Science 7(3). 113118. Radden, Gu nter. 2002. How metonymic are metaphors? In Rene Dirven & Ralf Porings (eds.), Metaphor and metonymy in comparison and constrast, 407435. Berlin & New York: Mouton de Gruyter. Ruiz de Mendoza Ibanez, Francisco & Mara Sandra Pena Cervel. 2008. Grammatical met onymy within the action frame in English and Spanish. In Mara de los Angeles Gomez Gonzalez, J. Lachlan Mackenzie & Elsa M. Gonzalez-Alvarez (eds), Current trends in contrastive linguistics: Functional and cognitive perspectives, 251280. Amsterdam & Philadelphia: John Benjamins.
Reviewing imagery
149
Rohrer, Tim. 2005. Image schemata in the brain. In Beate Hampe (ed.), From perception to meaning: Image schemas in Cognitive Linguistics, 165196. Berlin & New York: Mouton de Gruyter. Talmy, Len. 1999. Fictive motion in language and ception. In Paul Bloom, Mary A. Peterson, Lynn Nadel & Merrill F. Garrett (eds.), Language and space, 211276. Cambridge, MA: MIT Press. Thomas, Nigel J. T. 1999. Are theories of imagery theories of imagination? An active perception approach to conscious mental content. Cognitive Science 23. 207245. Thomas, Nigel J. T. 2009. A note on schema and image schema. http:/ /www.imageryimagination.com/schemata.htm (accessed 20 May 2009). Watt, Roger. 1991. Understanding vision. New York: Academic Press. Weisberg, Robert. 1986. Creativity: Genius and other myths. New York: Freeman.
Book reviews
Monica Gonzalez-Marquez, Irene Mittelberg, Seana Coulson & Michael J. Spivey (eds.). Methods in Cognitive Linguistics. Amsterdam & Philadelphia: John Benjamins, 2007, xxviii 452 pp. Paperback ISBN 978 90 272 2372 2 / EUR 25.00 / USD 37.95. Reviewed by Zhuo Jing-Schmidt, University of Oregon, USA. E-mail: 3zjingsch@uoregon.edu4 There was a time when the cognitive linguistic study of language and its implications for the mind generally relied on introspection. As is probably the case with any other discipline, new theories and methods would not have evolved without the insights gained from intuitions and introspections in the earlier stages of the quest. Leonard Talmys foreword to Methods in Cognitive Linguistics (MCL) drives home this very point. Today, post-introspection trends are emergingperhaps more or less too rapidly for the unpreparedin response to the demand for increased empiricism in cognitive linguistics. Just as Raymond Gibbs points out in the beginning chapter, methodological reliability and theoretical falsiability are called for to re-prole cognitive linguistics as an empirical science. Its time to move on! The message delivered by MCL, an unprecedented blend of good intentions and patient instructions, is acute. The books primary target audience is cognitive linguists. Nevertheless it will serve to give linguists of all theoretical orientations an updated perspective on methodological developments in linguistic analysis. MCL consists of 17 papers organized in ve chapters. Chapter I, Methods and motivations, provides by way of four introductory articles a view of the challenges and chances confronting cognitive linguists in the post-introspection era. Gibbs alerts cognitive linguists of the skepticism from people outside their eld and urges them to develop reliable and replicable methods and construct falsiable hypotheses. He points to the
Cognitive Linguistics 211 (2010), 151179 DOI 10.1515/COGL.2010.005 09365907/10/00210151 6 Walter de Gruyter
152
Book reviews Cognitive Linguistics 211 (2010)
fundamental dierences between unconscious cognitive processes and the conscious mind. Such dierences, he argues, render the cognitive unconscious unavailable to conscious introspection. Drawing on functionalist traditions of discourse analysis and corpus linguistics in an expansive attempt to fuse the old and the new, Mittelberg, Farmer and Waugh show the utility of usage data for cognitive linguistics. They then oer a useful guide to the terminology of corpus linguistics, a handy list of popular corpora and an informative discussion of the research potentials of Internet search engines. Gonzalez-Marquez, Becker and Cutting oer a practical thirty-page crash course of experimentation, taking us through the steps and procedures towards the mastery of skills and knowledge necessary for experimental research. Gonzalez-Marquez et al. contrast with Gibbs in their eagerness to turn cognitive linguists into experimenters. While Gibbs suggests that cognitive linguists are not expected to turn away from what they do best and try to be something that they are not, that is, experimental psychologists or neuroscientists, Gonzalez-Marquez et al. intend to get you on your feet about doing experiments. Nunez discusses inferential statistics as another tool to narrowing the gap between cognitive linguistics and hard science. No doubt with largescale corpora becoming ever more accessible to linguists, the ability to run statistical tests to determine rigorously the probability of hypotheses is crucial. Cognitive linguists with the conviction that meaning and grammar originate in usage cannot live up to that conviction unless they are able to quantitatively distinguish probability from chance with regard to ob served uses. In addition to providing technical empowerment, Nunez also points out that mutual feeding of knowledge and insights between cognitive linguistics and its neighboring elds is a vital process for progress. Chapter II, Corpus and discourse analysis, consists of two case studies that illustrate the importance and analytical potentials of usage data and especially corpus data in revealing and helping to explain the complexity of patterns of language use in communication. Waugh, Fonseca-Greber, Vickers and Erozs study of pronouns in spoken French discourse illus trates the analytical power of employing multi-layer empirical evidence as motivated by an ecological perspective on discourse. Grondelaers, Geeraerts and Speelmans analysis of the Dutch presentative sentence in two dialectal regions further demonstrates that corpus data and adequate analytical tools are needed to construct convincing descriptive models of usage. Since these are not easy tasks for any individual linguist, cognitive and otherwise, it is likely, as Grondelaers et al. predict, that linguistic research will become a collaborative, cumulative if only more slow-going science.
153
Chapter III, Sign language and gesture, articulates the immediate relevancy of research on two explicitly embodied symbolic systems, signed language and gesture, to the study of conceptual structures in cognitive linguistics. Wilcox and Morford recognize the conceptual junctures at which signed language and cognitive linguistics convene and are mutually illuminating. The newfound link between the gestural-visual systems and speech provides a multimodality perspective on the one hand, as illuminated by Mittelberg. On the other hand, the discovery of such a profound connection uncovers a potential data source invaluable to cognitive linguists, which Sweetser keenly perceives and points out. Chapter IV, Behavioral research, oers six articles to demonstrate how a variety of specialized task-based experiments which are common in cognitive and developmental psychology can be applied to test cognitive linguistic hypotheses. Carlson and Hill discuss diverse task-based experiments used for exploring the linguistic conceptualization of space. Bergen proposes a model of semantic processing based on mental simulation in terms of neural-level activations of perceptual and motor representations. He surveys a number of experimental methods that support the simulation theory, thus providing empirical evidence of the notion of embodied cognition as articulated by cognitive linguists. Hasson and Giora introduce the basic psycholinguistic measures of mental representations prompted by a sentence. They explain the rationale of each measure and illustrate its particular utility with examples, thereby providing beginners with a users manual. Richardson, Dale and Spivey join the quest by showing cognitive linguists what visual attention can reveal about cognition and language. They measure visual attention by tracking eye movements in reading and listening comprehension. This particular angle of research oers unique clues to real-time mental activities related to language processing and use. Brandone, Golinko, Pulverman, Maguire, Hirsh-Pasek and Pruden take a developmental perspective by studying concept formation as it occurs with preverbal infants. Their experimental paradigms help to reconstruct the transitional path along which preverbal concepts are transformed by early language acquisition. Gor is concerned with the cognitive mechanisms behind inectional morphology processing. She shows how two task-based experiments can be conducted across dierent languages and populations to yield strong data on the eects of frequency and phonological similarity across two types of processing. It is exciting to see the analytical potentials of such experiments in testing something as elusive as the intuition of a continuum between lexicon and grammar, an intuition captured in the foundational work of introspective cognitive linguistics. Chapter V, Neural approaches, introduces cognitive linguists to the organic world of the brain and its neural activities as the biological basis
154
of language and cognition. The eort described in this chapter highlights the beginning of what Lieberman (2002:17) envisages as a new era of biological linguistics, one involving cooperative research among linguists, cognitive scientists, and neurobiologists. From Coulsons introduction, the reader gets an idea of how the ERP technique could shed light on cognitive semantic hypotheses by providing a neural-level depiction of the conditions and processes of meaning comprehension as it takes place in the brain. On the other side, the reader is guided by Edelman into the more nascent domains of computation and algorithm in relation to neurobiology. The reader gets a faint taste of abstract computational models of language that are presumed to simulate distributed knowledge representations in terms of cognition-general principles compatible with neural mechanisms and statistical inference. Given the rapid development in brain science and given the relative lack of communication between neurolinguists and cognitive linguists despite the rapid development in brain science, to use Ahlsens (2006:vii) words, the message conveyed in this chapter is particularly valuable to cognitive linguists. As the book unfolds through the chapters, the conventional but inquisitive cognitive linguists may nd themselves entering ever less familiar territories. At the same time, wading through every last chapter, they may nd that the boundaries of what is possible in cognitive linguistics get stretched more or less to allow them to perceive somewhat richer sceneries that are thus far unseen. No doubt anxieties about ones own methodological inadequacies increase accordingly. Those looming anxieties, let us hope, will inspire cognitive linguists to acquire one of the tools revealed to them in this big bundle of a toolbox, or nd someone who already possesses some tools, to collaborate. Recent years have witnessed desirable advances resulting from such integrative eort, as exemplied by Dabrowska (2004), Gallese and Lako (2005), Pulvermuller (2005), Boroditsky and Gaby (2006), Glenberg (2008), and Barsalou et al. (2008), among many others. The most remarkable thing about MCL is that none of these contributions would have been as powerful if they were scattered in separate issues of discipline-specic journals. Yet together they send one startling message: Lets give each other a hand, for together we can leap over chasms!
References
Ahlsen, Elisabeth 2006 Introduction to Neurolinguistics. Amsterdam/Philadelphia: Benjamins.
155
Barsalou, L. W., A. Santos, W. K. Simmons, and C. D. Wilson 2008 Language and simulation in conceptual processing. In: Symbols and Embodiment, M. De Vega, A. M. Glenberg and A. C. Graesser (eds.). Oxford: Oxford University Press. Boroditsky, L. and A. Gaby 2006 East of Tuesday: Representing time in absolute space. Proceedings of the 28th Annual Meeting of the cognitive Science Society, Vancouver, Canada. Da browska, Ewa 2004 Language, Mind and Brain. Washington, D.C: Georgetown University Press. Gallese, V. and G. Lako 2005 The brains concepts: The role of the sensory-motor system in conceptual knowledge. Cognitive Neuropsychology 22, 455479. Glenberg, Arthur M. 2008 Toward the integration of bodily states, language, and action. In: Embodied Grounding: Social, Cognitive, Aective, and Neuroscientic Approaches, G. R. Semin and E. R. Smith (eds.), 4370. Cambridge: Cambridge University Press. Lieberman, Philip 2002 Human Language and Our Reptilian Brain. Cambridge, MA: Harvard University Press. Pulvermu ller, Friedemann 2005 Brain mechanisms linking language and action. Nature Reviews Neuroscience 6(7), 576582.
Mark Johnson. The meaning of the body: Aesthetics of human understanding. Chicago & London: The University of Chicago Press, 2007, xiii 308 pp. ISBN: 9780226401928. Cloth $ 32.00. Paper $ 22.50. Reviewed by Heli Tissari, University of Helsinki, Finland. Email 3heli.tissari@helsinki.4 Mark Johnsons new book continues to develop themes which he has already tackled in books well-known to cognitive linguists: conceptual metaphors (Lako & Johnson 1980), the bodily basis of meaning (Johnson 1987), and the implications of the conceptual metaphor theory (henceforth CMT) to philosophy (Lako & Johnson 1999). Its rst part deals with bodily meaning and felt sense (pp. 17110), the second part relates the notion of embodied meaning to the sciences of the mind (pp. 111206), and the third part discusses aesthetics and art (pp. 207 283). The rst chapter (pp. 1932) elaborates on a claim which it begins with, that [l]ife and movement are inextricably connected (p. 19). Considering CMT, this is not a new claim, but potentially a shift in emphasis. Movement has been present in discussions of time and temporal aspects
156
of language, but there are also a number of metaphors, e.g., metaphors of containment, which I would rather associate with relative stability. More precisely, we might say that Lako and Johnson have been interested in a persons interaction with her or his surroundings (e.g., in Lako & Johnson 1999), but in practice, discussions of conceptual metaphors often solidify metaphors through labelling and classifying them, and through using nouns for both the target and source domains of metaphors (e.g., love is fire). The second chapter discusses the development of a sense of self in infants, in particular their interaction with their surroundings through movement (pp. 3351). A synopsis of this chapter can be found on page 36, which introduces the notions of communication, object perception and manipulation, and bodily motion. Johnson discusses these by referring to psychological literature, aiming to show that objects are minddependent and individuated relative to our conceptual systems and structures for meaning-making. The world does not come to us prepackaged with determinate objects with their determinate properties (p. 46). This agrees with what he and George Lako claimed as early as 1980. Chapter three (pp. 5268) presents a movement onwards in CMT, claiming in its heading that feeling is rst. Johnson makes explicit tribute to Antonio Damasio and Joseph LeDoux, whom he calls two of the most distinguished researchers in the eld of cognitive neuroscience today (p. 54). He emphasises the importance of our unconscious processing of what is happening as a basis both of emotion and action (p. 66). In other words, he claims that we usually do not make sense of the world by consciously reecting on it. Rather, we feel it. Chapter four introduces the concept of qualities of life (pp. 6985), with particular focus on Deweys pervasive qualities of experiences (pp. 7178), and Gendlins felt situations (pp. 7985). Johnson sums up the former by stating that entire situations are characterized by pervasive qualities, and we pick out particular qualities for discrimination within this unied situational whole (p. 72). His discussion of Gendlin then introduces the concepts of the felt sense and the formal expression, which he claims are two dimensions of a single, ongoing activity of meaning-making (p. 82). Putting it all together, we can both feel a particular quality and name it, although the latter task may be much more dicult than the former. Continuing from this, chapter ve claims that concepts emerge at the meeting-point between the felt and the named (pp. 86110). Johnson bases this claim on the authority of William Jamess writings on thinking, underlining the dierence between such an understanding of meaning and traditional accounts of the relationship between words and concepts.
157
In agreement with his earlier views, he says that [h]uman thinking is a continuous feeling-thinking process that is forever tied to our bodys monitoring its own states (p. 98). These considerations conduct us to the second part of the book which, according to Johnson, is required because we must explain conceptual thinking without introducing immaterial mind or a transcendent ego (p. 113). Chapter six begins this task by outlining a nonrepresentational view of mind (pp. 113134), or what he also calls an interactionalist (or transactional or enactionist) view (p. 117, original emphasis). In this context, he briey discusses the biology of multicellular organisms and the way they develop neural maps as reactions to new kinds of situations (pp. 123130). Johnson begins chapter seven by restating that humans, like animals, have neural maps (p. 135). What he then goes on to say about image schemas is quite familiar from his previous writings (such as Johnson 1987). Towards the end of the chapter he discusses the social, intersubjective character of embodied cognition (pp. 147152), dening social phenomena as those phenomena arising out of recurrent structural couplings that require the coordinated participation of multiple organisms (p. 148). The eighth chapter focuses on the brains role in meaning (pp. 155 175); in other words, the sensorimotor system and its relationship to reasoning and language. In this chapter, Johnson denes concepts as patterns of interaction that are important enough for the ongoing experience of a person (or an animal) to merit being selected from the ow of experience (p. 159); cf. pervasive qualities of experiences, felt situations above. What I found particularly intriguing in this chapter is Johnsons statement that not only image schemas and conceptual metaphors and metonymies, but [a]ll aspects of grammar . . . and all aspects of logical relations need to be accounted for through ties to body-based meaning (p. 170). If a feeling report of this book is allowed, I felt that this claim, following the description of qualities of life and the emergence of concepts in chapters four and ve led to considerable expectations as to what would follow. In that respect, the ninth chapter, with its traditional Lakoand-Johnson-type account of embodied meaning and abstract thought in terms of CMT, was a disappointment. It did not seem that much was added to what appeared in Lako and Johnson 1999. The third part of the book, on embodied meaning, aesthetics, and art (pp. 207283), felt more welcome in these terms, bringing in more novel considerations. Chapter ten is a critique of the devaluation of aesthetics in the Western tradition and of Kants subjectivizing of
158
aesthetics (pp. 209219), followed by examples of meaning as occurring in poetry and visual images (pp. 219234). Remembering Johnsons visual interests from his The Body in the Mind (1987), chapter eleven on music and the ow of meaning is even more interesting (pp. 235262), describing the roles image schemas and metaphors play in pieces of music. In his chapter on music, Johnson beautifully returns to his original interest in movement, discussing, among other things, what he calls the moving music metaphor. In his explication of the metaphor, the target domain musical motion involves such issues as musical passages, their beginnings and ends, rests, and repeats (p. 248). What I missed in this chapter was a deeper involvement with the notion and intentions of a composer. Johnson concludes with a chapter entitled the meaning of the body (pp. 263283), which includes a similarly, if not more, ambitious subheading, the meaning of meaning (pp. 265274). Here he outlines the objectivist theory of meaning (p. 272), and an embodied, experientialist view of meaning (p. 273), in a manner already familiar from Lako and Johnsons earlier books. To compare, a similar list can be found in the preface to Lako (1987), where Lako lists objectivist views (pp. xiixiii), and a very dierent view (pp. xivxv). To contextualise the book in question, it may pay to compare it with two other books, because I wonder whether it is also a spiritual book. Many of the ideas in it are also present in Varela, Thompson and Roschs The Embodied Mind: Cognitive Science and Human Experience (1995 [1991]), which promotes an embodied view of human existence and Buddhist meditation. However, Johnsons aesthetics may also be seen to relate to many of the issues discussed by Evans in his Preserving the Person: A Look at the Human Sciences (1979 [1977]) which, apart from promoting a specically Christian world-view, discusses the then current situation of the human sciences. It is of interest to note that Varela, Thompson and Rosch say that the inspiration for [their] book began in the late seventies (1995: xi). In other words, it seems that some of the views which Johnson discusses in his book were already being debated at that time. I am not saying that they were not debated prior to that, but oering one way of seeing where Johnson comes from and where he may be going. In a footnote, Johnson writes:
When I was in graduate school in the mid-1970s, it was a commonplace prejudice of the culture of analytic philosophy to assume that the smartest, most serious students would do the intellectually rigorous work . . . while those who werent up to this exalted task could entertain themselves with the mushier, subjective value elds. (p. 214.)
159
Compare with Varela, Thompson & Roschs question: How can . . . an attitude of all encompassing, decentered, responsive, compassionate concern be fostered and embodied in our culture? (1995: 252.) Evanss book is of interest particularly in respect to the objectivist theory of meaning criticised by Johnson (p. 272), and by Lako, who summarises it by stating, among other things, that [i]t is . . . incidental to the nature of meaningful concepts and reason that human beings have the bodies they have and function in their environment in the way they do (1987: xiii). Evans wrote in the 1970s that [t]he Zeitgeist has denitely been against the dualist. To many, including many philosophers, dualism just does not seem a live option today. (1979: 105.) Compare also Evanss presentation of the French Catholic Gabriel Marcels perspectivalism with Johnsons aesthetics. According to Evans, Marcel sees lived experience as something in which
the person is seen as a presence not as object, and the diculties encountered are mysteries to be explored, rather than problems to be solved. Marcel believes that the mind-body problem is not a problem but a mystery. My relationship to my body is too intimate to allow me to objectify it as a thing and then ask, How am I related to this thing called my body? (1979: 107, original emphasis.)
Although Johnson, quite unsurprisingly, does not use the word mystery, he refers to the oh of wonder (p. 71). One of his chapter titles comes from a poet who is known to have talked to his God (since feeling is rst, E.E. Cummings, quoted on p. 52). Varela, Thompson and Rosch enumerate folk meanings of the word meditation, among these a mystical state in which higher realities or religious objects are experienced (1995: 23). To round up, what I think Johnson does most beautifully in his The Meaning of the Body is to describe what living and thinking in a human body feels like, and I would recommend it to anyone wanting to relish such a text. Simultaneously, I doubt whether just anyone is really as aware as Johnson is of his embodiment, unless they have trained themselves, whether through meditation (or prayer), through the arts, or through sport. Not that he claims this, but I simply wanted to emphasise it. Varela, Thompson and Rosch advocate one form of such an awareness, claiming that [e]xperience and scientic understanding are like two legs without which we cannot walk (1995: 14). In this light, Johnson seems to be going from the embodiment of meaning to coupling embodied meaning with values and, potentially, towards spiritual experience. Indeed, Johnson reaches the notion of embodied spirituality on the last pages of the book under review (pp. 281282). For him, it means
160
horizontal transcendence, as opposed to what he calls vertical transcendence. Such horizontal transcendence
. . . recognizes the inescapability of human nitude and is compatible with the embodiment of meaning, mind, and personal identity. From this human perspective, transcendence consists in our happy ability to sometimes go beyond our present situation in transformative acts that change both our world and ourselves. (P. 281.)
As a linguist, I would nally like to pick up three issues from this book. The rst is the notion that [l]ife and movement are inextricably connected (p. 19). This is no news to historical linguists and, in general, we know that language changes. It is nevertheless still a challenge for CMT and cognitive linguistics to take this into account, and this is something that Johnson does not discuss in his book. The second issue is the claim which I summarised by saying that concepts emerge at the meeting-point between the felt and the named (pp. 86110). In my view, cognitive linguists should pay more attention to the relationship between word and concept, and Johnson is thus right to pinpoint it as a critical issue. It is something that is too often mentioned in passing without considering what it means. For example, what does it mean to use a notation for conceptual metaphors which couples two nouns with a form of the verb be (e.g., love is fire)? Or, why do we say source-path-goal schema (as on pp. 141142) rather than, for example, the linear schema, or the move schema? Or, what does it mean that the felt sense of an experience is not vague, mushy, empty, or chaotic, but extremely precise, as Johnson claims, going on to say that it is carried forward only by quite specic words or forms (p. 82, original emphasis). (Consider the prototype theory.) The third issue is the role of the brain and the sensorimotor system in reasoning and language. There are potentially myriads of issues to research if we wish to come close to the goal set by Johnson in stating that [a]ll aspects of grammar . . . and all aspects of logical relations need to be accounted for through ties to body-based meaning (p. 170). It is a challenge for cognitive linguists to pose the right new questions to begin with. References
Evans, C. Stephen. 1979 [1977]. Preserving the person: A look at the human sciences. Downers Grove, IL: InterVarsity Press. Johnson, Mark. 1987. The body in the mind: The bodily basis of meaning, imagination and reason. Chicago: The University of Chicago Press. Lako, George. 1987. Women, re and dangerous things: What categories reveal about the mind. Chicago & London: The University of Chicago Press.
161
Lako, George & Mark Johnson. 1980. Metaphors we live by. Chicago: The University of Chicago Press. Lako, George & Mark Johnson. 1999. Philosophy in the esh: The embodied mind and its challenge to Western thought. New York: Basic Books. Varela, Francisco J., Evan Thompson & Eleanor Rosch. 1995 [1991]. The embodied mind: Cognitive Science and human experience. Cambridge, MA & London: The MIT Press.
June Luchjenbroers (ed.). Cognitive Linguistics investigations. Across languages, elds and philosophical boundaries. (Human Cognitive Processing 15). Amsterdam & Philadelphia: John Benjamins, 2006, 334 pp., ISBN 978 90 272 2368 5. Hardbound. EUR 120.00 / USD 180.00. Reviewed by Ignacy Nasalski, Institute of Oriental Philology, Jagiellonian University of Cracow, Poland. E-mail: 3ignacy.nasalski@uj.edu.pl4 The book in review has no particular leading theme. It arose from a workshop held at the University of Queensland during the 4th Australian Linguistics Institute, in July 1998, in which researchers from around the world oered papers drawn from a number or areas from within the cognitive sciences. As usually in the case of such compilations, the volume presents accordingly a range of topics and captures a diversity of research activities from various parts of the world and across a range of European, Asian, Bantu, Austronesian and Australian Aboriginal languages. Despite dierences in philosophical approach and applied methodology they all share a commitment to the view that human categorization involves mental concepts that have fuzzy boundaries and are culturally and situation-based (p. 3). Apart from the introduction by the editor which gives the reader a general outline of the volume, the book contains fourteen papers (there are in fact fteen chapters, but I exclude the Introduction from the calculation, because it should not be numbered as chapter 1) that are supposed to illustrate how otherwise separate areas of linguistic concern can present a better clarication of the linguistic distributions in which units are produced in talk; as well as provide a deeper appreciation of the semantic richness of those linguistic units, not captured by Formalist approaches (p. 2). The book is divided into three parts. Part one Cultural models and conceptual mappings consists of four papers presenting investigations into cultural schemata, gesture, mental spaces manoeuvres and the like. It shows how conceptual mapping builds on specic types of knowledge and how cultural models sculpt the verbal communication.
162
Gary Palmer, in a starting paper When does cognitive linguistics become cultural? Case studies in Tagalog voice and Shona noun classiers (p. 1345), uses data from two unrelated languages to supply further evidence for the idea that grammar is governed by cultural schemata rather than universal cognitive schemata. Two case studies from Tagalog and Shona illustrate how lexical domains and grammatical construction link to linguistically determinant cultural models such as scenarios and polycentric categories and end with the conclusion that understanding the grammar and lexicon of a language requires grasp of cultural models and culturally dened imagery (p. 39). In a very interesting and instructive paper Purple persuasion. Deliberative rhetoric and conceptual blending (p. 4765) Seana Coulson and Todd Oakley make use of Blending Theory to demonstrate how cognitive linguistics can contribute to better understanding of some political processes, particularly those involving inuencing people. The authors show how simplied input models and categories are blended to produce categories with its own distinct internal structure and form integrated event scenarios. The conclusion is that an apposite choice of input frames can serve as an ecient persuasive means to encourage a particular construal of events that will result afterwards in the target actions and transform recipients into political activists. Depicting ctive motion in drawings (p. 6785) is a paper by Teenie Matlock who analyzed so called ctive motion sentences such as The road goes/runs along the coast in order to nd out whether ctive motion plays a role in their comprehension. Drawing from picture experiments she provided evidence for a link between motion verbs and the mental stimulation of the action conveyed by them. The presented results challenge standard psycholinguistic accounts for how words are represented and proceeded. They prove that ctive motion sentences include dynamic construal as mentally simulated motion or linear extension, and that comprehension rests broadly on the embodied experience. June Luchjenbroers in the next paper Discourse, gesture and mental spaces manoeuvres (pp. 87105) investigated the dynamics of conversational gesture in the physical space (F-space) in which they occur during discourse. She argues that the boundaries of this space either amplify the information conveyed by the lexical component or provide additional aspects of speaker meaning in the form of clues about speaker cognition. The second part Computational models and conceptual mappings is comprised of studies dealing with computational models that hypothesize dierent features of the cognitive programming and discuss how they can be used to describe cognitive processes associated with the mental lexicon in relation to morphology (Ping Li, pp. 110137), to grammar (Joost
163
Schilperoord and Arie Verhagen, pp. 139168) and the phonological system (Paul Warren, pp. 169186). Ping Li (In search of meaning: The acquisition of semantic structures and morphological systems) argues that the evolution and development of semantic representations as acquired by children result from simple probabilistic procedures as embodied in connectionist networks and analogous statistical learning mechanisms. These connectionist networks can capture, as demonstrated by Li, the representation of semantic structures which can be best viewed as emerging out of a continuously developing dynamical system that operates on statistical computations of the various form-form and form-meaning constraints. Schilperoords and Verhagens interest (Grammar and language production) concentrates on the organizational features of the mental lexicon and mental grammar, and in particular on the question of how function words, prepositions and articles are selected during language production. The authors apply a usage based consideration of function words in order to explain how these words are cognitively processed. The traditional assumptions that function words are stored independently of their lexical heads and that individual content words are retrieved from the lexicon and assembled into larger structures by means of grammatical computation, are here questioned and presented as incorrect or at least incomplete. The nal paper in this section by Paul Warren (Word recognition and sound merger) deals with comprehension in the form of psycholinguistic models of spoken word recognition as well as with the need for a more cognitive account for variation arising from sound change. The particular case under consideration is the merger-in-progress of the front centring diphthong in New Zealand English as in ear/air neutralization. Warren argues that it is the sentential or extralinguistic context that resolves homophone ambiguity as in the case of merged ear/air form just like they do for other homophones. The last part Linguistic components and conceptual mappings focuses on three areas of linguistic description: semantics, grammar and discourse. In the rst paper in this section Verbal explication and the place of NSM semantics in cognitive linguistics (p. 189218) Cli Goddard argues that cognitive linguistics cannot approach verbal explication in a casual manner, because its familiar devices such as diagrams representing image schemata or conceptual metaphors often rely on complex culturespecic iconographic conventions which are smuggled in without the necessary acknowledgement or explanation (p. 209). Goddard makes therefore use of the natural semantic metalanguage framework (NSM),
164
because he sees it as the only well developed and empirically grounded theory of verbal explication that allows additionally to avoid the terminological ethnocentrism. The result is a convincing demonstration that NSM can be successfully employed within the cognitive linguistics paradigm. How do you know shes a woman. Feature, prototypes and category stress in Turkish kadn and kz is a scintillating paper presented by Robin Turner (p. 219234) in which he analyzes Turkish words for girl and woman, and investigates the semantic content of them with their relation to culture and personal interaction. He comes to the conclusion that the inconsistency between the feature-based and prototype-based categorization results in what he calls category stress. This concept is his original ideait can be seen as a kind of cognitive dissonance (p. 221) which results from either contextual factors, such as humour and alternative categorization, or from cultural factors such as social change. The next paper by Iraide Ibarretxe-Antunano Cross-linguistic poly semy in tactile verbs (pp. 235253) complements to some extent Turners research. The author shows how the semantic content of the tactile verb touch in three genetically unrelated languages, viz. Basque, Spanish and English, interacts and contributes to the creation of each semantic extension. From these examples Ibarretxe-Antunano draws more general conclusions and argues that dierent polysemes of a lexical item are obtained through the interaction of the semantic content of both the lexical item itself and its dierent co-occurring elements (the phenomenon she calls compositional polysemy). Maarten Lemmens in a paper How experience structures the conceptualization of causality (pp. 255270) deals with variations in the conceptualization of causative verbs of killing such as suocate, choke or kill in Old English corpus data. He argues that the syntactic choice between two dierent models of causative events, i.e. the transitive or the ergative model, shapes the way the event is experienced. The conclusion, not immediately obvious from introspection, is that the more volitional a participant seems to be whilst engaged in some causative process, the more likely it is that s/he will surface as a volitional Actor in a transitive con` struction; the more autonomous the process, vis-a-vis its cause, the more likely an ergative conception will be triggered (p. 267). In the next paper Internal state predicates in Japanese: a cognitive approach (pp. 271291) Satoshi Uehara develops Langackers framework on subjectivity and tries to explain the use of particular grammatical elements in discourse, such as the nominative particle ga being used for object marking. The author argues that internal state predicates in Japanese can best be characterized as deictic, since they prole the ob-
165
ject of speaking from the speakers point of view. Uehara argues that internal state predicates denoting speakers internal states, feelings and emotional reactions, contribute to the subjective event construal and provide thus support for the Cognitive Grammar theory of subjectivity. The last two papers in the volume deal with a very interesting aspect of human communication, to which cognitive linguistics has contributed too little, namely discourse and narration. Dave Goughs paper Figure, ground and connexity: evidence from Xhosa narrative (pp. 293303) is a study of folk narrative discourse. Gough advocates an orientation to language which holds that its nature can and should be explained in terms of factors outside of language, i.e. in terms of discourse and cognitive factors rather than seeing it as a taxonomic and internally describable separate modules. He argues that grammatical terms like mood and tense refer to quite dierent verbal categories and that the discourse concepts of grounding and connexity provide more coherent explanation of the structure of the verbal system than the traditional analysis. Ming-Ming Pu in her paper Discourse organization and coherence (pp. 305324) explores some narrative constraints that are usually subject of textual analysis. She is primarily interested in the question of how the events in an episodic structure are related to each other. Using narrative data from English and Mandarin Chinese, drawn from a childrens picture book, Pu comes to aquite obvious, I must addconclusion, that conversations and written texts are almost never unordered strings of utterances, but they are usually structured, more or less complex stories, the coherence of which is achieved in a systematic and even automatic process through establishing story frame, focusing on the central character, systematically tracking references and maintaining topic continuity (p. 323). Interestingly her paper provides strong support for an interactional character of communication. She shows namely that the process of text production, whether spoken or written, is guided by cognitive constraints upon speaker to accommodate their addressees processing needs by signalling discourse units and prompting the retrieval of information. Prepared by researchers from universities in Australia, New Zealand, Spain, France, USA, Turkey and Holland, this volume constitutes a signicant contribution to the eld of cognitive and cultural linguistics. Just as the subtitle Across languages, elds and philosophical boundaries suggests, the fteen chapters cover an extensive selection of concepts and notions that are of interest for everyone dealing with such elds as language acquisition, video data analysis, gesture, Blending Theory, ctive motion and the like. The volume presents well documented data from a spectrum of languages that empirically validate or challenge some of the hypotheses or theoretical models. Both those who prefer to focus on one
166
language or a variety at a time will nd the collected texts equally attractive. An important merit of the book is the fact that some papers go beyond mere linguistic investigations, and provide revealing insights into some cultural (Palmer; Goddard; Turner), socio-political (Coulson & Oakley) and psychological (Uehara; Pu) phenomena. Thus, the volume can also be recommended to philosophers, anthropologist and even political scientists.
nter Radden, Klaus-Michael Kopcke, Thomas Berg & Peter Siemund Gu (eds.). Aspects of meaning construction. Amsterdam & Philadelphia: John Benjamins, 2007, x 287 pp. EUR 110.00 / USD 165.00. Reviewed by JoAnne Ne-van Aertselaer, Department of English Studies, Universidad Complutense de Madrid, Spain. E-mail: 3nejoanne@hotmail.com4 A central presupposition of cognitive linguistics is that human beings construct meanings by accessing cognitive structures, which are themselves the result of embodied, encultured and imaginative dimensions of meaning (Fesmire 1994:150). Of the various cognitive structures used in meaning construction, this volume deals specically with metonymy and metaphor (Part I) and mental spaces and conceptual blending (Part II). Most of the articles reect, in dierent ways, concern with the research area of the continuum between metonymy and metaphor and consequently the volume provides a further conceptual up-date on recent work done in these areas. Thus, some of the thematic divergence found in festschrift volumes has been avoided here. The volume was dedicated to Klaus-Uwe Panther and, given his important work on metonymy, many of the articles included in Part I provide excellent links to Panthers signicant body of research (Panther 2005; Panther and Thornburg 1998; Panther and Thornburg 2003a,b; Panther and Thornburg 2004). The Introduction begins with a useful overview of the types of linguistic underspecicationinvolving implicitness, indeterminacy, and incompatibilitywhich lead language users to mutually construct meanings. To be sure, the interpretative nature of language meaning is not a new idea, and not even particular to cognitive linguistics. In the sub-elds of both cognitive psychology (Ausubel et al 1978; Bruner 1990; Mandler 1984; Rumelhart 1980; Shank 1977) and social psychology (Vygotsky 1978), the role of conceptual schemata, built up from real world knowledge, in utterance understanding has long been the focus of constructivist theories. However, most of these theorists have dealt with comprehension and the
167
building of schemata upon the reception of information and not with meaning making between interlocutors, i.e. two conceptualizers, which is the focus of this volume. However, for some time now, both psychology (Mandler 1992; Neisser 1976) and cognitive linguistics (Johnson 1987; Lako 1987; and many others) have begun to coalesce in viewing conceptualization as webs of relations emerging from the union of perception and environment. The remainder of the Introduction refers briey to the work presented in each of the 13 articles which the volume comprises. The editors also provide a practical index of metonymies and metaphor. A brief examination shows that of the metonymies most analyzed by the dierent authors are cause for effect, whole for part, part for whole and place for event; among the conceptual metaphors most examined are objects are human and time is space. Part I of the book deals with metonymy and metaphor, for the most part, from a macroscopic view (emphasis on large webs of systems of these two cognitive patterns). The number of articles that deal with the two conceptual patterns as underlying grammatical phenomena show how far cognitive linguistics has progressed, from microscopic perspectives (presenting and discussing specic examples), toward the explanation of metonymic and metaphorical explanations of grammar from a fundamentally dierent approach (Thornburg, Panther and Barcelona 2006; Verhagen 2005). The chapter by Raymond Gibbs, Experimental tests of gurative meaning construction, explores more extensively central questions which he has previously examined (Gibbs 1993), namely the question of whether claims of cognitive linguistics aord psychologically plausible accounts of how people construe meaning in everyday discourse situations. More specically, Gibbs studies how interlocutors integrate pragmatic knowledge with conceptual metonymies in order to create specic, contextually appropriate inferences. Gibbs rst question is whether conceptual metonymies (e.g., part for whole and place for event) really do have psychological reality, as claimed by Panther and Thornburg (2003b). In order to shed light on this issue, Gibbs turns to recent corpus studies and notes that contrary to what is usually thought (that place for event metonymies are very frequent), Markert and Nissum (2003) actually found that place for people was much more common. He, therefore, cautions that cognitive linguists must be very cautious before claiming that any particular gure of speech is common (p. 22). Noting that much more psycholinguistic research has been carried out on peoples use of conceptual metaphors, he calls for psycholinguists to empirically study conceptual metonymies through priming paradigms, i.e., using a primed conceptual
168
metonymy (i.e., object for user) to see if the priming facilitates peoples reading of metonymic phrases supported by similar conceptual metonymies, such as place for event. In the remainder of the chapter, Gibbs provides empirical support for an important claim made by Panther and Thornburg (2003b), namely, that metonymies make targets accessible (in the scalpel was sued for malpractice, the instrument for person metonymy activates the referent of surgeon) and thus allow for further elaboration. Finally, Gibbs also supports another claim made by Panther and Thornburg (2003b) regarding the adjustments in interpretations made by interlocutors through the use of pragmatic knowledge. In High-level metaphor and metonymy in meaning construction, Francisco Ruiz de Mendoza and Ricardo Mairal, argue that inferential activity constrains the possible interpretations made through these conceptual patterns and that said constraints are also operational in many grammatical processes, in particular in the domain of transitivity examined here. The authors review of various types of constraints on metaphoric and metonymic interpretation (the Invariance Principle, the Extended Invariance Principle, the Correlation Principle, and the Mapping Enforcement Principlethe rst is Lako s and the last three are Ruiz de Mendozas), but given the extensive polysemy of natural language, even these principles may not be enough to make actual language use fully predictable without the inferential activity emerging from speakers rhetorical construals during usage events, a statement with which these authors would not disagree. One of the chief contributions of Ruiz de Mendoza and Mairals paper is their discussion of high-level metaphoric and metonymic activity performed on non-situational cognitive models (here, action and process), which underly grammatical patterns such as transitivity. Starting from a discussion of the high-level-propositional idealized cognitive model, which they call action frame, and Vendlers Aktionsart (action-process distinction), they propose additional criteria for the classication of modes of action (resultative, eectual, experiental and communicative) and subsume some processes (although not all) into the subdomain of resultative actions. The high-level metaphoric activity signaled here is that there may be shifts between action frames and process frames, in the sense that in Peter laughed John out of the oce, the verb has undergone a metaphorical mapping of the actor-goal transitive relationship (from laugh at someone to laugh someone). The authors are somewhat less successful in dealing with other examples (namely 10, She laughed herself out to silence) of metaphorical mappings of transitivity onto intransitive verbs, mainly because, at least as far as example 10, the sentence seems highly unlikely to be a real token of language use. This
169
problem reminds one of Gibbs request for more use of corpora in cognitive linguistics. Still within the action frame, the nal section of the paper deals with high-level metonymy (process for action and process for action for result). These metonymic operations may prole dierent aspects of transitivity constructions, such as the bread cuts easily (process for action) and the bread cuts well (process for action for result) and, thus, the authors explanations provide links to the notion of coercion (Michaelis 2003), a concept which is challenged in Chapter 5 of this volume. In Antonio Barcelonas chapter on The role of metonymy in meaning construction at discourse level is a welcome eort to lend more psychological validity to the function of metonymy in discovering readers textlevel implications. Barcelona rst oers an explanation of the role of metonymy in triggering pragmatic reasoning, as also reected in the work of Panther and Thornburg (2003a, b). Although in other works Barcelona (2002: 227228) oers descriptions of three dierent types of metonymy (schematic, typical, and prototypical ), here he denes only schematic metonymy, because it exhibits all of the properties shared by every type of metonymy. He oers an up-dated denition of this type of metonymy as: . . . an asymmetrical mapping of a conceptual domain, the source, onto another domain, the target. Source and target are in the same functional domain and are linked by a pragmatic function, so that the target is mentally activated (p. 53). Whether one agrees or not with Barcelonas denition of schematic metonymy, which would include almost all uses of a linguistic expression (The book is very large whole for [physical] part), what he presents in this article is not a discussion of this type of metonymy, but of text-level implicatures which are guided by conceptual metaphors (i.e., those with a higher degree of metonymicity than the schematic type). In this study, he extends an earlier analysis of brief conversational exchanges to implicatures generated by a longer text (a six-sentence narrative on hiking in Colorado), which he uses to carry out an informal experiment with seven native speakers in order to check the inferences which Barcelona himself had already generated from his reading of the text. He then ties the conceptual metonymies (action for purpose, category for member, cause for effect, etc.) to the 16 textual implicatures, 14 of which were conrmed by the native-speaker participants. Although this experiment might have been more psychologically convincing if Barcelona had had at least two other people besides himself generate the initial inferences and if he had also added distractors, still Barcelona has provided a uncomplicated on-line method for verifying the inferences which can thus prevent over-intellectualizing linguistic comprehension.
170
Chapter 4 of the volume deals with Chained metonymies in lexicon and grammar by examining body part terms from a cross linguistic perspective. By using data from bilingual dictionaries to explore semantic extensions of body part terms, Martin Hilpert argues that, in dierent languages, there are systematic dierences between lexical and semantic extensions involving chained metonymies. Hilpert also contributes to the literature on the interplay of metonymy and metaphor by showing that metonymic mappings precede the metaphoric ones. The plausibility of the chained metonymies that he studied is veried by experiential motivation, the existence of polysemous links and cross-linguistic attestation of serial metonymic mappings in 76 languages, representing all the known language families. These data contribute signicantly to the reliability of the authors interpretations. However, Hilpert also rightly acknowledges that to validate his assumptions, he would need historical data in order to discuss grammaticalization processes and that some polysemous terms for body parts are very vague, e.g., the term for foot referring also to leg, which makes it dicult to identify which is the more basic term. He presents various strategies for associating body part terms with lexical meaning extensions and gives a detailed account of each type with interesting subsequent extensions: body parts associated with perceptual functions (as an instrument, as a container, as part of physical objects). As far as grammatical meaning extensions, Hilpert claims that all of these have in common the metaphor objects are human as the rst conceptual mapping, and then body part terms are extended to mark spatial and temporal relations. He shows that although extensions onto grammatical meanings are less frequent than the lexical ones, most of the former extensions consist of a series of mappings, constituting grammatical processes (back with its extensions behind and after). Hilperts study conrms the notion put forward in previous research (Goosens 2002) that metonymies based on metaphors are infrequent. But, importantly, his data also suggest that there is a preference for initial metaphoric extension while chained extensions onto lexical meanings favour mainly metonymic processes. In her study of coercion in the construction of meaning, Debra Ziegler proposes that this notion might be more fruitfully thought of as falling within the scope of pragmatic reasoning and reanalysis (echoing some concerns about the notion of coercion already expressed by Elizabeth C. Traugott 2007), and thus as a factor contributing to grammatical change. In order to show that the concept of coercion may actually be superuous, Ziegler focuses on three representative groups of coercion: nominal; complement or subject coercion; and, aspectual coercion. She then provides a complete and compelling diering account for each of the three.
171
For example, for nominal coercion, Ziegler oers explanations such as grammaticalization of the indenite article, which, from Middle English onwards, began to encroach upon contexts in which a mass noun would have been used previously. As far as mass-to-count coercion, she oers a plausible account of metonymy at work (She had a beer.) and count-tomass coercion (There was rat all over the place.), aided by dierences in construal of quantity boundedness, or the absence of determiners signaling boundedness. With equally competent arguments, Ziegler tackles the notion of coercion of complements and shows how the meaning is metaphorically constructed due to activation of an ICM in each case. To nalize this section, she examines aspectual coercion and concludes that this type of so-called coercion is the synchronic recognition of a prolonged and developing diachronic process (p. 117), whereby imperfective aspectual functions sometimes take on stable functions of some OE participles; she then compares these uses to McDonalds slogan Im lovin it. As noted by both Ziegler and Traugott (2007), there appears to be a present-day expansion of stative verbs used with a progressive aspect with a durative interpretation. Throughout her discussion, Ziegler examines not only what happens but also why certain processes occur, i.e., the uses that real speakers make of metonymical adaptation. For this, she convincingly appeals to processes of metonymic changes (including a discussion of the limits of constructive meaning according to the ease of retrieval) and diachronic changes. Since the notion of coercion cannot be used to interpret metonymy because such models often involve conversational implicatures, Ziegler concludes that the term of coercion is redundant. In chapter 6, Brdar and Brdar-Szabo return to a topic previously examined by Barcelona (2004)that of metonymic meanings of proper nouns denoting humans, such as Sarkozy is the Zidane of Finance. They attempt to identify the metonymic processes, involving several tiers, and then the metaphoric steps in the construction of meaning with the use of proper names. They base part of their work on Barcelonas previous studies on the paragon as a metonymic model and come to the conclusion that this model is not suciently motivated. They argue that these meanings are not rigidly xed but rather depend on complex matrices of domains and are also conditioned by the context of use. This comment again brings to mind Gibbs (2006) advice to cognitive linguists on the need for independent evidence from experimental studies on the accessing and interpretation of metonymies and metaphors. In regard to which properties are metaphorically projected between semantic domains, Anatol Stefanowitschs paper, titled Collocational overlap can guide metaphor interpretation, provides statistical data on
172
language processing. He examines the collocational overlap between literally and metaphorically used lexical items that co-occur in metaphorical expressions. He argues that the collocates with the highest combined probability show which properties (in this case, adjectives which may constitute the conceptual core of the node word, such as man, as metaphorically associated with wolf , gorilla and pig) would most likely be transferred from the source to the target domain. In order to ascertain the highest ranking lexical items, Stefanowitsch uses a three-part series of meticulous methodological steps to search in 3.uk4 and 3.us4 websites, with the results being given dierent weightings for the overlaps according to three variants: i) symmetrical (equal weighting to collocations in the source and target domains); ii) source-dominant collocate weighting; and, iii) target-dominant collocate weighting. He rst investigates with a frequency-based denition of collocation, and then with an association strength-based denition, which provided a slight improvement over the previous method. Even after this painstaking labor, Stefanowitsch readily admits to some remaining methodological dilemmas: a) the possibility that the association-based model may not be suciently sophisticated to be able to eliminate irrelevant adjectives as collocates; b) perhaps the strengths used in the comparison (based on Leech et als study of word frequencies in the BNC, 2001) did not reect the frequencies found in the Internet corpus collected by this researcher; and, c) the text types collected may not have been representative (i.e. balanced). Furthermore, perhaps the most problematic drawbacks of the collocation overlap model is that even such laborious studies do not prove that this model has correctly identied the same attributes as human speakers would. To remedy the possible lack of psychological reality, Stefanowitsch then carried out a small-scale informant test in which the 12 native-speaker participants chose the most relevant adjectival attributes metaphorically associating man to wolf , gorilla, pig and one non-conventional association that man is a dolphin. Stefanowitsch unassumingly admits that there still exist many problems with the application of both the collocational overlap model, whether based on frequencies or associationstrength. However, given the all too frequent assumptions of some authors in associating their own cognitive processes with those of the population at large, i.e., without taking into account the question of psychological reality, Stefanowitsch seems to be almost too modest about his methodological procedures, which, at the very least, can be said to attend to a host of criticisms that have long plagued studies in cognitive linguistics (Gibbs 2006). Section II of the volume groups together papers on mental spaces and conceptual blending. It opens with Ronald Langackers Constructing the
173
meanings of personal pronouns, in which he returns (Langacker 1991a: 226230) to the notion of subjectivity/objectivity, i.e. dierent possible construals of the speaker as included in the scene. Langacker rst alludes to a supercial account of I as designating the speaker, you as designating the hearer and they as designating a group which does not include either of the former. He then builds up his arguments towards the much more sophisticated and dynamic model of high-level blending (as when rst and second person pronouns incorporate both the speech scenario and the general viewing scenario (p. 183), i.e. both speaker and hearer are on stage in the blended spaces. To explain this blending, he refers to reference point phenomena by showing how for the rst and second person pronouns the reference point (salient entity) is also the target and for the underspecied third person pronouns the reference point has been identied by the current discourse space, all the information available to the interlocutors. Langacker also provides an interesting discussion of how personal pronouns functioning as impersonal pronouns have essentially the same meanings as they do in personal uses. After a brief discussion of we, you and they, Langacker turns to the underspecication of the pronoun it, whose vague reference comprehends the scope of awareness as an undierentiated abstract whole, as in expressions like it is obvious that . . . . The nal section of the paper returns to the proled referents of I and you and the construal of objectivity/subjectivity. Langacker suggests that the ambivalent nature of the two pronouns in regard to subject vs. object conception is actually part of their meaning. Thus, he provides a compelling account of how cognitive linguistics can explain intersubjectivity (apprehension of other minds and what they apprehend, p. 182). His contribution to this volume provides the necessary additional linguistic account of sense boundaries to Rumelhards psychological explanation of the role of schemata in the construction of intersubjective understanding. Kiki Nikiforidous article on The construction of meaning in relative clauses also deals with underspecication regarding Greek relative clauses introduced by pu, which frequently show indeterminacy in the way(s) the meaning of the head is integrated with the content of the relative clause (p. 190), i.e. I am the woman that the ght took place [for]. Since the relativizer here should function as the complement of a preposition, which is absent, the hearer must engage in a certain amount of semantic/pragmatic interpretation. The missing preposition might be for, over, with or against and, although these missing prepositions are lexically determined by the verb ght, the context is crucial in resolving which of the possible prepositions might be the adequate interpretation of the blended space, which allows for multiple construals.
174
Nikiforidou analyzes her examples from informal speech in terms of blending, in which the relative construction requires a conceptual integration of the meaning of the antecedent with that of the relativizer. She also examines a type of gapless relative clause, i.e., those in which there is no gap corresponding to the head noun, but instead a pronoun which cannot be considered resumptive. The example Shes preparing in the kitchen some dishes that you are crazy [for] might be analyzed as simply missing a preposition, but a better explanation of how people make sense of these two clauses is to consider the relativizer pu as strongly priming a pragmatically evoked causative relationship emergent from the blended space. In the nal section of the article, Nikiforidou discusses the dierent types of constraining principles which may aect possible interpretations and she plausibly concludes, at least as far as the Greek pu relative clause is concerned, the constraints are predominantly those that concern interpretability. In the article comprising chapter 10, Christian Koops addresses the cross-linguistic variability of inferential constructions (it is that) and nds, as have other researchers, that dierent languages licence the use of this construction in a much less restricted way than in English, for example, inferences can be activated by the environment rather than triggered by the linguistic context, as is the case with the English construction. These restrictions for English lead him to investigate, with the use of real spoken data instead of constructed examples, how discourse constraints reduce the applicability of this construction to contexts in which inferences are highly accessible. Further constraints for this construction in English involve certain types of modiers such as the adverb just or certain grammatical contexts such as negation (its not that), which act as facilitators for the hearers presuppositions that a unmentioned value (more specic reason) is relevant to the current discourse. In the case of negation, the it is not that-construction presents a presupposed reason that is then rejected and when this construction occurs with an epistemic modal (it may be that), the clause is presented as the result of a process of deduction (p. 219). Koops predicts that languages like English, in which the unmarked form of the construction rarely occurs, will also show the it is that-construction in combination with modiers and grammatical constructions which reinforce its specicational function (p. 223). Languages which do not show such a restrictive use of the construction, like Spanish, will not make use of the modiers and specic grammatical contexts. He nishes his interesting discussion of the it is that-constructions in various languages by proposing that more research be carried out on a wide range of languages so that his hypothesis can be conrmed.
175
In The construction of vagueness: Sort of expression in Romance languages, Wiltrud Mihatsch uses data from two Germanic languages (English and German) and four Romance languages (French, Italian, Portuguese and Spanish) to study two paths of pragmaticalization involved in emerging stages of taxonomic nouns towards approximative ` modiers, i.e., loose meaning constructions, such as espece de from es` pece, species. With the use of dictionaries, corpora and internet sources, the author provides detailed information on the implicatures which have triggered the pragmaticalization of discourse particles, which, up to now, have only been studied more comprehensively for English and French. She meticulously traces the historical determinologization, caused by dierent communicative needs and background knowledge of speakers and hearers in everyday contexts as opposed to scientic communities (p. 229). The more peripheral members of subcategorization are those for which there arises a need for a qualier in order to permit inclusion into a specied class. As this use becomes more and more entrenched, the original meaning of the taxonomic nouns leads to enhancement of the possible meanings of the NP, whereby negative quantication and free choice may be reinforced and therefore, serve as a basis for grammaticalization processes of indenite pronouns; these, then, may become universal quantiers. Mihatsch uses dictionary entries (Corde, DHLF, OED, etc.) to tie all these data to a productive type of metonymy in which the scalar endpoint is used to emphasize the whole scale (p. 230). The sort of expressions (meaning species) developed a new approximative function and began to work as modiers of the following noun, which is usually a specic or abstract noun, not a central basis-level noun. There is one other path of pragmaticalization of modiers in Romance languages (but not in Germanic languages); this one involves two nouns, the rst, generic one, postmodied by a specifying taxonomic NP. These equivalents of genus become prepositional-like elements (in French genre, in Spanish, del genero de . . . , like). The study shows how diachronic data can adequately motivate the pragmatic changes in sort of expressions over time. Chapter 12, titled Communication or memory mismatch? Towards a cognitive typology of questions, proposes that questions are reexes of memory mismatches, a process which occurs when one is processing stimulus from the outer world. Wolfgang Schulze oers an account of the heterogeneity of interrogative constructions and reects on the claim of cognitive linguistics, according to him, that many variations in form and function are variants of a single cognitive strategy (p. 248). Conceptualization is considered as inherently dynamic (Langacker 1991b, 2001) and although the processes may involve many emergent factors, Schulze
176
hypothesizes that, as far as interrogativity, both the linguistic and conceptual variation ultimately derive from a supercially simple cognitive strategy that has to do with memory mismatch (p. 250), more specically with memory and linearization (i.e., resolution of mismatch and alignment). That is, when one is construing a given world stimulus, there also occurs an activation of stored analogies, which can coincide with the new stimulae or not. If there is coincidence, there is an assertion; noncoincidence results interrogativity, involving various mechanisms (phonological, morphosyntactic, etc.) for constructing either polar questions (address the givenness of the world stimulus) or constituent questions (address a specic part of the stimulus). Unfortunately, the article seems a bit dis-jointed, jumping from a series of issues presented in one section, the up-take of which is not always clear in the following sections. For example, in one of the nal sections, the author gives a brief account of the bleaching of questions words, including tags, but it is not clear exactly what connection holds between this account and the discussions presented in previous sections. Schulze, from the framework of Radical Experientialism (p. 247), concludes that questions are not really part of a communication strategy but rather reactional patterns that reect the given state of cognition (p. 262). The nal chapter of the book, Brutal Brits and persuasive Americans. Variety specic meaning construction in the into-causative by Stefanie Wul, Anatol Stefanowitsch and Stefan Gries, aims at revealing the cultural dierences which may be reected in the lexical llers used in grammatical constructions. To do so, they use corpora from two prominent newspapers to examine the collexemes of verbs pairs which occur in the two varieties of English in into-causative constructions (talk into accepting or bully into accepting). In order to show the strength of association for both the cause slots and the results slots, the authors use a very promising statistical measure, called distinctive collexeme analysis (previously proposed and tested in various other articles, Stefanowitsch and Gries, 2003, 2005), and then, for the 3,908 pairs of cause-result predicates, they have applied the Fischer-Yates exact test for added reliability of signicance. They then group the distinctive collexemes into larger semantic categories (communication, negative emotion, physical force, etc., p. 273274). Their most signicant ndings, apart from testing the validity of the meticulous use of various measurement tools for strength of association, support notions proposed by Construction Grammar, namely the parallel between the pragmatic notions of interpretation and context and the lexico-syntactic notions of verb and construction (p. 277); that is, constructions seem to oer meaning potential in the nuances that are expressed by the lexical llers, which may be inuenced by the
177
speakers cultural input (p. 278). In the verbs studied, the analysis shows the following for the two varieties. In British English, the data showed a predominance of the use of acts causing negative emotions (pressurize, bounce, panic, bully, etc.), denoting cause set into motion, and with no real verb of communication in this slot), and more verbs of communication in the result slot. For American English, the analysis revealed that the cause predicates denote a restriction of movements for the causee (talk, pressure, prod, coax, etc.), with a predominance of light verbs for the result predicates. The authors propose an interpretation, albeit acknowledging the need for more research to verify their ndings: the causeslot appears to indicate a distinctive use of threatening actions in British English, while the result-slot predicates denote a confession frame (p. 279); in American English, the cause-slot appears to indicate a distinctive preference for verbs of communication or verbal persuasion, while the result-slot predicates indicate some type of unspecied action (light verbs). The sophisticated analytical tools used by the various authors provide a welcome addition to ways of attending to some of the criticism that cognitive linguistics has received. These extra-linguistic measures are called for by Gibbs in chapter one, although these authors use linguistic (corpus analysis) and statistical measurements and not psychological ones. And by showing that speakers cultural inputs can be reliably traced, they provide a bridge to work being carried out in critical discourse analysis, thereby establishing a possible common ground for these two rapidly developing elds in modern linguistics (Stockwell n/d: 1). In summary, Radden, Kopcke, Berg and Siemund have edited a useful body of research on metaphor, metonymy and blends, which is valuable in bringing readers up-to-date on recent developments. However, Raymond Gibbs point in the initial chapter is well taken: there is still a lag between theoretical and empirical studies. Without corpus studies, including historical ones, we cannot be sure of how ubiquitous metonymic and metaphoric expressions are; our speaker intuitions are often notoriously incorrect. Corpus studies, besides providing real examples and not constructed ones, may also reveal signicant speaker dierences in patterns of use across language varieties. And, without experimental studies, we cannot be sure that our after-the-fact cognitive modeling of speakers rhetorical construals actually do correspond to their on-line processes, be they in comprehension or production. Thus, the contribution of this book may not lie so much in the aspects studied (although these certainly add to our knowledge of the production and reception of meaning), but rather this volume distinguishes itself in that many of the articles presented here provide a clear pathway to more reliable measurement of cognitive processes.
178
References
Ausubel, D., J. Novak & H. Hanesian. 1978. Educational psychology: A cognitive view, 2nd edn. New York: Holt, Rinehart & Winston. Barcelona, A. 2002. Clarifying and applying the notions of metaphor and metonymy within cognitive linguistics: An up-date. In R. Dirven & R. Porings (eds.), Metaphor and meton ymy in comparison and contrast, 207278. Berlin & New York: Mouton de Gruyter. Barcelona, A. 2004. Metonymy behind grammar: The motivation of the seemingly irregular grammatical behavior of English paragon names. In G. Radden & K-U Panther (eds.), Studies in linguistic motivation, 357374. Berlin: Mouton de Gruyter. Bartlett, F. C. 1932. Remembering: An experimental and social study. Cambridge: Cambridge University Press. Bruner, J. 1990. Actual minds, possible worlds. Cambridge MA: Harvard University Press. Fesmire, S. 1994. What is cognitive about cognitive linguistics? Metaphor and Symbolic Activity 9(2). 149154. Gibbs, R. 1993. Process and products in making sense of tropes. In A. Ortony (ed.), Metaphor and thought, 2nd edn, 252276. Cambridge: Cambridge University Press. Gibbs, R. 2006. Cognitive linguistics and metaphor research: Past successes, sceptical questions, future challenges. DELTA 22. www.scielo.br Goosens, L. 2002. Metaphtonymy: The interaction of metaphor and metonymy in expressions for linguistic action. In R. Dirven & R. Porings (eds.), Metaphor and metonymy in comparison and contrast, 349377. Berlin & New York: Mouton de Gruyter. Johnson, M. 1987. The body in the mind. The bodily basis of meaning, imagination, and reason. Chicago & London: Chicago University Press. Lako, G. 1987. Women, re and dangerous things. What categories reveal about the mjind. Chicago: Chicago University Press. Langacker, R. 1991a. Concept, image and symbol. The cognitive basis of grammar. Berlin & New York: Mouton de Gruyter. Langacker, R. 1991b. Foundations of Cognitive Grammar, vol. 2, Descriptive application. Stanford: Stanford University Press. Langacker, R. 2001. Dynamicity in grammar. Axiomathes 12. 733. Leech, G., P. Rayson & A. Wilson. 2001. Word frequencies in written and spoken English: Based on the British National Corpus. London: Longman. Mandler, J. 1984. Stories, scripts, and scenes: Aspects of schema theory. Hillsdale, NJ: Erlbaum. Mandler, J. 1992. How to build a baby, II, Conceptual primitives. Psychological Review. 587604. Markert, K. & M. Nissum. 2003. Corpus-based metonymy analysis. Metaphor and Symbol 18. 175188. Michaelis, L. 2003. Headless constructions and coercion by construction. In E. Francis & L. Michaelis (eds.), Mismatch: Form-function incongruity and the architecture of grammar (Lecture Notes 163), 259310. Stanford: CSLI Publications. Neisser, U. 1976. Cognition and reality. San Francisco: W. H. Freeman. Panther, K-U. 2005. The role of conceptual metonymy in meaning construction. In F. Ruz de Mendoza & S. Pena Cervel (eds.), Cognitive Linguistics: Internal dynamics and interdis ciplinary interaction, 353386. Berlin & New York: Mouton de Gruyter. Panther, K-U. & L. Thornburg 1998. A cognitive approach to inferencing in conversation. Journal of Pragmatics 30. 755769. Panther, K-U. & L. Thornburg (eds.). 2003a. Metonymy and pragmatic inferencing. Amsterdam: John Benjamins.
179
Panther, K-U. & L. Thornburg 2003b. Metonymies as natural inference and activation of schemas. In K-U. Panther & L. Thornburg (eds.), Metonymy and pragmatic inferencing. Amsterdam: John Benjamins. Panther, K-U. & L. Thornburg 2004. The role of metonymy in meaning construction. metaphoric.de 6. 91116. Rumelhart, D. E. 1980. Schemata: The building blocks of cognition. In R. J. Spiro, B. Bruce & W. F. Brewer (eds.), Theoretical issues in reading and comprehension. Hillsdale, NJ: Erlbaum. Shank, R. C. 1977. Scripts, plans, goals and understanding. Hillsdale, NJ: Erlbaum. Stefanowitsch, A. & S. Gries. 2003. Collostructions: On the interaction between verbs and constructions. International Journal of Corpus Linguistics 8(2). 209243. Stefanowitsch, A. & S. Gries. 2005 Covarying collexemes. Corpus Linguistics and Linguistic Theory 1(1). 143. Stockwell, P. n/d. Towards a critical cognitive linguistics? eprints.nottingham.ac.uk Thornburg, L., K-U. Panther & A. Barcelona (eds.). 2006. Metonymy and metaphor in grammar. Amsterdam: John Benjamins. Traugott, E. C. 2007. The concepts of construction mismatch and type-shifting from the perspective of grammaticalization. Cognitive Linguistics 18 (4). 523557. Verhagen, A. 2005. Constructions of intersubjectivity: Discourse, syntax and cognition. Oxford & New York: Oxford University Press. Vygotsky, L. 1978. Mind in society. Cambridge, MA: Harvard University Press.
Space, language, and cognition: New advances in acquisition research1

HENRIETTE HENDRIKS*, MAYA HICKMANN and KATRIN LINDNER
Abstract In this introductory chapter to the present special issue about Space, language and cognition: developmental perspectives, we introduce some of the main questions that are currently debated concerning the relationships between cognitive and linguistic representations in the domain of space. This collection of papers addresses these questions by bringing together contributions from dierent disciplines, theoretical perspectives, and methodological approaches. All papers start out with the assumption that spatial cognition is not indierent to spatial language and aim at specifying how the two might be best related by examining the development of spatial representations in children and adults through language use and acquisition. Keywords: spatial language, spatial cognition, co-verbal gestures, language acquisition, cognitive development, cross-linguistic comparisons, motion, typology.
1.
Why space?
The ability to represent and to communicate spatial information accurately and rapidly is vital for the survival of all species, for example
1. The project to put together this special issue initially arose during a paper session on Spatial cognition and its expression in language and gesture: a developmental view organized in Munich by Katrin Lindner (LMU Munich) and Heike Behrens (University of Basle) in the context of the Second International Conference of the German Cognitive Linguistics Association (57 October 2006 check ICCLS). The names of the three editors are listed in alphabetical order. * Address for correspondence: H. Hendriks Research Centre for English and Applied Linguistics University of Cambridge English Faculty Building, 9 West Road, Cambridge CB39DP UK. Email: henriette.hendriks@rceal.cam.ac.uk Cognitive Linguistics 212 (2010), 181188 DOI 10.1515/COGL.2010.006 09365907/10/00210181 6 Walter de Gruyter
182
H. Hendriks, M. Hickmann and K. Lindner
enabling individuals or groups to ee from danger, to nd food or to return home. A growing number of studies have examined this domain in various disciplines of the cognitive sciences that have brought complementary contributions stemming from dierent scientic traditions and methodologies (linguistics, cognitive psychology, developmental psychology, psycholinguistics, philosophy, neurosciences . . . ). This research has aimed at studying spatial behaviour in order to uncover the underlying processes involved in the construction of internal spatial representations, the resulting spatial categories that guide individuals decisions, and the neural substrata that underlie spatial cognition. Studies have also examined how spatial representations evolve over time from dierent perspectives: across species from a phylogenetic perspective, across languages from a diachronic perspective, and during child development from an ontogenetic perspective. In comparison to other species, a specic and remarkable feature of our spatial cognition is the potential role of human language in determining properties of our mental spatial representations. Language provides a unique and powerful symbolic system that constitutes one of the means whereby we construct and categorize space. One of the central questions currently addressed in the cognitive sciences has been to determine the properties of spatial systems across languages of the world and the dierent ways in which they might contribute to how we construct space. In this respect, the last ten or twenty years have witnessed a blooming of studies in this domain focusing on major debates concerning the relationship between spatial language and spatial cognition. These debates revolve around a number of questions, for example: What are the universals of spatial language and cognition? Do various typological properties of linguistic spatial systems constrain our cognitive spatial representations? What is the degree to which dierent (perceptual, cognitive, linguistic) components of spatial behaviour are autonomous or interact? This collection of papers addresses some of these questions by bringing together contributions from dierent disciplines and perspectives. They all start out with the assumption that spatial cognition is not indierent to spatial language and aim at specifying how the two might be related. All papers are framed within a developmental perspective in which they examine the development of spatial representations through language use and acquisition. Taken together, this collection presents contributions that are representative of new advances in acquisition research and discuss dierent languages (English, French, German, Japanese), dierent phases of language acquisition (children, adolescents, adults), types of learners (rst and second language acquisition), and types of spatial representations (verbal productions, co-verbal gestures).
Space, language and cognition 2. What can spatial language tell us about spatial cognition?
183
Spatial language forms a system comprising means of talking about different aspects of the spatial universe that is available to our perception. This universe can be roughly described in terms of situations that dene two main sub-spaces: static space and dynamic space. The rst sub-space concerns the localization of entities in relation to other entities in space. It involves, for example, the expression of varied types of information in answer to questions such as Where is X?. This information varies across languages: spatial relations (X is in, on, under, above . . . Y ), posture (X lies, stands, sits . . . on Y ), and other related information (gure and ground properties, functional properties . . . ). The second sub-space concerns dierent types of motion. Motion may occur within a given general location (to run around in the garden) or it may imply a change from one location to another (to run away) and it can be carried out voluntarily by an agent (to run, to leave), be involuntary (to fall ) and/or result from the energy produced by an external force (to push something up). Irrespective of dierences among them, all languages provide means of encoding static and dynamic information, both of which are central to our spatial representations. The study of spatial language therefore constitutes one means of approaching the general nature of human spatial representations and the processes whereby they are constructed for varied purposes. Linguistic organization in any semantic domain inherently involves distinctions and relations among them that enter into a coherent network of discrete categories available to speakers when they construct their spatial representations. Several questions therefore arise in the context of current debates concerning the relationship between language and cognition. One fundamental set of questions that is presently debated concerns the relatively autonomous or interactive nature of dierent dimensions of our spatial representations. Research has begun to examine whether all languages follow similar organizational principles and the extent to which such common properties of linguistic systems might reect deeper universal properties of human spatial cognition. Various claims in linguistics have revolved around whether or not all languages share a common clear distinction between two autonomous domains: reference to entities and spatial reference. Research in neurosciences has even provided evidence in support for these separate systems in the brain (What/Where systems). In addition, cognitive psychology and psycholinguistics have typically studied either the perceptual/cognitive processes involved in the construction of spatial representation or the verbal processes that might contribute to these representations, both of which are presumed to be universal, but little is known about how these two sets of processes might be related.
184
In this respect, the study of spatial language addresses a second related question concerning the universals of spatial representations and the typological constraints that might partially determine them. Linguistic research has indeed uncovered wide dierences across the spatial systems of languages that touch on varied dimensions. Thus, languages provide dierent markers of spatial relations, make dierent distinctions along several spatial dimensions, rely on dierent spatial reference systems, and display dierent grammaticalization and lexicalization patterns for the encoding of varied information relative to location or to motion. A growing number of psycholinguistic studies has begun to show striking cross-linguistic dierences in how speakers express spatial information, for example dierences in the types of information they select and in the ways in which they organize this information in speech. On the basis of such results, some authors have revived the Whoran hypothesis, according to which such dierences reect deeper dierences in cognitive organization. According to this view, behavioural dierences result from the ways in which each language (or language family) lters and channels incoming spatial information, thereby making some of this information more salient and more accessible to cognitive functioning. In contrast, others view such dierences as merely reecting supercial languagespecic properties that do not aect deeper universal properties underlying cognitive organization. In this second view, then, cross-linguistic behavioural dierences are only apparent in speech and do not tap underlying cognitive representations. Clearly, claims concerning the impact of language-specic factors on human spatial cognition must go beyond the study of speech behaviour in order to avoid the pitfalls of circularity. Although available results concerning speech production are highly revealing in suggesting the possibility that particular types of linguistic organization might determine particular types of cognitive organization, this possibility remains still somewhat hypothetical in the absence of complementary evidence directly concerning non-linguistic representations. Going beyond linguistic production requires the study of both verbal and non-verbal behaviours in order to determine whether both types of behaviours dier across language groups and whether both could be accounted for by languagespecic or typological constraints. In this respect, one way of supporting the Whoran claim might be to study the relationship between speech and co-verbal gestures. Gestures during speech production provide another window onto speakers representations in that they are produced on line in close temporal proximity to speech as well as through a dierent modality. The similarities and dierences between simultaneous verbal and co-verbal behaviour might therefore uncover properties that are
Space, language and cognition
185
common to both modalities and that both follow from predictions based on language properties. 3. Why acquisition?
In addition to descriptive studies of spatial systems in linguistics and to experimental studies of spatial cognition in general cognitive psychology, space is of great interest in several respects from a developmental point of view. Spatial cognition has been a central domain of study for cognitive developmental psychology, serving as a primary foundation of childrens intelligence in ontogeny. For example, during the 20 th century Piagets most inuential developmental theory proposed that cognitive development is based on sensori-motor activity through endogenous and biologically determined cognitive processes (accommodation, adaptation) that drive children through a series of universal stages ( pre-operational, concrete operations, formal operations). Thus, it is by grounding his representations on displacements and locations in space, as well on the manipulation of entities in the surrounding environment, that the child comes to construct the world in such a way as to reach rational forms of thought. Similarly, much previous developmental psycholinguistic research in the spatial domain has focused on presumably universal properties of semantic systems that might be reected in recurrent and gradual progressions during childrens rst language acquisition. For example, some dimensions in the spatial domain seem to be perceptually most salient to children and this salience seems to determine the ways in which they learn to use spatial devices (such as the order in which spatial prepositions are acquired). Since the end of the 20 th century, the development of new methodologies in psychology has led to a revolution in our knowledge about child development. In particular, infancy has been the locus of much debate concerning components of childrens knowledge that seems to be more precocious than expected by previous theories. Thus, we now know that infants during the pre-linguistic period display some surprisingly precocious knowledge about the world, comprising knowledge in a variety of domains that include space but also categories of entities, number, temporality, causation, and agentivity. In the spatial domain, infants as young as three months of age know about the properties of entities, about their spatial relations, and about the physical laws that account for their displacements in space. The interpretation of these results is still controversial and has led to several views of child development. Some view this precocious knowledge as organized in autonomous modules that are pre-programmed in infants biological heritage, while others view it as
186
stemming from active perceptual and cognitive processes that occur very rapidly from birth on. With respect to language acquisition, all of these views propose more generally that children either inherit or develop percepts and concepts during the pre-linguistic period, on the basis of which they look for the adequate linguistic means of expression. This cognitive determinism, then, is the major force driving children to acquire language. At least two dierent views have been proposed, both of which consider that language plays a structuring role during cognitive development. Also during the 20th century, Vygotsky proposed that the advent of language results in a major reorganization of cognition whether on a phylogenetic or an ontogenetic scale. Since Vygotskys writings, more recent studies also conclude that language input and acquisition results in particular types of cognitive functioning, leading young children to search for certain regularities that they would not discover otherwise. Yet a dierent view has been that children are sensitive not only to the general properties of language, but also to the specic properties of their language from the earliest age on, leading them to construct particular categories and/or particular ways of organizing them. Another development in psycholinguistics focuses on the later stages of child development, increasing the temporal span beyond early stages. Thus, studies have begun to examine later stages of language acquisition (pre-adolescence, adolescence), beyond the point of biological maturation considered to be for some approaches a major determinant of early acquisition. This research examines language use either in natural situations (for example conversations) or in tasks involving dierent discourse types (narratives, argumentation). Results are enlightening in understanding complex language skills that are acquired during later phases of acquisition and nonetheless fundamental for learners to become competent native speakers. Finally, during the last twenty years, a growing number of studies have examined dierent types of learners in order to shed further light on the process of language acquisition. The most common type of comparison has involved childrens rst language acquisition and adults second language acquisition. Such comparisons make a contribution to our understanding of language acquisition by allowing us to separate factors that are normally confounded during child development. Thus, the child displays an increasing maturity in all domains (growing cognitive and linguistic maturity), whereas the adult learner comes to the task of second language acquisition with a mature cognitive system. By way of implication, such comparisons also allow us to understand how language and cognition relate across linguistic systems.
Space, language and cognition 4. Language and cognition: present and future directions
187
The dierent papers in this special issue touch on all of these questions and open new directions for future research. A rst series of papers examines rst language acquisition in a cross-linguistic perspective. In their paper on Typological constraints on the acquisition of spatial language in French and English, Hickmann and Hendriks compare childrens early spontaneous utterances about motion events in English and in French on the basis of experimental and longitudinal corpora collected from the emergence of language on. The paper by Ochsenbauer and Hickmann on Childrens verbalizations of motion events in German presents comparative analyses of experimentally elicited productions about voluntary motion. It discusses German data (adults and children between 3 and 11 years) in comparison to previous results in French and English. The paper by Gullberg and Narasimhan on What gestures reveal about the development of semantic distinctions in Dutch childrens placement verbs examines the gestures produced by Dutch adults and children (of 45 years) in parallel with their uses of placement verbs. Three papers then focus on late acquisition phases. In their paper on Changes in L1 encoding of path after exposure to an L2, Brown and Gullberg analyze how adult Japanese speakers acquiring English as a second language with dierent levels of prociency talk about motion, discussing the eects of L2 on the expression of path in L1. In Im fed up with Marmite, Im moving on to Vegemite Graf discusses childrens (6 to 10 years of age) uses of spatial language in spontaneous conversations, including static and dynamic references, as well as literal and non-literal uses. Finally, the article by Lemmens and Perrez On the use of posture verbs by French-speaking learners of Dutch: a corpus-based study presents the results of a quantitative and qualitative corpus study of the use of the Dutch posture verbs staan (stand), liggen (lie) and zitten (sit) by French-speaking learners of Dutch. As implied by the discussions across these papers, at least two such questions will clearly deserve attention in future research. The rst one concerns the relationships that may exist among processes that take place across semantic domains during language acquisition. In this respect, although this collection of papers focuses specically on spatial language and cognition, many questions concerning the relationships between space, time, and the properties of entities remain surprisingly unattended. Second, perhaps the most dicult challenge to meet in the future concerns the extent to which language has an impact on speakers internal representations. Much more research is still necessary to address this question, using a variety of situations and tasks (categorization of events and of relations, eye movements during the exploration of scenes, verbal
188
and visual memory). It is only through a variety of such behavioural measures that we can hope to uncover whether and how our spatial representations are inuenced by language (and by specic languages) both in communication (verbal representations) and in problem-solving situations (mental representations). As shown by the present volume, studying the relationship between language and space is a task that requires multiple methods in an interdisciplinary perspective. It is our hope that the volume will contribute to promoting future studies of this kind by bringing together papers that cover both typological and developmental questions, by studying a variety of languages, and by relying on varied methodologies. It is only through the joint contribution of multiple perspectives and tools that we will ultimately make significant advances in the study of spatial cognition, of spatial language, and of the relationship between the two.
Typological constraints on the acquisition of spatial language in French and English

MAYA HICKMANN and HENRIETTE HENDRIKS*
Abstract Typological analyses (Talmy 2000) show that languages vary a great deal in how they package and distribute spatial information by lexical and grammatical means. Recent developmental research suggests that childrens language acquisition is constrained by such typological properties from an early age on, but the relative role of such constraints in language and cognitive development is still much debated (Bowerman 2007; Bowerman and Choi, 2003; Slobin 1996, 2003a, 2003b, 2006). In the context of this debate, we compare the expression of motion in two data bases of child English vs. French: 1) experimentally induced productions about caused motion (adults and children of three to ten years); 2) spontaneous productions about varied types of motion events during earlier phases of acquisition (18 months to three years). The results of both studies show that the density of information about motion increases with age in both languages, particularly after the age of ve years. However, they also show striking cross-linguistic dierences. At all ages the semantic density of utterances about motion is higher in English than in French. English speakers systematically use compact structures to express multiple types of information (typically MANNER and CAUSE in main verbs, PATH in other devices). French speakers rely more on verbs and/or distribute information in more varied ways across parts of speech. The discussion highlights the joint impact of cognitive and typological factors on language acquisition, and raises questions to be addressed in further research concerning the relation between language and cognition during development.
* Address for correspondence: M. Hickmann, CNRS Laboratoire Structures Formelles du Langage, UMR 7023, 59 rue Pouchet, 75017, Paris, France. Email: maya.hickmann@ s.cnrs.fr. H. Hendriks, Research Centre for English and Applied Linguistics, University of Cambridge, English Faculty Building, 9 West Road, Cambridge CB39DP, UK. Email: henriette.hendriks@rceal.cam.ac.uk Cognitive Linguistics 212 (2010), 189215 DOI 10.1515/COGL.2010.007 09365907/10/00210189 6 Walter de Gruyter
190
M. Hickmann and H. Hendriks rst language acquisition, French and English, longitudinal and experimental data, typology, path, manner, and cause.
Keywords:
1.
Introduction
A highly debated question in current research on child language concerns the relative impact of universal versus language-specic determinants in rst language acquisition. In this respect, the spatial domain has been of particularly interest. Space is a most basic domain of human cognition, traditionally presumed to be governed by universal principles and capacities, but nonetheless showing considerable variations across human languages. Such variations concern the linguistic expression of static space (e.g., spatial relations, location) and of dynamic events (e.g., motion, changes of location). Previous developmental research in this domain has led some researchers to postulate that particular typological properties may inuence the rhythm and course of childrens language acquisition, and perhaps even the ways in which they develop and organize spatial concepts. In the context of this debate, we present below two studies concerning the expression of motion events in English versus French, focusing on how adults and children between three and ten years of age express caused motion in experimentally controlled situations (Study 1), as well as on how young children express all kinds of motion events in early spontaneous productions between the ages of two and three years (Study 2). Both studies examine the semantic content denoted in relation to motion and the devices used to express this information, with particular attention to the impact of both typological constraints and cognitive determinants on both aspects of childrens utterances.
2. 2.1.
Motion in language Universals and cross-linguistic variability
Linguistic research on spatial systems shows that languages vary a great deal in how they represent spatial information. With respect to motion events, Talmy (2000) proposes that they fall in dierent families depending on how they package and distribute spatial information by lexical and grammatical means. Satellite-framed languages (e.g., Germanic) typically encode the manner of motion in the verb root and its path in verbal satellites, whereas verb-framed languages (e.g., Romance) typically encode path in the verb root and express manner (if at all) by additional means at the periphery of the clause (examples (1) and (2)).
Typological constraints (1) (2)
191
She runs/crawls . . . up, down, away, across, into, out of . . . ` Elle monte, descend, part, traverse, entre, sort . . . en courant/a quatre pattes . . . . (She ascends, descends, leaves, crosses, enters, exits . . . by running/ on all fours . . . )
These typological dierences correspond to strong paradigms that run through the language of motion leading to speakers strong preferences for particular means of expression in default cases, despite a number of other available options, such as options that are more marked pragmatically (e.g., those emphasizing manner and/or path in French (3)), that result from borrowings (e.g., English Latinate (4)), or that are remnants from diachronic linguistic evolution (e.g., French verbal prexes such as (5)).1 (3) (4) (5) ` ` Elle a couru jusqua la maison a cloche-pied. (She ran all the way home on one foot.) to enter, to ascend, to descend . . . accoster (to reach the coast), ecremer (to take cream o ), em boter (to in-t) . . .
Furthermore, such preferences also have an impact on speakers preferences beyond the spatial domain, for example aecting the expression of causal relations, that are typically encoded by means of compact causalresultative constructions in Germanic languages (such as English (6)) vs. more distributed constructions in Romance languages (such as French (7)). (6) (7) She kicked the door open and kicked the dog out. ` Elle a ouvert la porte et fait sortir le chien a coups de pied. (She opened the door and made the dog go out by kicking). Motion in rst language acquisition
2.2.
A large body of research has brought strong evidence showing that very young infants possess much knowledge about space. On the basis of an extensive review, Pruden et al. (2008) conclude that the body of evidence currently available suggests that infants are sophisticated observers of actions and relations, [ . . . ] and that the inherent problem in learning
1. As shown by Kopecka (2006), verbal prexes such as those in (5) are remnants from an earlier stage in the diachronic development of French which evolved from a predominantly satellite-framed system (old French) to a predominantly verb-framed system (contemporary French).
192
M. Hickmann and H. Hendriks
[spatial] relational terms therefore appears not to be with conceptualizing events and actions in the world, but rather with mapping words onto actions (id: 10). However, as mentioned above, recent developmental research shows that childrens performance in language production and comprehension is constrained by typological properties from an early age on. The role of such constraints in language acquisition and more generally in cognitive development is presently vividly debated. According to some authors, language-specic properties have a deep impact on cognition, inuencing the rhythm and course of language acquisition, as well as the ways in which speakers of all ages from infancy to adulthood select and organize information when planning their utterances (Slobin 1996, 2003a, 2003b, 2006) and/or when constructing their cognitive categories (Bowerman 1996, 2007; Bowerman and Choi 2003; Choi and Bowerman 1991; Lucy and Gaskins 2001). Other authors propose that language-specic effects on cognition occur only at a later age (around 7 years), when the language is assumed to be fully settled as a system (Hohenstein 2005). Yet others view such eects as relatively supercial, eventually aecting verbal behaviors in some situations, but with no deeper implications for non-verbal cognition, itself assumed to follow universal progressions governed by language-independent perceptual and cognitive constraints (Clark 2003; Landau and Lakusta 2006; Munnich and Landau 2003). In the context of these debates, little information is available concerning the expression of motion in French child language, despite the interesting status of this language from a typological point of view (Kopecka 2006, see Note 1). Our previous research (Hendriks et al. 2008; Hickmann 2003; Hickmann and Hendriks 2006; Hickmann et al. 1998; Hickmann et al. 2009; Ochsenbauer and Hickmann 2008) compared how adults and children (between three and ten years) expressed motion in French vs. other languages in controlled experimental situations. General results show rst that utterances about motion are semantically less dense in French than in other languages (English, German, Chinese). In addition, although all languages show an increase with age in the semantic density of utterances about motion (voluntary or caused), this developmental progression is much more striking in French than in other languages. Finally, French utterances about motion are much more variable as a function of age and of event type in comparison to other languages where response patterns are rather systematic. When describing voluntary motion (Hickmann et al. 2009), French adults express both manner and path in some situations (e.g., with crossing events, such as a path verb and a manner phrase (8)), but they do not do so systematically. French children typically focus either on path or on manner, expressing one or
Typological constraints
193
the other type of information in verb roots (e.g., path in (9), manner in (10)), but rarely expressing both simultaneously within the same utterance. A notable exception concerns some descriptions of upwards motion at all ages (e.g., in (11) the verb grimper to climb up lexicalizes manner and path). In sharp contrast, most utterances in other languages simultaneously encode manner (in verb roots) and path (in satellites), regardless of age and of event type. (8) ` ` Alors le petit garcon traverse la riviere a la nage . . . (FAD01) (then the little boy crosses the river by swimming [at/with a swim] . . . ) et qui ensuite redescend le long du brin dherbe . . . (FAD01) (and who then descends back along the blade of grass . . . ) il y a un garcon qui nage (F0618) (theres a boy who is swimming) ` alors le petit ecureuil grimpe a larbre . . . (FAD01) (then the little squirrel climbs up [at] the tree . . . )
(9) (10) (11)
The dierence that was observed with respect to overall semantic density follows from language-specic properties. As suggested by previous studies (cf. Slobin 1996), it is easier to stack semantic components in satellite-framed languages such as English, which encode path in satellites thereby freeing the verb to express manner, than in verb-framed languages such as French, which reserve the main verb for path information, making it necessary to use more linguistic tools to encode manner at the sentence periphery. However, the fact that density increases with age across all languages shows the role of cognitive developmental factors. It is cognitively less demanding for young children to process only one type of information than to process and to combine more. Finally, the combined eects of cognitive and language-specic factors explain the fact that utterance density undergoes a slower developmental progression in a verb-framed language such as French as compared to satellite-framed languages. French children must not only develop their general cognitive capacities, but also solve the problem of having to use additional linguistic tools in order to combine multiple semantic components. With respect to caused motion, two patterns emerge. In a rst experiment (Hickmann and Hendriks 2006), subjects had to describe object displacements carried out in front of them by the experimenter. In this situation English speakers of all ages typically used a general verb indicating caused motion combined with particles or prepositions encoding spatial information (e.g., to put on/into). In contrast, French adults used a great variety of specic verbs to express simultaneously caused motion and several other types of information, particularly manner of attachment, either
194
using no preposition at all or relying on general prepositions (e.g., accrocher le manteau [au portemanteau] to hook the coat [at the coathanger]). Although children generally followed the same pattern as the adults in their language group, young French children used fewer specic verbs and more specic prepositions than older children or adults (e.g., mettre le manteau [sur le portemanteau] to put the coat [on the coathanger]). In a second experiment (Hendriks et al. 2008) adult speakers of French and of English (as well as adult English-speaking learners of French) were asked to describe animated cartoons showing a man that displaced various entities. Native English speakers systematically encoded cause manner in the verb root and path in satellites (to push the ball down, to roll the ball across), while French speakers expressed all kinds of information in dierent parts of speech, showing little systematicity in their responses. These results are in line with the typological properties of French and with previous studies concerning other Romance languages (e.g., Spanish, see Berman and Slobin 1994), suggesting that French presents a dicult system to learn when it comes to simultaneously expressing multiple types of information about motion. The two studies reported below follow up on these previous studies. Study 1 used the same experimental paradigm as Hendriks et al. (2008) to elicit utterances about caused motion among children between the ages of three and ten years. The aim was to determine whether the language dierences that had been observed among adult speakers could also be observed among children. Study 2 then further examined childrens spontaneous productions at earlier ages (about two to three years) in order to pursue the question of language-specic determinants of acquisition from the emergence of language onward.
3. 3.1.
Study 1: Experiment on caused motion Method
3.1.1. Subjects. Adults and children of four age groups participated in the study (12 subjects per group, total N 60). Children were approximately three years (French mean 3;4, range 3;0 to 3;10, English mean 3;3, range 2;11 to 3;6), four years (French mean 4;5, range 3;11 to 4;10, English mean 4, range 3;7 to 4;5), ve years (French mean 5;5, range 4;11 to 5;10, English mean 5;1, range 4;7 to 5;5), and ten years (French mean 10;2, range 9;8 to 11;1, English mean 10;0, range 9;6 to 10;6). Children were tested in kindergartens and schools in Paris and Cambridge. Adults were university students (Universities of Paris 5 and of Cambridge).
Table 1. Summary of main variables in test items for caused motion (Study 1) I. II. III. IV. V. Manner of A motion Cause Manner of cause Manner of O motion Path of motion
195
Manner in which Agent moved (walking in all items) Causal relation between Agent and Object (in all items) Action of Agent causing Object to move (push, pull) Manner in which Object moved (roll, slide) Trajectory followed by Agent and Object (up, down, across, into)
3.1.2. Materials. A total of 32 test items was designed (see summary in Appendix 1). They consisted of short animated cartoons in colour, all of which showed the same human agent (called Popi in French and Hoppy in English) in motion (hereafter A) carrying out an action that caused the displacement of an object (hereafter O). Table 1 summarizes all information components that were relevant to motion, two of which were constant across all items (I, II), while others systematically varied across items (III to V). The crossing of variables III to V resulted in 16 possible combinations (2 2 4), each of which was presented by means of two exemplars (resulting in a total of 32 items). The session began with an additional training item of the same type which ensured that subjects would be comfortable with the task and showed them the dierent types of information that were most relevant (corresponding to variables III to V), namely As action (manner of cause), the manner of Os motion (manner of o-motion), and the path followed by A and O (path). Finally, a set of seven distractor items also showed dierent types of situations without any scenery (grey background) nor any human agent. In these items a ball caused the motion of an inanimate entity (e.g., rolling into a book and causing it to move forward), sometimes also causing another result (e.g., a ball rolling into a vase, causing it to fall over and to break). 3.1.3. Procedure. Subjects were seen individually in their school or university setting. They were shown a series of animated cartoons and asked to tell what had happened in each. Target items were presented in four random orders to which subjects were attributed randomly. In all orders distractor items were interspersed at regular intervals among test items (one after every four test items). The following procedure invited participants to be as complete as possible. Young children (three to ve years) were presented a blindfolded doll and asked to tell her everything that had happened so that she could tell the story back. Older children (ten years) and adults were asked to narrate the cartoons for a ctitious listener who would not have seen the stimuli and would have to retell the stories only on the basis of the recordings.
196
3.1.4. Coding. The analyses below focus on responses that were elicited with test items (32 responses per subject, a total of 384 responses per age group in each language). The coding procedure aimed at providing three complementary measures of each response: 1) information focus, i.e., the particular types of information expressed in relation to motion within a given utterance, with particular attention to the dimensions that dened our stimuli (variables I to V in Table 1 above); responses that did not express any of these dimensions (e.g., verbs such as to go or to move, that merely expressed motion, but not any particular path, manner, or cause of motion) were coded as not expressing any relevant information; 2) the resulting overall utterance density, i.e., the total number of semantic components expressed within the utterance, which corresponded to one of four density categories (none, one, two, three or more components, hereafter SD0, SD1, SD2, SD3); 3) information locus, i.e., the particular linguistic devices used to express this information (main verbs vs. other devices). The examples below show the uses of main verb roots vs. other devices (in italics) to express dierent information types (in brackets) in responses that varied in semantic density: (12) expresses motion but contains none of the semantic components that were coded (I to V in Table 1) and it was therefore coded as a SD0 utterance; (13) and (14) contain one component, (15) to (17) two components, (18) and (19) three components. (12) (13) (14) (15) (16) (17) (18) (19) Thats a horsey. [ . . . ] Hes moving [none]. (E0302) Its a car [ . . . ] going up and up and up and up [path]. (E0301) Et bah il est monte [path] avec la bouee. (F0305) (And well he ascended with the swimming ring.) He was bringing [cause] the trunk down [path]. (E0507) Il a fait rouler [causemanner-of-o-motion] la roue. (F0305) (He made roll the wheel.) ` Et apres il a monte [causepath] le cadeau. (F0306) (And then he ascended the package.) He was pushing [CauseManner-of-Cause] the present up [Path] onto [Path] the house. (E1009) Il la avance [causepath] avec ses pieds [manner-of-cause]. (F0306) (He moved it forward with his feet.) Results
3.2.
3.2.1. Utterance density. The overall semantic density of a given age group within each language was dened as the percentage of responses that fell in each density category (SD0, SD1, SD2, SD3) calculated
197
Figure 1. Utterance density (Study 1)
over the total number of responses for that group.2 Figure 1 shows utterance density within each age group in English (Figure 1a) and in French (Figure 1b). SD0 utterances were rare at all ages (in either language). As expected, utterances were denser in English than in French and this language dierence could be observed at all ages. Almost all of the utterances produced by English adults contained three or more information components (SD3 92%). In comparison, French adults used such utterances to a lesser extent (SD3 79%) and produced more utterances of lower density (SD2 19%). As for children, the density of their utterances clearly diered across languages, particularly between three and ve years. At these ages English responses were frequently of density SD3 (between 42% and 54%) or at least of density SD2 (between 25% and 36%) and almost all of the ten-year-olds responses were of density SD3 (91%). In French, SD3 utterances mostly occurred at the age of ten years (58%), but SD2 utterances were frequent even at this age (32%). Furthermore,
2. The total possible number of responses for each age group in each language was 384 (12 subjects 32 items), except in rare cases where children did not respond appropriately (e.g., static location).
198
SD1 utterances were quite frequent at three years (61%) and only gradulally decreased thereafter (four years 40%, ve years 49%, ten years 10%). 3.2.2. Information focus. Each response was further examined with respect to the particular types of information components that were expressed. Information focus for a given age group within each language was calculated as the percentage of each type of information that was expressed over all semantic components found among all responses within that group. Figure 2 shows the distribution of information components expressed as a function of age in English (Figure 2a) and in French (Figure 2b). No notable dierence occurred in the adults utterances across languages. In both languages adults systematically expressed cause (English 32%, French 33%), path (English 34%, French 30%), and manner of cause (English 28%, French 27%), and they expressed manner of motion less frequently (English 6%, French 9%), mostly focusing in these cases on manner of O-motion and rarely on manner of A-motion. At all ages children generally followed the same pattern as the adults, especially in English. In French, however, a notable exception concerned the three-year-old children, who did not express manner of cause as often as
Figure 2. Overall information expressed (Study 1)
199
other groups (3 years 5%, 4 years 15%) and who tended to express path more often than other groups (especially at 3 years 48%). 3.2.3. Information locus. For each response analyses also examined information locus in order to determine the means that were used in the utterance to express information relevant to motion (among the variables shown in Table 1). Particular attention was placed on whether cause, path, manner-of-cause, manner-of-motion (collapsing manner for A-motion and for O-motion) were expressed in main verb roots versus other devices (particles, prepositions, adverbials, subordinate verbs). Figure 3 rst provides an overall summary of how semantic information was distributed across these parts of speech overall (collapsing across all ages within each language). For each type of information expressed, this gure shows the percentage of cases where it was encoded in main verbs versus other devices. English presents a clear complementary distribution whereby cause and both types of manner (manner of cause and manner of motion) were almost always encoded in main verbs (between 95% and 98%), but path in other devices (99%). In contrast, French speakers frequently
Figure 3. Distribution of information across main verbs vs. other devices collapsing over all age groups (Study 1)
200
Figure 4. Information expressed by verbs vs. other devices in English (Study 1)
relied on main verbs to express motion information, but also distributed all information types across verbs and other devices including cause (72% and 28%), manner-of-cause (63% and 37%), manner-of-motion (80% and 20%), and path (62% and 38%). Figures 4 and 5 (for English and French, respectively) show in more detail which particular types of information (if any among the components in Table 1) were expressed in main verbs vs. other devices within each age group. For each age within each language information locus in main verbs was dened as the percentage of cases where each information type was expressed over all components found in main verbs for that group (also shown are cases where no component was found). The same procedure served to dene information locus in other devices for a given age group within each language, i.e., as the percentage of each information type expressed over all components found outside of main verbs for that group (also including cases where no component was expressed). English speakers (adults and children at all ages) relied on a systematic strategy, illustrated in (20), whereby they used main verbs to express manner-of-cause and/or cause (overall 38% and 50%), while using other devices to express path (overall 79%). In contrast, French speakers expressed all types of information in dierent parts of speech, despite
201
Figure 5. Information expressed by verbs vs. other devices in French (Study 1)
a tendency to use main verbs to encode cause and path (overall 35% and 30%), as illustrated in (21) to (23). (20) (21) Hoppy pushed the suitcase down the hill. (E1004) Popi descend la colline en faisant rouler le ballon jusquen bas. (FAD02) (Popi descends the hill making roll the balloon all the way to the bottom). Il a tire le sac jusqu en haut du toit. (FAD02) (He pulled the bag all the way to the top of the roof.) Popi a fait rouler le gros pneu pour le rentrer dans le garage. (FAD02) (Popi made roll the big tyre to make it enter in[to] the garage.)
(22) (23)
The same patterns occurred across all age groups within each language, with the notable exception of the younger children (ages three to ve years), who did not always express any information outside of the main verb. In English, utterances of this type were more frequent at three years (40%) than at four and ve years (25% and 23%), and they practically disappeared at ten years and at adult age (2% in both cases). In French most utterances were of this type between three and ve years (three 89%, four
202
78%, ve 56%), and such cases were not rare even at ten years (24%), decreasing substantially only at adult age (9%). Utterances in which no coded information occurred outside of the main verb root were of two types, illustrated below (relevant information in italics and in brackets, also see more examples in (14) and (16) above): either 1) no additional devices were used at all (examples (24) and (25)); or 2) prepositional phrases occurred but they did not encode information about motion per se, for example referring to general locations (loc in (26) and (27)) or to accompanying entities (ae in (26)). (24) (25) (26) He is pushing [causemanner-of-cause] it. (E0303) Il a pousse [causemanner-of-cause] la table. (E0305) (He pushed the table.) ` Il est monte [path] sur le sable [loc] avec le sac a pommes de terre [ae]. (F0502) (He ascended the sand (dune) with the bag of patatoes.) Hes pulling [causemanner-of-cause] it on the road [loc]. (E0303) Summary
(27)
3.3.
Speakers descriptions of caused motion events show dierent patterns in English vs. French. First, from the earliest age on, utterances were semantically denser in English than in French. In English, speakers (children and adults) expressed multiple types of information in compact structures when denoting motion. In contrast, French speakers of all ages produced utterances that were semantically less dense, although this density dierence was most striking among the younger children. Second, English speakers produced utterances that followed a very systematic pattern from the youngest age on, whereby they expressed cause and manner information in the main verb root and path information in satellites outside of the main verb. In comparison, French speakers were much less systematic with respect to the locus of spatial information in their utterances. At all ages they produced much more varied types of utterances in which they distributed information across dierent parts of speech. Although utterance density increased with age in both languages, this developmental progression was more striking in French than in English. These cross-linguistic dierences were expected and they indicate the impact of typological properties in English (satellite-framed) vs. French (verb-framed) on how children and adults expressed motion. We now turn to analyses of corpus-based spontaneous productions in order to determine whether the same language-specic eects can be observed during initial phases of acquisition.
Typological constraints 4. 4.1. Study 2: Longitudinal corpus of spontaneous productions at two to three years Method
203
4.1.1. Data base. The data base (summarized in Table 2) consisted of early spontaneous productions from four children, two French-speaking (Clara, Gregoire) and two English-speaking (Sarah, Adam from the Brown corpus in CHILDES, see MacWhinney, 2000). Children were recorded in natural settings during several developmental periods (P1 to P4), that were dened by mean length of utterance (mean MLU for P1 < 2,5; for P2 2,5-4; for P3 4,1-5; for P4 > 5).3 The analyses below focus on all utterances that expressed motion of any kind. For this purpose all utterances that contained an explicit motion verb were selected, resulting in subsamples of utterances that varied in size across children (see Table 2), i.e., very large for three children (over a thousand for Clara and Gregoire, close to two thousands for Adam) and smaller for one child (close to 300 for Sarah). 4.1.2. Coding. Utterances referring explicitly to motion were coded with respect to their semantic content. The list in (28) shows all of the information components that were expressed by children in relation to motion: path of motion, manner of motion, cause of motion, and manner of causing motion. General location was the only other type of information expressed in addition to this list.4 (28) Path of motion (path): direction (up, monter ascend), boundaries (into, entrer enter), deixis (come, venir) ` courir run, a quatre pattes on all fours bring, faire tomber make fall pull, pousser push
Manner of motion (manm): Cause of motion (cse): Manner of cause (manc):
3. All available French recordings were included in the corpora. The English corpora (borrowed from CHILDES) did not include any session corresponding to French period P4, but provided many more sessions for P1 to P3. A comparable subset of English recordings was therefore selected by 1) choosing the most comparable (lowest) ages in English within P1 to P3, since age was notably lower in French than in English for the corresponding mean MLUs of each period; 2) by randomly choosing every second or third recording among the remaining les of each period. Verb-less utterances were excluded (and decreased after P1 for all children). 4. Childrens spontaneous productions also contained some occurrences of verbs denoting specically changes of posture (e.g., sit down, sasseoir) which were not elicited in the experimental data.
204
Table 2. Early spontaneous productions: data base for all children (Study 2) FRENCH Clara Period 1 Mean MLU Mean Age* # of sessions Period 2 Mean MLU Mean Age # of sessions Period 3 Mean MLU Mean Age of sessions Period 4 Mean MLU Mean Age # of sessions Total utterances about motion Gregoire Sarah ENGLISH Adam
1,9 1;9 12 3,3 2;9 8 4,7 3;3 8 5,3 4;1 12
1,9 1;9 4 3,2 2;3 5 4,6 2;7 8 4,9 3;4 8
1,9 2;6 12 3,1 3;5 12 4,1 4;6 6 n.a.**
2,1 2;4 6 3,4 3;2 12 4,5 4;6 12 n.a.**
1134
1146
299
1909
* Ages are shown in years;months. ** Not applicable (see Note 4).
Each utterance about motion was also coded with respect to its overall semantic density. Overall utterance density was dened as the total number of dierent information types that were expressed within each utterance (among the types listed under (28) above).5 Finally, each utterance was coded with respect to information locus in such a way as to identify whether the information components listed under (28) were expressed in main motion verbs or in any other co-occuring device within the utterance (particles, prepositional phrases, adverbial phrases). Examples (29) to (37) illustrate the coding of motion information in dierent loci (indi-
5. Although general locations were coded, they were not considered to encode information that is inherent to motion per se. Since the measure of semantic density aimed at determining which dierent types of information were expressed, utterances expressing the same information content more than once were coded for this content only once, e.g., ` manner in Il court [manm] a quatre pattes [manm]. He is running on all fours or path in Je passe [path] par [path] les eurs I am passing by/through the owers).
205
cated in italics and in brackets) within utterances of varying levels of semantic density6. SD1 (29) They all dance [verbmanm]. (Sarah P3) (30) Elle est partie [verbpath]. (Clara P1) (She has left.) (31) Et pis dehors [otherloc] jai fait du poney [verbmanm] dehors [otherloc]. (Gregoire, P2) (And then outside I did [rode] pony outside.) ` (32) Il va tres vite [othermanm]. (Clara P3) (He is going very fast.) (33) I climb [verbmanm] up [otherpath] the ladder. (Sarah P2) (34) Im going y [verbcsemanm] a kite. (Sarah P3) ` (35) Elle rentre [verbpath] a la pointe des pieds [othermanm]. (Clara P4) (Shes entering at/on the tip of her feet.) (36) Faut faire nager [verbcsemanm] ma voiture. (Gregoire P3) (Must make swim my car.) (37) I just pushed [verbcsemanc] it down [otherpath]. (Sarah, P3) Results
SD2
SD3
4.2.
Appendix 2 shows all verb roots that denoted motion in the corpora (with a frequency of 10 occurences or more collapsing over all developmental periods). Verb types that were by far the most frequent (450 occurences or more) were English go and French mettre (put), followed by a few very frequent verbs (English come, fall, put, take and French aller go, tomber fall, prendre take, donner give) some of which are considered to be light verbs in the literature because they express little or no semantic information (e.g., go to mean sheer motion). In both languages most verb roots denoted either voluntary motion or explicitly caused motion, and fewer verb types denoted involuntary motion, i.e., motion that was not voluntarily carried out by an agent and did not imply any explicit cause (mostly English intransitive fall and its French equivalent tomber).7
6. Rare cases included in this count consisted of utterances that come from nursery rhymes, the status of which may dier from spontaneous utterances, e.g., The mouse went up [Path] the clock. (Sarah P3). 7. Some transitive and intransitive uses of the same verb roots occurred (e.g., y voluntary vs. caused), particularly in English (nine verb types) as compared to French (one verb type).
206
Figure 6. Overall utterance density for motion events (Study 2)
As shown in Figure 6, the overall semantic density of utterances in relation to motion was higher in English than in French. English utterances were frequently of density SD2 (52%) and fewer were SD1 (38%), while SD3 utterances also occurred (10%). In comparison, fewer French utterances were SD2 (20%), most were SD1 (78%), and very few were SD3 (2%). This density pattern held for each developmental period from P1 onwards. It also held for both French children (SD1 Clara 82%, Gregoire 74%; SD2 Clara 17%, Gregoire 22%), but some individual dierences occurred in English between Adam (35% SD1, 56% SD2, 9% SD3) and Sarah (21% SD1, 62% SD2, 17% SD3). A further glance at the data shows that this dierence in utterance density was related to the fact that English-speaking children frequently associated motion verbs with other devices (particles, prepositions, adverbial phrases). In this respect, as illustrated in examples (24) to (37) above, childrens utterances fell into three types corresponding to the locus of motion information, which could be encoded in several ways: only in
Table 3. Utterance types expressing motion information in verbs and other devices (Study 2)* Developmental period English P1 P2 P3 French P1 P2 P3 P4 VerbOther VX Verb only VX Other only VX
36% 34% 33%
56% 54% 53%
8% 12% 14%
0% 1% <1% 2.5%
100% 99% 99.5% 97.5%
0% 0% 0% 0%
* Percentages are calculated within each period collapsing across children within each language, who showed no notable individual dierences.
207
Figure 7. Motion information expressed in early spontaneous productions (Study 2)
verbs (hereafter VX, e.g., They all dance, Elle est partie She has left), ` only in other devices (VX, e.g., The mouse went up, Il va tres vite He is going very fast.), or in both (VX, e.g., I just pushed it down, Elle ` rentre a la pointe de pieds Shes entering at/on the tip of her feet). Table 3 summarizes the distribution of childrens utterances among these three utterance types. In French the locus of motion information was clearly in verbs (VX 99%). In English this was less frequently the case (54%), because children also encoded information either jointly in verbs and in other devices (VX 35%) or in other devices only (VX 11%). This pattern held for all developmental periods in both languages. Figure 7 shows in more detail the information components that were expressed in verb roots (Figure 7a) versus other devices (Figure 7b). Concerning verbs, in both languages cause was most frequent (Clara 59%, Gregoire 40%, Sarah 38%, Adam 42%), while manner-of-cause was least frequent (8% to 12% for all children). path was equally frequent in both languages (English 20%, French 24%). manner-of-motion tended to be more frequently encoded in English verb roots than in French ones (Gregoire 24%, Clara 12%, Sarah 33%, Adam 26%). As for other devices, they rarely expressed any information directly relevant to motion itself in
208
French. When they were used at all, they mostly encoded general loca tions (Clara 97%, Gregoire 96%). In contrast, other English devices expressed path more frequently (Sarah 61%, Adam 52%) than general locations (Sarah 31%, Adam 43%). 4.3. Summary
The longitudinal analyses of Study 2 focused on young childrens early spontaneous productions in English versus French from under two years onwards. The results are consistent with those of Study 1. They rst show that utterances were denser in English than in French with respect to the dierent types of motion information that were expressed. Second, children relied almost exclusively on verbs to express motion information in French, but frequently associated verb roots with other devices in English. In both languages verb roots expressed cause and path, but they tended to express manner more often in English. As for other devices, they expressed mostly general locations in French, but they encoded path more frequently than locations in English. These results held for all periods, notwithstanding other changes over developmental periods and individual dierences within a given language.
5.
General discussion
Two complementary studies examined how speakers expressed motion events in English versus French. Particular attention was placed on the impact of two main factors: typological constraints that might result from language-specic properties (satellite-framed and verb-framed languages, respectively) and developmental changes across several age groups (adults and children between approximately two and ten years) that might result from universal cognitive factors. Study 1 elicited productions from adults and four groups of children (three, four, ve, and ten years) in a controlled experimental situation focusing on caused motion. Study 2 further examined the expression of all types of motion events at earlier ages in order to determine whether any cross-linguistic dierences could be observed from the emergence of language onward. Longitudinal analyses focused on all spontaneous utterances about motion events that were produced by four children (two learners of French, two learners of English) across several developmental periods between approximately the ages of two and three years. Study 1 rst showed some striking cross-linguistic dierences at all ages. From three years on, utterances were semantically denser and showed a much more systematic distribution of expressed information in
209
English than in French. In English, adults and children in all age groups expressed multiple types of information (predominantly two to three or more components among children) using compact structures that typically encoded causemanner in main verbs and path in other devices (mainly particles). In contrast, French speakers produced utterances that were less dense and showed more varied structural patterns. At all ages French children expressed less information in their utterances about motion (predominantly one to two components) than same-aged children in English. They also tended to rely on verb roots to express motion information and/or to distribute this information in varied ways across parts of speech within their utterances (main verbs and other devices). Following Talmys typology, these results show the strong impact of language-specic factors on childrens productions. As a verb-framed language, French typically uses main verbs to express the path of events that imply changes of location, thereby making it dicult to express this information together with cause and manner within a single clause and frequently requiring the use of complex structures that contain subordinate clauses. In contrast, satellite-framed languages such as English typically express path in satellites thereby leaving room for cause and manner information to be expressed in main verbs. These compact structures therefore allow more information to be distributed within simple clauses and they are systematically used by children from the youngest age onward. Second, regardless of language, responses also showed an increase in semantic density with age. In both languages childrens utterances increasingly encoded more types of information (cause, manner, path) between the ages of three and ten years. A striking increase in density was observed after ve years, particularly with the advent of frequent utterances encoding three or more information components at ten years. In addition, the density of childrens utterances increased in both languages between the ages of three and ve years, roughly going from one to two components in French and from two to three or more components in English. This common developmental progression indicates the impact of universal cognitive determinants of language acquisition that might include memory and processing limits, as well as more general discourse organizational skills. From a cognitive point of view, it is simpler for children to produce basic structures expressing fewer types of information than to produce more compact and more complex structures expressing more information types. However, despite such common cognitive determinants, density was strikingly higher in English than in French at all ages. Furthermore, the increase in utterance density was more striking in French. In this respect, note that, when French children expressed only one type of information,
210
they tended to encode the cause and/or the path of motion, typically in main verbs, at the expense of other information types, which requires the use of subordinate clauses when expressed together with these components. These results are in line with our previous studies focusing on the expression of voluntary motion in French and English (Hickmann 2003; Hickmann et al. 1998) that showed a higher utterance density, as well as a greater focus on manner information in English than in French. They are also consistent with previous studies comparing a variety of verb-framed and satellite-framed languages (Berman and Slobin,1994; Slobin 2003b, 2006). Study 2 further examined early spontaneous utterances that denoted all types of motion events (voluntary, caused, involuntary) produced by four children between approximately two and three years of age. The results also showed a higher utterance density in English (mostly two information components) than in French (mostly one component) during all developmental periods. This dierence in density was shown to follow largely from the dierent ways in which children distributed information about motion across linguistic devices within their utterances. In particular, children learning French most frequently relied on main verbs alone to express all types of information relevant to motion, reserving prepositional phrases for general locations (if they used them at all). In contrast, children learning English used both main verbs and other devices, thereby expressing more varied information about motion, namely cause together with manner (particularly in main verbs), as well as path (typically in satellites). These results show that typological constraints on acquisition, such as the ones that were expected on the basis of Talmys typology, hold from the earliest age onwards. These studies raise several questions. Among them a rst technical question concerns our methodology in Study 2, and particularly the way in which the data base was constituted for the cross-linguistic comparison among children across developmental periods from P1 onwards. Recall that the recordings from French and English children in this study were matched by their MLU within each period (with a mean MLU of <2,5 for P1, between 2,5 and 4 for P2, between 4,1 and 5 for P3, and >5 for P4). In this respect, note that chronological age was systematically lower in French than in English for the corresponding mean MLU within each period (see Note 4). This discrepancy existed regardless of how MLU was calculated and it is presumably due to the properties of the two compared languages. It therefore cannot be excluded that the higher semantic density observed in the early spontaneous productions of English-speaking versus French-speaking children was at least partially due to their higher chronological age (English > French) in addition to typological con-
211
straints resulting from language-specic properties (satellite-framing in English versus verb-framing in French). Nonetheless, a similar dierence in density was found in Study 1, in which children were matched by chronological age, as well as in our previous experimental studies including French in other experimental situations (Hickmann 2003; Hickmann et al. 1998, 2008; Ochsenbauer and Hickmann 2008) and in a number of studies involving other languages either in controlled experimental studies (Berman and Slobin 1994; Slobin 2003a, 2003b, 2006) or in longitudinal studies of early spontaneous productions (Bowerman 2007; Bowerman and Choi 2003). Second, a number of other more general questions remain open. In particular, a potential implication of our results concerns the extent to which the cross-linguistic dierences that were observed in speakers utterances might reect deeper dierences in their underlying conceptual representations. Although the present research does not allow us to draw any conclusions in this respect, its results are in line with those of a number of other studies (e.g., Slobin 2003, 2006) suggesting that language-specic properties may draw speakers attention to dierent aspects of the space surrounding them. From a developmental point of view, childrens exposure to these language-specic properties may lead them to construct different represensations during the active process of acquiring their native language. Such a hypothesis constitutes a major challenge that will require future studies examining the extent to which typological constraints may aect non only speech production, but also non-verbal behaviors and more generally the processes that might lead to dierent types of cognitive organization. At the same time, future research should investigate in more detail the precise nature of general and universal cognitive capacities (for example working and long term memory, processing speed, categorization, and planning) that may drive the development in the expression of space in all child languages. 6. Conclusion
This research showed striking cross-linguistic dierences in how speakers express motion in English versus French. Such dierences follow from typological constraints related to how satellite-framed (English) versus verb-framed (French) languages grammaticalize or lexicalize information relevant to motion events, particularly when these events involve changes of location. They therefore indicate the impact of language-specic properties on language acquisition. Some common developmental progressions were also observed in both languages, particularly an increase with age in the semantic density of utterances denoting motion. This result
212
suggests the impact of cognitive factors that are presumably universal and independent of language, such as the development of childrens processing or memory capacities necessary for the joint expression of multiple semantic components. However, typological constraints were also observed at all ages from the youngest age (18 months) to adult age. Furthermore, developmental progressions were shown to be much more striking in French than in English, suggesting the joint impact of cognitive and typological factors on language acquisition. Young learners of verb-framed languages have more problems to solve when expressing multiple types of information about motion as compared to same-aged learners of satellite-framed languages. Future research is necessary to address further questions raised by these results concerning the relationship between language and cognition during development. Received 1 March 2009 Revision received 25 November 2009 University of Paris 8/ University of Cambridge
References
Berman, Ruth and Dan I. Slobin, 1994. Relating events in narrative: A crosslinguistic developmental study. New York: Lawrence Erlbaum Associates. Bowerman, Melissa. 1996. The origins of childrens spatial semantic categories: cognitive versus linguistic determinants. In John J. Gumperz and Stephen Levinson (eds.), Rethinking Linguistic Relativity, 145176. Cambridge: Cambridge University Press. Bowerman, Melissa. 2007. Containment, support, and beyond: Constructing typological spatial categories in rst language acquisition. In Michel Aurnague, Maya Hickmann and Laure Vieu (eds.), The categorization of spatial entities in language and cognition, 177203. Amsterdam: Benjamins. Bowerman, Melissa and Soonja Choi. 2003. Space under construction: language-specic categorization in rst language acquisition. In Dedre Gentner and Susan Goldin-Meadow (eds.), Language in Mind: Advances in the study of language and thought, 387427. Cambridge, MA: MIT Press. Choi, Soonja and Melissa Bowerman. 1991. Learning to express motion events in English and Korean: the inuence of language-specic lexicalization patterns. Cognition 41. 83 121. Clark, Eve V. 2003. Language and representations. In Dedre Gentner and Susan GoldinMeadow (eds.), Language in Mind: Advances in the study of language and thought, 1724. Cambridge, MA: MIT Press. Hendriks, Henriette, Hickmann, Maya, and Annie-Claude Demagny. 2008. How adult En glish learners of French express caused motion: a comparison with English and French ` natives. Acquisition et Interaction en Langue Etrangere 27. 1541. Hickmann, Maya. 2003. Childrens Discourse: Person, Space and Time across Languages. Cambridge, Cambridge University Press. Hickmann, Maya and Henriette Hendriks. 2006. Static and dynamic location in French and in English. First Language 26 (1). 103135.
213
Hickmann, Maya, Henriette Hendriks and Christian Champaud. 2009. Typological con straints on motion in French child language. In Jiansheng Guo, Elena Lieven, Susan Ervin-Tripp, Nancy Budwig, Keiko Nakamura, Seyda Ozcaliskan (eds.), Crosslinguistic approaches to the psychology of language: research in the tradition of Dan Isaac Slobin, 209224. Hillsdale, NJ: Lawrence Erlbaum. Hickmann, Maya, Francoise Roland and Henriette Hendriks. 1998. Reference spatiale dans les recits denfants francais: perspective inter-langues. Langue Francaise 118: Numero special sur lAcquisition du francais langue maternelle. 104123. Hickmann, Maya, Pierre Taranne and Isabelle Bonnot. 2009. Motion in rst language acquisition: Manner and Path in French and English child language. Journal of Child Language, 36 (4). 705742. Hohenstein, Jill. 2005. Language-related motion event similarities in English- and Spanish speaking children. Journal of Cognition and Development, 6. 402425. Kopecka, Annette. 2006. The semantic structure of motion verbs in French: typological per spectives. In Maya Hickmann and Stephane Robert (eds.), Space across languages: linguistic systems and cognitive categories, 83101. Amsterdam: John Benjamins. Landau, Barbara and Laura Lakusta. 2006. Spatial language and spatial representation: autonomy and interaction. In Maya Hickmann and Stephane Robert (eds.), Space in languages: linguistic systems and cognitive categories, 309333. Amsterdam: John Benjamins. Lucy, John and Suzanne Gaskins. 2001. Grammatical categories and the development of classication preferences: A comparative approach. In Stephen Levinson and Melissa Bowerman (eds.), Language Acquisition and Conceptual Development, 257283. Cambridge: Cambridge University Press. MacWhinney, Brian. 2000. The Childes Project: Tools for Analyzing Talk. Mahwah, NJ: Lawrence Erlbaum Associates. Munnich, Edward and Barbara Landau. 2003. The eects of spatial language on spatial representation: setting some boundaries. In Dedre Gentner and Susan Goldin-Meadow (eds.), Language in mind: Advances in the study of language and thought, 113155. Cambridge, MA: MIT Press. Ochsenbauer, Anne-Katherine and Maya Hickmann. 2008. Voluntary motion in French and German child language. Paper presented at the Conference of the International Association for the Study of Child Language. Edinburgh, 28 July1 August. Pruden, Shannon M., Kathy Hirsch-Pasek and Roberta M. Golinko. 2008. Current events: How infants parse the world and events for language. In Thomas F. Shipley and Jerey M. Zacks (eds.), Understanding events: How humans see, represent, and act on events, 160 192. New York: Oxford University Press. Slobin, Dan I. 1996. From thought to language to thinking for speaking. In Gumperz, John J. and Stephen C. Levinson (Eds.), Rethinking linguistic relativity, 7096. Cambridge: Cambridge University Press. Slobin, Dan I. 2003a. Language and thought online: cognitive consequences of linguistic relativity. In Dedre Gentner and Susan Goldin-Meadow (eds.), Language in Mind: Advances in the study of language and thought, 157191. Cambridge, MA, MIT Press. Slobin, Dan I. 2003b. The many ways to search for a frog. In Sven Stromqvist and Ludo Verhoeven (eds.), Relating events in narrative: Typological and contextual perspectives, 219257. Hillsdale, NJ: Erlbaum. Slobin, Dan I. 2006. What makes manner of motion salient? Explorations in linguistic typol ogy, discourse, and cognition. In Maya Hickmann and Stephane Robert (eds.), Space across languages: linguistic systems and cognitive categories, 5981. Amsterdam: Benjamins. Talmy, Leonard. 2000. Towards a cognitive semantics. Harvard: MIT Press.
214
Appendix 1. Summary of main features in the stimuli used to test caused motion (Study 1)*
Combination Ground in exemplar a roof roof roof roof snow snow snow snow road road road road cave cave cave cave Ground in exemplar b sand dune sand dune sand dune sand dune grass hill grass hill grass hill grass hill street street street street barn barn barn barn Figure object swimming ring package toy car bag balloon suitcase wheelbarrow trunk wheel apple basket pram rocking horse tyre table trolley chair Manner of cause push push pull pull push push pull pull push push pull pull push push pull pull Manner of O-motion roll slide roll slide roll slide roll slide roll slide roll slide roll slide roll slide Path of motion up up up up down down down down across across across across into into into into
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
hill hill hill hill
* The O gures changed across the 16 combinations. Subjects saw two exemplars (a and b) of each combination, both with the same O but with a dierent scenery and ground referent. Path was always the same for A and O. Two other information components were held constant across all test items: cause (causal relation between A and O) and manner of A-motion (walking).
215
Appendix 2. Verb roots used in early spontaneous productions (Study 2)*

English: Level I: go Level II: come, fall, put, take Level III: drive, turn, walk, y, run, jump, sit, stand, dance, step, swim, get, roll, move, drop, spill, throw, turn, push, bring, park, knock, hang, pick. French: Level I: mettre (put) Level II: aller (go), tomber (fall), prendre (take), donner (give) Level III: partir (leave), monter (ascend), venir (come), rouler[intr] (roll), sortir (exit), marcher (walk), passer (pass), sauter (jump), rentrer (enter), descendre (descend), revenir (come back), voler (y), bouger (move), sasseoir (sit down), faire du ski (ski), couler (drip), nager (swim), danser (dance), rattraper (catch up with), courir (run), sen aller (leave), enlever (take o); faire (make[Inntive]), ranger (put away), tourner (turn), remettre (put back), tirer (pull), coller (glue), melanger (mix), jeter (throw), pousser (push), ramasser (pick up), amener (bring), poser (place). * Levels of token frequencies are shown collapsing all developmental periods within each language. Verbs are shown in decreasing order of frequency levels (I:>450, II: 100449, III: 1099), as well as within each frequency level. Other verb roots were much less frequent (less than 10 occurences overall).
Childrens verbalizations of motion events in German

ANNE-KATHARINA OCHSENBAUER and MAYA HICKMANN*
Abstract Recent studies in language acquisition have paid much attention to linguistic diversity and have begun to show that language properties may have an impact on how children construct and organize their representations. With respect to motion events, Talmy (2000) has proposed a typological distinction between satellite-framed (S) languages that encode PATH in satellites, leaving the verb root free for the expression of MANNER, and verb-framed (V) languages that encode PATH in the verb, requiring MANNER to be expressed in the periphery of the sentence. This distinction has lead to the hypothesis (Slobin 1996) that MANNER should be more salient for children learning S-languages, who should have no diculty combining it with PATH, as compared to those learning V-languages. This hypothesis was tested in a corpus elicited from German children and adults who had to verbalize short animated cartoons showing motion events, and the results are compared with previous analyses of French and English corpora elicited in an identical situation (Hickmann et al. 2009). As predicted, and as previously found for English, German children from three years on systematically express both MANNER (in the verb root) and PATH (in particles), in sharp contrast to French children, who rarely package MANNER and PATH together. These results suggest that, when they are engaged in communication, children construct spatial representations in accordance with the particular properties of their mother tongue. Future research is necessary to determine the extent to which cross-linguistic dierences in production
* Address for correspondence: A.-K. Ochsenbauer, Ludwig-Maximilians-Universitat, In stitut fur Deutsche Philologie, Schellingstrae 3/RG, 80799 Munchen. Email: anne. ochsenbauer@lmu.de M. Hickmann, CNRS Laboratoire Structures Formelles du Langage, UMR 7023, 59 rue Pouchet, 75017 Paris, France. Email: maya.hickmann@s. cnrs.fr Cognitive Linguistics 212 (2010), 217238 DOI 10.1515/COGL.2010.008 09365907/10/00210217 6 Walter de Gruyter
218
A.-K. Ochsenbauer and M. Hickmann
may reect deeper dierences in the allocation of attention and in conceptual organization. Keywords: cognitive linguistics, language acquisition, language typology, space, thinking for speaking, Whoranism.
1.
Introduction
During the last twenty years linguists and psycholinguists have postulated dierent ways of relating the process of childrens language acquisition to their cognitive development. Predominant theories in psychology have put forth the existence of universal perceptual and cognitive constraints in language acquisition determining childrens verbal production and comprehension (e.g., Piaget and Inhelder 1947, or Spelke 2003). However, recent studies (for example, Slobin 1996, 2003a, 2003b, 2006) indicate that our language seems to inuence how we think when we speak, for example inviting us to focus on particular aspects of reality. These results suggest that children learn to verbalize situations in a certain way, which is most typical of their mother tongue, and that they organize incoming information accordingly. The study described below further tests this hypothesis by examining how German children and adults express voluntary motion events in controlled experimental situations. A comparison of our results with those of previous comparable studies concerning English and French supports the claim that the linguistic properties of spatial systems inuence how children construct their spatial representations. 2. Space across languages
Talmy (1983, 1985, 1991, 2000) has shown that languages show strikingly dierent lexicalization patterns in the expression of motion events, that are reected in dierent ways of combining semantic information in surface structure. For example, as illustrated in (1) and (2), satellite-framed languages (e.g., Germanic) encode manner in the verb stem (English/ German swim/schwimmen, run/rennen) and path in verbal satellites1 such as particles (across/durch, away/weg). In contrast, as shown in (3),
1. In their draft, Croft et al. (2008) propose to expand Talmys typology, taking into account lexicalization patterns that are less typical but occur regularly in a large number of languages, particularly three symmetrical types: coordination, serialization and compounding.
Childrens verbalizations of motion events
219
verb-framed languages (e.g., Romance) encode path in the verb stem (traverser to cross, partir to leave) and manner by peripheral constructions ` such as adverbial phrases (a la nage with a swim) or gerunds (en courant by running). (1) (2) (3)
The Das The L The child Kind child enfant child swims across durchschwimmt through-swims traverse crosses the den the la the river Fluss river ` ` riviere a la nage river with a swim and und and et and runs rennt runs part leaves away. weg. away. en courant. by running.
When verbalizing a motion event, speakers choose among several means of expression those which are most typical for their language. One implication is that, while speaking, they are invited to focus on different aspects of reality, and therefore to foreground and background incoming information in dierent ways across languages. Slobin (1996, 2003a, 2003b, 2006) further tested some cognitive implications of Talmys typology, pointing out three factors which increase the likelihood that speakers will express and/or combine particular semantic components. The rst factor is niteness. In satellite-framed languages (see German (4) and (5)) manner is normally expressed in the main inected verb, or more precisely in that part of the main verb that carries verbal morphology. In contrast, speakers of verb-framed languages (see French (6) and (7)) have to use peripheral constructions that may include a non-nite verb. As a result, German (4) and (5) are of the same complexity, whereas (7) is more complex than (6) in French. (4) (5) (6) (7) Das The Das The La The La The Madchen girl Madchen girl lle girl lle girl rennt runs geht goes traverse crosses traverse crosses uber across uber across die the die the la the la the Strae. street. Strae. street. rue. street. rue en courant. street by running.
The second factor is lexeme frequency. In satellite-framed languages verbs expressing simultaneously motion and manner are extremely frequent and often used, even by young children. In verb-framed languages this kind of verb is less frequent. Finally, the last factor is the possibility of expressing information by means of a single (complex) morpheme rather than by a phrase or clause. Examples (8) to (10) illustrate several verb equivalents for some types of motion events in English, German and French, showing that German has many manner-verbs which have
220
no monolexematic equivalent in French and sometimes not even in English: (8) (9) (10) schlurfen, to shue along, traner les pieds ` stapfen, to plod, marcher a pas lourds tappen, to go falteringly, marcher dun pas maladroit
Each of these three factors makes it easier for speakers of satelliteframed languages to express manner and path together in one single clause as compared to speakers of verb-framed languages. As a result, one implication is that manner should be more salient in these languages than in verb-framed languages. In contrast, no dierence in salience across languages is predicted for the semantic component path. From a developmental point of view, it might also be predicted that these typological dierences should result in dierent developmental progressions during the acquisition of spatial language. Thus, Slobin suggests that each language should invite children to focus on some specic aspects of spatial representations. As a result, they may gradually take a particular perspective on the world, which may inuence not only how they verbalize motion events, but perhaps also their cognitive organization more generally. Although German stands among other satellite-framed languages, few studies have examined in detail how motion is expressed in this language. One study (Tschander 1999) shows that available classications of motion verbs (Talmy 1985; Landau et al. 1993) are too simplistic. Apart form German verbs containing either manner or path, this study postulates a third category, namely path-manner-verbs (e.g., humpeln to hobble) which describe a combined movement (kombinierte Bewegung). These verbs are used with dierent auxiliaries depending on the speakers focus: with the auxiliary haben (to have) they focus on manner; with the auxiliary sein (to be) they focus on path. Example (11) taken form Tschanders article demonstrates this phenomenon: (11) Debbie hat/ist gehumpelt. Debbie has/is hobbled.
According to Tschander, these two concepts of movement, manner and path, must not constitute separate entries in the lexicon, so that these verbs should correspond to only one entry in the lexicon. Weber (1983) also proposes a more detailed classication of German motion verbs based on several recurrent semantic components. On the basis of a sample of 90 motion verbs, he extracts 35 semantic components, 20 of which actually correspond to some aspect of manner that characterizes most
221
verbs (94%). This analysis shows again that German encodes manner in the vast majority of its motion verbs. With respect to satellites, there is no consensus as to the nature of this class in German. According to Talmy (1991: 486), satellites belong to
[ . . . ] the grammatical category of any constituent other than a nominal complement that is in a sister relation to the verb root. Satellites can be either a bound ax or a free word, and encompass very diverse grammatical forms (English verb particles, German separable and inseparable verb prexes, [ . . . ].).
Haggblades (1994) analysis of German satellites includes a variety of devices, among which the following four classes will be most relevant below: 1) prexes, e.g., uber- in uberqueren (to cross); 2) particles, e.g., rauf- in raufklettern (to climb up) 3) prepositional phrases, e.g., auf den Baum (on the tree); 4) adverbs, e.g., hinauf (up).2 A more detailed discussion of the problematic distinction between German prex- and particle-verbs can also be found in Altmann and Kemmerling (2005: 63 ). For example, they propose a particle type called double-particle (e.g., drauf- up there or herunter- down from there), which add some deictic (and sometimes local) information to their directional component. 3. Universal and language-specic determinants of childrens spatial language
With respect to the relation between language and cognition during child development, one of the most important research questions is whether children construct universal pre-linguistic concepts that underlie language acquisition or whether their concepts are substantially structured or transformed with the emergence of language. During the last years, this question has been approached by linguists and psycholinguists by and large in three dierent ways. Proponents of a rst position claim that the language ability is innate, modular, and domain-specic. In this view neither general cognitive faculties nor language acquisition have any substantial inuence on this initial knowledge. According to Spelke (2003), human language only provides the opportunity to combine knowledge from dierent modules, allowing humans to build representations that are more complex than
2. In some verbs stress is a criterion to distinguish between prex and particle verbs, e.g., uberfahren (ton run over) and uberfahren (to cross over), where accents on vowels mark stressed syllables in the verb.
222
those of other species. The second position is perhaps best illustrated by Piagetian theory, which argues that perceptual and cognitive constraints determine language acquisition. Many studies have indeed shown that universal perceptual and cognitive factors inuence concept formation and determine which spatial dimensions are most salient (Antell et al. 1985, Mandler 1996). Such factors account for the recurrent order in which linguistic procedures are acquired and related concepts constructed by children across languages. For example, in a review of studies on the acquisition of spatial prepositions across several languages, Johnston and Slobin (1979) showed that all children rst learn prepositions that encode containment (in), support (on) and occlusion (under), then those that encode proximity (next to), and at last those that refer to distinctions on the sagittal axis (behind, in front of ). This recurrent order reects the relative complexity of spatial markers and suggests that universal cognitive constraints inuence acquisition. Finally, according to the position known as linguistic determinism (Whorf 1956; Bowerman 1996; Slobin 1996), our language inuences how we think when we talk, getting us to focus our attention on particular aspects of reality. Thus, children learn to verbalize situations in a certain way, that is most typical for their mother tongue. In particular, dierent lexicalization patterns across languages (e.g., pre- and postpositions, particles, morphologically complex forms or synonyms) inuence how children acquire spatial language. Several studies (Bowerman 2003; Choi and Bowerman 1991; Hickmann 2006, 2007; Hickmann et al. 2009) have shown that children talk about space more like adults who speak the same mother tongue than like children of the same age learning a typologically dierent mother tongue. For example, from very early age on, English-speaking children express the manner and path of motion together in one single clause because their language possesses very compact structures allowing them to do so easily. In contrast, although it is possible to express manner and path together in French, and although French adults do combine these two types of information in some situations, they do so less frequently and less systematically than English adults. In addition, French children (three to ten years) rarely express both components together, presumably because this kind of response requires more complex structures in French than in English. Finally, at all ages French speakers responses vary with event types: although they typically focus on path with most events, they also focus on manner with crossing events (children) or combine path and manner with upward motion (mostly using the verb grimper to climb up, that lexicalizes both). These results directly follow from the typological properties of English and French, suggesting that children learn very
223
early to express the types of information that are salient in their native language. Therefore, both general and language-specic determinants inuence childrens cognition and language in the domain of space. As noted above, despite some particular properties of German, its lexicalization pattern for motion events is similar to the one in English since most German motion verbs conate motion and manner, while path is typically expressed in a wide range of satellites (particles, adverbs etc.). From a developmental point of view, German children should therefore talk about motion more like English-speaking children (frequent manner verbs and path satellites) than like French-speaking children (frequent path verbs, infrequent manner). This hypothesis is partially supported by some scant available evidence. One study (Bamberg 1994: 221) notes that German children make heavy use of varied motion verbs and satellites in narrative discourse, but provides no further information concerning how these devices are used. Evidence from a study (Gentner 1979) concerning early child English shows the frequent use of light verbs in combination with satellites, which may also be expected to occur among German children. Light uses need not involve the full meaning of verbs, which can be frequently reduced to sheer motion (e.g., gehen to go rather than to walk). Such uses presumably also involve a lower level of grammatical complexity since children often learn the nite forms of these verbs by rote and therefore do not actively inect them. Example (12) illustrates a construction of this type (light verb gehen, verb particle rauf up) which is very frequent among young German children. (12) Er geht rauf. He goes up.
With respect to German, surprisingly little is still known concerning childrens uses of other devices outside of the main verb root, such as those illustrated in (13) to (17) below: spatial adverbs, spatial particles, prexed verbs, and full prepositional phrases which govern either Dative or Accusative case to distinguish general locations from changes of locations, respectively. (13) (14) (15) (16) (17) Der rennt hier. He runs here. Die geht rauf. She goes up. Die Frau uberquert die Strae. The woman crosses the street. Der Ae klettert auf den Baum. The monkey climbs on the[Acc] tree. Das Kind spielt in der Kuche. The child is playing in the[Dat] kitchen.
224
The present study aimed at further examining how German children represent motion events in a controlled experimental situation that was similar to the one previously used for English and French (Hickmann 2006; Hickmann et al. 2009). Given the properties of German, the following predictions were made. First, from the youngest age tested (three years) onwards, German children should express both manner and path, relying on structures that encode manner in the verb and path in other devices such as particles, which represent the typical typological pattern of satellite-framed languages. Second, they also should use a great number and variety of motion verbs expressing manner, since such verbs are frequent in the adult input (Talmy 1985, 2000; Weber 1983). Third, their uses of devices outside of the verb root should show some change with age as a function of grammatical complexity. In particular, children should produce these devices in the following order: rst particles and adverbs, which are least dicult because they do not require any inection3; then prexed verbs, which are possible means of expressing motion; and nally, full prepositional phrases, which govern dierent case markings. 4. 4.1. Method Subjects
The results reported below concern 60 monolingual Germans in ve age groups (12 subjects per age). Four groups of children, boys and girls, were tested in kindergartens and primary schools of Augsburg. Their ages were approximately three years (mean 3;8, range 3;4 to 4;4), four years (mean 4;7, range 4;6 to 5;4), six years (mean 6;7, range 6;4 to 7;2), and ten years (mean 10;5, range 10;4 to 10;11). A control group of adults involved students from the University of Munich. 4.2. Materials
Two sets of animated cartoons were constructed (see Appendix). In all cartoons characters carried out a displacement in a particular manner (e.g., walking, running, jumping, etc.), then left the scene. One set of target items (six up-targets and six down-targets) showed a scene with a vertical ground referent, along which displacements took place (e.g., a squir-
3. Particles are used very early in German (e.g., Auto rauf car up). Later in development they are integrated into particle verbs and then form part of the Satzklammer (sentence bracketing) which is syntactically more complex and thus more dicult to learn (e.g., Ich schieb das Auto rauf. I push the car up.).
225
rel running up/down a tree and away). In another set of items (six control items) the characters entered onto one side of the scene against a blank screen, moved to the other side, and left. manner corresponded to the types of actions that took place in the target items during the characters departure (e.g., walking). These displacements were carried out in the absence of any scenery that could provide specic relevant ground entities for the expression of path. Upward and downward motion was selected as targets for the stimuli because they correspond to events that are most familiar to children4. Furthermore, the addition of control items provided a direct contrast between two conditions. Target items focused subjects attention on location changes that involved relevant manner and path information, whereas control items minimized path information and highlighted manner. It was expected that German subjects should 1) express manner with both types of items, but 2) combine manner and path with target items and 3) do so more often with increasing age. Control items also provided a way of determining whether children were able to produce some manner information, particularly if they had not spontaneously mentioned this information when describing target items. 4.3. Procedure
Subjects were seen individually in their school or university setting. They were presented the cartoons on a computer screen and asked to narrate each cartoon as completely as possible. The entire session was audiotaped. Primary school children and adults were told that a future addressee, who would not be shown the cartoons, would have to reproduce the stories on the basis of the recordings. Younger children were introduced to a doll and were asked to blindfold her as part of a game in which they would be telling her secrets. They were reminded throughout to tell her everything that had happened because she could not see and would also like to tell the story. This procedure ensured that subjects produced full descriptions. Cartoons were presented in six dierent random orders in which target items always occurred before control items. A training item began the session.
4. The stimuli actually included six other cartoons that were interspersed among the target items, but are not discussed in the present paper because of space limitations. This additional set of stimuli showed events that involved crossing a boundary (e.g., a baby crawling across a street, a boy swimming across a river). All results concerning these events are entirely in line with those reported here for up/down events.
226 4.4.
A.-K. Ochsenbauer and M. Hickmann Coding
The analyses focused on utterances that described motion. These utterances contained several types of information relevant to motion that were encoded by a variety of linguistic devices, grouped below into two classes: main verbs vs. all other devices. With respect to main verbs, the coding rst distinguished those that were potentially light, particularly all verbal forms of gehen (to go), from all others that had their full semantic meaning (e.g., rennen to run, hupfen to jump). Most uses of gehen (to go) presumably expressed sheer motion, rather than a particular manner of motion, given that the corresponding experimental item did not at all show walking (e.g., cyclist in example (18)). Motion verbs other than gehen were further coded with respect to path (e.g., kommen to come) and manner (e.g., hupfen to jump). Devices outside of the main verb were of three types: 1) prepositional phrases (e.g., auf den Baum on the tree); 2) spatial particles (e.g., ruber across); 3) other relevant expressions like ad verbs (e.g., hier here). They were further coded in terms of whether they expressed path (e.g., weg away), manner (e.g., auf allen Vieren on all fours), and other types of information, for example locations (e.g., da there).5 (18) Die The Results Up and down motion Fahrradfahrerin, die woman-cyclist, she geht da ruber. goes there across.
5. 5.1.
5.1.1. Main verbs. As expected, young children frequently produced the light verb gehen (to go) in combination with various other devices, particularly at ages three to six (34%). Light verbs then decrease sharply at ten years (8%) and practically disappear in the adult group (2%). As illustrated in example (19), the meaning of gehen (to go) as light-verb can be reduced to sheer motion. (19) Der geht dahin und geht da hoch, frisst den ganzen Honig, wieder runter und geht dann weiter. (3 years) He goes there and goes there up, eats all the honey, again down and goes then along.
5. Our data showed no verbal root conating both manner and path, nor any subordinated clauses expressing motion information.
227
Figure 1. Semantic content of main verbs for target events as a function of age
Figure 1 shows how frequently path and manner were expressed in the main verb as a function of age.6 Main verbs expressed manner more frequently than path at all ages (77% vs. 23% overall). Examples (20) and (21) show two typical sentences from German speakers who combine manner-verbs with path-devices. However, path-verbs were not infrequent among young children (29% at three years, 30% at four years, 37% at six years, in comparison to 15% at ten years), as illustrated in (22). With increasing age, manner-verbs became clearly most frequent, particularly at ten years (87%) and in the adults group (94%). Post-hoctests (Bonferroni) comparing uses of manner verbs showed a signicant dierence between ages six and ten years (p < .05), but none between ages three to six years, nor between 10-year-old children and adults. (20) (21) Die hupft da, krabbelt auf zum Kase rauf. (3 years) She bounces there, crawls on to the cheese up. Das Eichhornchen springt zu dem Baum und klettert hinauf und kriecht in das Loch 3und geht4 [/ und hupft wieder raus und run/] ter. (6 years)7 The squirrel jumps to the tree and climbs up and creeps into the hole 3and goes4 [/ and bounces then out and down. /] Die geht auf den Stangel und geht auf das Blatt und esst es ein bisschen und geht wieder runter. (6 years) She goes on the stick and goes on the leaf and eats it a little bit and goes again down.
(22)
6. Very few German verbs express Manner and Path simultaneously, as for example steigen (to climb up) or tauchen (to dive [down]) . Since these verbs were extremely scarce in our data, we did not distinguish them from simple Manner-verbs. 7. The symbol [/ marks a self-correction concerning the passage shown between pointed /] brackets 3 4.
228
5.1.2. Other devices. We now turn to all linguistic procedures that expressed spatial information outside of the main nite verb (included among other devices, see coding above). Table 1 shows the mean number of these devices that were used in each age group within one response. This number increased with age, particularly after six years. Age comparisons in this respect showed signicant increases between ages six and ten (p < .05), as well as between ten-year-olds and adults (p < .05), but no signicant dierences between ages three and six.
Table 1. Number of devices outside of main verbs for target events as a function of age* Age groups 3 years 4 years 6 years 10 years Adults Total number of devices 14,58 14,42 13,67 17,08 19,27 Number of devices per event 1,21 1,20 1,14 1,42 1,75
* Number of devices per utterance.
In their verbalizations of up- and down-motion, participants used either one satellite as in example (23) or more as in (24) and (25). On average, adults used 1,75 devices per utterance, as compared to 1,21 at three years. Nevertheless, even some of the children at three years produced as many devices as adults. (23) (24) (25) Die klettert auf die Blume. (3 years) She climbs on the ower. Das krabbelt hoch zum Baum. (3 years) It crawls up to the tree. Wir haben eine Raupe, die sich im Garten bewegt, und sich dann auf einen Stangel raufhangelt [ . . . ]. (adult) We have a caterpillar that moves in the garden, and then clings on[on] a stripe.
Figure 2 further shows the distribution of other devices within each age. In all age groups particles were most frequent, but tended to decrease with age (from 75% at three years to 57% at adult age). Prepositional phrases tended to increase between ages three (9%) and six (24%), then to decrease until adult age (12%). However, no age dierences were signicant for either of these types of devices. As for the third residual category, consisting above all of adverbs, it was rather infrequent until six years (8% to 14%), but increased thereafter until adult age (31%). Age
229
Figure 2. Types of devices outside of main verbs for target events as a function of age
comparisons showed signicant increases in these devices between six and ten years (p < .05), as well as between age ten and adults (p < .05). No other age dierences were signicant. With respect to the semantic information encoded in these linguistic devices, overall 80% expressed path (as in (18) to (20) above), as compared to only 1% manner and 19% other relevant information (for instance information about the setting). As expected, path devices were signicantly more frequent than manner devices (T-Test, df 59, p .001) and the remaining category mostly concerned information about the setting and the ground, e.g., hier (here) or auf einer Wiese (in a meadow). As shown in Figure 3, the same pattern was observed within each age group.
Figure 3. Semantic information expressed in other devices for target events as a function of age
Age comparisons showed no signicant dierences in the frequencies of path devices, despite a slight increase from ages three to six (82% to 90%) and decrease thereafter (78% at age ten, 67% among adults). The
230
residual class of devices, for example those that provided general locations (other than particles and prepositional phrases) increased with age, showing more frequent uses by adults (33%) than by children in any age group (three years 15%, 4 years 12%, six years 10%, ten years 21%). As illustrated in examples (26) and (27), this dierence between young children and adults mainly concerned information about the setting and the ground. (26) Das krabbelt da hoch, dann holt sie den Kase und dann geht sie runter. (4 years) It crawls up there, then it takes the cheese and then it goes down. Die Katze ahm springt an einem Telefon(mast), nein an einem Strommast und ahm krabbelt dann hoch und klaut sich ein Ei aus dem Nest, also das stubst das Ei runter und das Ei fallt auf den Boden, das Ei bricht entzwei und die Katze springt runter und leckt dann das Ei auf. (adult) The cat ehm jumps at a telephone (pole), no at a power pole and ehm then crawls up and nicks an egg from the # nest, well it nudges the egg down and the egg falls to the ground, the egg breaks in two and the cat jumps down and then licks the egg. Control items
(27)
5.2.
Figure 4 shows the semantic information that was expressed in motion verbs with control items across the dierent age groups. At all ages the great majority of verbs expressed manner rather than path (overall 91% vs. 9%). Despite slight variations in this respect, no age dierences were signicant. Examples (28) to (31) show a great variety of dierent manner-verbs across ages. In comparison, regardless of age, very few responses contained any other device outside of the main verb, except for occasional locative expressions. No analysis of light verbs is presented for control items, since these verbs were very rare with these items.
Figure 4. Semantic information in main verbs used for control items as a function of age
Childrens verbalizations of motion events (28) (29) (30) (31) Die springt. (3 years) She jumps. Die ist so gekrabbelt. (4 years) She has crawled like this. Die Robbe, die robbt halt so ja, die Raupe, da. (10 years) The seal, it crawls just like this, yes, the caterpillar, there. Der Bar tapst. (adult) The bear lumbers.
231
A nal analysis compared the types of motion verbs that were used in relation to target and control items. Table 2 shows the number of dierent lexemes (types) that were used for each item type. In both cases the number of lexeme types remained stable between ages three and six (89 types), then increased slightly with target items (1011 types) and drastically with control items (17 types). However, although childrens productive lexicon seems to undergo an explosion between ages six and ten, note that they produced some verbs that were not used by adults, e.g., robben (to crawl) or schrubben (to scrub) for the motion of the caterpillar. Moreover, three neologisms were found in the data: raupen, which might be derived from the noun Raupe (carterpillar) to mean to move as a caterpillar; kraupen, which probably fuses the verb krabbeln (to crawl) and the noun Raupe (carterpillar); tappeln, which probably combines the verbs tappen (to go falteringly) and tippeln (to trip).
Table 2. Number of dierent motion verb lexemes (types) as a function of age Age groups 3 years 4 years 6 years 10 years adults Target items 9 9 9 11 10 Control items 8 9 9 17 17
6. 6.1.
Discussion Lexicalization patterns in German and other child languages
Our experiment examined how German children and adults described voluntary motion events, with particular attention to the expression of path and manner in relation to up and down motion. As expected, the results showed that speakers mostly encoded manner in nite verbs and path in other linguistic devices outside of the verb. This rst result was
232
observed at all ages from the youngest age (three years) to adult age. As predicted, these observed patterns are in line with the proposal that German (like other satellite-framed languages) invites speakers to simultaneously focus on both manner and path. Additional qualitative information shows that German speakers were greatly concerned with manner of motion. First, as illustrated in (32) and (33), self-corrections were observed at all ages from the youngest to the oldest age group, showing that in the majority of cases speakers were searching for the motion verb that exactly corresponded to particular motion events. (32) [ . . . ] ahm jetzt die Raupe, die ist 3auf den Stangel hochgegangen hat das4 [/ auf den Stangel hochgekrabbelt 3und die4 [/ und /] /] hat das Blatt angebissen. (4 years) [ . . . ] eh now the caterpillar she 3went up on the stalk, has4 [/ she /] climbed up on the stalk 3and she4 [/ and took a bit of the leaf. /] Ein Eichhornchen krabbelt, also klettert 3einen Berg4 [/ ah einen /] Baum rauf. (Adult) The squirrel crawls, so climbs up 3on a mountain4 [/ eh on a /] tree.
(33)
Second, speakers used a wide range of manner verbs in all age groups, sometimes involving very subtle nuances, for instance tappen, tippeln or trippeln (all of which may be translated into English as to go falteringly or to trip). Furthermore, they used a great number of dierent verbal particles, most of which expressed dierent aspects of path, for example contracted particles such as drauf or hinunter containing up to three different types of semantic information, as illustrated in (34): (34) drauf: da (there): her (towards): auf (up): hinunter: hin (towards): unter (down): d(a) (he)r auf general location in which motion takes place deixis direction (along a vertical axis) hin unter deixis direction of the movement (vertical axis)
Finally, speakers encoded path information not only in particles, but also in many other linguistic devices, such as prepositional phrases or adverbs, thereby producing very detailed path descriptions, as illustrated in (35). (35) [ . . . ] von rechts kommt eine Raupe ins Bild, bewegt sich auf einen Halm zu, klettert hinauf bis zum ersten Blatt [ . . . ]. (Adult)
233
[ . . . ] a caterpillar comes from the right into the screen, moves towards a blade, climbs up to the rst leaf [ . . . ]. A comparison of these results with those that were previously obtained in the same experimental situation for French and English (Hickmann et al. 2009) follows our hypotheses. As predicted on the basis of Talmys typology, our results concerning German are similar to those reported for English and dier signicantly from those reported for French. Like German speakers, English-speaking adults and children predominantly express manner and path together within their utterances. They encode manner in the main verb and path in particles and prepositions (e.g., to crawl up/down). However, children also produce some path-only responses in which they express sheer motion in the verb and path in satellites (e.g., to go up/down). In contrast, although French adults frequently express manner and path together, they do so less frequently and less systematically than English-speaking adults. French children tend to focus on path alone, encoding this information in the main verb (e.g., monter to ascend, descendre to descend) and they either do not express manner or express this information outside of the verb (in gerunds e.g., descendre en courant to descend by running or in adverbials e.g., monter avec les pattes to ascend with the paws). Dierences also occur at all ages as a function of event type. manner-only responses are rare in both languages and at all ages, with the exception of boundary-crossing events (see above). In addition, French provides a very frequent verb that simultaneously encodes manner and upwards direction (grimper climb up) inviting French speakers to produce more mannerpath responses with upward motion than with downward motion. A couple of additional points arose during our analyses of some German spatial devices and remain open. First, with respect to motion verbs, previous linguistic analyses of German (particularly Haggblade 1994; Weber 1983 among others) may need substantial qualications, particularly in relation to light verbs. Although the classication of gehen (to go) as a light verb is probably uncontroversial in most cases, its semantic content may dier across contexts (also see Di Meola 1994, for a study of kommen and gehen). For example, as illustrated in (36), this verb may encode manner information, particularly when used to describe motion in relation to the control items. In other contexts such as (37), however, it can encode information about deixis, particularly in descriptions of departures from the screen (which were not analyzed in the present study). (36) Experimenter: Child: Und die kleine Maus? Die geht. (4 years) And the mouse? She goes [walks].
234 (37)
A.-K. Ochsenbauer and M. Hickmann Die Raupe frisst ein bisschen vom Blatt wo sie hochgegangen ist, dann geht [?] sie wieder runter und geht [geht weg]. (4 years) The caterpillar eats a bit of the leaf where it went up, then it goes down under and goes [goes away].
Second, following Haggblade (1994: 43), our analysis included information about the setting and the ground among the semantic information that was encoded by subjects uses of linguistic devices outside of the verb. However, as suggested by some authors (e.g., Talmy 1985, 2000 or Slobin 1996, 2003b), this type of information is of a dierent nature and should not be included as part of the semantics of motion per se in the class of satellites. Excluding such spatial devices would imply a more conservative coding resulting in fewer satellites overall but in proportionally more satellites expressing path. It would therefore not invalidate our analysis and on the contrary increase the satellite-framed properties of German observed at all ages in our data. 6.2. Developmental progressions in German
Our results show the same lexicalization patterns among children and adults. From the earliest age tested onward (three years), German speakers express manner and path in compact utterances, encoding manner in the nite verb and path mostly in verbal particles. As predicted, manner is as salient to them as path, a result that follows from the typological properties of German as a satellite-framed language. Since German systematically encodes manner in the main verb, children seem to pay attention to this information from three years on and also encode it in their motion event descriptions. Nonetheless, several developmental progressions also occur, revealing a leap particularly between the ages of six and ten years. A rst progression concerns an increase in the semantic and syntactical complexity of childrens utterances. For instance, the complexity of linguistic devices encoding information outside of the main verb increased with age. As expected, young children most often used adverbs (e.g., da there, hier here) to locate the motion event and simple particles (e.g., rauf up, runter down) to describe path. It is only at around six years that prepositional phrases (e.g., auf dem Boden on the ground, auf den Baum on the tree) are used more frequently and with relative ease. This developmental progression was observed in relation to the devices that were used outside of the main verb. Although the semantics of these devices do not change over time, they are used increasingly with age. Children mostly use path particles that can be easily combined with a great number of dierent verbal stems. As children learn other devices,
235
they rst acquire particles as undierentiated and whole linguistic entities that are not analysed (neither morphologically nor semantically) and therefore produce frequent contracted forms (drauf onto, nunter down etc.). Qualitative analyses show that each child uses only one single form within a given set of particles diering with respect to the deictic element hin/her and their corresponding reduced forms n-/r- (e.g., nauf, nunter, nuber; rauf, runter, ruber; herauf, herunter, heruber). From very early on, young children use semantically complex satellite forms, even though we cannot assume that they know all semantic contrasts within a given paradigm (e.g., hinauf/herauf ). Second, the data show some changes across age groups with respect to verb use. Although motion verbs are quite diverse in all age groups, reecting in particular the highly salient nature of manner in German (as in other satellite-framed languages), children between three and six years also make frequent uses of light motion verbs (particularly gehen to go), thereby producing utterances that are morphologically and semantically simpler. Finally, as children get older, they show an increasing ability to organize discourse, as shown by the fact that they gradually learn to specify information about the setting and the ground. Unlike adults, children often do not provide sucient information for their listener to reconstruct the spatial universe of discourse. They typically only describe motion itself, without any spatial anchoring, for example without specifying the general location in which these events occurred, nor the source and goal locations implied by some of these events, making it dicult for their listener to interpret changes of location. A very similar developmental progression was also observed in previous analyses of childrens narratives (Hickmann 2003) across several dierent child languages (English, French, German, Chinese). Thus, although language-specic factors strongly contribute to shaping German childrens lexicalization patterns when they verbalize motion events, these factors alone cannot account for the fact that their responses show an increase in semantic density and in syntactic complexity. Other factors must clearly play a role in how these childrens spatial language changes with age. Thus, childrens cognitive system matures during language acquisition and some of these changes presumably underlie some changes in their linguistic system, for example an increase in their memory, processing, reasoning, and planning capacities, all of which are involved in complex discourse activities. In addition, the most striking developmental changes were observed at around six years, which corresponds to the age at which German children start school and are challenged in the domain of language.
236 7.
A.-K. Ochsenbauer and M. Hickmann Conclusion
The pattern found for German children is consistent with the one reported in other satellite-framed languages such as English and quite different from the one reported in verb-framed languages such as Spanish or French. When describing motion events, young children learning satellite-framed languages systematically express both manner and path within their utterances. The typological properties of their mother tongue simplify this task by compactly packaging these two types of information in constructions comprising linguistic devices that are among the rst morphemes to be mastered during language acquisition (verbal particles). Depending on their language, then, speakers within a given speech community choose to talk about or to ignore particular aspects of denoted situations. This process of selection presumably leads them to build up spatial representations that are partially characteristic of their language. In this sense, our results support the view that children partially construct the semantics of space in accordance with the language-specic characteristics of their mother tongue. Future research needs to address further questions concerning the depth of such typological constraints on speakers representations beyond language use. Received 1 March 2009 Revision received 25 November 2009 University of Munich/ University of Paris
References
Altmann, Hans and Silke Kemmerling. 2005. Wortbildung furs Examen. Gottingen: Vander hoeck and Ruprecht. Antell, Sue Ellen and Albert Caron. 1985. Neonatal perception of spatial relations. Infant Behaviour and Development 8. 1523. Bamberg, Michael. 1994. Development of linguistic forms: German. In Ruth Berman and Dan I. Slobin (eds.). Relating events in Narrative: A crosslinguistic developmental study, 189284. Hove: Erlbaum. Bowerman, Melissa. 1996a. Learning how to structure space for language. A crosslinguistic perspective. In Paul Bloom (ed.). Language and space, 385436. Cambridge: MIT Press. Bowerman, Melissa. 1996b. The origins of childrens spatial semantic categories: Cognitive versus linguistic determinants. In John J. Gumperz and Stephen Levinson (eds.). Rethinking linguistic relativity, 145176. Cambridge: Cambridge University Press. Bowerman, Melissa and Soonja Choi. 2003. Space under construction. Language-specic categorization in rst language acquisition. In Dedre Gentner and Susan Goldin-Meadow (eds.). Language in Mind. Advances in the study of language and thought, 387427. Cambridge: MIT Press. Choi, Soonja and Melissa Bowerman. 1991. Learning to express motion events in English and Korean: the inuence of language-specic lexicalization patterns. Cognition 41. 83121.
237
Croft, William, Johanna Baraddal, Willem Hollmann, Violeta Sotirova and Chiaki Taoka. 2008. Revising Talmys typological classication of complex events. Unpublished draft: http:/ /www.unm.edu/~wcroft/WACpubs.html Di Meola, Claudio. 1994. Kommen und gehen. Eine kognitiv-linguistische Untersuchung der Polysemie deiktischer Bewegungsverben. Tubingen: Niemeyer. Gentner, Dedre. 1978. On relational meaning. The acquisition of verb meaning. Journal of Child Development 49. 988998. Haggblade, Elisabeth. 1994. Die Lexikalisierung von semantischen Komponenten in den Bewegungsverben. Berlin: microche version. Hickmann, Maya. 2006. The relativity of motion in rst language acquisition. In Hickmann, Maya and Stephane Robert (eds.), Space across languages: linguistic systems and cognitive categories, 281308. Amsterdam/Philadelphia: John Benjamins. Hickmann, Maya. 2007. Static and dynamic location in French: developmental and crosslinguistic perspectives. In Michel Aurnague, Maya Hickmann and Laure Vieu (eds.), Spatial entities in language and cognition, 205231. Amsterdam/Philadelphia: John Benjamins. Hickmann, Maya, Pierre Taranne and Philippe Bonnet. 2009. Motion in rst language acquisition: manner and path in French and English child language. Journal of Child Language, 36 (4). 705741. Johnston, Judith and Dan Isaac Slobin. 1979. The development of locative expressions in English, Italian, Serbo-Croatian and Turkish. Journal of Child Development 6, 530 545. Landau, Barbara and Ray Jackendo. 1993. What and Where in spatial language and spatial cognition. Behavioral and Brain Sciences 16(2). 21738. Mandler, Jean M. 1996. Preverbal Representation and Language. In Paul Bloom (ed.), Language and space, 365384. Cambridge: MIT Press. Piaget, Jean and Barbel Inhelder. 1947. La representation de lespace chez lenfant. Paris: Presses Universitaires de France. Slobin, Dan I. 1996. From thought and language to thinking for speaking. In John J. Gumperz and Stephen Levinson (eds.), Rethinking linguistic relativity, 7096. Cambridge: Cambridge University Press. Slobin, Dan I. 2003a. Language and thought online: cognitive consequences of linguistic relativity. In Dedre Gentner and Susan Goldin-Meadow (eds.), Language in mind: Advances in the study of language and thought, 157191. Cambridge, MA: MIT Press. Slobin, Dan I. 2003b. The many ways to search for a frog. In Sven Stromqvist and Ludo Verhoeven (eds.), Relating events in narrative: Typological and contextual perspectives, 219257. New Jersey: Erlbaum. Slobin, Dan I. 2006. What makes manner of motion salient: Explorations in linguistic typol ogy, discourse, and cognition. In Maya Hickmann and Stephane Robert (eds.), Space across languages: Linguistic systems and cognitive categories, 5981. Amsterdam: John Benjamins. Spelke, Elizabeth S. 2003. What makes us smart? Core knowledge and natural language. In Dedre Gentner and Susan Goldin-Meadow (eds.), Language in mind: Advances in the study of language and thought, 277311. Cambridge, MA: MIT Press. Talmy, Leonard. 1983. How language structures space. In Herbert Pick and Linda Acredolo (eds.), Spatial Orientation, 225282. New York: Plenum. Talmy, Leonard. 1991. Path to Realization. A Typology of Event Conation. In J. Sutton and C. Johnson (eds.), Proceeding of the seventeenth annual meeting of the Berkeley Linguistic Society, 480519. Berkeley. Talmy, Leonard. 2000. Towards a cognitive semantics: Concept structuring systems. Cambridge: Cambridge University Press.
238
Talmy, Leonard. 1985. Lexicalization Patterns. Semantic structure in lexical forms. In Timothy Shopen (ed.), Language typology and syntactic description: Grammatical categories and the lexicon, Vol. 3, 57149. Cambridge: Cambridge University Press. Tschander, Ladina. 1999. Bewegung und Bewegungsverben. In Ipke Wachsmuth and Bernhard Jung (eds.), Proceedings der 4. Fachtagung der Gesellschaft fur Kognitionswissen schaft, 2530. Bielefeld: Sankt Augustin. Weber, Gerhard. 1983. Untersuchungen zur mentalen Reprasentation von Bewegungsver ben. Merkmale, Dimensionen und Vorstellungsbilder. Dissertation. Hannover. Whorf, Benjamin Lee. 1956. Language, thought and reality. Cambridge, MA: MIT Press.
Appendix Stimuli T: Target items up and down (T1) (T2) (T3) (T4) (T5) (T6) A squirrel runs to a tree, up into and out of a hole in the tree, down, and away. A caterpillar crawls to a plant, up the stalk to eat a piece of leaf, down, and away. A bear walks to a tree, climbs up to a beehive to get some honey, climbs down to eat it, and walks away. A cat runs to a telephone pole, jumps up to a birds nest, drops an egg, jumps down to lick the egg, and runs away. A mouse tiptoes to a table, climbs up to take a piece of cheese, slides down, and tiptoes away. A monkey walks to a banana tree, climbs up to take a banana, then slides down and walks away.
C: Control items manner maximally salient (C1) squirrel running; (C2) caterpillar crawling; (C3) bear walking; (C4) cat running; (C5) mouse tiptoeing; (C6) kitten running.
What gestures reveal about how semantic distinctions develop in Dutch childrens placement verbs
MARIANNE GULLBERG and BHUVANA NARASIMHAN*
Abstract Placement verbs describe every-day events like putting a toy in a box. Dutch uses two semi-obligatory caused posture verbs (leggen lay and zetten set/stand) to distinguish between events based on whether the located object is placed horizontally or vertically. Although prevalent in the input, these verbs cause Dutch children diculties even at age ve (Narasimhan and Gullberg, accepted). Children overextend leggen to all placement events and underextend the use of zetten. This study examines what gestures can reveal about Dutch three- and ve-year-olds semantic representations of such verbs. The results show that children gesture dierently from adults in this domain. Three-year-olds express only the path of the caused motion, whereas ve-year-olds, like adults, also incorporate the located object. Crucially, gesture patterns are tied to verb use: those children who over-use leggen lay for all placement events only gesture about path. Conversely, children who use the two verbs dierentially for horizontal and vertical placement also incorporate objects in gestures like adults. We argue that childrens gestures reect their current knowledge of verb semantics, and indicate a developmental transition from a system with a single
* Address for correspondence: M. Gullberg, Centre for Languages and Literature, Lund University, PO Box 201, 221 00 Lund, Sweden. Email: marianne.gullberg@ling.lu.se B. Narasimhan, University of Colorado at Boulder, Department of Linguistics, Hellems 290, 295 UCB, Boulder, CO 80309, U.S.A. Email: Bhuvana.Narasimhan@colorado.edu. Acknowledgements: We gratefully acknowledge the cooperation of the teachers and students of the Kindercentrum Dribbel (Molenhoek, the Netherlands). We are also grateful for funding from the Max Planck Institute for Psycholinguistics. We wish to express our thanks to Judith Bindels, Bregje Esmeijer, Marieke Hoetjes, Anke Jolink, Ilonka Petal, Anne Rijpma, Femke Uijtdewilligen, Sonja Wichert, and Arna Van Doorn for help with data collection, coding, and establishment of interrater reliability, and to Melissa Bowerman, Asifa Majid, Leah Roberts and two anonymous reviewers for feedback and helpful suggestions. Any remaining errors are solely ours. Cognitive Linguistics 212 (2010), 239262 DOI 10.1515/COGL.2010.009 09365907/10/00210239 6 Walter de Gruyter
240
M. Gullberg and B. Narasimhan
semantic component(caused) movementto an (adult-like) focus on two semantic components(caused) movement-and-object. Keywords: gesture, verb semantics, Dutch, language development, placement.
1.
Introduction
How adult-like are childrens verb meanings in the early stages of development? Prior work on how children tune in to semantic patterns in the input has investigated childrens comprehension of verb meaning (e.g., Gentner 1978; Thomson and Chapman 1977) as well as their production of verbs in both elicited and spontaneous contexts (e.g., Choi and Bowerman 1991; Fisher et al. 1994; Gropen et al. 1991; Naigles and HoGinsberg 1998; Pye et al. 1996). However, with some notable exceptions (e.g., Anglin 1970; Bowerman 1978), surprisingly few studies have asked how adult-like childrens semantic systems are once forms are in use in production. We therefore know remarkably little about the nature of the semantic representations children operate with, what changes take place in the system over the course of development, and when such changes occur. This study explores childrens development of verb meaning and what semantic distinctions may underlie their extension patterns in the semantic domain of object placement by looking across modalities. More specically, we ask what childrens gestures about putting things in places can tell us about their developing semantic systems. 1.1. Placement, caused motion verbs, and their development
Children and adults talk frequently about the placement of objects such as putting a toy in a box. Object placement can be dened as events of caused motion where an object (a located or gure object) is moved to a location (a reference object or ground) with (typically manual) control exerted over the located object until it reaches its end location (cf. Bowerman et al. 2002; Bowerman et al. 2004). Placement (put) has long been a popular candidate for a cognitive and linguistically basic notion (Goldberg 1995; Pinker 1989), and children are assumed to acquire light verbs such as put early and easily (Clark 1978; Pinker 1989). But there is also crosslinguistic variation, for instance in the number of verbs that populate this domain and their level of semantic granularity (cf. papers in Ameka and Levinson 2007; Kopecka and Narasimhan to appear; Levinson and Wilkins 2006). Patterns range from single, light, all-purpose verbs like English put, via systems with a small number of (caused posture) verbs
Gestures and the development of semantic distinctions in Dutch
241
with more specic semantics and constrained extensions like set, stand and lay, to large inventories of very specic, classicatory placement verbs as in the Mayan languages. Moreover, languages sometimes have mixed systems with optionality between the use of light and more specic verbs, as in English where put co-exists with the rarer set, stand, and lay (cf. David 2003; Pauwels 2000). The acquisition of verbs in this domain also displays variation crosslinguistically (e.g., Chenu and Jisa 2006; Hansson and Bruce 2002; Hickmann and Hendriks 2006; Slobin, et al. in press). But interestingly, neither number of semantic distinctions made in a given semantic domain, nor optionality of use in the input seem to signicantly delay verb acquisition (e.g., Brown 1998; Narasimhan and Gullberg 2006). Based on a broad crosslinguistic comparison of the acquisition of placement verbs in languages that lexicalise path in verbs (verb-framed) vs. in satellites (satellite-framed, Talmy 1985), it has instead been suggested that acquisition is determined by many factors, including the interaction between semantic distinctions made in the verb and other non-verbal forms (e.g., case marking, adpositions) expressing relevant spatial information (Slobin et al. in press). Dutch uses a small set of caused posture placement verbs, zetten set and leggen lay. In addition to caused change of location, these monomorphemic verbs encode information about gure objects and their end conguration in that location or ground. Among other factors, the choice of verb for a given event depends on the properties of the object being located: its shape, its orientation, and its disposition with respect to the ground. Specically, the semantic distinctions concern the presence of a functional base and whether the gure object is resting on it, and whether the spatial extension or projected axis of the object is vertical or horizontal (Lemmens 2002, 2006; van Staden et al. 2006). For gure objects resting on their base, often extending vertically, zetten set is typically used, as in example (1). For gure objects lacking a functional base and/or extending horizontally, leggen, lay, is preferred, as seen in (2). (1) (2) zij zet de kop/de es op tafel she sets the cup/the bottle on the table zij legt de bal/de es op tafel she lays the ball/the bottle on the table
Dutch caused posture verbs are semi-obligatory and frequent in adult usage, and they are also ubiquitous in the input to Dutch children (Narasimhan and Gullberg accepted). In line with claims that children tune in very early to the habitual patterns of encoding in their language (Choi and Bowerman 1991; Slobin et al. in press), Dutch children might
242
therefore be expected to acquire these verbs early, easily, and uniformly. However, these verbs cause unexpected diculties for children as old as four and ve leading to non-adult-like verb use (Narasimhan and Gullberg accepted). When Dutch childrens verb use is compared to that of adults for the same set of scenes, children are found to over-extend leggen lay and under-extend zetten set, seemingly picking one default verb to apply to all placement events. The question arises as to what semantic distinctions children who use leggen lay for all placement events actually operate with. One novel way to examine this question is to consider other available vehicles of meaning, namely speech-associated gestures, along with speech. 1.2. Gestures and language-specic meaning
Speech-associated gestures are closely linked to speech and language. Generally, speech and gesture are semantically, temporally and pragmatically coordinated such that the most meaningful part of a gesture, the stroke, typically is temporally coordinated with a part of speech expressing closely related meaning (Kendon 1980; McNeill 1992). Although theories about the speech-gesture relationship dier in their views on the locus and nature of the connection, the connection itself is undisputed (for a review, see De Ruiter 2007). Adults gestural practices dier crosslinguistically for various reasons (cf. Kendon, 2004). Recent research suggests that the variation is partially related to linguistic variation. Although gestures convey information in a dierent format from speech, they reect the linguistic choices speakers make: what information is considered newsworthy and when (McNeill 1992; McNeill, Levy and Pedelty 1990). Insofar as languages select dierent information for expression, gestural forms and their timing relative to speech thus dier crosslinguistically. For instance, gestures have been shown to be inuenced by how semantic components like path and manner of motion are lexicalised and packaged syntactically in a given lan guage (e.g., Duncan 1996, 2005; Gullberg et al. 2008; Kita and Ozyurek zyurek et al. 2005). Languages like 2003; McNeill and Duncan 2000; O Turkish, which expresses path and manner of motion in separate spoken clauses (e.g., descend [path] while rolling [manner]), also tend to be accompanied by gestures which express each component separately: one separate gesture for the path and another for the manner (e.g., Kita and Ozyurek 2003). Gestures also appear to be inuenced by verb semantics alone when information structure and syntactic packaging are kept constant. For instance, French and Dutch organise placement descriptions similarly
243
(agent-action-object-location) and the simple transitive placement verbs project similar structures. However, the semantics of the placement verbs dier. French has a general placement verb mettre put, which chiey encodes the caused motion. French adults predominantly accompany placement descriptions by gestures expressing only the direction or path of the movement (Gullberg in press, submitted). This is in contrast to Dutch adults who instead chiey accompany their caused posture verbs by gestures incorporating the gure object with the direction of the gestural movement in hand shapes that reect the imagined object. These object-incorporating gestures are not restricted to a specic verb, but occur with both caused posture verbs (viz. leggen as well as zetten). Since the information structure and syntactic packaging of placement descriptions is similar across the two languages, the dierence in gesture patterns arguably stems from the dierent semantic specicity of the placement verbs. The Dutch gestural focus on objects seems to be prompted by the semantic distinction based on the object and its properties, viz. leggen lay for objects without a base extended horizontally, and zetten set for objects resting on their base, extending vertically. Conversely, the absence of a French gestural interest in objects seems to be inuenced by the relatively less specic verb semantics in French. The observed coordination between speech and gesture, which includes crosslinguistic dierences in semantic and syntactic distinctions, suggests that gestures can be seen as vehicles of language-specic meaning on a par with speech. They can therefore provide an additional window onto speakers event-related, semantic representations. 1.3. Gestures in language development
A growing body of research indicates that gestures and speech develop in parallel in childhood (e.g., Bates and Dick 2002; Capirci and Volterra 2008; Nicoladis et al. 1999; Volterra et al. 2005). However, despite the integration of the modalities, a number of studies also show that gestures serve as precursors to speech (e.g., Bates and Dick 2002; Tomasello et al. 2007), carrying more communicative weight in younger children (e.g., Guidetti 2005; Stefanini et al. 2008). A particular research tradition focuses on how gestures foreshadow speech such that non-redundant meaning is expressed in gesture before it can be expressed in speech in so called mis-matches (Church and Goldin-Meadow 1986). The presence of such gestures has been seen as an indication of transitional knowledge states and of a readiness to learn both language (e.g., Capirci et al. 1996; Goldin-Meadow 2007; Ozcaliskan and Goldin-Meadow 2005), and learning more generally (e.g., Alibali and Goldin-Meadow 1993; GoldinMeadow 2003; Pine et al. 2004) even beyond younger childhood.
244
There is also evidence that gestures can be informative about the development of semantic representations in general (Capone, 2007), and about the development of language-specic semantics in particular. This latter aspect has been examined in the domain of motion where the realisation of semantic components like path and manner has been explored in speech and gesture. A series of studies investigating descriptions of motion and causal events in English and Turkish have shown that children between three and nine learning these languages overall display general (universal) patterns in younger childhood, and language-specic pat terns later on (e.g., Allen et al. 2003; Ozyurek and Ozcaliskan 2000). At age three, the children examined often display similar patterns crosslinguistically, conating elements of path and manner, or cause and path of motion, in gesture and in speech. Language-specic patterns emerge around age ve or six, depending on the study and construction examined. Particularly interesting is the observation that speech and gesture often express the same meaning even if neither modality is adult-like. For instance, the youngest Turkish childrens gestures diered from those of Turkish adults in that they conated cause and path of motion more often than Turkish adults did, but they were consistent with their own spoken descriptions which also conated these components more than adults (Furman et al. 2006). Similar ndings come from a study of the expression of path and manner of motion in French. French children aged four and six were adult-like in their tendency to both talk and gesture predominantly about path (Gullberg et al. 2008). In sum, these studies of childrens speech and gesture generally suggest that childrens gestures reect the meanings that they express in speech. This in turn suggests that childrens gestures can be informative about their semantic representations at a given point in time.
2.
This study
The aim of the present study is to examine the nature of childrens semantic knowledge of placement verbs in more detail. We do this by considering how Dutch three- (N 5) and ve-year-olds (N 7) use gestures in parallel with Dutch caused posture verbs to describe object placement events compared to Dutch adults (N 10). We ask the following two questions: (1) Do Dutch children gesture like adults in the domain of placement? (2) If not, do Dutch childrens placement gestures dier depending on their patterns of use of placement verbs, and if so, how? Previous research leads us to expect Dutch adults to produce the caused posture verbs leggen lay and zetten set/stand in the description
245
of placement events, and also to produce gestures that reect the semantic importance of gure objects through a preference for object-incorporation with the direction or path of gestural motion. Our rst analysis examines whether childrens gestures accompanying placement descriptions look overall adult-like. Second, to determine whether childrens use of placement verbs is adult-like, we investigate childrens deployment of verbs to a set of target events, which systematically vary the orientation of gure objects, and compare them to adults. Finally, we examine whether gesture use diers between those children whose verb use is adult-like versus those whose verb use is not in order to explore whether the information expressed in gesture can shed light on the semantic representations underlying the usage of placement verbs.
3.
Method
To elicit natural speech and gesture data while maintaining control over the extensions of placement verbs, we used a referential communication task (Yule, 1997) in the form of a Director-Matcher game. One participant, the Director, describes video clips depicting placement events to a confederate, the Matcher, who must then select the picture corresponding to the description from a set of possible options. The dyadic set-up as well as the information gap between the participants is conducive to gesture production despite the short, simple descriptions. We rst examine children and adults overall gesture production and the frequencies of use of object-incorporating versus path-only gestures. We then compare children and adults verb use to describe the same scenes in a subset of contrastive target placement events. Finally, we explore the connections between gesture production and verb use in individuals. 3.1. Participants
Participants were 29 children acquiring Dutch (aged 3;1 to 6;0) recruited through a Dutch preschool (Molenhoek, the Netherlands). For the purposes of this analysis, we excluded all children who produced fewer than three gestures during the task, leaving 12 children in total for analysis. The children fell naturally into two groups of children aged 3;14;5 (M 3;6, N 5), and children aged 5;16;0 (M 5;4, N 7). For ease of exposition, the child groups are referred to as three-year-olds and veyear-olds. Additionally, 29 adult native speakers of Dutch were tested as
246
controls, 10 of whom produced more than three gestures and were therefore retained for analysis. 3.2. Materials
The stimuli, developed for a crosslinguistic comparison of placement event descriptions (cf. Narasimhan and Gullberg 2006; accepted), consisted of a set of video clips showing a female actor manually placing gure objects (henceforth simply objects) on a shelf or a table top. Sixteen target events showed eight objects (a doll, a monkey, a bear, a dog, a can, a book, a ashlight, and a picture frame) being placed either in a vertical or horizontal position at a location (see the Appendix, target events listed in boldface). Twenty ller events and 3 warm-up items showed a range of other objects being dropped, squeezed, etc. These were not expected to elicit placement verbs. The stimulus clips were randomized and organized into two orders. The presentation of the stimulus order was counterbalanced. A set of still photos of the objects in their end location was also produced. 3.3. Procedure
Participants were tested individually and given oral instructions that they were going to play a game where they had to help one person (Experimenter2) put a set of pictures in the right order. Participants saw one video clip at a time on a laptop screen manipulated by Experimenter1. Experimenter2, who could not see the video screen, asked the participants What did the woman do?. Based on the participants descriptions, Experimenter2 chose the correct still image from the set of stills depicting the placement scenes. If participants gave a simple locative expression or an intransitive description (e.g., the book is/lies on the table), then Experimenter2 asked What happened or What did the woman do? Adults controlled the computer themselves. The testing procedure was otherwise identical for adults and children. The session started with three warm-up items. The entire testing session was audio- and video taped. 3.4. Data treatment
3.4.1. Speech. Native speakers of Dutch transcribed the rst spontaneous transitive description of each video clip (cf. Plumert et al. 1995). An (adult) example is given in (3), with the rst transitive description in boldface. (3) ze pakt een dingetje . . . zon knuelbeertje en die zet ze op tafel she takes a thingy . . . a little teddy bear and that she sets on [the] table
247
The placement verbs were selected for further analysis. Where two utterances described the same scene with dierent object labels, the rst one was selected. Finally, in cases of self-corrections, the rst immediately following complete and/or interpretable description was retained. A similar procedure was applied to uninterpretable utterances. 3.4.2. Gesture. The narrow focus on the rst descriptions is particularly important for the gesture analysis. Gestures are sensitive to information structure and tend to co-occur with the most newsworthy element. In the rst description of the placement event, that information is the placement act itself in conjunction with the ground. In contrast, in elaborations prompted by questions, other spatial information is often targeted such as specic locations like at the right-hand corner on top. Gestures accompanying such elaborations are often deliberately demonstrative, sometimes aligning with spoken deictic expressions referring to the gesture (like this). These gestures therefore target other information and are potentially driven by other mechanisms than gestures performed without any particular demonstrative intent. Also excluded from analysis, and for similar reasons, were gestures occurring with disuencies or multiple hesitation phenomena (cf. Gullberg 1998). Using frame-by-frame analysis of digital video in video annotation software (ELAN, http:/ /www.lat-mpi.eu/tools/elan/), we identied gestures occurring with the spontaneous rst descriptions of the placement events. Specically, we identied gestural strokes, that is, the expressive part of the gestural movement where the spatial excursion of the limb reaches its apex, and post-stroke holds, or cases where the hands are temporarily immobile in gesture space before moving on (Kendon 1972, 2004: 111112; Kita et al. 1998; Seyfeddinipur 2006). All gestures thus identied were then coded for whether they encoded (gure) object information, or only direction or path of movement. This coding was done with sound turned o and was based on the structural properties of the gestures alone to avoid circularity when gesture information was compared to speech information. Gestures were coded as expressing object information when they displayed a hand shape that reected and incorporated the gure object into the movement. Gestures were coded as expressing only path of movement when they expressed a spatial excursion (cf. Kendon 2004) laterally, vertically or sagittally from the speakers body and displayed no particular hand shape, that is, a relaxed, oppy hand or a pointing hand shape. Examples of these categories are displayed in Figure 1a (Object-incorporation) and Figure 1b (Path-only). Finally, in the same annotation software with sound turned back on, we also transcribed the speech that co-occurred exactly with the gesture
248
Figure 1a. Example of gesture coded as Object-incorporating displaying a hand shape indicating the presence of a gure object.
Figure 1b. Example of gesture coded as Path-only displaying a at hand with no hand shape indicating the presence of a gure object.
stroke, although no detailed speech-gesture alignment analysis was performed for this study. Interrater reliability of the gesture coding was established by having a second coder judge the data. The interrater reliability for gesture identication was .94 (N 235) and for form coding (object-incorporation vs. path-only) .92. In cases of discrepancy, the judgement of the second coder was retained. Table 1 summarises the total number of gestures per age group.
Table 1. Number of gestures per age group # speakers 3-year-olds 5-year-olds Adults Total 5 7 10 22 # gestures 66 70 99 235
3.5.
Analyses
The dependent variables are proportions of gestures per participant expressing object-incorporation vs. path-only, and proportion of verb types used per participant. Because the dependent variables are proportions, they were arcsine transformed for statistical analysis (Howell 2002); however, non-transformed values are reported in tables, gures and text. Analyses of gesture data draw on non-parametric statistical tests, specically Kruskal-Wallis for comparisons of multiple independent samples and Mann-Whitney for comparisons of two independent samples. Speech
249
data are analysed with parametric one-way ANOVAs followed by Tukey HSD tests for post-hoc comparisons. 4. 4.1. Results Overall gesture use
We rst examine whether Dutch children produce the same gestures and to the same extent as Dutch adults, excluding warm-up items but including both target and ller events. Figure 2 summarises the mean proportion of gestures that express object-incorporation in the form of objectrelated hand shapes or path-only as a function of age (3 years, 5 years, adults).
Figure 2. Mean proportion of gestures expressing object-incorporation in hand shape (Obj) or path-only (Path) as a function of age. (Error bars standard error).
Adult Dutch speakers show a clear preference for incorporating object information in gestures that accompany placement descriptions. They produce gestures with hand shapes that incorporate objects in the gestural movement. Moreover, the occurrence of these object-incorporating gestures is not restricted to a specic verb, but they occur with both verbs across the board. These data replicate previous ndings showing a robust adult Dutch gestural preference for object-incorporation with placement descriptions (Gullberg in press, submitted). The child data look strikingly dierent. The youngest children in particular almost exclusively produce gestures that express only path. In order to investigate whether there was a dierence in the overall pattern of gesture usage across the three age groups a Kruskal-Wallis test was run on the mean proportion of object-incorporating gestures
250
(Obj 1) with age (3, 5, adults) as the between-subject factor. The groups diered signicantly in the mean proportion of object-incorporating gestures ( w 2 (2, N 22) 11.46, p < 0.01). Specically, 3-year-olds produced signicantly fewer object-incorporating gestures (M 1%, SD 2%) than both 5-year-olds (M 47%, SD 27%; z 2.44, p 0.02) and adults (M 62%, SD 15%; z 3.09, p < 0.001), who did not dier from each other (z 1.52, p 0.13). The youngest children clearly prefer to express only path in their gestures accompanying placement descriptions, and only very rarely do they express object information. 5-year-olds express considerably more object-incorporation, although their preferences do not numerically match those of adult speakers. Children thus gesture dierently from adults. They do not appear to imitate the adult gestural input, nor to imitate the practical placement actions by enacting a placement event with a symbolised, imagined object (cf. Capirci et al. 2005). 4.2. Verb use
We next investigate whether children use the same verbs to describe the same scenes as adults, that is, whether they have the same extension patterns as adults or convey the same meaning with the verbs as adults do. We focus on verb use for the 16 target items, which systematically vary object orientation. We group the target scenes by orientation into two groups of 8 scenes each: horizontal and vertical placement. All verb responses, including inappropriate forms for a given orientation, went into the analysis. For each age group, the mean proportion of responses per verb type (leggen, zetten, and OTHER) was computed (cf. Narasimhan and Gullberg accepted).1 Figure 3 summarises the mean proportion of verbs used to describe horizontal (Figure 3a) and vertical items (Figure 3b), respectively, as a function of age. For horizontal items the typical adult verb choice is leggen lay. All age groups overwhelmingly used the verb leggen for items placed horizontally. The three-year-olds also used a sprinkling of OTHER verbs. Oneway ANOVAs for each verb type with age group as the between-subject factor2 revealed no dierence between the groups in use of leggen lay (F(2,19) 1.66, p 0.22), or zetten set/stand (F > 1). However, the
1. This analysis is similar to the one performed in Narasimhan and Gullberg (accepted), but is performed here on a sub-set of those data, viz. only on speech data from those participants who also gesture. 2. Because an items analysis on as few items as 8 is dicult to interpret, no items analysis was performed.
251
Figure 3a. Mean use of leggen, zetten, and OTHER in Dutch for 8 horizontal target scenes across age groups (error bars standard error).
Figure 3b. Mean use of leggen, zetten, and OTHER in Dutch for 8 vertical target scenes across age groups (error bars standard error).
groups did dier in their use of OTHER (F(2,19) 4.94, p 0.02), with three-year-olds using signicantly more OTHER verbs (M 13%, SD 21%) than ve-year-olds (M 2%, SD 6%; Tukey HSD p 0.04) and adults
252
(M 1%, SD 4%; Tukey HSD p 0.02), who did not dier.3 Critically, the groups did not dier in the use of leggen. Dutch children thus use leggen for horizontal items as often as adults already by the age of three and half. For vertical items, the standard verb choice should be the verb zetten set/stand. This is the only verb used by adults, but children behave surprisingly dierently. The youngest children use leggen lay for more than half of the vertical items and only rarely use zetten. One-way ANOVAs for each verb type with age group as the between-subject factor revealed a dierence between the age groups in the use of zetten set/stand (F(2,19) 11.02, p < 0.001), with three-year-olds using zetten signicantly less (M 21%, SD 39%) than adults (M 100%; Tukey HSD p < 0.001), and ve-year-olds also using zetten signicantly less than adults (M 73%, SD 33%; Tukey HSD p 0.03). Despite the numerical dierence, the child groups did not dier statistically from each other. In contrast, the two child groups did dier in their use of leggen lay for vertical items (F(1,11) 6.31, p 0.03), with the three-year-olds using signicantly more leggen (M 61%, SD 31%) than ve-year-olds (M 9%, SD 15%). The child groups did not dier in their use of OTHER, however (F(1,11) 3.19, p 0.11). The Dutch three- and ve-year-olds in this sample thus dier from adults in their under-use of zetten set/stand for vertical items, and both child groups dier from adults in that they use leggen lay to describe vertical scenes, three-year-olds signicantly more so than ve-year-olds. Some children thus use leggen across the board for all placement scenes. 4.3. Gesture use with zetten and leggen
We nally turn to the question whether gesture use diers between those children who use both leggen lay and zetten set/stand and those who over-extend leggen to all placement regardless of orientation of the object. Figure 4 summarises the mean proportion of gestures that express objectincorporation in the form of object-related hand shapes or path-only as a function of whether children over-extend leggen and chiey use one verb (N 6), or whether they use two verbs (N 6) to describe the 16 target scenes. The adult data are included for ease of comparison. When gestures are considered in parallel with speech, a binomial distribution is found such that children who mainly use only one verb, leggen
3. We could not perform an omnibus (repeated measures) ANOVA on all verb types across age groups since not all groups used all verbs. The same argument holds for the analysis of vertical items.
253
Figure 4. Mean proportion of gestures expressing object incorporation in hand shape (Obj), or path-only (Path) as a function of whether children use one verb or two verbs to describe placement verbs. (Error bars standard error).
lay, to describe all 16 target events also predominantly produce gestures that express path-only. In contrast, children who use two verbs, both leggen lay and zetten set/stand, to describe horizontal and vertical placement, respectively, also produce object-incorporating gestures, even if their proportions do not quite match those of adults. Again, a KruskalWallis test (Obj 1) revealed that the groups diered signicantly in the mean proportion of object-incorporating gestures (w 2 (2, N 22) 12.15, p < 0.01). Specically, children using mainly one verb produced signicantly fewer object-incorporating gestures (M 8%, SD 16%) than both children using two verbs (M 46%, SD 26%; z 2.25, p 0.03) and adults (M 62%, SD 15%; z 3.28, p < 0.001), who did not dier from each other (z 1.47, p 0.15). Finally, a correlation analysis was run on the mean proportion of object-incorporating gestures and the mean proportion of leggen lay used for vertical scenes across the age groups. The analysis revealed that with decreasing use of leggen for vertical placement, the proportion of gestures expressing object-incorporation increases signicantly (r(20) 0.7475, t(20) 5.03, p < 0.001). This indicates that as children cease to label vertical scenes with leggen lay and shift to using zetten set/stand, they are also more likely to produce object-incorporating gestures. 5. General discussion
This study examined how Dutch three- and ve-year-olds use gestures in parallel with caused posture placement verbs to describe object placement events compared to Dutch adults. There are two main ndings. First,
254
children in these age groups gesture dierently from adults. Dutch adults show a robust preference for expressing objects and direction or path of movement simultaneously in placement gestures, a result replicating previous ndings (Gullberg in press; submitted). In contrast, three-year-olds show a strong bias towards gestures that express only the path of the (caused) movement, and although ve-year-olds are more likely to produce object-incorporating gestures, they are numerically still not adultlike in their preferences. Second, children who use the placement verbs in non-adult-like ways in speech also gesture in non-adult-like ways. That is, children who overextend leggen lay to all placement scenes express only the path of the movement in gesture. In contrast, children who use both leggen lay and zetten set/stand dierentially for horizontal and vertical placement respectively also incorporate objects in gestures like adults. They are adultlike in both speech and gesture, targeting information about objects and movement in both modalities. What can these ndings tell us about the semantic distinctions that underlie childrens extension patterns in the semantic domain of object placement? The gesture data suggest that children who use leggen lay to label all placement events are only targeting one semantic component, namely the movement or motion component of the caused motion verbs, as seen in their gestures expressing only path. These children do not seem to care about the object being moved and its properties. Recall that in the adult system some attention to the object is necessary for the choice of a specic placement verb to a given scene, which is arguably what prompts adults to also gesture about objects. There is no evidence in the childrens gestures that the objects matter at this stage. Consequently, it is as if the verb leggen has an over-general meaning for children, cause to move or put. Similarly, the object-incorporating gestures produced by children who use both leggen and zetten dierentially suggest that they have tuned their attention to encompass the object. Specically, the gestural incorporation of the object suggests that these children have included the objects in their semantic representations of the caused posture verbs. We therefore argue that speech and gesture together indicate a developmental transition from a system with a single semantic component based on (caused) movement-only, reected in the use of one single verb (leggen), to an (adult-like) focus on (caused) movement-and-object mirrored in adult-like use of two verbs (leggen and zetten). Notice that the crucial issue is not whether children notice the objects more generally, but whether the object is included in the representations of the transitive caused posture verbs. Even children who use leggen for all transitive placement descriptions occasionally talk and gesture about
255
the objects. However, the objects are not included in the compact transitive caused posture verb descriptions, but appear elsewhere. In (4) a child comments on the object outside the transitive description for a scene where a monkey is placed in a standing position: (4) nou die [staat] hij legt hem op de tafel well it [stands] he lays him on the table (DuCh18 aged 4;5)
The child rst describes the end state of the object using a correct intransitive posture verb, staan stand. She immediately follows this by a description of the placement event itself using leggen lay. Interestingly, the gesture accompanying the intransitive verb staan, indicated in square brackets, still expresses path-only, and more specically, movement towards the ground. There is no hand shape indicating an object-incorporation. The example nevertheless highlights (a) that the child is not confused about the objects orientation (vertical, standing), and (b) that she attempts to express both information about the object in its end state in staan and the caused motion in leggen. This may be a precursor stage to bringing the two elements together in one single adult-like representation. One might wonder why not all children gesture about objects. Given the nature of placement events, children could have been expected to imitate the practical action as perceived and enact the placing of an object with a symbolised, imagined object in hand (cf. Capirci et al. 2005; McNeill 2005). They might also have been expected to imitate the gestural input provided by Dutch adults as they talk and gesture about placement. Finally, children could even have been expected to gesture about objects on theoretical grounds, given the documented tendency for speakers to make more ne-grained distinctions at the goal of a path of motion (e.g., Lakuta and Landau 2005; Regier and Zheng 2007). The object in its end conguration is arguably a goal-related part of the caused motion. The fact that children do not, and that there is a developmental trend from gesturing about movement-only towards adult-like gesturing about movement-and-object simultaneously suggest that childrens gestures in this domain are inuenced by their linguistic activities, and, more specically, by the semantic distinctions they operate with, even at young ages. The converse question is why younger children so overwhelmingly target path or the direction of movement alone in their placement gestures. One possibility is that this is a reection of communicative development, as suggested by Clark and Grossman (1998). The youngest children may interpret the communicative goal of the situation dierently from older children, and focus on direction or path towards the goal ground. However, this does not explain the strong co-occurrence of such gestures with
256
childrens overextension patterns with the verb leggen lay. A compensatory account of gestures might suggest that children gesture about path because they do not talk about it.4 Although they use caused motion verbs, it could be argued that the path element is not explicit in these verbs. However, the data contain examples of children using the frequent path-prexed forms of the Dutch caused posture verbs, such as neerleggen down.lay and inleggen in.lay. There is no dierence between gestures accompanying such explicitly path-prexed verbs and the bare caused posture verbs. It therefore does not seem to be a matter of pathcompensation in gesture. A third possibility is that the dierence in gesturing has nothing to do with the semantics of Dutch specically. Instead, the path component may be a more universally basic motion element, as suggested by Talmy (1985), which all children therefore target initially. More crosslinguistic data is needed in the placement domain to address that issue. An additional option is that children do not target path information per se but rather that the path gestures reect a more basic focus on change.5 Kamp (1980) has argued that change is the fundamental basis for any event structure. In the study at hand, both a focus on change in general or a more specic focus on path would yield path gestures. It remains an issue for future research to disentangle these options. Finally, there is no evidence in the data that gestures foreshadow speech such that children who use leggen lay for both vertical and horizontal placement produce object-incorporating gestures, using gesture to indicate an interest in objects that they are not yet able to express in speech. These results seem to run counter to ndings in the literature on cognitive development where children are found to express aspects of mathematical reasoning in gesture not yet accessible to them in speech (e.g., Alibali and Goldin-Meadow 1993; Goldin-Meadow 2003; Pine et al. 2004). The absence of such mis-matches in the current data does not invalidate the basic observation that gestures reect childrens current knowledge of placement verb semantics. First, their absence can simply be a sampling accident. Given the small number of children in this study, we cannot exclude the possibility that some children may gesture about objects while still over-extending leggen to all placement events. However, an alternative explanation is that children engaged in reasoning tasks have more room to express alternative, additional, or dierent meanings in gesture than do children who talk and gesture about as mundane and
4. This was suggested by an anonymous reviewer. 5. We thank an anonymous reviewer for this suggestion.
257
simple things as putting toys on tables. The nature of the task and the age range examined here may both contribute to the patterns observed. Moreover, support for non-compensatory gesture production in language development comes from other developmental studies in the domain of voluntary motion. Turkish three-year-olds have been shown to dier from Turkish adults in gesture and speech, with the childrens gestures critically matching their own speech (Furman et al. 2006). Similarly, French children aged four and six talking about path and manner overwhelmingly gesture and talk about the same elements (Gullberg et al. 2008). This is despite the fact that the complex constructions for expressing manner in French might have led children wanting to express all aspects of motion to talk about path and instead gesture about manner. However, even the youngest children look adult-like and gesture about path when talking about path, and gesture about manner only when speaking about manner. Similar consistency is reported for bilingual French-English children (Nicoladis and Brisard 2002). The view of childrens gestures as mainly compensatory at these ages thus receives little support here. In conclusion, this study suggests that Dutch childrens knowledge of placement verb semantics moves from a focus on (caused) movementonly to a focus on (caused) movement-and-object in conjunction. Part of childrens diculties with the Dutch caused posture placement verbs seem to be related to understanding the role of the object as a necessary semantic component in these monomorphemic, portmanteau verbs that conate cause, motion, and properties of the object in one form (cf. Narasimhan and Gullberg accepted). The transition from a system based on one single semantic distinction to a system with an adult-like focus on movementand-object is not necessarily complete by age ve, but seems to continue to develop in later childhood. How and exactly when this transition takes place is an empirical question. More generally, the study highlights the value of studying childrens gestures as a means of examining semantic representations in development. Gestures, as vehicles of language-specic meaning, provide a window on the details of what semantic elements underpin non-adult-like use of forms in inappropriate contexts, as well as what elements undergo change in switches towards more adult-like speech. Gestures allow us to go beyond error analysis of speech, stating merely that childrens verb meanings differ from those of adults, and allow us to explore how they dier. Received 1 March 2009 Revised received 25 November 2009 Max Planck Institute for Psycholinguistics/ University of Colorado
258
References
Alibali, Martha W., and Susan Goldin-Meadow. 1993. Gesture-speech mismatch and mechanisms of learning: What the hands reveal about a childs state of mind. Cognitive Psychology 25(4). 468523. Allen, Shanley, Asli Ozyurek, Amanda Brown, Reyhan Furman and Tomoko Ishizuka. 2003. Early speech about manner and path in Turkish and English: Universal or language-specic? In Barbara Beachley, Amanda Brown and Frances Conlin (Eds.), Proceedings of BUCLD 27, 6372. Somerville: Cascadilla Press. Ameka, Felix K., and Stephen C. Levinson (eds.). 2007. Locative predicates. [Special Issue]. Linguistics 45 (5/6). Anglin, Jeremy. 1970. The growth of word meaning. Cambridge, MA: MIT Press. Bates, Elizabeth and Fred Dick. 2002. Language, gesture, and the developing brain. Developmental psychobiology 40(3). 293310. Bowerman, Melissa. 1978. Systematizing semantic knowledge: Changes over time in the childs organization of word meaning. Child Development 49(4). 997987. Bowerman, Melissa, Penelope Brown, Sonia Eisenbeiss, Bhuvana Narasimhan, and Dan I. Slobin. 2002. Putting things in places. Developmental consequences of linguistic typology. In Eve V. Clark (ed.), Space in Language. Location, Motion, Path, and Manner, S1S122. Stanford: CLS. Bowerman, Melissa, Marianne Gullberg, Asifa Majid and Bhuvana Narasimhan. 2004. Put project: The cross-linguistic encoding of placement events. In Asifa Majid (ed.), Field Manual (Vol. 9, pp. 1018). Nijmegen: Max Planck Institute for Psycholinguistics. Brown, Penelope. 1998. Childrens rst verbs in Tzeltal: Evidence for an early verb category. Linguistics 36(4). 715753. Capirci, Olga, Annarita Contaldo, Maria C. Caselli and Virginia Volterra. 2005. From action to language through gesture: A longitudinal perspective. Gesture 5(1/2). 155177. Capirci, Olga, Jana M. Iverson, Elena Pizzuto and Virginia Volterra. 1996. Gestures and words during the transition to two-word speech. Journal of Child Language 23(3). 645 675. Capirci, Olga, and Virginia Volterra. 2008. Gesture and speech. The emergence and development of a strong and changing partnership. Gesture 8(1). 2244. Capone, N. C. 2007. Tapping toddlers evolving semantic representation via gesture. Journal of Speech and Hearing Research 50. 732745. Chenu, Florence and Harriet Jisa. 2006. Caused motion constructions and semantic generality in early acquisition of French. In Eve V. Clark and Barb F. Kelly (Eds.), Constructions in acquisition, 233261. Stanford: CLS. Choi, Soonja, and Melissa Bowerman. 1991. Learning to express motion events in English and Korean: The inuence of language-specic lexicalization patterns. Cognition 41(13). 83 121. Church, R. Breckinridge, and Susan Goldin-Meadow. 1986. The mismatch between gesture and speech as an index of transitional knowledge. Cognition 23(1). 4371. Clark, Eve V. 1978. Discovering what words can do. In Farkas, Donka, Wesley Jacobsen and Karol Todrys (eds.), Papers from the parasession on the lexicon, 3457. Chicago: CLS. Clark, Eve V., and James B. Grossman. 1998. Pragmatic directions and childrens word learning. Journal of Child Language 25(1). 118. David, Caroline. 2003. Les verbs of putting: Typologie, schema syntaxique et organisation semantique des constructions prepositionnelles en anglais contemporain. Poitiers, Universite de Poitiers, PhD thesis.
259
De Ruiter, Jan-Peter. 2007. Postcards from the mind: The relationship between speech, gesture and thought. Gesture 7(1). 2138. Duncan, Susan. 1996. Grammatical form and thinking-for-speaking in Mandarin Chinese and English: An analysis based on speech-accompanying gesture. Chicago, University of Chicago, PhD dissertation. Duncan, Susan. 2005. Co-expressivity of speech and gesture: Manner of motion in Spanish, English, and Chinese. In Charles Chang et al. (eds.), Proceedings of the 27th annual meeting of the Berkeley Linguistic Society, 353370. Berkeley: BLS. Fisher, Cynthia, D. G. Hall, S. Rakowitz and Lila Gleitman. 1994. When it is better to receive than to give: Syntactic and conceptual constraints on vocabulary growth. Lingua 92. 333375. Furman, Reyhan, Asli Ozyurek and Shanley Allen. 2006. Learning to express causal events across languages: What do speech and gesture patterns reveal? In David Bamman, Tatiana Magnitskaia and Colleen Zaller (eds.), Proceedings of the 30th BUCLD, 190201. Somerville: Cascadilla Press. Gentner, Dedre. 1978. On relational meaning: The acquisition of verb meaning. Child Development 49(4). 988998. Goldberg, Adele. 1995. Constructions: A construction grammar approach to argument structure. Chicago: University of Chicago Press. Goldin-Meadow, Susan. 2003. Hearing gesture: How our hands help us think. Cambridge, MA: The Belknap Press. Goldin-Meadow, Susan. 2007. Pointing sets the stage for learning languageand creating language. Child Development 78(3). 741745. Gropen, J., Steve Pinker, M. Hollander and R. Goldberg. 1991. Syntax and semantics in the acquisition of locative verbs. Journal of Child Language 18(1). 115151. ` Guidetti, Michele. 2005. Yes or no? How young French children combine gestures and speech to agree and refuse. Journal of Child Language 32(4). 911924. Gullberg, Marianne. 1998. Gesture as a communication strategy in second language discourse. A study of learners of French and Swedish. Lund: Lund University Press. Gullberg, Marianne. in press. Language-specic encoding of placement events in gestures. In Eric Pederson and Ju rgen Bohnemeyer (eds.), Event representations in language and cognition. Cambridge: Cambridge University Press. Gullberg, Marianne. Submitted. Linguistic representations inuence speech-associated gestures in the domain of placement. Gullberg, Marianne, Henriette Hendriks, and Maya Hickmann. 2008. Learning to talk and gesture about motion in French. First Language 28(2). 200236. Hansson, Kristina and Barbro Bruce. 2002. Verbs of placement in Swedish children with SLI. International Journal of Communication Disorders 37(4). 401414. Hickmann, Maya and Henriette Hendriks. 2006. Static and dynamic location in French and English. First Language 26(1). 103135. Howell, David C. 2002. Statistical Methods for Psychology (5th ed.). Pacic Grove, CA: Duxbury. Kamp, Hans. 1980. Some remarks on the logic of change, part 1. In Christian Rohrer (ed.). Time, tense, and quantiers, 135180. Tuebingen: Niemeyer. Kendon, Adam. 1972. Some relationships between body motion and speech: An analysis of an example. In Aron W. Siegmann and Benjamin Pope (eds.), Studies in dyadic communication, 177210. New York: Pergamon. Kendon, Adam. 1980. Gesticulation and speech: Two aspects of the process of utterance. In Key, Mary R. (Ed.), The relationship of verbal and nonverbal communication, 207227. The Hague: Mouton.
260
Kendon, Adam. 2004. Gesture. Visible action as utterance. Cambridge: Cambridge University Press. Kita, Sotaro and Asli Ozyurek. 2003. What does cross-linguistic variation in semantic coordination of speech and gesture reveal?: Evidence for an interface representation of spatial thinking and speaking. Journal of Memory and Language 48(1). 1632. Kita, Sotaro, Ingeborg Van Gijn and Harry van der Hulst. 1998. Movement phases in signs and co-speech gestures, and their transcription by human coders. In Ipke Wachsmuth and Martin Frohlich (eds.), Gesture and Sign Language in human-computer interaction, 2335. Berlin: Springer. Kopecka, Annette and Bhuvana Narasimhan (eds.). To appear. Events of putting and taking: A crosslinguistic perspective. Amsterdam: Benjamins. Lakuta, Linda and Barbara Landau. 2005. Starting at the end: The importance of goals in spatial language. Cognition 96(1). 133. Lemmens, Maarten. 2002. Tracing referent location in oral picture descriptions. In Andrew Wilson, Paul Rayson and Tony McEnery (eds.), A rainbow of corpora. Corpus linguistics and the languages of the world, 7385. Munchen: Lincom-Europa. Lemmens, Maarten. 2006. Caused posture: experiential patterns emerging from corpus research. In Anatol Stefanowitsch and Stefan Gries (eds.), Corpora in cognitive linguistics. Corpus-based approaches to syntax and lexis, 263298. Berlin: Mouton. Levinson, Stephen C. and David Wilkins (eds.). 2006. Grammars of space. Explorations in cognitive diversity. Cambridge: Cambridge University Press. McNeill, David. 1992. Hand and mind. What the hands reveal about thought. Chicago: University of Chicago Press. McNeill, David. 2005. Gesture and thought. Chicago: University of Chicago Press. McNeill, David and Susan Duncan. 2000. Growth points in thinking-for-speaking. In McNeill, David (Ed.), Language and gesture, 141161. Cambridge: Cambridge University Press. McNeill, David, Elena Levy and Laura Pedelty. 1990. Speech and gesture. In Hammond, Georey R. (Ed.), Cerebral control of speech and limb movements, 203256. Amsterdam: North Holland. Naigles, Letitia R., and Erika Ho-Ginsberg. 1998. Why are some verbs learned before other verbs? Eects of input frequency and structure on childrens early verb use. Journal of Child Language 25(1). 95120. Narasimhan, Bhuvana and Marianne Gullberg. 2006. Perspective-shifts in event descriptions in Tamil child language. Journal of Child Language 33(1). 99124. Narasimhan, Bhuvana and Marianne Gullberg. Accepted. The role of input frequency and semantic transparency in the acquisition of verb meaning: Evidence from placement verbs in Tamil and Dutch. Journal of Child Language. Nicoladis, Elena and Frank Brisard. 2002. Encoding motion in gestures and speech: Are there dierences in bilingual childrens French and English? In Clark, Eve V. (Ed.), Space in Language. Location, Motion, Path, and Manner, 6068. Stanford: CLS. Nicoladis, Elena, Rachel I. Mayberry and Fred Genesee. 1999. Gesture and early bilingual development. Developmental Psychology 35(2). 514526. Ozcaliskan, Seyda and Susan Goldin-Meadow. 2005. Gesture is at the cutting edge of early language development. Cognition 96(3). B101B113. Ozyurek, Asli, Sotaro Kita, Shanley Allen, Reyhan Furman and Amanda Brown. 2005. How does linguistic framing of events inuence co-speech gestures? Insights from crosslinguistic variations and similarities. Gesture 5(1/2). 219240. Ozyurek, Asli, and Seyda Ozcaliskan. 2000. How do children learn to conate manner and path in their speech and gestures? In E. V. Clark (Ed.), Proceedings of the 30th Stanford Child Language Research Forum, 7785. Stanford: CLS.
261
Pauwels, Paul. 2000. Put, set, lay and place: A cognitive linguistic approach to verbal meaning. Munchen: Lincom Europa. Pine, Karen J., Nicola Lufkin and David Messer. 2004. More gestures than answers: Children learning about balance. Developmental Psychology 40(6). 10591067. Pinker, Steven. 1989. Learnability and cognition: The acquisition of argument structure. Cambridge, MA: MIT Press. Plumert, J., M., K. Ewert and Spear, S. J. 1995. The early development of childrens communication about nested spatial relations. Child Development 66(4). 959969. Pye, Clifton, Diane F. Loeb and Yin-Yin Pao. 1996. The acquisition of breaking and cutting. In Clark, Eve V. (Ed.), Proceedings of the 27th Annual Child Language Research Forum, 227236. Stanford: CLS. Regier, Terry and Mingyu Zheng. 2007. Attention to endpoints: A cross-linguistic constraint on spatial meaning. Cognitive Science 31(4). 705719. Seyfeddinipur, Mandana. 2006. Disuency: Interrupting speech and gesture. Nijmegen, Radboud University, PhD Dissertation. Slobin, Dan I., Melissa Bowerman, Penelope Brown, Sonia Eisenbeiss and Bhuvana Narasimhan. In press. Putting things in places: Developmental consequences of linguistic typology. In Eric Pederson and Jurgen Bohnemeyer (eds.), Event representations in language and cognition. Cambridge: Cambridge University Press. Stefanini, Silvia, Martina Recchia and Maria Cristina Caselli. 2008. The relation between spontaneous gesture production and spoken lexical ability in children with Down syndrome in a naming task. Gesture 8(2). 197218. Talmy, Leonard. 1985. Lexicalization patterns: Semantic structure in lexical forms. In Timothy Shopen (Ed.), Language typology and syntactic description, 57149. Cambridge: Cambridge University Press. Thomson, Jean, and Robin Chapman. 1977. Who is daddy revisited: The status of 2-yearolds over-extended words in use and comprehension. Journal of Child Language 4(3). 359375. Tomasello, Michael, Malinda Carpenter and Ulf Liszkowski. 2007. A new look at infant pointing. Child Development 78(3). 705722. van Staden, Miriam, Melissa Bowerman and Mariet Verhelst. 2006. Some properties of spatial description in Dutch. In Levinson, Stephen C. and David Wilkins (Eds.), Grammars of space. Explorations in cognitive diversity, 475511. Cambridge: Cambridge University Press. Volterra, Virginia, Maria Christina Caselli, Olga Capirci and Elena Pizzuto. 2005. Gesture and the emergence and development of language. In Michael Tomasello and Dan I. Slobin (eds.), Beyond nature-nurture: Essays in honor of Elizabeth Bates, 340. Mahwah, NJ: Erlbaum. Yule, George. 1997. Referential communication tasks. Hillsdale, N.J.: Erlbaum.
Appendix A:
Materials (target items in bold)
Warmup item 1 Warmup item 2 Warmup item 3 Agent_put_bear_lying Agent_put_ashlight_lying Agent_put_book_lying
262
Agent_put_doll_standing Agent_put_paper_envelope Agent_squeeze_wet_cloth Agent_put_book_standing Agent_put_can_lying Agent_put_ashlight_standing Agent_put_monkey_lying Agent_put_can_standing Agent_spin_disc Agent_put_picframe_standing Agent_put_bear_standing Agent_drop_can_accidentally Agent_put_doll_lying Agent_drop_pencils_table Agent_put_mouse_vase Agent_drop_book_lying Agent_drop_can_lying Agent_put_napkin_oor Agent_drop_doll_lying Agent_put_cookiebatter_tray_spoon Agent_ick_coin Agent_put_piece_puzzle Agent_put_dog_standing Agent_put_rice_table Agent_put_picframe_lying Agent_put_pillowcase_pillow Agent_put_arm_frame Agent_put_monkey_standing Agent_put_ring_pole Agent_put_dog_lying Agent_put_tomato_bag Agent_drop_matchsticks_table Agent_drop_monkey_lying
Changes in encoding of PATH of motion in a first language during acquisition of a second language
AMANDA BROWN and MARIANNE GULLBERG*
Abstract Languages vary typologically in their lexicalization of PATH of motion (Talmy 1991). Furthermore, lexicalization patterns are argued to aect syntactic packaging at the level of the clause (e.g., Slobin 1996b) and tend to transfer from a rst (L1) to a second language (L2) in second language acquisition (e.g., Cadierno and Ruiz 2006). Crosslinguistic and developmental evidence suggests, then, that typological preferences for PATH expression are highly robust features of a rst language. The current study examines the robustness of preferences for PATH encoding by investigating (1) whether Japanese follows patterns identied for other verb-framed languages like Spanish, and (2) whether patterns established in an L1 can change after acquisition of an L2. L1 performance of native speakers of Japanese with intermediate-level knowledge of English was compared to that of monolingual speakers of Japanese and English. Results showed that monolingual Japanese speakers followed basic lexicalization patterns typical of other verb-framed languages, but with dierent realizations of PATH packaging within the clause. Moreover, native Japanese speakers with knowledge of English displayed mixed patterns for lexicalization and expressed signicantly more PATH information per clause than either group of monolinguals. Implications for typology and second language acquisition are discussed. Keywords: motion events, PATH, Japanese, English, second language acquisition, crosslinguistic inuence, attrition.
* Address for correspondence: A. Brown, Syracuse University, Department of Languages, Literatures and Linguistics, Oce 323C, 340 H.B. Crouse Hall, Syracuse, N.Y. 132441160, U.S.A. Email: abrown08@syr.edu M. Gullberg, Centre for Languages and Literature, PO Box 201, 221 00 Lund, Sweden. Email: marianne.gullberg@ling.lu.se Cognitive Linguistics 212 (2010), 263286 DOI 10.1515/COGL.2010.010 09365907/10/00210263 6 Walter de Gruyter
264 1.
A. Brown and M. Gullberg Introduction
In human understanding of motion, the notions of Source (point of origin), path (trajectory), and Goal (destination) are core (Johnson 1987). All languages encode such concepts, and the ways in which these elements are mapped onto lexical items pattern remarkably systematically across languages (Talmy 1991). Typological preferences particularly for lexicalization of path appear so robust that they aect syntactic packaging at the level of the clause (Slobin 1996b, 1997) and tend to transfer from a rst language (L1) to a second (L2) in second language acquisition (e.g., Cadierno 2004; Cadierno and Ruiz 2006; Navarro and Nicoladis 2005; Negueruela et al. 2004; Stam 2006). The current study examines the robustness of preferences for path encoding by investigating whether Japanese follows patterns identied for other verb-framed languages such as Spanish, and whether patterns established in an L1 can change after acquisition of an L2. Distinctive patterns in this crosslinguistic and developmental data would underscore the importance of taking individual language experiences into account in characterizations of languages on the basis of usage data, and would have further implications for our understanding of the relationship between languages in the multilingual mind. 2. Background
In inuential work, Talmy (1991) has suggested that languages can be divided into two typological groups depending on how path of motion is lexicalized: in the verb (verb-framed) or outside the verb (satelliteframed). To illustrate, examples are given below for Japanese (verbframed) and English (satellite-framed), with path expressions underlined. (1) Tama-ga saka-o kudaru Ball-Nom hill-Acc descend1 The ball descends the slope The ball rolls down the hill
(2)
In (1), a prototypical example from Japanese, path is lexicalized in the verb kudaru descend. In (2), a corresponding prototypical example from English, path is lexicalized in the so-called satellite (verb particle) down. Renements of the typology (e.g., Slobin 2004b) notwithstanding, support for the prevalence of basic typological distinctions in lexicaliza1. Abbreviations used in examples are Nom Nominative Case, Acc Accusative Case, Gen Genitive Case, Top Topic Marker, Con Connector.
Changes in encoding PATH of motion
265
tion of path has been found in many empirical studies on dierent languages (e.g., Gennari et al. 2002; Naigles et al. 1998; Slobin 1996b). Talmys typology (1985, 1991, 2000) reects characteristic preferences in a language, but there are often several options for path lexicalization in both satellite-framed and verb-framed languages. In addition to the preponderance of satellites, English, for example, possesses several path verbs such as descend, ascend, etc., although, as Talmy observed, most of these are borrowings from Latin, representing a more formal register, which is not characteristic of English. Japanese, however, has a number of rather more frequent options for path expression besides simple main verbs. Example (3) illustrates several of these. (3) Tama-ga toi-kara detekite bouringu-jyou-made ball-Nom pipe-from exit.come.Con bowling-alley-to haitte itte enter.Con go.Con Lit: The ball comes exiting the pipe, and goes entering the bowling alley
Example (3) displays three dierent kinds of possibilities for path expression in Japanese other than simple main verbs: postpositions, e.g., made until/to, kara from; complex motion predicates, e.g., haitte itte go entering, consisting of hairu enter and the deictic verb iku go; and compound verbs, e.g., detekite come out, a combination of deru exit and kuru come. Such possibilities are not necessarily unique to Japanese. Spanish, for example, employs directional adpositions, which can be stacked within the clause,2 as well as complex motion predicates.3 Compound verbs are also seen in other verb-framed languages such as Korean (Slobin 2004b).
2. Use of directional adpositional phrases in combination with verbs of manner of motion in verb-framed languages is argued to be restricted such that they cannot be used for telic events (Aske 1989) or events involving state changing boundary crossing (Slobin and Hoiting 1994). To some extent, Japanese may be similarly constrained, which may explain the ungrammaticality of *John-ga gakkoo-ni/e hashitta/aruita John walked/ran to school (Tsujimura 1994, cited in Inagaki 2002:119), although see Inagaki (2002: 191, footnote 11) for comments on variations in native speaker judgments of sentences such as these. However, John-ga gakkoo-made hashitta/aruita John walked/ran to school (Inagaki 2002:191) is commonly accepted, which may reect semantic dierences concealed in translation equivalents. 3. In Japanese, Matsumoto (1991; 1996) claims that such complex motion predicates are mono-clausal and contain a motion verb, either manner or path, with a connective -te sux followed by a main tensed verb. He restricts the verbs that can appear in tensed/ nal positions in such constructions to deictic motion verbs, e.g., iku, go; kuru, come; irassharu, go; kaeru, return.
266
A. Brown and M. Gullberg
Basic dierences in lexicalization patterns have been argued to have consequences at the level of the clause. In a corpus of literary translations, Slobin (1996b) illustrates possibilities in English and Spanish. He found that English texts tended to encode more information about path than Spanish texts through numerous mentions of Ground within individual clauses describing motion.4 (4) I went into the hall and through to the dining room. Entre en el hall y pase al comedor. I entered the hall and passed to the dining room (Du Maurier 1938: 243, cited in Slobin 1996b: 216)
In the English sentence above, there are two Ground elements associated with a single path verb (went into the hall / through to the dining room). In Spanish, on the other hand, comparable information is spread across two clauses, each associated with dierent path verbs (entre en el hall; pase al comedor). Slobin hypothesized that, as English generally locates path outside the verb root, many more path elements (that is, path particles and Ground elements expressing trajectory information such as into the hall ) can be concatenated within a clause, thereby yielding a more extended path description. For Spanish speakers to do the same, each path expression would require a separate verb clause. And indeed, the analysis of novels revealed that English-speaking writers on average mentioned 2.24 Ground elements in each description of a motion event, in contrast with the 1.52 elements mentioned by Spanish-speaking writers. Thus, although they employed fewer clauses, writers of English ultimately added more path detail to their motion event descriptions than their Spanish-speaking counterparts. These observations lie at the heart of the concept of thinking for speaking (Slobin 1996a), that is, the idea that speakers typically attend to the aspects of an event that their language has the readily available linguistic means to express, and that over time, this habitual attention leads to certain rhetorical styles. Thinking for speaking, then, would predict generally compact expression of complex trajectories in English. The existence of crosslinguistic dierences in lexicalization and encoding of path have also prompted the question of what happens when indi4. Observations about depiction of Ground and its relationship to path here should be distinguished from other observations in the literature regarding descriptions of Ground in the process of scene setting, i.e., descriptions of the context in which the motion took place prior to descriptions of the motion itself, which allow information about path to be inferred (cf. Slobin 1996b).
267
viduals acquire knowledge of a competing system, for example, in the case of second language learning. Studies of both intermediate and advanced L2 speakers have found traces of properties from the L1 in L2 production, generally known as transfer from the L1. In the domain of path expressions, examples of transfer include non-target-like use of path verbs, redundant use of path satellites, and acceptance of ungrammatical combinations of manner and path constructions (e.g., Cadierno 2004; Cadierno and Ruiz, 2006; Inagaki, 2001; Navarro and Nicoladis 2005; Negueruela et al. 2004; Stam 2006). Diculties with such seemingly simple lexical items as up, down, enter, exit in English, even at high levels of L2 prociency, is rather striking. Although this kind of data overwhelmingly suggests that typologically determined preferences for expression of path in the L1 are resistant to change, there is a small body of evidence indicating that patterns may shift in an L1 under the inuence of presence of an L2even during L2 acquisition and in L2 speakers who are not functional bilinguals. To date, studies have focused on manner of motion in speech and gesture (Brown and Gullberg, 2008) and gesture perspective in the expression of motion (Brown 2008), but little is known about whether an L2 inuence on the L1 can also be found in the expression of path of motion. In sum, given the variety of available morphosyntactic resources in Japanese outlined above, we may question whether Japanese really patterns like other verb-framed languages such as Spanish in terms of preference for expressing one path constituent per clause as opposed to concatenating several such expressions within the clause. Moreover, since expression of path is moderated by preference rather than governed by grammar in both English and Japanese, there is potential for eects of one language on another in the context of second language acquisition. While eects of the L1 on the L2 have been found in L2 production in this domain, no study has examined concurrent eects of an L2 on the L1 (although see Hohenstein, Eisenberg and Naigles 2006 and Tatsumi 1997 for a discussion of bidirectional crosslinguistic inuence in bilingualism in the domain of motion), especially at modest levels of prociency in the L2. 3. This Study
The aim of this study is twofold. The rst goal is to examine the extent to which Japanese conforms to the typical verb-framed pattern in language usage. If it does, monolingual speakers of Japanese should lexicalize path primarily in simple, main verbs, which diminish the possibility of stacking expressions within the clause. On the other hand, if speakers make use
268
of the full range of morphosyntactic resources available in Japanese, e.g., postpositions, compound verbs, and complex motion predicates, they may actually encode more information about path than speakers of other verb-framed languages, e.g., Spanish, through concatenation of expressions. The second aim is to test the robustness of typological preferences for expression of path by investigating whether acquisition of an L2 can inuence patterns established in an L1. Since Japanese and English dier typologically in this domain, we observe native speakers of Japanese with knowledge of English as an L2 and compare performance in their native L1 to that of monolingual speakers of each language. If inuence of an L2 on an L1 exists and is a normal part of L2 acquisition and not L1 loss, these non-monolingual Japanese speakers are predicted to display properties of English in fully grammatical production in Japanese, for example in lexicalization and concatenation of path.
4. 4.1.
Methodology Participants
A total of fty-seven adults aged between 18 and 48 participated in this study, distributed across four groups: monolingual Japanese speakers resident in Japan (16 speakers), monolingual English speakers resident in the USA (13 speakers), and native Japanese speakers with knowledge of English resident in Japan (15 speakers) or the USA (13 speakers). Biographical information and information on general language usage was gathered using a detailed questionnaire developed by the Multilingualism Project at the Max Planck Institute for Psycholinguistics (Gullberg and Indefrey 2003). The monolingual speakers of each language had had minimal exposure to an L2, were not engaged in active study of an L2, and did not use an L2 in their everyday lives; therefore, they were considered functionally monolingual. Further, all native Japanese speakers with knowledge of English were engaged in active use of their L2. Crucially, the L2 speakers in Japan had never lived in an English-speaking country, while those in the USA had been residents for between one and two years. This contrast in residence was designed to control for possible eects of L1 loss. Changes in path expression seen only in the L1 of those in the USA might be explained by attrition of the L1 due to residence in the L2 community. However, similar L1 patterns in both groups would render an explanation based on L1 attrition less likely. Even though this study is only concerned with L1 production, learners L2 knowledge was carefully measured to ensure uniform prociency in
269
English. Participants rst rated their own prociency in speaking, listening, writing, reading, grammar, and pronunciation. They then completed the rst grammar section of the Oxford Placement Test (Allan 1992). Third, their oral prociency was evaluated by consensus judgment of two certied examiners using the University of Cambridge Local Examinations Syndicate (UCLES) oral testing criteria for the First Certicate in English (FCE).5 Both the Oxford Placement and the FCE criteria placed the native Japanese speakers with knowledge of English resident in Japan and the USA within intermediate range. The groups did not signicantly dier in prociency as measured by the Oxford Placement Test (t (25) 0.795, p 0.434), but marginally diered in prociency as measured by the Cambridge FCE criteria (t (26) 1.982, p 0.058), with the learners resident in Japan scoring slightly higher than those resident in the USA. Participants biographical and language usage data as well as English prociency data are summarized in Table 1.
Table 1. Summary of biographical and language usage/prociency data Language background Mean AoEa : English Mean usageb : English Mean self-ratingc : English Mean Oxford Score Mean FCEd Score
a d
Non-monolingual Japanese (Japan) (n 15) 11.9 (range 913) 3 hrs (range .58.5) 2.97 (range 24.17) 78% (range 6088%) 4.27 / 5 (range 25)
Non-monolingual Japanese (USA) (n 13) 12.8 (range 1214) 6 hrs (range 112) 3.27 (range 1.84.3) 75% (range 5885%) 3.69 / 5 (range 2.35)
Age of exposure; b Hours of usage per day; c A composite score of individual skill scores; Cambridge First Certicate in English
4.2.
Stimuli
Short narrative descriptions were elicited based on the six-minute, animated Sylvester and Tweety Bird cartoon, Canary Row (Freleng, 1950), used in several studies on expression of motion in speech and ges ture (e.g., Kita and Ozyurek, 2003; McNeill 1992; Stam 2006; inter al.). The cartoon contains numerous motion events, centering around Sylvesters repeated but failed attempts to catch Tweety. In order to get maximal information from participants and increase the likelihood of mention of motion events, the entire cartoon was broken down and shown in
5. More information can be found at http:/ /www.cambridgeesol.org.
270
manageable scenes following McNeill (1992). Two dierent sequences of scenes were systematically varied in the presentation of the stimulus. From the stimulus material, four motion events consistently described by participants were selected for coding and analysis, yielding four dierent paths: climb through, roll down, clamber up, swing across. 4.3. Procedure
All participants narrated in their L1. The native Japanese speakers with knowledge of English also produced narratives in their L2, but only the L1 data are reported here. Note, however, that the language order in which the second language speakers gave descriptions was counterbalanced across participants with a minimum of three days between appointments. This minimized the likelihood of both the L1 and L2 being fully active at the same time, therefore controlling for the eects of language mode (Grosjean 1998). Depending on the language of the experiment, participants were tested individually by either a native English- or native Japanese-speaking confederate. The participant and experimenter rst engaged in a brief warm-up, consisting of small talk in the target language, in order to put participants in monolingual mode. Next, the experimenter told participants that they would be watching a series of animated scenes from a cartoon on a computer screen and should retell what they had seen to the experimenter in as much detail as they could remember. The experimenter was trained to appear fully engaged in the participants narratives, but to avoid asking questions and crucially to avoid supplying the target path. 4.4. Speech segmentation and coding
Narrative descriptions were transcribed from digital video by a native speaker of the relevant language. Descriptions were divided into clauses, dened as any unit that contains a unied predicate . . . (expressing) a single situation (activity, event, state), following Berman and Slobin (1994: 660). Clauses sometimes contained more than one verb. Innitives or participles functioning as complements of modal or aspectual verbs, for example, were not segmented separately, e.g., [He wants to go], and neither were predicates that were narrator comments, e.g., [I think he went]. In Japanese, clausal segmentation presented some challenges due to the status of the connector morpheme, -te, which can connect a whole series of verbs. Linguists ascribe various semantics to -te, which might affect the placement of clausal boundaries (see, for example, Hasegawa 1996; Kuno 1973; Nakatani 2003). Following Kuno (1973) and Nakatani
271
(2003), in this analysis -te was considered primarily a simple connector of temporal sequence. Thus, all such inected verbs were segmented as individual clauses, with the exception of those occurring in mono-clausal complex motion predicates, dened by Matsumoto (1991; 1996) as consisting of a motion verb, -te sux, and a deictic verb. Examples of clausal segmentation of individual narratives by an English speaker and a Japanese speaker respectively are shown in (5) and (6). (5) 1[okay so Sylvester decides to crawl inside the drainpipe up to the windowsill ] 2[Tweety sees] 3[him coming] 4[and puts a bowling ball down the drainpipe] 5[and it ts] 6[and it meets Sylvester] 7[who ends up with a the ball inside of his stomach] 8[and he runs] 9[and rolls down the hill with it into a bowling alley] 10[when you hear a strike] 1[amamizu-no kou ochiru] rainwater-Gen like descend (the thing) the rainwater goes down like this 2[toi-ga arundesukedo] pipe-Nom exist.but there is a drainpipe 3[soko-kara naka-ni neko-ga haitte-itte] there-from inside-to cat-Nom enter.Con-go.Con from there, the cat went inside and 4[sono hiyoko-no tokoro-made ikouto-shitandesukedo] that bird-Gen place-to try.to.go-did.but and tried to reach the place where that chick is 5[hiyoko-wa booringu-no booru-o soko-no toi-ni bird-Top bowling-Gen ball-Acc there-Gen pipe-to ue-kara otoshite] up-from drop.Con the chick drops the bowling ball on the drainpipe from the top and 6[ee sono neko-ga haitteiru] um that cat-Nom is.inside where that cat is inside 7[naka-ni otoshitande] inside-to drop.Con (the bird) dropped (it) inside of (the drainpipe)
(6)
272
A. Brown and M. Gullberg 8[kou nanka neko-ga sore-o sono booringu-no booru-wo like like cat-Nom that-Acc that bowling-Gen ball-Acc nonde-shimatte] drink.Con-nish.Con something like the cat swallowed that bowling ball and 9[de saka-o kou kudaru-youni ] and hill-Acc like descend-like and like goes down the slope like this 10[kou ochite-itte] like fall.Con-go.Con (the cat) is falling down like this 11[booringu jyou-ga choudo atta-node] bowling place-Nom precisely existed-so and there was a bowling alley just there and so 12[soko-ni haitte-shimaimashita] there-to enter.Con-nished (he) got in there
Next, clauses describing the four target motion events were identied and coded using Elan, a digital video tagging software program developed at the Max Planck Institute for Psycholinguistics (Wittenburg et al. 2006). In example (5), clauses 1 and 3 relate to the climb through event and 8 and 9 relate to the roll down event. In example (6), clauses 3 and 4 relate to the climb through event, and 9, 10 and 12 relate to the roll down event. Clause 10 illustrates a complex motion predicate, in which two verbs are joined within a single clause, ochiru fall and iku go. A coding scheme was employed whereby all lexical elements encoding information about the trajectory followed by the Figure object were coded as path, including directional adpositional phrases indicating source and goal of motion and deictic verbs indicating motion. This coding scheme largely followed schemes outlined in previous studies on motion events (e.g., Jensen 2002; Kita and Ozyurek 2003; Slobin 1996b, 1997, 2004b; 6. In addition, the following language-specic guideWeingold 1992, 1995) lines were employed. Morphologically complex words in Japanese composed of a manner component and path component e.g., tobi-komu yenter.in, or two path components, e.g., toori-nukete go through-come
6. Although they are included here in order to be inclusive with respect to specication of a trajectory, many coding schemes do not include source, goal or deictic expressions as path. Although this may appear controversial, the reader is reminded that the crucial comparisons in this study are within-language for which exactly the same coding scheme was applied.
273
out were divided, and each path component was coded separately since each part of the lexical compound contributes independently to the meaning of the construction. Complex motion predicates consisting of a progressive motion participle with a deictic motion verb were treated similarly. The Japanese verbs, hairu enter and deru exit, were not coded as motion verbs at all unless they were combined with kuru come or iku go as auxiliaries or adpositional phrases such as ni to, following Kitas (1999) claim that these verbs in their bare forms express discrete changes of state without motion semantics (although see Tsujimura 2002, for an alternative analysis of Japanese enter and exit verbs). Furthermore, in Japanese, we excluded all spatial nouns, e.g., ue top/upness in ue-ni agaru rise to the top, as we considered these to encode location more than trajectory. We excluded comparable cases of locative expressions in English, e.g., climbed on the drainpipe, climbed the inside of the drainpipe unless these were used adverbially to express motion, e.g., went in/inside/into. The rst level of analysis investigated lexicalization of path. Here, the repertoire of lexical items used and the distribution of path semantics across morphosyntactic resources were identied. Two possible morphosyntactic patterns were distinguished in this analysis: verbal and adverbial. The second level of analysis addressed concatenation of path by examining the number of path expressions of any type per clause. Examples of analysis of lexicalization and concatenation of path in descriptions of the roll down event in Japanese and English appear in (7) and (8), with clause boundaries marked by brackets and path expressions underlined. (7) [Neko-wa sakamichi-o korogatte ikimashita] cat-Top hill-Acc roll.Con went Lit: The cat went rolling on the hill [The ball rolled out of the drainpipe, down the hill and into the bowling alley]
(8)
Example (7) from Japanese contains only one overt path expression, a verb, embedded in a complex motion predicate with a manner and path component: korogatte iku go rolling.7 In example (6) from English, however, there are three path expressions, all adverbials: out, down and into.
7. Native Japanese speakers may argue that this utterance contains directional information other than that conveyed by iku go. This may be due to the special status of korogaru roll, which in combination with a Ground phrase, e.g., saka-o hill-Acc, without a directional particle, may encode implicit directional semantics e.g., saka-o korogatte roll on/down the hill. However, since any additional directional information expressing descent in (5) is regarded as implicit, it has not been included in the coding, gloss or translation.
274 4.5.
A. Brown and M. Gullberg Reliability of speech coding
To establish reliability of data coding, 15% of the entire data set was segmented and coded by an independent second coder. 95% agreement was reached on selection of relevant clauses for coding, and of these, 100% agreement was reached on coding of lexicalization and concatenation. Disagreements were settled by accepting the judgment of the initial coder. 4.6. Analysis
Two dierent analyses were conducted to investigate the expression of path in L1 narrative production: rst, we identied lexicalization patterns in each group, and second, we assessed concatenation patterns. For all quantitative analyses, the native Japanese speakers with knowledge of English resident in Japan were compared to their counterparts resident in the USA. When no dierences were found between them, the data were collapsed to form a single group of non-monolingual speakers. Non-parametric statistical tests were employed throughout, specically Kruskal-Wallis for multiple group analyses and Mann-Whitney for between group analyses.
5. 5.1.
Results Lexicalization of PATH
In order to investigate lexicalization patterns, we rst identied the lexical repertoire for path expression employed by speakers in each group to describe the four target motion events. Table 2 shows the range of verbal and adverbial path types used by monolingual Japanese, non-monolingual Japanese and monolingual English speakers. In this qualitative analysis, native Japanese speakers with knowledge of English resident in Japan are displayed separately from those in the USA in order to balance participant numbers across groups and minimize the likelihood of dierences in the size of lexical repertoires arising from simple dierences in group size. As Table 2 shows, all groups employed both verbs and adverbials to lexicalize path. However, the diering number of lexical types appearing in each language is a clear indication that lexicalization patterns vary crosslinguistically. As expected, monolingual speakers of English employed a greater variety of adverbial expressions, whereas lexical diversity in both monolingual and non-monolingual Japanese discourse was chiey observed in verbs. In contrast to clear dierences between languages, within-language patterns appeared rather more uniform, regardless of

Table 2. Lexical repertoire for PATH expression Mono Japanese n 16 Non-mono Japanese: Japan n 15 agaru rise hairu enter idou-suru move iku go komu (only in compound form) into kuru come mezasu go toward mukau go toward noboru climb nukeru go through ochiru fall oriru decend shinnyuu-suru invade tadoritsuku arrive tooru go through tsutau go along tsutawaru be passed along ugoku/ugokasu move/be moved utsuru move Non-mono Japanese: USA n 13 agaru rise chikazuku approach hairu enter iku go komu (only in compound form) into kuru come noboru climb ochiru fall shinnyuu-suru invade tadoritsuku arrive tooru go through toutatsu-suru arrive tsutau go along tsutawaru be passed along utsuru move
275
Mono English n 13
path verb types
agaru rise hairu enter iku go komu (only in compound form) into kudaru descend kuru come noboru climb8 noru (only in compound form) onto ochiru fall shinnyuu-suru invade tai-suru go toward tooru go through tsutau go along tsutawaru go through utsuru move wataru cross
come get go
8. Japanese linguists (e.g., Matsumoto 1996) consider noboru climb.ascend as a path verb because it can only encode upwards trajectory (ue-ni noboru climb up / *shita-ni noboru climb down), in contrast to its closest translation equivalent in English, climb, which is considered a manner verb as it can be paired with both upwards and downward trajectories (climb up / climb down). In Japanese, noboru also occupies the position of a path verb (second position) in a manner-path verb compound. However, Sugiyama (2005) discusses the problematic nature of this verb, explaining that it can be represented by three dierent Chinese characters, only two of which have a clear path reading. The third character, she argues, has a much stronger suggestion of manner, indicating use of ones hands or feet. Moreover, there is no clear way of knowing which meaning the speaker intended. However, as she observes, the addition of yojiru clamber with noboru in the compound construction, yoji-noboru clamber.ascend more clearly expresses the semantics of manner. Thus, all cases of noboru have been coded here as path.
276
Table 2. (Continued) Mono Japanese n 16 Non-mono Japanese: Japan n 15 he to kara from made until/to ni to Non-mono Japanese: USA n 13 he to kara from made until/to massigura toward ni to Mono English n 13
path adverbial types
he to kara from made until/to ni to
across along back behind beyond down from in inside into on out of over through to up
language experience. There was complete overlap in adverbial types employed by monolingual and non-monolingual Japanese speakers and comparable numbers of verb types with a large degree of overlap. However, in order to fully explore lexicalization patterns given the possibilities for dierent verb constructions in Japanese, e.g., compound verbs and complex motion predicates, we calculated the number of path verbs versus adverbials per clause. Figure 1 shows the mean number of verbs expressing path per clause in all clauses containing path information in each language group. As there was no signicant dierence between the non-monolingual Japanese speakers resident in Japan versus the USA (z 1.322, p 0.186), the data for the two groups were collapsed. There was a signicant dierence between the groups in mean number of path verbs per clause ( w 2 (2, N 57) 29.826, p < 0.001). Specically, monolingual English speakers produced signicantly fewer path verbs per clause than both monolingual Japanese speakers (z 4.572, p < 0.001) and native Japanese speakers with knowledge of English (z 5.111, p < 0.001), who did not signicantly dier from each other (z 0.356, p 0.722). Figure 2 shows the mean number of adverbials expressing path per clause in all clauses containing path information. Again there was no sig-
277
Figure 1. Mean number of path verbs per clause: J (monolingual Japanese speakers), J (E) (L1 of native Japanese speakers with knowledge of English), and E (monolingual English speakers)
Figure 2. Mean number of path adverbials per clause: J (monolingual Japanese speakers), J (E) (L1 of native Japanese speakers with knowledge of English), and E (monolingual English speakers)
nicant dierence between the non-monolingual Japanese speakers resident in Japan versus the USA (z 0.278, p 0.781), so the groups were collapsed. The groups again diered in mean number of path adverbials per clause ( w 2 (2, N 57) 26.775, p < 0.001). This time, monolingual English speakers produced signicantly more path adverbials per clause
278
than native Japanese speakers with knowledge of English (z 4.306, p < 0.001), who in turn produced marginally signicantly more path adverbials per clause than monolingual Japanese speakers (z 1.895, p 0.058). In sum, these results illustrate between-language but also to some extent within-language dierences. In line with previous crosslinguistic research showing dierences between lexicalization patterns in satelliteversus verb-framed languages (e.g., Gennari et al. 2002; Naigles et al. 1998; Slobin 1996b), the native English speakers observed here lexicalized path in a wide range of adverbials, whereas the native Japanese speakers lexicalized path in a comparably wide range of verbs. Yet analyses also show that both English and Japanese speakers were not grammatically constrained by their typological classication and expressed path in alternative ways. Most striking, however, is the nding that native Japanese speakers with knowledge of English used marginally signicantly more adverbials than their monolingual Japanese counterparts, which suggests an inuence of knowledge of English. Crucially, given that performance among non-monolingual Japanese speakers did not dier according to their country of residence and lexicalization was fully grammatical, the higher adverbial usage did not appear to arise from loss of the L1. 5.2. Concatenation of PATH
From the analyses above, we see that Japanese and English speakers employ both verbs and adverbials for lexicalization of path. Given that adverbials can be concatenated, this may have repercussions for path expression at the level of the clause. Moreover, as noted previously, even stacking of path verbs is an available option in Japanese. Example clauses (9)(13) from descriptions of the climb through and clamber up events demonstrate this range of options in monolingual and nonmonolingual Japanese discourse as compared to monolingual English discourse. (9) [Neko-ga amadoi-no naka-o tsutatte] cat-Nom drainpipe-Gen inside-Acc go.along.Con Lit: The cat goes along the inside of a drainpipe [Tori-no tokoro-ni ikouto] bird-Gen place-to try.to.go Lit: (The cat) tries to go to the birds place [Haisuikan-no naka-o toori-nukete] drainpipe-Gen inside-Acc go.through-go.through Lit: (The cat) goes along going through the inside of the drainpipe
(10)
(11)
Changes in encoding PATH of motion (12) [Chiyou-kara Tweety-no tokoro-made nobotte ground-from Tweety-Gen place-to climb.ascend.Con itta] went Lit: (He) went climbing from the ground to Tweetys place [And goes rolling down the street into a bowling alley]
279
(13)
Example (9) from a monolingual Japanese speaker illustrates the typical verb-framed pattern, a clause with one path expression in the main verb tsutau go along.9 Examples (8) and (9) from non-monolingual Japanese speakers with knowledge of English present clauses with two path expressions in each: in the rst, the verb iku go and the postposition ni to, and in the second, the compound verb combining tooru go through and nukeru go through. The example in (10), also from a nonmonolingual Japanese speaker, however, contains four path expressions in a completely grammatical clause: two postpositions, kara from and made to, and a complex motion predicate consisting of two verbs, noboru climb.ascend and iku go. The nal example in (11) from a monolingual English speaker contains three path expressions: one verb, go, one adverb, down, and one preposition, into. These examples demonstrate clearly that with the full range of morphosyntactic devices, Japanese speakers can concatenate path expressions grammatically within the clause as easily as English speakers can. The remaining question is whether they actually do. Figure 3 shows the mean number of path expressions of all types (verbs and adverbials) per clause in all clauses containing path information in each language group. Once again, there was no signicant dierence between the non-monolingual Japanese speakers resident in Japan versus the USA (z 0.723, p 0.470), so the data for the two groups were collapsed. There was a signicant dierence between the groups in mean number of path expressions per clause (w 2 (2, N 57) 16.193, p < 0.001). Native Japanese speakers with knowledge of English stacked signicantly more path expressions per clause than monolingual Japanese speakers (z 2.010, p 0.044), who packed signicantly more path expressions per clause than monolingual English speakers (z 2.079, p 0.038). In sum, results on concatenation of path expressions within the clause revealed surprising between- and within-language dierences. First, not only did speakers of Japanese in general stack more path expressions per clause than would be expected from a verb-framed language, but they
9. The spatial noun naka inside was not coded as path in Japanese for reasons outlined in the section on coding of speech.
280
Figure 3. Mean number of path expressions of all types per clause: J (monolingual Japanese speakers), J (E) (L1 of native Japanese speakers with knowledge of English), and E (monolingual English speakers)
also packed signicantly more path expressions per clause than monolingual speakers of a satellite-framed language, English. Second, in their L1, Japanese speakers with intermediate knowledge of English concatenated signicantly more path expressions per clause than their monolingual Japanese counterparts. Again, there was nothing ungrammatical about non-monolingual L1 production, as can be seen in examples (7)(10), and non-monolingual speakers in the USA patterned in the same way as those in Japan, implying that this pattern was not the result of L1 loss. 6. Discussion
This study investigated the robustness of typological preferences for path expression by examining (1) the extent to which expression of path in Japanese follows patterns demonstrated in other verb-framed languages with respect to lexicalization and concatenation, and (2) whether patterns in an L1 can change after acquisition of an L2. Regarding the rst research question, analyses of monolingual expression of path both conrm and challenge previously found typological differences in lexicalization patterns. In line with previous research, monolingual Japanese speakers encoded path primarily in a wide range of verbs, whereas monolingual English speakers encoded path primarily in a wide range of adverbials. However, the full range of morphosyntactic devices available in Japanese meant that monolingual speakers of this
281
verb-framed language were not restricted to the one path expression per clause seen in other verb-framed languages and instead concatenated signicantly more path expressions than monolingual speakers of a satelliteframed language, English. Important to note, however, are dierences in the semantics of path expressions used. In English, adverbials can encode all components of a trajectorythe source, the goal and the intervening movement. Therefore, the stacking of adverbials within a clause, e.g., down the street into a bowling alley, can actually encode separate trajectories within a journey. In Japanese, on the other hand, adverbials only encode the source and goal of a trajectory. Hence, the stacking of adverbials, e.g., chiyou-kara Tweety-no tokoro-made from the ground to Tweetys place, only encodes dierent components of a single trajectory, specically the starting and ending points. Thus, with a greater stacking of path expressions, native Japanese speakers were not necessarily encoding more complex trajectories within a clause than native English speakers, just greater specications of a single trajectory. In addition to these semantic dierences, methodological dierences between this and prior studies might help account for the disparity between Japanese and other verb- or satellite-framed languages. For example, there is variation between studies in the number and nature of motion events described, and similar patterns may not hold for all motion events. Indeed, Matsumoto (p.c.) suggests that some of the motion events analyzed in this study may not have involved a journey complex enough to elicit maximal concatenation of path expression in English speech. Moreover, there also may have been dierences in the biographical proles of participants. Data in previous studies came from native speakers of the languages, who may or may not have had varying degrees of prociency in another language, whereas the monolingual participants employed in the current study were carefully selected on the basis of their limited foreign language experience. With the results of this study indicating that use of ones L1 can be subtly altered with even intermediate prociency in an L2, it becomes crucial to control for second language knowledge in any investigation of the native speaker baseline. The above dierences in semantics and methodologies notwithstanding, the fact remains that the existence of postpositions as well as compound verbs and complex motion predicates allow Japanese speakers to concatenate path expressions to a surprisingly high degree. Furthermore, we must keep in mind that since speakers of satellite-framed languages typically reserve the verb slot for a manner verb and have few options for compound manner-path verbs, adverbials oer the best option for accumulation of path expressions. In short, it may not have been the
282
monolingual English speakers in this study who patterned dierently from native English speakers in previous studies, but the monolingual Japanese speakers who did not pattern in a way generally predicted for speakers of verb-framed languages. This supports ndings from at least one other verb-framed language, Basque, which also pays a lot of attention to source and goal of path in a range of morphosyntactic devices, and thus behaves rather like a satellite-framed language (Ibarretxe-Antunano 2004). We conclude that in contrast to previous claims, typological classication does not necessarily restrict concatenation of path information within the clause. Moreover, these ndings highlight the importance of distinguishing between what a language allows and what speakers of that language actually do. Regarding the second research question, L1 preferences for path expression do not appear to be impervious to change. Native Japanese speakers with intermediate knowledge of English employed a mixed strategy for path lexicalization in their L1, Japanese, with frequent use of both verbs, like their monolingual Japanese counterparts, but also adverbials, like monolingual English speakers. These same speakers then produced the most greatly specied trajectories of all, with signicantly more path expressions per clause than either monolingual group. These results suggest that established typological patterns in the L1 might be inuenced by patterns being acquired in the L2, even at intermediate levels of L2 prociency. More specically, non-monolingual speakers of Japanese with some knowledge of English appear to combine both Japanese and English lexicalization strategies for expression of path in their L1. In all likelihood, this strategy accounts for the highly specied and compact encoding of path, since a combination of verbs and adverbials can be easily stacked within the clause. Importantly, as there were no dierences between Japanese speakers residing in the L1 versus the L2 community and as increased L1 expression of path was completely grammatical, these results do not seem to indicate any kind of language loss. Instead, in this arena, characterized by linguistic preference as opposed to grammaticality, such patterns suggest a fully grammatical process of convergence between the L1 and L2, much as has been proposed for the linguistic systems of bilinguals (e.g., Bullock and Toribio 2004; Colantoni and Gurlekian 2004; Montrul 2004; Sanchez 2004; Tatsumi 1997). If the patterns observed here do reect the eects of acquisition of L2 English on use of L1 Japanese, the nature of the inuence is rather more complicated than a simple matter of translation. As noted above, directionals function dierently in English and Japanese. For example, the Japanese equivalent of the English adverbial up, which does not
283
specify an end point, would be ue-ni to the top/upness, which species a spatial noun as the goal of motion. Therefore, in using comparable morphosyntactic resources as an English speaker to lexicalize path, a non-monolingual speaker of Japanese communicates slightly dierent semantic information, e.g., the source and goal of motion as opposed to the intervening trajectory. These ndings have several theoretical and methodological implications. First, with respect to linguistic typology, monolingual baseline results reveal that the relationship between typology and discourse is not as simple as has been predicted, and that there is still a need for further empirical testing in a wider range of languages of predictions for language usage on the basis of typological distinctions. Furthermore, multilingual results suggest that studies of language usage should consider the impact of individual language experiences, particularly with respect to second languages as common as English, the eects of which might be seen across entire groups of speakers. Moreover, the synchronic changes observed here may oer predictions for more systemic diachronic shifts, for example of the kind seen after language contact between speech communities (see Slobin 2004b for a discussion of the impact of German on Italian in the domain of motion event language). Second, in the eld of second language acquisition, the relationship between a rst language and a second is generally considered to be unidirectional with features of the L1 inuencing the L2. However, we argue instead that the relationship may be bidirectional with features of the L2 concurrently inuencing the L1. While this has long been acknowledged in the functional bilingualism literature, where prociency levels in both languages are high (e.g., papers in Cook 2003; Dussias 2001; Hohenstein et al. 2006; Pavlenko and Jarvis 2002), eects of an L2 even at intermediate levels of prociency on a supposedly stable L1 question the validity of benchmarks used in research on and assessment of second language acquisition. L2 production is typically compared to and assessed against that of a native speaker, whose established language is seen as a xed target. The stability, unity and invariability of this standard is likely to be an over-simplication (cf. Davies 2003). If another language, however imperfectly mastered, also inuences the native language, this suggests that the native L1 is not an invariable entity, but rather a moving target. For this reason, we should be more wary of the term non-target-like in regard to L2 production. As a consequence, we may then have reason to question ndings on the limits on ultimate attainment in an L2 (cf. Birdsong 2005). In conclusion, this paper argues that expression of path in monolingual Japanese does not completely follow patterns established in other
284
verb-framed languages and that encoding of path in the L1 may change after even partial acquisition of an L2. We need more usage data in a range of languages in order to fully explore typological preferences and their eects on discourse. We also need data from other L1L2 pairings in order to distinguish more clearly between patterns arising from convergence of knowledge of particular languages and those arising from general eects of bilingualism. Much work thus remains to be done. Nevertheless, at this point we may conclude that although the linguistic expression of path exhibits considerable crosslinguistic dierences among monolingual speakers, it does not seem to be as robust as expected and impervious to change. Received 1 March 2009 Revision received 25 November 2009 Syracuse University/ Max Planck Institute for Psycholinguistics, Nijmegen
References
Allan, David. 1992. Oxford Placement Test. Oxford: Oxford University Press. Aske, Jon. 1989. path predicated in English and Spanish: A closer look. Proceedings of the Fifteenth Annual Meeting of the Berkeley Linguistics Society. 114. Berman, Ruth and Dan I. Slobin. 1994. Relating Events in Narrative: A Cross-linguistic Developmental Study. Mahwah, NJ: Lawrence Erlbaum. Birdsong, David. 2005. Nativelikeness and non-nativelikeness in L2A research. International Review of Applied Linguistics 43(4). 319328. Brown, Amanda. 2008. Gesture viewpoint in Japanese and English: Cross-linguistic interactions between two languages in one speaker. Gesture 8(2). 256276. Brown, Amanda and Marianne Gullberg. 2008 Bidirectional crosslinguistic inuence in L1 L2 encoding of manner in speech and gesture: A study of Japanese speakers of English. Studies in Second Language Acquisition 30(2). 225251. Bullock, Barbara E., and Almeida Jacqueline Toribio. 2004. Introduction: Convergence as an emergent property in bilingual speech. Bilingualism: Language and Cognition 7(2). 91 93. Cadierno, Teresa. 2004. Expressing motion events in a second language: A cognitive typological perspective. In Michel Achard and Susanne Niemeier (eds.). Cognitive Linguistics, Second Language Acquisition, and Foreign Language Teaching, 1349. Berlin: Mouton de Gruyter. Cadierno, Teresa and Lucas Ruiz. 2006. Motion events in Spanish L2 acquisition. Annual Review of Cognitive Linguistics 4. 183216. Colantoni, Laura and Jorge Gurlekian. 2004. Convergence and intonation: Historical evidence from Buenos Aires Spanish. Bilingualism: Language and Cognition 7(2). 107 119. Cook, Vivian (ed.). 2003. Eects of the Second Language on the First. Clevedon, UK: Multilingual Matters. Davies, Alan. 2003. The Native Speaker: Myth and Reality. Clevedon: Multilingual Matters. Freleng, Friz. 1950. Canary Row [Film, animated cartoon]. New York: Time Warner.
285
Dussias, Paola, E. 2001. Sentence parsing in uent Spanish-English bilinguals. In Janet Nicol (ed.), One Mind, Two languages: Bilingual Language Processing, 159176. Cambridge: Blackwell. Gennari, Silvia P., Steven A. Sloman, Barbara C. Malt, and W. Tecumseh Fitch. 2002. Motion events in language and cognition. Cognition 83(1). 4979. Grosjean, Francois. 1998. Studying bilinguals: Methodological and conceptual issues. Bilin gualism: Language and Cognition 1(1). 131149. Gullberg, Marianne and Peter Indefrey. 2003. Language Background Questionnaire. Nijmegen, Max Planck Institute for Psycholinguistics. http:/ /www.mpi.nl/research/projects/ Multilingualism/questionnaire. Hasegawa, Yoko. 1996. The (nonvacuous) semantics of TE-linkage in Japanese. Journal of Pragmatics 25(6). 763790. Hohenstein, Jill, Ann Eisenberg and Letitia Naigles. 2006. Is he oating across or crossing aoat? Cross-inuence of L1 and L2 in SpanishEnglish bilingual adults. Bilingualism: Language and Cognition 9(3). 249261. Ibarretxe-Antunano, Iraide. 2004. Motion events in Basque narratives. In Sven Stromqvist and Ludo Verhoeven (eds.), Relating Events in Narrative Volume 2: Typological and Contextual Perspectives, 89111. Mahwah, NJ: Lawrence Erlbaum. Inagaki, Shunji. 2001. Motion verbs with goal PPs in the L2 acquisition of English and Japanese. Studies in Second Language Acquisition 23(2). 153170. Inagaki, Shunji. 2002. Motion verbs with locational/directional PPs in English and Japanese. Canadian Journal of Linguistics 47(3/4). 187234. Jensen, Anne. 2002. Expressing motion events in a second language. Paper presented at EUROSLA, Basel, 1821 September. Johnson, Mark. 1987. The Body in the Mind: The Bodily Basis of Meaning, Imagination and Reasoning. Chicago: University of Chicago Press. Kita, Sotaro. 1999. Japanese enter/exit verbs without motion semantics. Studies in Language 23(2). 307330. Kita, Sotaro and Asli Ozyurek. 2003. What does cross-linguistic variation in semantic coordination of speech and gesture reveal?: Evidence for an interface representation of spatial thinking and speaking. Journal of Memory and Language 48(1). 1632. Kuno, Susumu. 1973. The Structure of the Japanese Language. Cambridge, Mass: MIT Press. Matsumoto, Yo. 1991. On the lexical nature of the purposive and participial complex motion predicates in Japanese. Proceedings of the 17th Annual Meeting of the Berkeley Linguistics Society 180191. Matsumoto, Yo. 1996. Subjective motion and English and Japanese verbs. Cognitive Linguistics 7(2). 183226. McNeill, David. 1992. Hand and Mind. What the Hands Reveal about Thought. Chicago: Chicago University Press. Montrul, Silvina. 2004. Subject and object expression in Spanish heritage speakers: A case of morphosyntactic convergence. Bilingualism: Language and Cognition 7(2). 125142. Naigles, Letitia R., Ann R. Eisenberg, Edward T. Kako, Melissa Highter and Nancy McGraw. 1998. Speaking of motion: Verb use in English and Spanish. Language and Cognitive Processes 13(5). 521549. Nakatani, Kimiko. 2003. Analyzing -te. In William McClure (ed.), Japanese/Korean Linguistics, 377387. Stanford: CSLI. Navarro, Samuel and Elena Nicoladis. 2005. Describing motion events in adult L2 Spanish narratives. In David Eddington (ed.), Selected Proceedings of the 6th Conference on the Acquisition of Spanish and Portuguese as First and Second Languages, 102107. Somerville, MA: Cascadilla Proceedings Project.
286
Negueruela, Eduardo, James P. Lantolf, Stephanie R. Jordan and Jaime Gelabert. 2004. The private function of gesture in second language speaking activity: A study of motion verbs and gesturing in English and Spanish. International Journal of Applied Linguistics 14(1). 113147. Pavlenko, Aneta and Scott Jarvis. 2002. Bidirectional Transfer. Applied Linguistics 23(2). 190214. Sanchez, Liliana. 2004. Functional convergence in the tense, evidentiality, and aspectual systems of Quechua. Bilingualism: Language and Cognition 7(2). 147162. Slobin, Dan I. 1996a. From thought and language" to "thinking for speaking.". In John J. Gumperz and Stephen C. Levinson (eds.), Rethinking Linguistic Relativity, 7096. Cambridge: Cambridge University Press. Slobin, Dan I. 1996b. Two ways to travel: Verbs of motion in English and Spanish. In Masayoshi Shibatani and Sandra A. Thompson (eds.), Grammatical Constructions: Their Form and Meaning, 195219. Oxford: Oxford University Press. Slobin, Dan I. 1997. Mind, code and text. In Bybee, Joan, John Haiman and Sandra A. Thompson (eds.), Essays on Language Function and Language Type: Dedicated to T. Givon, 437476. Philadelphia: John Benjamins. Slobin, Dan I. 2004b. The many ways to search for a frog: Linguistic typology and the expression of motion events. In Sven Stromqvist and Ludo Verhoeven (eds.), Relating Events in Narrative: Typological and Contextual Perspectives, 219257. Mahwah, NJ: Lawrence Erlbaum. Slobin, Dan I. and Nina Hoiting. 1994. Reference to movement in spoken and signed languages: Typological considerations. Proceedings of the Twentieth Annual Meeting of the Berkeley Linguistics Society 487505. Stam, Gale. 2006. Thinking for Speaking about motion: L1 and L2 speech and gesture. International Review of Applied Linguistics 44(2). 143169. Sugiyama, Yuki. 2005. Not all verb-framed languages are created equal: The case of Japanese. In Proceedings of the Thirty-rst Annual Meeting of Berkeley Linguistics Society. Talmy, Leonard. 1985. Lexicalization patterns: Semantic structure in lexical forms. In Timothy Shopen (ed.), Language Typology and Syntactic Description Vol. 3, 57149. Cambridge: Cambridge University Press. Talmy, Leonard. 1991. path to realization: A typology of event conation. Proceedings of the Seventeenth Annual Meeting of the Berkeley Linguistics Society, 480519. Talmy, Leonard. 2000. Toward a Cognitive Semantics Vols. 1 and 2. Cambridge, MA: MIT Press. Tatsumi, Tomoaki. 1997. The bilinguals thinking for speaking: Adaptation of Slobins Frog story experiment to Japanese-English bilinguals. Japan: Sophia University, BA Thesis. Tsujimura, Natsuko. 2002. Japanese enter/exit verbs revisited: A reply to Kita (1999). Studies in Language 26(1). 165180. Weingold, Gotz. 1992. Up and down: On some concepts of path in Korean motion verbs. Ohak Yonku Language Research 28(3). 1543. Weingold, Gotz. 1995. Lexical and conceptual structures in expressions for movement and space: With reference to Japanese, Korean, Thai and Indonesian as compared to English and German. In Urs Egli, Peter E. Pause, Christoph Schwarze, Arnim von Stechow and Gotz Weingold (eds.), Lexical Knowledge in the Organization of Language, 301340. Amsterdam/Philadelphia: John Benjamins. Wittenburg, Peter, Hennie Brugman, Albert Russel, Alex Klassmann and Han Sloetjes. 2006. ELAN: A professional framework for multimodality research. Proceedings of the Fifth International Conference on Language Resources and Evaluation LREC. Genoa: Italy.
Im fed up with MarmiteIm moving on to VegemiteWhat happens to the development of spatial language after the very first years?
EVA-MARIA GRAF*
Abstract The present article examines childrens spatial language during late phases of development. To this end, the spontaneous speech of American English speaking children between 6 and 10 years of age was analyzed within an analytical framework developed in a previous study for spontaneous speech from speakers between 10 and 19 years of age (Graf 2006). The analysis dened ve basic spatial categories and four levels of abstraction in spatial meaning in order to capture all spatial relations and their literal as well as non-literal uses. The results show that the spatial language of speakers between 6 and 10 years of age diers mainly with respect to speakers preference for literal and metaphorical uses. Compared to the ndings in Graf (2006), the age groups analyzed here may be viewed as the developmental phase during which children and adolescents will reach the end of their journey towards adult use of spatial reference. Keywords: Space, language acquisition, developmental phase, literal meaning, metaphorical meaning.
1.
Introduction: In search of a larger picture of spatial language development
The ubiquity of space in thought and language has inspired a vast amount of research. The ontogenetic path children take to become competent spatial communicators therefore is a prominent topic in cognitive
* Address for correspondence: E-M. Graf, Alpen-Adria Universitat, Institut fur Anglistik und Amerikanistik, North Building, level 0, Zimmer I.0.35, Universitatsstrae 6567, 9020 Klagenfurt, Austria. Email: eva-maria.graf@uni-klu.ac.at Cognitive Linguistics 212 (2010), 287314 DOI 10.1515/COGL.2010.011 09365907/10/00210287 6 Walter de Gruyter
288
E.-M. Graf
science, developmental psychology and language acquisition research. Any informed discussion of the relation between language and space should include an accounting of how young children come to represent the meaning of spatial terms (Quinn 2005: 294). To apply a developmental perspective is particularly rewarding with respect to the close and multi-faceted interrelations of space, language, and cognition: The universal need for spatial knowledge, representation and communication results in an early mastering of basic spatial skills and leads to a high frequency of spatial reference in infants linguistic experience and input. Whereas the formerdue to, among other things, the relatively similar needs of young childrenleads to the construction of universal spatial categories such as location and motion, the latter results in languagespecic encodings of spatial relations and entities on the language level (for an account on the dierent grammars of space see Levinson and Wilkins 2006). (Spatial) language development follows what Gentner and Boroditsky (2001: 248) call a division of dominance continuum with respect to the inuence of cognition and language on spatial representation and communication (see also Bowerman 1996; Bowerman and Levinson 2001; Choi 2006; Gentner and Goldin-Meadow 2003; Gumperz and Levinson 1996). The way children learn to refer to space in language is both inuenced by general cognitive factors that account for universally similar acquisition paths and language-specic factors that account for the particularities found in these developmental paths (e.g., whereas English speakers linguistically categorize spatial events on the basis of support (X is on the table) or containment (X is in the bowl ), Korean speakers focus on tight vs. loose t between objects) (cf. Hickmann 2007: 227; see also Choi and Bowerman 1991; Pruden et al. 2008). These assumptions of the spatial ubiquity in thought and language and of the division of dominance continuum represent the theoretical context into which the present analysis is embedded. Its goal is to analyze the literal and metaphorical1 spatial language development of English speaking
1.
Metaphorical spatial language results from cognitive and socio-communicative processes. According to Conceptual Metaphor Theory, cognitive processes account for our human propensity to conceptualize the abstract, e.g., time, via the concrete, e.g., space, to make it intellectually accessible. Conceptual metaphors result from ontological mapping processes between concrete, bodily and experientially based source domains onto abstract target domains. These mapping processes include the extension of congurations from the source to the target, become entrenched as conventional cognitive patterns in the mind of the speakers and are, in turn, expressed in language via metaphorical items. At the same time, socio-communicative processes such as frequency of use of metaphorical expression account for the entrenchment and (lexical) conventionality of
Im fed up with MarmiteIm moving on to Vegemite
289
children between 6 and 10 years of age. These age groups represent a research gap as the majority of both inter- and intralinguistic studies on spatial language acquisition concentrate on childrens development during their very rst years. The lack of interest in later stages of spatial language development is paralleled by a general disinterest in language development during school age and puberty (but see e.g., Berman 2004; Berman and Slobin 1994; Hickmann 2003; Karmilo and Karmilo-Smith 2001; Nippold 1998; Romaine 1984). In a previous study, the author therefore focused on English speaking adolescents between 10 and 19 years in order to determine if and how their spatial language developed (Graf 2006). The study showed no clear indication of any developmental changes in spatial language as a whole for the analyzed period: Irrespective of their age, speakers showed great similarities with respect to the spatial categories and their uses at the dierent levels of abstraction. The acquisition of spatial language may therefore be considered nalized after age 10. With respect to the dramatic ongoing changes reported in studies of earliest phases of spatial language acquisition, i.e., the years up to the age of six, and the reported lack of developmental changes after age 10, the years 6 to 10 seem to represent a crucial transition phase from apprentice to master of spatial language. The analysis is based on the categorical framework developed for the previous study. It deviates from existing spatial language research in two respects. Firstly, in accordance with Levinson and Wilkins (2006: 551) nding that . . . we know that spatial language is not fully mastered until late childhood, it adopts a life-span perspective as suggested by Eckert (1998) (for a similar argument see Karmilo and Karmilo-Smith 2001: 1). Spatial ontogenesis is viewed as a three-phase process2, whereby each phasechildrens preverbal spatial development, their early phases of spatial language acquisition (primary spatial language acquisition) and their late phases of spatial language acquisition (secondary spatial language acquisition)builds on and further elaborates what has already been mastered (see Chapter 2 and footnote 3).
spatial metaphors (see Graf in press; Keysar et al. 2000; Svanlund 2007). With time and usage, such lexical metaphors often lose more and more of their transparency andat least in some casesseem to be dying or already dead (Rice, Sandra and Vanrespaille 1999: 124) as is the case with the grammaticalized items going to future or existential there (termed transmetaphorical in the current approach (Graf 2006) (see Table 2)). 2. Reference to dierent developmental phases is found only implicitly in the literature.
290
E.-M. Graf
Secondly, the study applies an integrative perspective. Most studies concentrate on the acquisition of specic literal tokens of spatial language in proper spatial contexts. Little research, in contrast, has examined childrens use of space in non-spatial, metaphorical contexts. Studies interested in the acquisition of metaphorical language primarily analyze metaphors as instances of gurative language (Gardner 1974; Winner 1976, 1995; Pearson 1990), seldom as a basic human means of intellectually accessing abstract phenomena (but see for example Johnson 1997, 1999, 2001; Ozcaliskan 2003, 2004, 2005). In view of the immense attention space has received as a primary ontological source domain for metaphorical extensions onto abstract domains such as TIME (e.g., Boroditsky 2000; Gentner 2001; Tenbrink 2007), it might be rewarding to bring this aspect more to the forefront in spatial language acquisition research. In this study, non-literal uses are analyzed alongside literal uses of spatial categories. The categorical framework applied in this study follows from its integrative perspective. It concentrates on spatial categories which are understood as linking elements between spatial cognition and spatial language, rather than on the linguistic correlates such as spatial verb spatial preposition that represent these categories, e.g., go down. Such a procedure allows for the integration of the various phases of spatial development given that spatial categories are dependent on pre-conceptual image-schemata as the pre-linguistic representation format of spatial knowledge and constitute more abstract representations on the language level (cf. Quinn 1998: 160; Tomasello 1998: XX). In addition, the focus on spatial categories allows for the consideration of dierent levels of abstraction in spatial meaning at which spatial congurations may be used. For example, motion events may be represented at the literal spatial level (He goes to the station) or at the metaphorical spatial level (He always goes to extremes). This framework aims to capture all instances of spatial reference in the data and focuses neither exclusively on particular spatial relations such as motion events and their linguistic correlates nor on either literal or metaphorical spatial expressions. Instead, the aim is to analyze as wide a variety as possible of spontaneous spatial references in natural, everyday conversations of American English speakers between 6 and 10 years of age with the help of a qualitative analysis. The study will hopefully make a rst contribution towards lling the existing research gap in this area of study. In addition, it will hopefully link the various developmental phases in search of the larger picture of spatial language acquisition.
Im fed up with MarmiteIm moving on to Vegemite 2. Phases of spatial language development3
291
The focus of this article is on the phases of spatial language development, in particular on the later phase between 6 and 10 years of age. The very rst years of spatial language acquisition, i.e., primary spatial language acquisition, have been extensively studied from a variety of theoretical and methodological backgrounds. These numerous, mainly microanalytic in-depth studies have created islands of diering research interests and knowledge and are characterized by a pervasive methodological heterogeneity (Hickmann 2003: 175). Well aware of such methodological and theoretical heterogeneity, Graf (2006) carried out a meta-analysis of 75 studies in spatial language development from the last 35 years in order to come up with general developmental trends of young childrens spatial language acquisition (see also Hickmann 2003, and Nowak 2007, who gives a detailed outline of the meta-study methodology). Based on this meta-analysis the following features characterize and dene primary spatial language acquisition (in English) (cf. Graf 2006: 147 ):
Children come a long way from their rst spatial holophrases (cf. Dominey 2006: 138)4 such as up to both the comprehension and production of practically all spatial linguistic tokens, however not yet in their full semantic and pragmatic scope and contrast. They show a preference for those spatial congurations already prominent during their pre-verbal spatial development such as dynamic events expressing goal or source, and they encode these with general motion verbs such as go or come. They also produce rst metaphorical extensions of spatial
3. Spatial language development follows the general phases of language acquisition: The pre-linguistic phase thereby lays important cognitive and social foundations for the acquisition of language and forms the gestural basis for symbolic communication (Blake 2000; Lock 1980, 1999; Ozcaliskan and Goldin-Meadow 2005; Tomasello 2003). Early phases of language acquisition mean the onset of verbal communication around a childs rst birthday as a consequence of their universal human predilection for pattern-nding and intention-reading (Tomasello 2003). These rst years of language acquisition witness dramatic changes from context-dependent holophrastic utterances to an impressive linguistic repertoire during kindergarten and pre-school; developmental changes happen on a daily basis and require researchers to work in narrow time frames to trace them. According to the few available studies on late phases of language acquisition, subsequent developments are in turn characterized by an enrichment and sophistication of already acquired systems (cf. Grimshaw and Holden 1976: 41). Such deepening and renement happens at a much slower pace and requires the comparison of widely separate age groups to be able to trace developmental changes (cf. Nippold 21998: 3). 4. In early one-unit utterances (Tomasello 2003: 39) or holophrases, dened as unparsed holistic utterances that correspond directly to a meaning (Dominey 2006: 138), a single lexical item forms an early construction through its xed embedding in a specic context of situation.
292
E.-M. Graf
structures in language and start combining spatial expressions to refer to complex spatial congurations such as up here. Therefore, spatial language acquisition is clearly on its way in the years leading up to childrens schooling. However, intracategorical and inter-categorical dierences and correlations must still be acquired, metaphorical and other non-literal uses must be fully mastered and the possibilities of combining spatial expressions must be further explored.
In accordance with the general features of later phases of language development, secondary spatial language acquisition should be characterized by a further elaboration and renement of already acquired linguistic spatial competence. Spatial categories and their linguistic correlates should become used in a fully semantically and pragmatically contrastive manner, on all levels of abstraction as well as in combinations where necessary. In Graf (2006), the author assumed the phase of secondary spatial language to cover the adolescent years from 10 to 19 and compared their spatial language development in the context of three dierent age groups (10 to 13 years of age, 14 to 16 years of age, and 17 to 19 years of age)5. Yet, as the ndings showed, there is neither an increase in the complexity of the spatial language nor any variation with respect to the spatial congurations expressed. Furthermore, no development with respect to literal and non-literal uses of spatial categories could be reported. Instead, the speech of the three age groups under scrutiny was strikingly similar: A highly identical distribution of spatial categories, a similar complexity with respect to the use of simple and compound spatial categories, and, nally, a nearly identical picture with respect to literal and non-literal uses of spatial language could be reported (for terminology see below). The following assumption therefore motivates this study: Given that spatial language acquisition is not nalized during the very rst years and that there is hardly any developmental change found in the language of speakers older than 10, the years between 6 and 10 seem to be a decisive and separate period in spatial language development. 3. 3.1. The method The data
The data stems from the American English section of the CHILDES corpus, the Carterette transcripts. Unlike the material in the COLT corpus
5.
Assuming secondary (spatial) language acquisition to cover these years is based on Nippolds (1998) indirect classication.
293
used in Graf (2006), i.e., the 10 to 19-year-olds, where informants wore a microphone and a Walkman and were asked to record all conversations they had for some time, the Carterette material was evoked in a simple social situation. A researcher was sitting around a table together with three children of the same age group at a time and told them that she wanted to nd out what children of their age were interested in. She encouraged them to talk about anything they wanted to with each other, but did not herself participate in the interaction. Whereas the warm-up period was discarded from the transcripts, the speech produced during the rest of the conversation showed all features of natural speech, i.e., a broad range of topics, interruptions, use of slang words and generally a natural turn-taking among the speakers (see the data manual for Carterette and Jones, page 17). Although the two sets of data were gathered in slightly dierent contexts, i.e., naturally occurring conversations in the COLT corpus and evoked conversations in the CHILDES corpus, given that both sets of data share the characteristics of naturally evolving speech among interaction partners, it clearly sets them apart from experimentally elicited responses, produced under strictly controlled co- and contextual conditions. We therefore feel that comparison of the data is possible. The transcribed material contains speech from 54 rst graders, 48 third graders and 48 fth graders from junior college classes of a city college in California as well as the speech of 24 adults. For the present purpose only the childrens parts were used which divided into three age groups: 6 yearolds (Group I), 8 year-olds (Group II), and 10 year-olds (Group III). For each age group, well over 10,000 words were recorded and transcribed6. The aim is to analyze as wide a variety as possible of spontaneous spatial reference in natural, everyday conversations in English7. This approach is in keeping with Ames and Learneds (1948: 63) important insight that observation of spontaneous verbalizations seems in many ways to be
6. One problem related to such a method of eliciting and observing spontaneous language is Labovs (1972) observers paradox. The fact that children are aware of being recorded can be seen in the following utterance taken from the speech of eight year olds: How much longer do we have to talk?. 7. All items identied by the researcher as spatial words were checked against other studies on spatial language for corroboration of their status as prepositions, dimensional adjectives, etc. The Oxford English Dictionary was consulted to assess the spatial semantics of a word whenever the classication was unsure to the researcher. Items that did not occur in other studies, or that could not be identied as spatial in the Oxford English Dictionary did not get considered in the study.
294
E.-M. Graf
one of the most useful methods for studying the childs concepts of space. 3.2. The analytic framework
The categorical framework, developed in Graf (2006) for the analysis of spontaneous speech of speakers between 10 and 19 years of age, allows for the analysis of spatial language in the broadest sense possible, as it captures all spatial relations and their literal as well as their non-literal uses. At the same time, such an integrative endeavor can only be executed at the cost of detail. Such detail is found in the many microscopic indepth studies of earliest spatial language acquisition, which provide valuable insights into the exact acquisition of certain spatial expressions or the exact onset of language-specic inuences on spatial cognition etc. Bringing these two dierent research approaches together is one of the future research desiderata formulated in Section 5. 3.2.1. The spatial reference act. In accordance with the usage-based approach to language acquisition (Tomasello 2003), the framework centers on the communicative unit of Spatial Reference Act (SRA) and its components. The unit of analysis is dened as follows:
A speaker or Reference Origo (RO) refers to something, the Reference Entity (RE), usually in relation to something else, the Reference Relatum (RR) in space, the Reference Field (RF)8, from a particular point of view, the Reference Perspective (RP). In addition, Spatial Reference Acts are addressed to someone, the hearer or decoder of the spatial communicative act (Graf 2006: 79).
3.2.2. The spatial categories. The spatial categories (and their subtypes) used to analyze the data dierentiate a number of verbally encoded spatial congurations that result from dierent relations between the Reference Entity (RE) and the Reference Relatum (RR) as basic constituents of a Spatial Reference Act. Five simple spatial categories between Reference Entity and Reference Relatum are claimed: Reference Entity Distance, which encodes a deictic relation of distance or proximity (This is
8.
The following denition of the Reference Field is given in Graf (2006: 84): Reference Field (RF) is understood as the implied surrounding space within which the speaker anchors the specic spatial conguration between the Reference Entity (RE) and the Reference Relatum (RR). The concept of Reference Field is closely related to the more frequently used concept of frame of reference (see e.g., Levinson 1996, 2003). The pragmatic focus on spatial language adopted here is however better captured in the concept of the Reference Field.
295
my new bike), Reference Entity Motion, which encodes dynamic relations between RE and RR (He ran down the hallway), Reference Entity Location, which encodes a static relation between RE and RR (I forgot my keys on the table), Reference Entity Dystation9, which encodes intermediate relations between location and motion RE and RR (Peter lives across the street), and Reference Entity Dimension10, which encodes the dimensional Gestalt of the RE in (implicit) relation with RR (I want a big coke) (see Table 1 below). Based on a meta-analysis of existing studies on the acquisition of dimensional spatial expressions such as big or tall during the very rst years, Graf (2006: 172) concluded that certain categories and their linguistic correlates such as proximity/distanceexpressed via the spatial adverbs here and there or the demonstratives this or thatare cognitively simpler, whereas others such as dimension are generally more complex (due to cognitive prerequisites, semantic complexity and contextdependency) and acquired at a later stage. The distribution of the basic categories within and across the three age groups may therefore oer valuable insight into their developmental states.
Table 1. Categories of space in the English language Spatial categories Reference Entity Distance (distance/proximity) Reference Entity Motion (dynamic category D) Reference Entity Location (static category S) Reference Entity Dystation (dystatic category DS) Reference Entity Dimension (dimensional category DIM) Token examples This is my new bike. He ran down the hallway. I forgot my keys on the table. Peter lives across the street. I want a big coke.
9. Such intermediate types are not a matter of language alone, but are also found in perception and conceptualization. See Talmy (1996: 211 ), (2000: 99 ) and 2003 for a detailed outline of the dierent types of ctive vs. factive spatial relations. 10. Whereas the spatial Gestalt is an additional factor in the sub-categorization of dynamic and static relations in language (e.g., The pictures are on the table vs. The pictures are in the drawer) (for detailed discussions of such functional features see Carlson and van der Zee (eds.) (2005) or Aurnague, Hickmann and Vieu (eds.) (2007)), the spatial category REDIM is centered on such spatial Gestalts.
296
E.-M. Graf
In addition to these ve simple spatial categories that represent one particular spatial conguration on the language level, we nd the representation of complex spatial congurations, i.e., the combination of spatial relations within what is termed here compound category. In the example in this room (age 8) we nd the combination of the simple spatial category Reference Entity Location realized on the language level via the (spatial) verb be spatial preposition in and the simple spatial category Reference Entity Distance realized on the language level via the demonstrative pronoun this. The relation between reference entity and reference relatum comprises of two types of spatial information, one of containment and one of proximity. Compared to in the room which is an example of the simple category containment, in this room is complex in that it also expresses proximity and is therefore considered a compound spatial category. Further examples of compound categories from the data are a place up in the desert (age 10), big long rope (age 10), this big castle there (age 6), close to that time (age 6), these two little girls come over (age 8), or move this one big rock (age 8)11. Thus, the Spatial Reference Act is based on the transformation of two (or more) simple categories into a compound one that transmits the overall spatial information (for similar arguments see Boers 1996; Gapp 1997; Vorweg and Rickheit 1999; Carstensen 2001). According to Stockman and Vaughn-Cooke (1992), the use of compound categories is cognitively and linguistically more complex and thus represents a developmental challenge for younger speakers. At the same time, their use often reects a meta-pragmatic awareness in the speaker that their communicative partner may need more precise spatial information in order to decode the message correctly and understand speakers spatial intention (cf. Gapp 1997: 67; Plumert 1996: 376). The use of compound spatial categories therefore seems to be another important measuring device for (cognitive and) linguistic development in the context of spatial language. 3.2.3. The levels of abstraction in spatial meaning. Part of the authors research interest is to trace 6 to 10 year old speakers spontaneous use of non-literal spatial reference alongside literal uses. Therefore, the four dif-
11. Although some of the examples include the combination of two instances of the same category (e.g., big long rope), whereas others represent a combination of two dierent categories (e.g., these two little girls), both cases are treated here as representing compound spatial categories. As the primary explanation for the use of such compound categories is speakers metapragmatic awareness that more precise spatial information is necessary, the type of categories combined is not considered an indicator of complexity.
297
ferent levels of abstraction in spatial meaning as dened in Graf (2006) are applied to the data (see Table 2 below). This scalarity of spatial meaning can be explained via the categorical framework of the present approach and from a general cognitive and pragmatic perspective. In accordance with the categorical framework presented here, the dierent levels of abstraction are a consequence of the surrounding space or Reference Field into which the spatial congurations between the RE and the RR are embedded by the speaker (RO). The Reference Field is embedded in the ROs Perceptual Space (PS) in case of literal spatial reference (Its over there), in his or her Conceptual Space (CS) in case of literal spatial reference (He lives in Berlin) or metaphorical spatial reference (The meeting is in June), in both, as is the case in metaliteral meaning (Go to bed now!), or even outside ROs Conceptual Space as is the case in transmetaphorical meaning (Its going to rain) (for more detail see Graf 2006: 96 ). At the same time, the scalarity must be explained in the context of communicative interaction which motivates spatial reference in the rst place. Salient information as well as aspects of pragmatic strengthening, i.e., the conventionalization and entrenchment of situated implicatures that arise from communicative experience with a particular lexical item as new meaning components, help create dierent levels of abstraction over time (Evans 2003; Graf in press; Keysar et al. 2000; Svanlund 2007). These levels are dierentiated in terms of degree of transparency and imageability as well as recoverability of the underlying spatial relation and the question of salient information. The ubiquity of spatial language in childrens surrounding input as well as their own frequent use of spatial reference due to the importance of such information for (communicative) interaction set o such conventionalization and entrenchment processes at a very early stage. The following levels of abstraction are dierentiated here: Literal spatial meaning (This is Jim.)metaliteral spatial meaning (Go to bed now!), where the spatial linguistic times convey non-spatial information such as e.g., Its late, you need to sleep alongside the spatial information change of locationmetaphorical spatial meaning (Im moving on to Vegemite)transmetaphorical spatial meaning (Its going to rain), i.e., dead or dying metaphors (Diewald 1997; Kuteva and Sinha 1994; Rice, Sandra and Vanrespaille 1999). Examples from the data include in room eight (age 8), I am in here (age 6), go to Hawaii (age 6), live around the block from me (age 6) or at the Sahara hotel (age 10) for literal spatial meaning, buy something from the store (age 10), go to the show (age 6), get up (age 8), little boy (age 10) or get sth. from somebody (age 8) as examples for metaliteral spatial meaning, before that (age 6), stay over night (age 10), at the usual time (age 6),
298
E.-M. Graf
on Jeepers Creepers (age 8) or on her birthday (age 8) for metaphorical spatial meaning. Examples for transmetaphorical spatial meaning, i.e., grammaticalized and highly conventionalized items such as going to future or existential there, are found in the transcripts of all age groups12.
Table 2. Scalar nature of spatial meaning (cf. Graf 2006: 96) Level of spatial abstraction literal spatial meaning metaliteral spatial meaning metaphorical spatial meaning transmetaphorical spatial meaning Example This is Jim. Go to bed now! Im moving on to Vegemite! Its going to rain.
Based on the general assumption that ages 610 years represent a proper phase in spatial language acquisition, it is hypothesized that this period shows the following intra-phasal developments and thus diers from earlier (06 years) and later (1019 years) periods: 1) The distribution of simple and compound spatial categories, i.e., the frequency of use of compound spatial categories should be higher in the language of 10 year old speakers than in the language of 6 year old speakers. The distribution of the ve simple spatial categories; i.e., the frequency of use of more complex spatial categories (where hypothesized complexity is based on Graf 2006) such as Reference Entity Dimension should be higher in the language of 10 year old speakers than in the language of 6 year old speakers. The preference for literal or non-literal uses of spatial categories; the frequency of use of spatial categories on non-literal levels of abstraction should be higher in the language of 10 year old speakers than in the language of 6 year old speakers.
2)
3)
3.2.4. The analysis. The method chosen here is a qualitative categorical interpretation, followed by a quantitativeyet not statistical analysis of the classied material. In naturally evolving speech it is harder than, for example, in experimental data, to identify valuable measure points that can be quantied. This problem is even more prominent be12. Due to their grammaticalized status going to future and existential there as instances of transmetaphorical spatial meaning do not suppose a higher cognitive complexity for speakers than e.g., metaphorical spatial meaning. Instead, they are learnt as separate items in their idiomatic meaning from early on (cf. Graf in press on the relationship between syntagmatic co-occurrence of multi-word units, metaphorical strength and levels of abstraction in spatial meaning).
299
cause size of the data is harder to control for. Well conducted qualitative research in the context of these data allows the researcher to highlight many more interesting phenomena and explain them within the context in which they occur. However [w]ell conducted qualitative research is very labour-intensive and therefore qualitative studies typically use, of necessity, much smaller samples of participants than quantitative ones (Dornyei 2007: 38). Due to this relatively small sample size a statistical analysis as a second step would not come up with reliable results. In order to nevertheless illustrate developmental trendswhich need to be corroborated in the future with the help of more datathe current qualitative analysis of spontaneous data is followed by a simple quantitative counting approach. In a rst step, the data of each age group was exhaustively searched for spatial linguistic tokens, which were marked and classied according to the categories spatial category, simple or compound category and level of abstraction. Despite the primary focus on the spatial categories, the categorical interpretation of the corpus data via contextual and co-textual information is based on a qualitative evaluation of the categories linguistic correlates (e.g., spatial prepositions or motion verbs) as their representatives on the language level. However, such an analysis of the linguistic correlates of spatial categories is not straightforward as will be discussed in more detail in Section 5. In the next section, the results are given in relation to the (sub-)hypotheses put forth earlier. In order to give a fuller picture of late stages of spatial language acquisition as a whole, reference will be made to the ndings of the previous study concerning years 10 to 19 where necessary. 4. Results
Does spatial language dier with respect to the distribution of simple and compound spatial categories?
Table 3 below gives the total number of spatial congurations found in the data and, more importantly for the present purpose, the number of simple and compound categories for each age group.
Table 3. Number of spatial categories and distribution across simple and compound categories 6 total number of spatial congurations simple compound 1,894 1,605 (84.7%) 289 (15.3%) 8 2,046 1,693 (82.7%) 353 (17.3%) 10 2,395 1,971 (82.3%) 424 (17.7%)
300
E.-M. Graf
Whereas six-year-old speakers represent a total of 1,894 spatial congurations in their speech, the speech of eight-year-olds contains 2,046, and that of ten-year-olds contains 2,395. Altogether, 6,335 Spatial Reference Acts were analyzed. Steady increase in SRAs from the youngest to the oldest speakers must be interpreted with great care at this point. According to the Carterette manual, roughly the same amount of language data is coded for the three groups, which would suggest that older speakers refer more often to spatial congurations than younger speakers. Possible explanations for these varying numbers are found in the free choice of topics, howeverone characteristic feature of natural speech. Figure 1 below presents the absolute numbers of simple and compound categories used by each age group in proportions.
Figure 1. Total number of spatial categories and their distribution across simple and compound categories (numbers in percentage)
We nd a rather consistent distribution of simple and compound categories; more than 4/5 of all spatial congurations in the three age groups are represented via simple categories. There is, however, some development: Whereas the simple categories account for 84.7% and the compound categories for 15.3% of all spatial congurations expressed by the six-year-old speakers, the simple categories account for 82.7% and the compound categories for 17.3% in the language of the eight-year-old speakers. Finally, the oldest speakers apply simple categories in 82.3% of all cases and compound categories in 17.7% of all cases. Such clear dominance of the simple spatial categories was also found in the previous study of speakers between 10 to 19 years of age (see Table 4), where simple categories accounted for about 88%, and compound categories for about 12% of all instances of spatial reference throughout the data.
301
Table 4. Total number of spatial categories and their distribution across simple and compound categories from Graf (2006) 1013 total number of spatial congurations simple compound 2,289 2,019 (88.2%) 270 (11.8%) 1416 2,142 1,917 (89.5%) 225 (10.5%) 1719 2,000 1,759 (87.9%) 241 (12.1%)
Contrary to very balanced ndings in Graf (2006) for the 10 to 19 year olds, the distribution of simple and compound categories in the present data is shifting towards a slightly larger proportion of compound categories towards the end of the developmental phase as speakers at the age of ten most often refer to spatial congurations via compound categories, followed by speakers at the age of eight. Considering the higher cognitive and linguistic complexity of compound categories as well as their pragmatic function of clarifying referential ambiguity, it seems expectable that older speakers make more frequent use of such complex spatial reference in the form of compound categories. However, as the analysis of the 10 to 19 year old speakers use of compound spatial categories evinced an overall lower, albeit balanced, proportion (P88% vs. P82 to 85%), the current interpretation, i.e., the frequency of use of compound spatial categories as an indicator of more advanced language development, must be critically evaluated in future research (see Section 5).
Does spatial language dier with respect to the overall distribution of the ve simple spatial categories?
The next possible indicator of spatial development is the distribution of the ve simple spatial categories Reference Entity Distance (proximity/ distance), Reference Entity Motion (D), Reference Entity Location (S), Reference Entity Dystation (DS) and Reference Entity Dimension (DIM) (see Table 5 and Figure 2 below). Speakers of all age groups most often refer to dynamic spatial relations: Around 50% of all simple spatial categories in each age group represent dynamic relations (e.g., ride all around town or go home from school (6 years). This striking dominance of motion events is replicated in the ndings from earliest phases of spatial language development (see Graf 2006: 161 for a summary of studies) and from the pre-linguistic phase where young infants already show a preference for moving visual displays over static displays as reported e.g., by Atkinson (1993). A further conrmation of the exceptional importance of dynamism for human cognition
302
E.-M. Graf
Table 5. Types of simple spatial categories13 6 S D DS DIM Proximity/distance 321 (20.0%) 791 (49.3%) 81 (5%) 111 (6.9%) 301 (18.8%) 8 349 (20.6%) 901 (53.2%) 51 (3.0%) 61 (3.6%) 331 (19.6%) 10 408 (20.7%) 996 (50.5%) 74 (3.8%) 101 (5.1%) 392 (19.9%)
Figure 2. Types of simple spatial categories (numbers in percentage)
and language is illustrated in Table 6 below, where ndings from the analysis of older children and adolescents use of the ve basic spatial categories corroborate and support this clear trend (cf. Graf 2006: 185 ). Another interesting nding is the proportion of the static category and the deictic category that includes both distance and proximity. Both
Table 6. Types of simple spatial categories from Graf (2006) 1013 S D DS DIM Proximity/distance 315 (15.6%) 992 (49.2%) 100 (4.9%) 85 (4.2%) 526 (26.1%) 1416 344 (17.9%) 983 (51.3%) 72 (3.8%) 59 (3.0%) 459 (24.0%) 1719 332 (18.8%) 827 (47.1%) 80 (4.6%) 47 (2.7%) 472 (26.8%)
13. D Dynamic category (Reference Entity Motion); S Static category (Reference Entity Location); DS Dystatic category (Reference Entity Dystation); DIM Dimensional category (Reference Entity Dimension).
303
amount to around 20% each for each age group. These ndings contrast with those from the language of speakers between 10 and 19 years of age. While we also nd a very consistent distribution, the older speakers used the category distance/proximity more often than the static category (around 25% vs. 1719%). Finally, both the dimensional category and the dystatic category are with only 3 to 7%relatively little used. This trend again is replicated in the previous study.
Does spatial language diers with respect to the preference for literal or nonliteral uses of spatial categories?
The last question focuses on the dierent levels of abstraction in spatial meaning. Table 7 below summarizes the ndings of the distribution of the simple spatial categories across the four levels of abstraction, i.e., literal, metaliteral, metaphorical and transmetaphorical spatial meaning.
Table 7. Levels of abstraction in simple categories 6 literal metaliteral metaphorical transmetaphorical 890 (55.5%) 183 (11.4%) 411 (25.6%) 121 (7.5%) 8 837 (49.4%) 117 (6.9%) 580 (34.3%) 159 (9.4%) 10 884 (44.9%) 123 (6.2%) 762 (38.7%) 202 (10.2%)
We observe that literal use outweighs all other uses in all three age groups, i.e., all speakers use the simple spatial categories most often to refer to truly spatial congurations. The meta-study of more than 75 studies dedicated to early phases of spatial language acquisition (cf. Graf 2006: 176 ) showed that childrens development with respect to the use of spatial categories on non-literal levels is on its way, but that this development is in no way nalized. In the acquisitional phase analyzed here, this developmental trend continues: 6 year old speakers refer in more than 55% of all cases to literal spatial congurations, 8 year old speakers already less than 50% of the time and, nally, the oldest speakers use decreases to 44%. At the same time, the metaphorical uses constantly rise across the three age groups from around 25% in the language of speakers aged 6 towards nearly 40% in the spontaneous speech of the ten year olds. These ndings are especially sound when considered in the larger context of how literal and metaphorical spaces develop throughout the years. Although the majority of spatial acquisitional studies focus on literal space, indirect ndings do conrm the early importance of spatial knowledge in the
304
E.-M. Graf
conceptualization of abstract domains: Results from Clark and Carpenter (1989, 1994), Friedman and Seely (1976), Johnson (2001), Macrae (1976) and Weist (1991) illustrate that children rst acquire demonstratives, spatial prepositions and spatial verbs in their literal spatial sense, before they beginas early as age 2to metaphorically extend them onto nonliteral, abstract domains such as time. In contrast, looking at Table 8 below, the ndings from speakers between 10 and 19 years of age clearly indicate that they have fully mastered all levels of abstraction in spatial meaning. Furthermore, metaphorical uses not only predominate across all three age groups, but we also witness a very similar distribution of all levels of abstraction in spatial meaning in that data.
Table 8. Levels of abstraction in simple categories from Graf (2006) 1013 literal metaliteral metaphorical transmetaphorical 727 (36.0%) 145 (7.2%) 847 (42.0%) 299 (14.8%) 1416 518 (27.0%) 169 (8.8%) 893 (46.6%) 337 (17.6%) 1719 547 (31.1%) 139 (7.9%) 756 (43.0%) 318 (18.0%)
5.
Discussion
Spatial language acquisition can be considered from two dierent perspectives. Whereas the rst perspective concentrates on the questions when and how children acquire spatial reference, i.e., adopts a more process-oriented stance, the second perspective presupposes the acquisition of these items and concentrates on their actual use, i.e., on spontaneous spatial reference, thus adopting a more product-oriented stance. The process of spatial language acquisition is primarily tested with the help of experiments where the items under scrutiny are elicited in narrowly dened testing procedures; the product of spatial language acquisition is best investigated with the help of analyzing spontaneous, authentic language data that oers the widest possible co- and contextual variation. The majority of acquisitional studies focus on the procedural aspects, applying experiments in carefully controlled contexts such as for example toy play or picture book reading. The acquisition of spatial items is thereby predominantly tested in purely spatial contexts14 (for a critical evaluation of such
14. Of great interest in these studies has been how language-specic characteristics such as a satellite- or verb-framed typology inuence the acquisition process of the category SPACE, whose importance and ubiquity in human thought and language is universally observable and acknowledged (Choi and Bowerman 1991; Bowerman and Choi 2001; Choi 2006; Hickmann and Hendriks 2006).
305
procedures see Crystal 1997; Yont et al. 2003). Product-oriented studies nd a prominent forerunner in Ames and Learneds study from 1948, but their approach has not been frequently replicated. The present study as well as Graf (2006) take up this approach and look at the natural use of the dierent types of spatial reference in literal as well as non-literal senses. The results stem from a qualitative categorical interpretation of transcripts of spontaneous speech samples. Yet, such method is not an unproblematic endeavor. Due to the characteristics of spoken language such as false starts, repetitions, ungrammatical structures etc., the analyzed material contains instances of spatial representations whose status is problematic. The uncontrollability of conversational topics in natural use of spoken language adds another challenge. Furthermore, the analyses above were based on the transcribed version of the face-to-face interaction that should be complemented by multimodal information (Arndt and Janney 1987). In this respect, the categorical interpretation is challenging in certain cases. A similar argument holds for the uncontrollability of the co- and context, which also renders decisions on category membership sometimes dicult. Last but not least, basing the categorical interpretation on the categories linguistic correlates is not a straightforward endeavor. While closed-class tokens such as spatial prepositions or the demonstratives constitute a clear-cut group and pose no problems for the analysis, open-class items are often less easy to interpret and it is sometimes hard to draw the line between what counts as a spatial verb or noun or not: Whereas come and go and other examples of core spatial expressions are easily classiable and are frequently dealt with in the literature on spatial language, prepositional and phrasal verbs such as write down, speak up or look at are less clear cut cases. The same holds true for spatial nouns, where category membership is often fuzzy. To keep the subjective moment of such a qualitative approach at a minimum, independent coding by more than one researcher is a clear desideratum for further projects in order to achieve inter-coder reliability. Bearing such methodological problems in mind, the presented ndings of this (exploratory) study do raise important issues of spatial language development. For the period under scrutiny here, i.e., the years 6 to 10, the distribution of simple vs. compound categories, the use of the ve simple spatial categories as well as the use of these spatial categories on the four levels of abstraction were hypothesized to function as possible indicators of spatial development (see Chapter 3). Consequently, these years were hypothesized to represent a proper developmental phase within speakers spatial development, bridging the phases of primary spatial language acquisition and the years after age 10, whereaccording to a prior
306
E.-M. Graf
study in the same framework (Graf 2006)no further spatial development could be reported. As regards the distribution of simple and compound categories in the Carterette data, speakers rather consistently apply compound categories in about 1/5 of all instances of spatial reference. Only a small increase in the use of compound categories can be reported for the older speakers. To adequately apply compound categories in context, children have to come to terms with what Carstensen (2001: 73) calls Konzept-Unvertrag lichkeit (concept incompatibility): Spatial congurations that cannot be perceived and conceived together in a complex conguration cannot be represented together on the language level. At the same time, children have to learn that a certain spatial ambiguity for the communicative partner may be solved with the help of applying compound categories in ones Spatial Reference Act (e.g., Which one is yours? The one to your left). What is more, the use of compound spatial categories such as the tiny little opening through the couch between the wall (age 10) supposes a higher cognitive and linguistic complexity. The slight increase in compound categories from 6 to 10 years of age then possibly implies an ongoing development for the phase under scrutiny based on such questions of cognitive complexity and communicative awareness. The overall dominance of simple categories, in turn, replicates the ndings from the analysis of spatial reference in 10 to 19 year old speakers. However, on average the amount of simple categories was greater in the previous study with around 88% of simple categories across the three tested age groups. Bringing the ndings from the two studies together, the issue of the general comparability of the two sets of data needs to be raised: Whereas experimental settings allow for an exact control of the testing parameters, spontaneous language material dees such control in general. In addition, the Carterette material stems from semi-spontaneous interaction initiated by a researcher, whereas the COLT data represents truly spontaneous interaction. Still, the chosen databased on its co- and (partly) contextual variationis the best and only available material to undertake the endeavor of a qualitative interpretation of spontaneous spatial reference in a developmental context. A critical look at what types of spatial categories are combined within such compound categories as well as an assessment of the communicative co- and context are necessary. What is more, controlled experiment situations in which speakers use of compound categories are tested in relation to their age, cognitive and communicative development etc. are needed to rene and further elaborate on these qualitatively based ndings. As regards the ve simple spatial categories and the question whether their distribution functions as an indicator of development, the following
307
trend can be reported: The most frequently used spatial category is Reference Entity Motion; speakers of all age groups refer to dynamic spatial relations in roughly 50% of all instances of spatial reference. Our human propensity towards dynamism as claimed by Talmy (2003: 12)already conrmed for the pre-linguistic spatial development where infants prefer dynamic over static relations and corroborated for the earliest stages of spatial language acquisition (cf. a meta-analysis in Graf 2006: 147 )is also shaping the phase from 6 to 10 years of age and explains the lack of development. Both distance/proximity relations as well as static spatial relations are consistently used; the same holds true for the categories Reference Entity Dimension and Reference Entity Dystation, which are rather consistently distributed across the data and are of overall minor importance. The slightly dierent proportions of Reference Entity Location and Reference Entity Distance in the present and the previous data may derive from the varying contextual situations of the data collection or the general freedom of topic choice in spontaneous speech. Whereas the speakers between six and ten from the Carterette les were sitting round a table while conversing with each other, the older informants in the COLT corpus were engaged in all kinds of activities such as walking home from school, playing football, going to the movies etc. (see Section 3). Such outdoor activities may require a more frequent use of deictic information; however, again, this assumption remains to be conrmed by further analysis. In addition, it is necessary in the future to focus on the various subtypes of the simple spatial categories, whose diering cognitive complexity may evince, after all, some developmental dierences in the language of speakers of dierent ages. At this point, however, the distribution of the simple spatial categories cannot be considered an indicator for spatial development. Last, but not least, the distribution of the four levels of abstraction was hypothesized as another possible indicator for spatial development. The proportion of literal and non-literal uses of spatial categories in the language of speakers between 6 and 10 years of age indeed points to clear developmental changes: There is a decrease in literal uses from the youngest speakers, where more than half of all instances of spatial categories are literal, to the oldest speakers, were less than half of all instances of spatial categories are literal. The decrease is paralleled by an increase of metaphorical uses across the three age groups, starting with one out of four in the language of the 6 year olds and ending with roughly one out of three in the language of the oldest, i.e., the 10 year olds. This developmental trend in the data is backed up by acquisition studies which claim that children rst acquire spatial expressions in their literal spatial
308
E.-M. Graf
meaning, before they learn to extend them onto non-literal uses (cf. Clark and Carpenter 1994; Friedman and Seely 1976; Johnson 2001; Macrae 1976; Weist 1991). Moreover, the rise of metaphorical uses andat the same timethe decline of the literal uses is indirectly corroborated by the ndings from the previous study. All age groups consistently showed more metaphorical than literal uses of all spatial categories under scrutiny (cf. Graf 2006: 189 ) (for a more detailed analysis of the types of spatial metaphors in the language of speakers between 10 and 19 years of age and the varying degree of their metaphorical strength see Graf in press). We can observe a clear developmental change then with respect to literal and non-literal uses of spatial categories in the phase analyzed here, i.e., literal and non-literal uses function as true indicators of spatial development. At the same time, this development seems to be nalized at around the age of 10 according to the previous study. This implies a relatively narrow acquisition frame for spatial metaphors, which may best be explained with the ubiquity of spatial reference in the linguistic input with which children grow up as well as their own need to spatialize abstract ideas to make them intellectually accessible. Other metaphorical extensions, based on more complex source domains, may take more time to fully develop: . . . not surprisingly, the ability to understand more complex analogical and metaphorical mappings involving less familiar domains . . . or higher order relations . . . increases with age and achieves adult-like quality somewhere between ages 10;014;0 (Ozcaliskan and Goldin-Meadow 2005: 233). However, as was already mentioned . . . there has been no systematic work on how children learn the metaphorical extensions of motion [and other spatial relations, the author] as they become native speakers of a particular language (Ozcaliskan 2005: 291f ). More research on the acquisition of spatial language in non-spatial contexts is required, especially with respect to those frequent incidents of highly conventionalized spatial metaphors (see Graf in press). What seems especially rewarding for future research then is the concept of conventionalization and how it inuences childrens acquisition of the dierent levels of abstraction in spatial meaning: According to Svanlund (2007: 577f ), the conception of conventionalization as put forth in conceptual metaphor theory thereby . . . underestimates the social nature of conventions. It also underestimates the role of linguistic experience. A more usage-based, i.e., communicatively oriented approach is required. To sum up, the period from 6 to 10 years of age can indeed be considered a proper developmental phase in spatial language acquisition. Particularly with respect to literal and non-literal uses of spatial language important developmental steps take place. The youngest speakers refer twice as much to literal than to metaphorical spaces, but even the eight-
309
year-old children still show a clear preference for such literal uses. However, their metaphorical uses are clearly on the rise, and the oldest group of speakers, the ten year olds, shows a nearly balanced preference for both literal and metaphorical uses, with a slight preference still for the literal ones. This would mean, at the same time, that the period of secondary spatial language development covers the years 6 to 10, not between 10 and 19 years of age as assumed at the beginning of Graf (2006). 6. Conclusion and Future Research Desiderata
A larger picture of spatial language development is slowly taking shape. The following developmental trend emerges: English spatial language development takes place during three separate, but related, phases, up to the age around 10. After these years hardly anything develops, at least according to Graf (2006). The presence of spatial experience from the earliest moments of human life and their importance for human cognition and language could account for this relatively early mastering of linguistic reference to space. However, more integrative studies of spontaneous spatial reference in naturally occurring speech are needed. As assumed at the beginning of this paper, contrary to such a lack of spatial linguistic development after age 10, important developmental steps take place in the language of children between six and ten years of age, especially in the context of literal and metaphorical use of spatial categories. One possible explanation may lie in the dramatic changes that happen in childrens lives around that age. In the language context under scrutiny here, they leave home and start school, they learn to read and to write, i.e., they acquire new media that supply them with additional (spatial) information, etc. All of this supposes new cognitive and communicative challenges that contribute to the consolidation and elaboration of the already acquired spatial linguistic capacities. However, much remains to be done to conrm or challenge and further specify the big picture. Necessary next steps must include the analysis of adult spontaneous language with respect to the use of the various spatial categories as well as their use on the dierent levels of abstraction. At this stage, it can already be hypothesized that we nd a similar predominance of dynamic spatial categories as well as a majority of metaphorical uses of spatial categories, which would manifest, once more, the human propensity towards spatial metaphors and the human propensity towards dynamism. In addition, childrens earliest spontaneous use of spatial language should also be studied from an integrative perspective. As claimed by Tomasello (2003), children pass through a stage of isolated islands of
310
E.-M. Graf
knowledge that are only linked later on due to more linguistic input and experience, the capacity to draw analogies and make abstractions as well as their general cognitive development. It should be rewarding to trace such linking of the various spatial categories and their linguistic forms with respect to questions of mutual inuence, the acquisition of the various meaning components within certain spatial expressions in diering co- and contexts, as well as their use on the various levels of abstraction. Such assumed mutual inuence of knowledge and use of certain spatial categories and their correlates on other spatial categories should thereby be traced both in spontaneous language and tested in experimental settings. The combination of a qualitative and integrative analysis of spontaneous spatial language use with experimentally based research on e.g., age-preferential encodings of spatial congurations via simple or compound categories is of utmost importance for the reliability of the ndings. Once such product- and process-oriented ndings are available, the spatial development of speakers of English can be documented from the very beginning to its nalization in adolescent years as well as its further characteristics in adult years. Received 1 March 2009 Revision received 25 November 2009 References
Ames, Louise and Janet Learned. 1948. Development of Verbalized Space in the Young Child. Journal of Educational Psychology 39. 101116. Arndt, Horst and Richard Janney. 1987. InterGrammar. Berlin: Mouton de Gruyter. Atkinson, Jean. 1993. A neurobiological approach to the development of where and what systems for spatial representation in human infants. In Naomi Eilan, Rosaleen McCarthy and Bill Brewer (eds.), Spatial representation: Problems in philosophy and psychology, 325339. Oxford: Blackwell. Aurnague, Michel, Maya Hickmann and Laure Vieu (eds.). 2007. The categorization of spatial entities in language and cognition. Amsterdam: John Benjamins. Berman, Ruth (ed.). 2004. Language development across childhood and adolescence. Amsterdam: John Benjamins. Berman, Ruth and Dan I. Slobin. 1994. Relating events in narrative: A crosslinguistic developmental study. New York: Lawrence Erlbaum Associates Blake, Joanna. 2000. Routes to child language: Evolutionary and developmental precursors. Cambridge: Cambridge University Press. Boers, Frank. 1996. Spatial prepositions and metaphor. Tubingen: Gunter Narr. Boroditsky, Lera. 2000. Metaphoric structuring: understanding time through spatial Metaphors. Cognition 75. 128. Bowerman, Melissa. 1996. The origins of childrens spatial semantic categories: cognitive vs. linguistic determinism. In John J. Gumperz and Stephen Levinson (eds.), Rethinking linguistic relativity, 145176. Cambridge: Cambridge University Press.
Alpen-Adria University
311
Bowerman, Melissa and Stephen Levinson (eds.). 2001. Language acquisition and conceptual development. Cambridge: Cambridge University Press. Bowerman, Melissa and Soonja Choi. 2001. Shaping meanings for language: Universal and language-specic in the acquisition of spatial semantic categories. In Melissa Bowerman and Stephen Levinson (eds.), Language acquisition and conceptual development, 475511. Cambridge: Cambridge University Press. Carlson, Laura and Emile van der Zee (eds.). 2005. Functional features in language and space: Insights from perception,categorization, and development. Oxford: Oxford University Press. Carstensen, Kai-Uwe. 2001. Sprache, raum und aufmerksamkeit. Tubingen: Max Niemeyer. Choi, Soonja. 2006. Inuence of language-specic input on spatial cognition: Categories of containment. First Language 26(2). 207232. Choi, Soonja and Melissa Bowerman. 1991. Learning to express motion events in English and Korean: The inuence of language-specic lexicalization patterns. Cognition 41. 83121. Clark, Eve and Kathie Carpenter. 1989. On childrens use of from, by, and with in oblique noun phrases. Journal of Child Language 16. 349364. Clark, Eve and Kathie Carpenter. 1994. The notion of source in language acquisition. In Paul Bloom (ed.), Language acquisition, 251284. Cambridge, MA: MIT Press. Crystal, David. 1997. The Cambridge encyclopedia of the English language. Cambridge: Cambridge University Press. Diewald, Gabriele. 1997. Grammatikalisierung. Eine Einfuhrung in Sein und Werden gram matischer Formen. Tubingen: Max Niemeyer. Dominey, Peter. 2006. From holophrases to abstract grammatical constructions: Insights from simulation studies. In Clark, Eve and Barb Kelly (eds.), Constructions in acquisition, 137161. Standford: CSLI Publications. Dornyei, Zoltan. 2007. Research methods in applied linguistics. Oxford: Oxford University Press. Eckert, Penelope. 1998. Age as a sociolinguistic variable. In Florian Coulmas (ed.), Handbook of Sociolinguistic, 151167. Oxford: Oxford University Press. Evans, Vyvyan. 2003. The structure of time: Language, meaning and temporal cognition. John Benjamins, Amsterdam. Friedman, William and Pamela Seely. 1976. The childs acquisition of spatial and temporal word meaning. Child Development 47. 11031108. Gapp, Klaus-Peter. 1997. Objektlokalisation: Ein System zur sprachlichen Raumbeschreibung.Wiesbaden: Deutscher Universitatsverlag. Gardner, Howard 1974. Metaphors and modalities: How children project polar adjectives onto diverse domains, Child Development 45. 8491. Gentner, Dedre. 2001. Spatial metaphors in temporal reasoning. In Meredith Gattis (ed.), Spatial schemas and abstract thought, 203222. Cambridge, MA: MIT Press. Gentner, Dedre and Lera Boroditsky. 2001. Individuation, relativity, and early word learning. In Melissa Bowerman and Stephen Levinson (eds.). Language acquisition and conceptual development, 215256. Cambridge: Cambridge University Press. Gentner, Dedre and Susan Goldin-Meadow (eds.), 2003. Language in the mind: Advances in the study of language and thought. Cambridge, MA: MIT Press. Graf, Eva-Maria. 2006. The ontogenetic development of literal and metaphorical space in Language. Tubingen: Gunter Narr. Graf, Eva-Maria. In press. Adolescents use of spatial time metaphors: A matter of cognition or socio communicative practice? Journal of Pragmatics. Special Issue: The Language of Space and Time. Gumperz, John J. and Stephen Levinson (eds.). 1996. Rethinking linguistic relativity. Cambridge: Cambridge University Press.
312
E.-M. Graf
Hickmann, Maya. 2003. Childrens discourse: Person, space and time across languages. Cambridge: Cambridge University Press. Hickmann, Maya. 2007. Static and dynamic location in French. Developmental and crosslinguistic Perspectives. In Michel Aurnague, Maya Hickmann and Laure Vieu (eds.), The categorization of spatial entities in language and cognition, 205231. Amsterdam: John Benjamins. Hickmann, Maya and Henriette Hendriks. 2006. Static and dynamic location in French and in English. First Language 26(1). 103135. Johnson, Christopher. 1997. Learnability in the acquisition of multiple senses: SOURCE Reconsidered. Proceedings of the 22th annual meeting of the Berkeley Linguistic, 469 480. Berkeley: Berkeley Linguistic Society. Johnson, Christopher. 1999. Metaphor vs. Conation in the acquisition of polysemy: The case of see. In Masako K. Hiraga, Chris Sinha and Sherman Wilcox (eds.), Cultural, typological and psychological issues in cognitive linguistics: Current issues in linguistic theory, 155169. Amsterdam: John Benjamins. Johnson, Christopher. 2001. Constructional Grounding: On the relations between deictic and existential thereconstructions in acquisition. In Alan Cienki, Barbara Luka and Michael Smith (eds.), Conceptual and discourse factors in linguistic structures, 123136. Stanford: CSLI Publications. Karmilo, Kyra and Annette Karmilo-Smith. 2001. Pathways to language: From fetus to adolescent. Cambridge: Cambridge University Press. Keysar, Boaz, Shen Yeshayahu, Sam Glucksberg and William Horton. 2000. Conventional Language: How Metaphorical Is It? Journal of Memory and Language 43. 576593. Kuteva, Tania and Chris Sinha. 1994. Spatial and non-spatial uses of prepositions: Conceptual integrity across semantic domains. In David Mark and Andrew U. Frank (eds.), Cognitive and linguistic aspects of geographic space, 419434. Dordrecht: Kluwer. Labov, William. 1972. Sociolinguistic patterns. Oxford: Blackwell. Levinson, Stephen. 2003. Space in language and cognition: explorations in linguistic diversity. Cambridge: Cambridge University Press. Levinson, Stephen and David Wilkins (eds.). 2006. Grammars of space: Explorations in cognitive diversity. Cambridge: Cambridge University Press. Levinson, Stephen and David Wilkins. 2006b. Patterns in the data: towards a semantic typology of spatial description. In Stephen Levinson and David Wilkins (eds.), Grammars of space: Explorations in cognitive diversity, 512552. Cambridge: Cambridge University Press. Lock, Andrew. 1980. The guided reinvention of language. London: Academic Press. Lock, Andrew. 1999. Preverbal communication. In Gavin Bremner and Alan Fogel (eds.), Handbook of infancy research, Oxford: Blackwell. Macrae, Alison. 1976. Movement and location in the acquisition of deictic verbs. Journal of Child Language 3. 191204. Mayer, Mercer. 1969. Frog, where are you? New York: Pied Piper. Nippold, Marilyn A. 1998. Later language development: The school-age and adolescent years. Austin: Pro-Ed. Nowak, Peter. 2007. Meta-Studien Methodik ein neues Methodenparadigma fur die Dis kurs-forschung. Gesprachsforschung 8. 89116. Ozcaliskan, Seyda. 2003. Childrens developing understanding of metaphors about the mind. In Barbara Beachley, Amanda Brown and Frances Conlin (eds.). Proceedings of the 27th Annual Boston University Conference on Language Development, 603614. Somerville, MA: Cascadilla Press.
313
Ozcaliskan, S eyda. 2004. Encoding the manner, path and ground components of a metaphorical motion event. Annual review of cognitive linguistics 2. 73102. Ozcaliskan, Seyda. 2005. On learning to draw the distinction between physical and meta phorical motion: is metaphor an early emerging cognitive and linguistic capacity? Journal of Child Language 32. 291318. Ozcaliskan, Seyda and Susan Goldin-Meadow. 2005. Gesture is at the cutting edge of early language development. Cognition 96. B101B113. Pearson, Barbara 1990. The comprehension of metaphor by preschool children. Journal of Child Language, 17. 185203. Plumert, Jodie M. 1996. Young childrens ability to detect ambiguity in descriptions of locations. Cognitive Development 11. 375396. Pruden, Shannon M., Kathy Hirsch-Pasek, Roberta M. Golinko. 2008. Current events: How infants parse the world and events for language. In Thomas F. Shipley and Jerey M. Zacks (eds.), How humans see, represent, and act on events, 160192. New York, NY: Oxford University Press. Quinn, Paul C. 1998. Object and spatial categorization in young infants: what and where in early visual development. In Slater, Alan (ed.), Perceptual development: Visual, auditory, and speech perception in infancy, 131165. Hove: Psychology Press. Quinn, Paul C. 2005. Developmental Constraints on the Representation of Spatial Relation Information: Evidence form Preverbal Infants. In Laura Carlson and Emile van der Zee (eds.), Functional features in language and space: Insights from perception, categorization, and development, 293309. Oxford: Oxford University Press. Rice, Sally, Sandra Dominiek and Mia Vanrespaille. 1999. Prepositional semantics and the fragile link between space and time. In Masako Hiraga, Chris Sinha and Sherman Wilcox (eds.), Cultural, psychological, and typological issues in cognitive linguistics, 107127. Amsterdam: John Benjamins. Romaine, Suzanne. 1984. The language of children and adolescentsthe acquisition of communicative competence. New York: Basil Blackwell. Stockman, Ida and Fay Vaughn-Cooke. 1992. Lexical elaboration in childrens locative action expression. Child Development 63(5). 11041125. Svanlund, Jan. 2007. Metaphor and convention. Cognitive Linguistics 18/1. 4789. Talmy, Leonard. 1996. Fictive motion in language and ception. In Paul Bloom, Mary Peterson, Lynn Nadel and Merrill F. Garrett (eds.), Language and space, 211276. Cambridge, MA: MIT Press. Talmy, Leonard. 2000. Towards a Cognitive Semantics. Cambridge, MA: MIT Press. Talmy, Leonard. 2003. Fictive Motion in Language and Ception. Paper presented at the LIPP symposium, Ludwig-Maximilians-Universitat, Munich. Tenbrink, Thora. 2007. Space, time, and the use of language: An investigation of relationships. Berlin: Mouton de Gruyter. Tomasello, Michael. 1998. Introduction. In Michael Tomasello (ed.), The new psychology of language: Cognitive and functional approaches to language structure (viixxii). Mahwah, NJ: Lawrence Erlbaum. Tomasello, Michael. 2003. Constructing a language: A usage-based theory of language acquisition. Cambridge: Harvard University Press. Vorweg, Constanze and Gert Rickheit. 1999. Richtungsausdrucke und Heckenbildung beim sprachlichen Lokalisieren von Objekten im visuellen Raum. Linguisische Berichte 178. 152204. Weist, Rudy. 1991. Spatial and temporal location in child language. First Language 11. 253 267.
314
E.-M. Graf
Winner, Ellen. 1976. New names for old things: The emergence of metaphoric language. Journal of Child Language 6. 469491. Winner, Ellen (ed.). 1995. Developmental perspectives on metaphors. Hillsdale: Lawrence Erlbaum. Yont, Kristine, Catherine Snow and Lynne Vernon-Feagans. 2003. The role of context in mother-child interactions: an analysis of communicative intents expressed during toy play and picture book reading with 12month olds. Journal of Pragmatics 35. 435454.
On the use of posture verbs by French-speaking learners of Dutch: A corpus-based study

MAARTEN LEMMENS and JULIEN PERREZ*
Abstract This article presents the results of a quantitative and qualitative corpus study of the use of the Dutch posture verbs staan (stand), liggen (lie) and zitten (sit) by French-speaking learners of Dutch. In addition to providing a quantied insight into which uses of these verbs prove most problematic to the L2 learners, the study has also revealed three important tendencies. Firstly, in line with the typological dierences between French and Dutch (where these verbs behave like noun classiers), our analysis conrms the French-driven tendency of the learners for underusing these verbs. Secondly, seemingly paradoxical to the previous point, is that these learners occasionally overuse these posture verbs in contexts where no such verb is allowed. Thirdly, our qualitative analysis of errors reveals that the learners operate on grammaticised semantic distinctions drawn from the target language. Even if the categories used by L2 speakers may not be the same as those exploited by native speakers, our analysis suggests that the L2 speakers are thus aware of the patterns in the input and exploit them in a fashion that may not dier all that much in kind from those in L1 acquisition. Keywords: posture verbs, Dutch, second language acquisition, learner corpus, lexical, overgeneralisation.
* Address for correspondence: M. Lemmens, Professeur en linguistique et didactique des langues, UFR Angellier, Universite de Lille 3, B.P. 60149, 59653 Villeneuve dAscq CEDEX, France. Email: maarten.lemmens@univ-lille3.fr; J. Perrez, Facultes Universitaires Saint-Louis (FUSL), Boulevard du Jardin botanique 43, 1000 Bruxelles, Belgium. Email: perrez@fusl.ac.be; Acknowledgements: The authors wish to thank the editors of this special issue (Henriette Hendriks, Maya Hickmann and Katrin Lindner), Sabine De Knop, Philippe Hiligsmann, Aliyah Morgenstern and the two anonymous reviewers for their constructive comments on an earlier version of this paper. The authors are responsible for any errors that remain. Cognitive Linguistics 212 (2010), 315347 DOI 10.1515/COGL.2010.012 09365907/10/00210315 6 Walter de Gruyter
316 1. 1.1.
M. Lemmens and J. Perrez Introduction Scope and issues
Anyone familiar with the teaching of Dutch as a foreign language will know that the use of the three cardinal posture verbs zitten (sit), liggen (lie) and staan (stand) are often quite problematic for learners. In this paper, we present the results of a corpus-based study, in which we looked at how these verbs are used by Belgian francophone learners of Dutch.1 We approach the data from a quantitative as well as a qualitative perspective. The quantitative analysis allows us to evaluate in which uses the diculties are mostly situated. The qualitative analysis discusses some of the mechanisms that lead L2 speakers to produce these errors. The present study is but a rst (yet essential) step towards a more systematic analysis of the use of posture and location verbs in learner data (in Dutch as well as other languages) and will be followed by comparative research drawing on more controlled (spoken) data along the lines of earlier research in this domain (cf. Lemmens 2005a). Notwithstanding its modesty in scope, our corpus data support a usage-based model of (second) language acquisition, suggesting evidence for partially unit-based learning strategies as well as for systematic overgeneralisations of acquired patterns, much like what is known to occur in L1-acquisition. While the semantic categories with which the learners operate may not be the same as those of the native speakers, the learners errors show that the learner language is a linguistic system, in which grammaticised semantic distinctions drawn from the target language do play an important role (Klein 2008; cf. also Klein and Perdue 1993; Hiligsmann 1997). If it werent for this (partial) semanticisation, we could not explain the apparent paradox in the L2-data, i.e., the undeniable (typologically determined) underuse of the posture verbs in general, combined with posture verb overkill in many of the sentences in which these verbs do occur. 1.2. Typological background
In earlier work (Lemmens 2002), one of the authors has characterized the diculty that francophone learners of Dutch have with posture verbs as being situated on three interrelated levels: (i) coding exibility, (ii) coding variability and (iii) coding obligation. As the term suggest, coding exibil1. The present study focuses on Dutch as spoken in Flanders, the Dutch-speaking part of Belgium, which is also the variant with which the French-speaking learners in this study will be most regularly confronted (even if not exclusively). There are some interesting dierences between Belgian and Netherlandic Dutch concerning the use of posture verbs (cf. also Lemmens 2006).
On the use of posture verbs
317
ity refers to the wide range of semantic extensions (semasiological variation) that the posture verbs have in Dutch, since they have grammaticalised to basic locational verbs that are not only used to refer to the basic human postures, but also to the location of any entity in space or metaphorical extensions thereof (cf. section 1.3 below). The second diculty concerns the coding variation, which represents the other side of the coding coin (onomasiological variation), since one and the same spatial conguration, such as for example in De boter in de koelkast (the butter in the refrigerator) may be coded either with staan (in which case it metonymically refers to the butter dish standing on its base), with liggen (in which case it talks about the package typically lying on its longest side), or with zitten (an a-positional usage referring to containment only). Each of those verbs clearly imposes its own semantic prole on the scene; the choice of verb cannot be predicted with absolute certainty based on dimensions of the located object (although these dimensions may play a role in certain contexts). Often, (French) L2 speakers are mislead by these dimensions, saying for example that a bed in a room or a plate on the table (entities with a salient horizontal dimension/orientation) are lying whereas in Dutch staan (stand) is to be used. The third level of diculty, the coding obligation, concerns the fact that the use of a posture verb is obligatory in Dutch whenever an entity is located in space, whereas in English and in French, it is quite common (if not obligatory) to use a verb of EXISTENCE (such as be/etre) in locative predications, as illustrated in example (1) below.2 (1) a. b. c. my keys are on the table / the car is in front of the house mes cles sont sur la table / la voiture est devant la maison mijn sleutels liggen (*zijn) op de tafel / de auto staat (*is) voor het huis
While in English one could still use lie and stand in these two contexts (even if often giving a more stilted formulation), this is quite infelicitous in French: *mes cles sont couchees sur la table / *la voiture est debout devant la maison. The coding obligation in Dutch also holds for many metaphorical uses (even if some leniency is to be attributed to these, cf. Section 3 below). In short, not only do francophone learners of Dutch have to go against their native speaker intuitions and use a posture verb instead of a neutral
2. Examples without a reference have been constructed by the authors (possibly varying on attested uses in other corpora), examples from the corpus will be marked with an IDnumber, and examples found via Google will have a URL reference.
318
M. Lemmens and J. Perrez
verb, they are also confronted with considerable semasiological and onomasiological variation. While this typological dierence, related to that between Verb-framed and Satellite-framed languages (see Talmy 2000), has been discussed in earlier work (see, e.g., Lemmens 2005a; Lemmens and Slobin 2008 and the references therein), similar observations have been made in some other recent publications. Of note is the special issue of Linguistics edited by Ameka and Levinson (2007), devoted to location and posture verbs in a typologically varied language sample. While French is not included in their comparative study, the typological distinction between French and Dutch would in their terminology be cast as that between a Type I language (using a single locative dummy verb) versus a Type II language using a small set of locative verbs (typically, but not exclusively, posture verbs, as in Dutch). Before we turn to a more detailed discussion of the use of posture verbs in L2 productions, it is essential that we briey review some of the main patterns of use for the three posture verbs in Dutch, presented in the next section. Restricted to patterns that are immediately relevant to the L2 data, the description is but a summary of more elaborate descriptions of the Dutch posture verbs presented elsewhere (Lemmens 2002, 2006). 1.3. A short overview of Dutch posture verbs
In line with the basic assumptions of Cognitive Grammar, the Dutch posture verbs liggen, zitten, and staan can safely be said to be structured around a prototype, the representation of the three basic human positions. As Newman (2002) correctly observes, these prototypes are experiential clusters of attributes and the extended uses can be explained drawing on the notion of image schemata based on our everyday experience of lying, standing, sitting. Classifying the extensive networks in broad strokes, we can distinguish three types of uses: postural uses, referring to human posture; locational uses, referring to the location of any entity in space; and metaphorical uses, referring to location in abstract space or location of abstract entities in concrete space. The following sections will look at some of the extensions in more detail. 1.3.1. Staan. The most important uses of staan can be summarized as in the schema below. (i) (ii) (iii) be on ones feet be on ones base extend upward from base (origin) extend from origin in any direction have a vertical orientation (absence of base or not on base)
On the use of posture verbs (iv) (v) be in canonical position written text as standing
319
The image of an object on its base, a logical extension of the prototype conguration of a human being on its feet, is undoubtedly the most productive one within the locational domain. In an earlier corpus study (Lemmens 2002), it was shown to account for almost 60% of the locational uses. Its conceptual importance is further reected in the fact that the real dimensions of the object do not play a role anymore: for any object resting on its base, a coding with staan becomes the most likely candidate, even if it is more horizontal than vertical, as is the case for cars, plates or laptops, which are said to be standing when resting on their base. Considering cognitive processing, one could argue, as does Serra Borneto (1996) discussing German stehen (stand), that the conceptualisation of a base triggers a mental verticality, i.e., the mental image of an upward extension of an object taking the base as its origin.3 Typically, the situation involves a vertical extension (e.g., trees or grass growing upwards from their roots and thus standing), but through image schematic transformation (rotation), the verb can also be applied in contexts where non-vertical direction is at issue, as in Er staan geen takken meer aan deze boom There stand no branches to this tree anymore.4 Such uses of staan do not express verticality but a (moderate) form of perpendicularity.5 Verticality only comes in as a determinative factor in the absence of a base, as in (2a), or when the object is not resting upon its base and verticality is needed to identify its orientation, as in (2b). (2) a. Het boek staat in het rek. / De golfstok staat in de paraplubak. the book stands on the shelf / the golfclub stands in the umbrella holder. De borden staan in de afwasmachine / De es stond omgekeerd op tafel. the dishes stand in the dish washer / the bottle stood upsidedown on (the) table
b.
It is particularly in this case that staan provides a maximal opposition with liggen. Before continuing with liggen and zitten, however, we need
3. On German posture verbs, see also Fagan 1991 and Kutscher and Schultze-Berndt 2007. 4. The English glosses are but literal translations of the Dutch originals using as much as possible the English equivalents sit, lie, or stand. 5. Dutch is not isolated in this. Perpendicularity is a notion also important for example to a language as Trumai, a genetic isolate spoken in Brazil (cf. Guirardello-Damian 2002).
320
to mention a few metaphorical extensions for staan that will be immediately relevant to the L2-data. The rst (extension (iv) in the schema above) concerns a number of different uses that all relate to the idea of standing as the canonical position for human beings (cf. also Van Oosten 1984: 144). There are a number of (non-linguistic) arguments to justify this claim. First, standing upright is the position that most distinguishes the human being (homo erectus) from other species, esp. primates. Moreover, standing is the starting position for the proto-archetype of human, self-propelled movement, viz. walking or running (on two legs). Related to this is that, when standing, humans are physically stronger than when sitting or lying and generally have better control over their body movements. Humans in a standing position are also perceptually more distinguishable from their surroundings (cf. the metaphor stand out and outstanding). In short, human beings physically function best when in a standing position, feeding the idea of canonicity. But also other sources can serve to conrm the canonicity: if you ask someone to quickly draw a human being, they will typically draw a standing gure.6 Finally, returning to the domain of linguistic meaning, it can be seen that many extensions, locational and metaphorical, draw precisely on standing as the canonical position. This is especially true for Dutch where, as explained above, staan has become the conventionalized coding for any object resting on its base, the default position also being the objects optimal position, i.e., the functional position it has been designed for. In short, standing being the canonical position for human beings, motivates the use of staan to refer to a human beings default posture, even when posture is backgrounded or even no longer at issue.7 This is reinforced by the use of the verb to refer to objects in their normal (i.e., functional) position. This pertains to our study in two dierent ways. First, it may help to explain why staan is used in contexts where there may still be a reference to the standing posture as the most typical posture for the activity at hand, but where there is some non-postural reading as well. A typical case is that of working as a teacher or as a shopkeeper, where you would commonly say (at least in Belgian Dutch) ik sta in het onderwijs (I stand in the education) or Ik sta in een herenboetiek (I stand in a clothes
6. Notice that the standing position is also the way in which humans are represented in handbooks on human anatomy. 7. Clearly, motivation does not equate prediction, as the notion of canonical position can be overruled by for example cultural factors. In Ese Ejja, for example, an endangered language (Tacana family) spoken in Peru and Bolivia, for some contexts the default position for men is neki (stand), that for women, ani (sit) (Vuillermet 2008).
321
boutique for men). Typically, one stands in front of the class room when teaching or behind the counter when running a shop, but both sentences do more than just refer to that postural conguration, referring to the job as a whole, which involves many more kinds of activities than just standing (walking around, sitting and correcting exams, etc.). Notice that for other types of jobs, if one wants to use a posture verb at all, it will be zitten (e.g., Hij zit in de computerbranche He sits in the computer business), but this is an a-postural use of the verb referring to containment (cf. section 1.3.3 below). Second, and more important than the above cases which are rather limited, there are cases where a standing posture is no longer at issue at all, and staan simply refers to the default position. This has given rise to a wide range of extended uses, as illustrated by the following examples: (3) a. b. De politici staan tegenwoordig veel dichter bij de burgers. politicians these days stand much closer to the citizens Hoe sta jij tegenover de nieuwe spelling? how do you stand against the new spelling ( Whats your position about . . . ) Dit thema staat te ver van de leefwereld van het kind (DL1-S0278)8 this theme stands too far from the world of the child ( is too remote from)
c.
Such uses are clearly no longer postural or locational, as they concern ones ideological position on certain issues or simply the position of one ` entity vis-a-vis another. At the same time, the use of staan is well motivated here, as there still is a link with being in ones default position, particularly since it mostly conceptualises the located entities as being placed there. It cannot be denied, however, that the link that can be construed with ones default position is of variable strength in these uses, suggesting a gradient of metaphorisation. For example, while all uses in the example above are metaphorical, there is arguably a cline ranging from (a) (least metaphorical) to (c) (most metaphorical). The second metaphorical extension that plays an important role in both the L1 and L2 data is that of written text, which in Dutch is invariably coded with staan, as illustrated in the following examples:
8. References to corpus examples such as this one consist of 3 parts: (1) DL1 or DL2 identifying it as taken from the Dutch L1 or L2 corpus respectively, (2) the letters S, Z, L referring to resp. staan, zitten, and liggen, and (3) a number identifying the sentence in question.
322 (4)
M. Lemmens and J. Perrez a. b. Wat staat er op deze pagina? what stands there on this page? Sommigen staan op een wachtlijst. (metonymy: NAME / PEOPLE) some (people) stand on a waiting list
The motivation behind this use is probably no longer transparent even to native speakers; nevertheless, two converging factors can still be attributed some motivating force (and this regardless of whether they are etymologically accurate). First, there is the image of text standing on the supporting paper, as if in relief. In order to be readable, letters must be placed on their at side, which thus becomes their base. The mental scanning vector is thus from the paper upwards to the top surface of the printed letter. Second, letters can be seen as standing on (visible or invisible) horizontal lines on the paper. Hence, you write on the line, the letters have a height. The mental scanning vector is thus dierent, perpendicular to the previous one, going from the bottom of the line to the smallest top of the letter. We thus disagree with Serra Borneto (1996) who analyses similar uses of German stehen (stand) as resulting from the metaphor written text as vertical ordering; we do agree with him, however, when he says that the gurative extension, which started from a perceptual image, has established itself in the conventional knowledge of the speakers and is now active, independently from the original spatial image (1996: 477). Whatever its motivation, it is clear that this usage has become highly entrenched to the extent that it has laid the basis for extensions to all kinds of imprints (including non-textual ones), of either temporal or permanent nature, such as pictures in a book, text or icons on a screen, marks on the body. All of these can, and often must, be coded with staan. Within the prototypically structured radial network encoded by staan, this usage could be characterized as a local prototype from which new uses extend. The discussion in section 3.2 below will consider the importance of this local prototype for the L2 data. 1.3.2. Liggen. The following is an overview of the most important uses of liggen that will be briey discussed here: (i) be on ones sides (human posture) not be on base with horizontal orientation (inanimate entities) not be on ones base (regardless of orientation) location of dimension-less entities geotopographical location (cities, buildings, etc.) location of abstract entities
(ii) (iii) (iv)
323
Horizontality is much more important for liggen than verticality is for staan. This horizontality manifests itself in dierent types. Two large categories of horizontal objects can be distinguished, line types and sheet types, which are maximally distinct in their prototypes but share a transitional zone (small boards, for example, are conceivable as wide lines yet also as small elongated sheets). Within the sheet category are also included dierent kinds of tissues (e.g., clothes, towels, etc.) and substances (e.g., liquids, sand, etc.), since they are non-rigid objects that naturally take a horizontal expansion under their own gravitational weight. The dierence between Het zout ligt op tafel and Het zout staat op tafel (The salt stands/lies on the table) is thus metonymical: in the rst case, liggen refers to the salt as substance which, unconstrained by any xed boundaries, will atten out on the table; in the second case, staan shifts the focus from the substance itself to the saltshaker (itself left implicit however), posited on its base, and thus in a standing position. One of the particularities of Dutch (but something one nds in other languages as well) is that it has conventionalized the verb liggen to encode the location of symmetrical entities (balls, cubes, wads, etc.). These can be characterized by a lack of dimensional salience as Serra Borneto (1996) correctly observes for German liegen, perfectly similar to Dutch in this context. He points out how in the absence of dimensional dierentiation there is no mental tracing away from the origin that one has with vertical objects or objects resting on their base. The dimension-less use of liggen motivates a number of metaphorical extensions concerning the location of abstract entities. We are not referring here to the cases where these abstract issues are saliently associated with a particular horizontal form, as may be the case for example with frontiers conceived as lines, or foundations as horizontal supports. The abstract uses that we are concerned with here are those entities that seem to lack such imagery, as for example in De verantwoordelijkheid ligt bij jou The responsibility lies with you. We will not go into detail as to what motivates this extension (see Lemmens 2006); for our present purposes it suces to point out the entrenchment of liggen as the usual encoding for abstract entities. Another particularly well-entrenched usage of liggen in Dutch is that of geotopographical location as Serra Borneto (1996) has called it. This concerns cases where buildings, cities, and the like are located geographically. Even when standing right in front of a quite saliently vertical building, like a church, that typically is thought of as standing (resting on its base), we can still felicitously say, e.g., De kerk lag pal voor ons the church lay right in front of us; in that case, we would obviously not be talking about it as a building, but about its geographical location. As we
324
will detail below, some interesting (erroneous) patterns for this usage emerge from the L2-data. 1.3.3. Zitten. follows: (i) The most important uses of zitten can be summarized as
(ii) (iii)
be in a sitting posture (considerable postural variation) default posture of small animals default posture of insects (close) containment (locational usage) (close) contact (locational usage)9
Strikingly, zitten shows considerably more variety in the postural domain than do liggen and staan, as it is used for a diversity of positions: (i) resting on the buttocks like on a chair (prototype posture for zitten), or (ii) with the legs crossed (yoga-position), or (iii) with legs stretched out; (iv) a squatting position; (v) on all fours; (vi) on hands and knees; or (vii) on ones knees. Interestingly, some of the extended uses of zitten can be explained from these postural variations. For instance, in a squatting position, the lower legs are bent, the body is close to the ground and often, there is an additional support with our hands on the ground. This postural conguration motivates the use of zitten to express the default position of lower animals such as rabbits, mice, frogs, etc. who usually are not said to stand. In the domain of animal postures, zitten has even gone further in that it is also the default verb for insects that, just as frogs and mice etc., only have a dual postural opposition zitten-liggen (the latter being used, for example, when they are dead). In the absence of postural variation, not much of the notion of posture is probably retained in these uses. The postureless nature of these uses, in combination with the postural variety sketched above, may explain the verbs noncommitment to posture and its productive extensions to other postureless uses. The most important one immediately relevant to the L2-data, is what we conveniently label containment-zitten. In the case of containment-zitten the verb no longer encodes posture but merely situates the entity as (closely) contained by a container. Hence the use of zitten to refer to water in a bottle, money in your pocket, a key in the keyhole, dust in your hair, a CD in a CD-case, etc., but also people sitting in prison or in a hotel room, etc. When used with inanimate enti-
9. While this extension is quite important for zitten, where the verb expresses close contact, as for example in Er zit geen deurknop aan deze deur there sits no doorknob on this door, it turned out to be irrelevant to this study and will thus be ignored.
325
ties, the contexts usually concern close containment or cases where the position of the contained entity depends on that of the container. As can be expected, zitten is also often used when metaphorical containment is at issue, such as suspense sitting in a race, or a bug sitting in a computer system, or the meaning sitting in a word or text. As we will show, the latter will be of particular interest for our L2-data. The productivity of containment-zitten is clearly illustrated by the L1-corpus used in our study: 64% of the cases refer to containment.10 As has become clear from the above discussion, the three cardinal posture verbs zitten, staan, and liggen have become basic location verbs in Dutch. While many uses of their extensive semantic networks have not been discussed here, the above summary has revealed the basic semantic mechanisms that underlie their most important uses. Considering their locational uses, and particularly the variations that may exist (such as a building said to lie or stand, or salt on the table as lying or standing, or butter lying, sitting, or standing in the fridge), we could say that the Dutch posture verbs actually function as noun classiers, just as noun suxes may do in more exotic languages, specifying that the noun in question refers to an entity that is liquid, oblong-shaped, pointed, rigid, sand-like, sticky, tubular, etc. Clearly, the Dutch categories are less rened than in many of these languages, yet the parallel with how the Dutch posture verbs indeed categorize the located entities cannot be denied. Interestingly, Gullbergs analysis of gestures conrms this idea, showing that Dutch speakers are signicantly more likely to incorporate gure object information in their gestures than are French speakers (Gullberg to appear; see also Gullberg and Narashimhan, this volume). Using these posture verbs in an idiomatically correct way is quite hard to master for French-speaking learners of Dutch. The study discussed here is a rst attempt at clarifying these diculties in more detail and preparing the ground for further research. Before turning to the actual quantitative and qualitative analysis, it is appropriate to say a few words about the corpora used (L1 and L2) and how we analysed them. 1.4. Corpus and corpus analysis
This study is based on two corpora: a learner corpus (DL2) and a control corpus (DL1). The learner corpus is a selection from the Leerdercorpus Nederlands (Learner corpus Dutch; see Perrez and Degand, in prep.).
10. This percentage lines up nicely with another corpus-based study (Lemmens 2002), where some 50% of the 4,311 sentences with zitten referred to containment.
326
This corpus is a collection of texts written by learners of Dutch from different L1-backgrounds (French, German, Polish, Indonesian and Hungarian). Our French selection is drawn from (i) a series of argumentative essays written by French-speaking learners of Dutch studying Dutch as a main option and (ii) writing tasks performed by French learners of Dutch in the context of the CNaVT-exam11. The latter texts show a greater diversity, ranging from essays, summaries and reports, to letters and e-mails. For each text, some meta-information has been recorded concerning the author (mother tongue, study level and orientation) and the text itself (type of text, year, CNaVT-prole). In total, the French DL2subcorpus contains 1,247 texts amounting to 323,921 words. The control corpus (DL1) is composed of a range of argumentative essays written by native speakers of Belgian Dutch in the framework of a writing prociency class (rst year university students, Ghent University). The size of the corpus is admittedly rather limited (approximately 52,000 words) but its primary interest lies in its argumentative nature which matches quite well the main type of texts in the learner corpus. Our study of posture verbs on the basis of these corpora is not without limitations. Firstly, argumentative texts are not really representative of the contexts in which posture verbs typically occur, which may result in a limited number of attestations. The planned follow-up studies on spoken data will surely overcome this limitation. Secondly, as indicated before, it is essentially restricted to Belgian Dutch; some of the uses mentioned here may not be common in Netherlandic Dutch. Thirdly, for written corpora it is not always possible to reconstruct the contexts in which they have been produced, which makes it occasionally dicult for the researcher to interpret the learners intentions in having used a given posture verb. Finally, the absence of an objectively determined indication of the individual levels of prociency did not allow a reliable investigation into the evolution over the dierent levels. Despite these limitations, our study has revealed relevant tendencies concerning the use of posture verbs by French-speaking learners of Dutch. In line with the above analysis of staan, liggen and zitten, we coded, for both the L1 and L2 corpus, the dierent semantic categories that these verbs are used in. This has been done at two levels of detail. At the highest level, a distinction was made between postural, locational and metaphorical uses of the verbs. In addition to these three categories, we distin-
11. The Certicaat Nederlands als Vreemde Taal (CNaVT ) is an internationally recognized certicate for students of Dutch comparable to the Cambridge Certicate in Advanced English.
327
guished two other categories at the highest level, viz. the use of the posture verb (i) as root of a particle verb construction and (ii) as part of an idiomatic expression. The former category refers to cases where one of the posture verbs is combined with a particle (such as opstaan get up (from bed) or toestaan allow), whereas the latter category refers to xed collocational uses (such as bekend staan be famous or onder stress staan be under stress) as well as cases where a posture verb is used as part of a xed multiword unit (in combination with a preposition, an adverb, an adjective and/or a noun) whose global meaning cannot be derived from the meaning of its components in isolation.12 Examples are voor de hand liggen (lie before the hand be evident), op eigen benen staan (stand on own legs be independent), or het niet meer zien zitten (not see it sit any longer not able to see ones way out of a situation). The motivation for separating these two categories is, rstly, that the meaning of the verb in these constructions is often quite remote from its postural or locational meaning. Secondly, French-speaking learners of Dutch, when using such constructions, arguably do not really intend to use a posture verb, but rather directly translate a French construction (Il est evident que . . . , It is clear that) into a Dutch counterpart that simply happens to be built with a posture verb (Het staat vast dat . . . it stands xed ( is) clear that). In other words, the use by the learners of staan in vaststaan does not say anything about their ability to use the posture verb properly, but would rather be part of a unit-based learning strategy. At a more rened level, additional codes were used to further specify the use of the posture verb within the larger categories described above. This is particularly relevant to the locational and metaphorical uses of the verbs. Such further specication allows us to determine what type of location the posture verb refers to (e.g., geotopographical location, documents on a desk, etc. for liggen or containment for zitten) or what type of metaphorical use was at issue (e.g., abstract entities or scales for liggen; canonical position or written text for staan; containment, or stuckness for zitten). Some of these labels probably deserve some further comments. For instance, the label containment for zitten has been applied to locational as well as metaphorical uses. The dierence between these two lies
12. In order to avoid a subjective labeling, we have used as a reference the Van Dale Groot Woordenboek Nederlands-Frans bilingual dictionary to establish whether a given structure should be considered as an idiomatic expression or not. While obviously the dictionary cannot be taken as a awless norm, we believe that the choice is further justied by the observation that this is the resource that French L2 learners are most likely to turn to. Only for a handful of cases, where it clearly concerned an omission in the dictionary, was the label a decision of the authors.
328
in the nature of the container, which is concrete in the former, example (5) but abstract in the latter ones, example (6).13 (5) locational:containment Ik zit namelijk op kot en heb geen kabelaansluiting in huis. (DL2-Z0042) I sit ( live) in a student room and dont have any cable connection there. metaphor:containment Na het treinongeval en de reis beseft hij waar hij in zijn leven zit. (DL2-Z-0053) after the train accident and the journey, he realises where he sits ( is, stands) in his life
(6)
The further specications for staan also deserve some comments. In the description of the locational use of staan, we specied whether this location was canonical, referring to contexts where the location is concurrently related to the default position of a human being or of an object (resting on its base) as in example (7). (7) Dus de producten zullen op een logische plek in de winkel staan en het zal gemakkelijker voor de klant zijn. (DL2-S-0040) thus the products will stand in a logic location in the store and it will be easier for the customer.
The label canonical has also been used to qualify some of the metaphorical uses of the verb. In these cases, it refers to metaphorical extensions of staan that can still be related to its postural or locational meaning (see example (3b) above). The other metaphorical uses concern cases where staan could not directly be linked with its postural or locational meaning (see example (3c) above). The coding scheme as described here has been applied to all of the sentences in the DL1-corpus and all correct uses in the DL2-corpus. The cases where the L2-users did not use the posture verb correctly were set apart at the highest level via the label error, so as to ensure that statistics on the distribution of usage only applied to correct cases. On a more rened level, we then coded a specication of the dierent types of errors made by the French-speaking learners of Dutch. These dierent types of
13. The examples from the DL2 corpus are reproduced in their original form; it may thus be that there are other language errors than the ones that we are interested in (such as the infelicitous use of in huis in this example). These errors will be ignored here.
329
errors will be discussed extensively in the subsequent sections. Before we turn to them, some important methodological observations have to be made concerning the corpus analysis as described here. Firstly, it should be clear that the coding scheme described above has its methodological limitations. The codes are set up as heuristic tools to allow qualitative and quantitative analysis, but the resulting categorization should not be taken as reecting a nal and xed map of the meanings of these verbs. For one thing, it may in some cases not always be easy to distinguish between certain categories, for example, with contexts where both a postural and locational reading could be entertained. The absence of clear-cut distinctions is also nicely illustrated by the degrees of metaphorisation illustrated above. Secondly, while we strongly prefer corpus-based analysis over intuitionbased analysis, the latter cannot be excluded when evaluating L2productions. To assess our data as objectively as possible, all fragments have rst been analyzed by both authors separately, for which, as became apparent in the subsequent comparison, there was a high degree of agreement. Problematic cases were further discussed and submitted to the judgment of minimally two other native speakers of Belgian Dutch.
2. 2.1.
Quantitative analysis of L2 data Overall frequencies
In total, 557 sentences have been extracted in which one of the three posture verbs occurred (407 fragments from the learner corpus and 150 from the control corpus). A rst general observation is that staan is the most frequently used posture verb in both corpora. In the control corpus, it is followed by liggen and zitten respectively, whereas in the learner data, zitten is slightly more frequent than liggen (cf. Table 1)14. Further comparison between the L1 and L2 data show that, in line with the typological dierences between French and Dutch, the learners globally tend to underuse the posture verbs in their L2 productions (62.85 vs. 143.95
14. The DL1 frequency distribution (staan > liggen > zitten) does not exactly replicate the one given in Lemmens (2002) (staan > zitten > liggen), in which zitten even appeared to be almost as frequent as staan. This can probably be explained by the dierent types of texts analysed in the latter study, i.e., newspaper articles versus argumentative essays in the present study.
330
Table 1. Distribution of liggen, staan and zitten in the learner and control corpora Verbs Control 52,056 words Occ. liggen staan zitten Total 55 73 22 150 Freq./50,000 52.8 70.05 21.1 143.95 Occ. 88 209 110 407 Learner 323,921 words Freq./50,000 13.6 32.25 17 62.85
Figure 1. Distribution of the posture verbs across their categories of use
occurrences per 50,000 words).15 This is more outspoken for liggen and staan than for zitten.16 A more detailed analysis of how the posture verbs are used by the L1speakers and the learners respectively points to some interesting tendencies, as illustrated by Fig. 1. A rst observation is that the learners appear
15. Given the unequal size of the learner and control corpora, the frequencies in Table 1 have been normalized to 50,000 words, the greatest common decimal factor; such a procedure is not uncommon in corpus linguistic studies for frequencies in large data sets (see, e.g., Newman and Rice 2004). In the smaller data sets (e.g., Tables 2 and 3) standard percentages are used. 16. The tendency of Francophone L2 speakers of Dutch to underuse posture verbs and overuse of location verbs (the latter issue is not considered here) is conrmed by a pilot study (Lemmens 2001) comparing picture descriptions: native speakers of Belgian Dutch used posture verbs in 58% of the locational phrases, whereas the French learners only used them in 19% of the cases and resorted to a location verb in 63% of the cases (p < 0.0005).
331
to use posture verbs inappropriately in about 11% of the cases; these will be discussed more extensively in sections 2.4 and 3. While we have labelled them as errors which, from a L1 perspective they are, they may be quite motivated (and thus, in a sense correct) within the learner language (see Section 3). A second observation which can be derived from Fig. 1 is that the learners tend to use the posture verbs more frequently in postural and locational contexts (respectively 15% and 17%) than the L1 speakers (respectively 4.7% of postural contexts and 10% of locational).17 Narrowing down these results over the three verbs individually (see Table 2), we see that the tendency towards a greater postural use by the learners especially holds for zitten (42.7 % of the cases vs. 13.6% in the control corpus) and to a lesser extent for staan (7.2% of the cases vs. 4.1 % in the control corpus).18 These cases refer to sentences where the sitting or standing position is prominent. As far as the locational contexts in the learner productions are concerned, they appear to be the most frequent for liggen (38.6% of the cases vs. 7.3 in the L1 corpus), and zitten (23.6% of the cases vs. 9.1% in the control corpus). The latter observation does not apply to staan, however, which is more frequently used in locational contexts by the native speakers (13.7% of the cases vs. 6.2% in the learner corpus). Considering the 34 locational uses encoded by liggen in the learner productions, one observes that in a huge majority of the cases (73.5%), they refer to sentences expressing a geotopographical location. Other examples of this locational use of liggen concern papers (14.7% of the cases) or books (8.8% of the cases) lying on a desk. Locational zitten in the L2 data almost exclusively concerns sentences clearly expressing the notion of containment, as in example (8), or to borderline cases taking an intermediate position between posture and location, as in (9). (8) Aan het begin, zit [ . . . ] de hoofdguur alleen thuis als de telefoon rinkelt. (DL2-Z-0017)
17. One may wonder (as did one of the reviewers) whether the dierences in frequency may not be due to the dierent topics of argumentative essays in the two corpora. As will be recalled, the learner corpus does indeed contain a larger variety of topics and text types (argumentative essays, summaries, emails and business letters), but none of the topics in either corpus are such that they are biased towards postural or locational uses. The variety in topics does not, in other words, invalidate the major claims made in this paper, such as the overall underuse of the posture verbs by L2-speakers as well as their overusing these verbs in some contexts. Furthermore, the dierent text types do not affect the qualitative analysis presented in Section 3. 18. The percentages in Table 2 represent the uses per verb; for example, of the 55 attestations for liggen in the control corpus (see Table 1), only 1 is postural (1.8%).
332
Table 2. Distribution of staan, liggen and zitten across the L1 and L2 corpora Control Occ. Postural liggen staan zitten Locational liggen staan zitten Metaphorical liggen staan zitten 7 1 3 3 16 4 10 2 69 30 26 13 % 4.7 1.8 4.1 13.6 10.7 7.3 13.7 9.1 46 54.5 35.6 59.1 Occ. 63 1 15 47 73 34 13 26 123 24 75 24 Learners % 15.5 1.1 7.2 42.7 17.9 38.6 6.2 23.6 30 27.3 35.9 21.8
(9)
at the beginning, [ . . . ] the main character is sitting home alone as the phone rings Maar roken is niet alleen slecht de mens die rookt, maar ook voor de mensen die erbij zitten. (DL2-Z-0082) but smoking is not only bad for the smoker, but also for the people sitting with him
In sum, the learners tend to use posture verbs, especially liggen and zitten, to a greater extent than the native speakers in more prototypical contexts denoting posture or location. Conversely, the native speakers use the posture verbs more frequently in metaphorical contexts than the learners (46% of the cases vs. 30% in the learner productions). This, again, is more striking with liggen and zitten, showing a far greater proportion of metaphorical uses in the productions of the native speakers than in the learners essays (54.5 % vs. 27.3 % for liggen; 59.1% vs. 21.8% for zitten). This tendency does not hold for staan, whose metaphorical usage is comparable in both corpora (35.9% of the cases in the learner corpus vs. 35.6% in the control corpus). When liggen is used metaphorically by the native speakers, it almost exclusively appears in contexts where it co-occurs with an abstract entity, such as problems, solutions, causes, etc. (93.3% of the cases). The remaining metaphorical uses of liggen include contexts where it refers to a conceptualized scalar entity (6.7% of the cases). Al-
333
though metaphorical liggen is less frequent in the L2 data, its use by the learners seems to be similar to the native usage, appearing in the rst place in combination with abstract entities (75% of the cases), and in the second place in contexts where it refers to a scale (25% of the cases). Of the 13 occurrences of metaphorical zitten in the productions of the L1-speakers, 12 refer to some abstract notion of containment (92.3%), whereas in the remaining example zitten is used as part of a progressive construction. Quite surprisingly, despite its lower frequency in the L2 productions, metaphorical zitten is used in a greater variety of ways by the learners. In addition to a majority of containment uses (66.6% of the cases) and two progressive uses (8.3% of the cases), zitten also appears in contexts where it encodes the notion of possession (8.3% of the cases, see (10)) and in contexts where it refers to the idea of being stuck (12.5% of the cases, see (11)). While this discrepancy between the learner and native usage of metaphorical zitten could probably be explained in terms of the relative size and scope of the control corpus, the observed variety of uses of metaphorical zitten among the learners tend to suggest that some learners correctly manage such very specic metaphorical uses of zitten. (10) (11) [ . . . ] ik zit met een klein probleempje (DL2-Z-0043) [ . . . ] I sit with ( have) a small problem. De kans bestaat dat de ambitie van de werknemer te hoog of te laag zit. (DL2-Z-0030) the chance exists that the ambition of the employee sits ( is) too high or too low.
Finally, even though metaphorical staan is as frequent in L2 as in L1, the native speakers and the learners use it in quite dierent ways. First of all, the native speakers use it more frequently in canonical position contexts, i.e., where the metaphorical use of staan can easily be linked up with its postural or locational meaning. This use of metaphorical staan accounts for 53.8% of the cases in the L1 corpus. A second context in which metaphorical staan is commonly used in the L1 corpus (34.6%), is related to the notion of written text as standing entity. Third, 11.5% of the cases in the control corpus concern metaphorical extensions without a clear link to postural or locational uses. The distribution of metaphorical uses of staan in the L2-corpus is slightly dierent: the notion of written text as standing entity accounts for 73% of the cases, followed by the uses referring to canonical position (21.3%) and other metaphorical ones (4%). The dierences internal to the group of metaphorical extensions suggest that L2 speakers do have control of the standing text pattern (a point to which we shall return in 3.2), but their overall semantic map of
334
the metaphorical extensions clearly diers from that of the L1 speakers, as could be expected. The L2 speakers probably have not yet mastered the semantic motivations linking up the dierent uses, as shown by the low frequency of canonical and other metaphorical uses (i.e., other than the standing text pattern). To conclude the discussion of the overall distributional tendencies, some observations can be made concerning the use of the posture verbs as part of a particle verb and as part of an idiomatic expression. While the former cases are rather limited (4.7% in L1 and 3.4% in L2)all but one example being constructed with staan (vaststaan it is clear that, openstaan be open to, toestaan allow to, etc. but klaarliggen be ready), the group of idiomatic expressions deserves a short discussion. The occurrence of posture verbs as part of an idiomatic expression occurs more frequently in the L1 than in the L2 productions (33.3% of the cases vs. 21.6%). This holds for staan (37% vs. 29.7%), liggen (34.5% vs. 22.7%) and zitten (18.2% vs. 5.5%), even though in both corpora such uses appear more regularly with staan and liggen. Frequent expressions with liggen in the L1 data are voor de hand liggen lie near the hand ( be evident) (31.6%), aan de basis liggen lie at the basis of (21%) and in iemands handen liggen lie in s.o.s hands (10.5%). In the learner data, the most frequent examples are voor de hand liggen, (40% of the cases), ergens aan ten grondslag liggen lie at the basis of (25%) and iemand na aan het hart liggen lie near to s.o.s heart ( be very dear to someone) (15%). As far as zitten is concerned, its idiomatic uses are quite limited. Both the native speakers and the learners use it in hoe zit het met . . . ? how sits it with ( what about . . . ?) (50% of the cases in both L1 and L2 ) and iets zien zitten regard s.th. feasible (50% in the L1 corpus vs. 16.6% in the L2 corpus). In addition, the learners also use zitten in iemand in het haar zitten annoy someone (lit. sit s.o. in their hair) (33.3% of the cases). Finally, idiomatic uses of staan, in opposition to zitten and liggen, show a great diversity of examples among the native speakers (27 occurrences distributed across 17 dierent expressions; 0.63 Type/Token ratio), as well as among the learners productions (62 occurrences distributed across 24 dierent expressions; 0.39 Type/Token ratio). The occurrences of these expressions seem to be quite equally distributed in the L1 corpus; more frequent examples including aan het hoofd staan van stand at the head of ( be in charge of ) (11%), centraal staan stand ( be) central (11%) and op eigen benen staan stand on your own legs ( be independent) (11%). On the other hand, in the L2 data, staan is extensively used as part of the expression centraal staan (29%), followed by ter beschikking staan stand ( be) at ones disposal (9.7%) and in contact staan stand ( be) in contact with (8%).
335
In sum, even though these expressions are somewhat less frequent in the L2 productions, overall these uses are correct, supporting the idea that these are learnt as xed units. 2.2. Quantitative error analysis
The second step in the quantitative analysis focuses on the dierent types of errors made by the French-speaking learners when they use staan, liggen and zitten; in Section 3 we will then consider some of these errors from a more qualitative perspective. As shown in Fig. 1 above, the learners use the posture verbs incorrectly in approximately 11% of the cases (46 sentences in total). All in all, this is a relatively good result, but this may be attributed to the fact that the corpus consists of written data only, where learners have more time for reection. It is expected that the error rate in spontaneous speech will be much higher. The highest proportion of errors occurs with staan (65.2%), followed by liggen (19.6%) and zitten (15.2%). The dierent types of errors which have been identied are summarized in Table 6. Recall that these errors all concern cases where a posture verb has been used incorrectly, either (i) because the wrong posture verb was chosen (posture verb confusion) or (ii) because a posture verb was not possible in the given context (posture verb panic). These are the two main categories in Table 6; they will be discussed in more detail below. The third group concerns a collection of miscellaneous cases (i) where it was not at all clear what the speaker was trying to say, (ii) where a posture verb was used instead of a phrasal verb (e.g., toestaan allow)19, or (iii) where a given construction has not been reproduced correctly (hence, constructional contamination). The neat subdivisions in Table 3 concern in reality a much more complicated interplay of factors, especially for the miscellaneous group. At the same time, the division allows us to identify the two main error patterns discussed below, i.e., posture verb confusion and posture verb panic. The subdivisions within these two groups represent an onomasiological perspective, as they identify the context to be encoded, and they do so via the verb that would have been used had the situation been coded correctly. For example, an error labelled staan:metaphor:text refers to a sentence expressing the idea of texts located on paper for which
19. These complex verbs probably contribute to the overall posture verb problem, but in line with the decision taken to treat these as a separate category for the correct sentences (see section 1.4 above) we have put these in a separate group here as well.
336
Table 3. Types of errors in the learner corpus Context of error 1. posture verb confusion liggen-context liggen:metaphor:abstract entity liggen:locational:paper liggen:locational:geotopographical staan-context staan:metaphor:text staan:locational:canonical zitten-context zitten:metaphor:containment zitten:locational:containment zitten:progressive 2. posture verb panic existential verb neutral location copula 3. miscellaneous constructional contamination posture verb instead of particle verb unclear Total 3 4 3 46 6.5% 8.7% 6.5% 100% Occurrences 20 9 1 3 5 6 5 1 5 2 2 1 16 10 5 1 % 43.5% 19.6% 2.2% 6.5% 10.9% 13.1% 10.9% 2.2% 10.8% 4.3% 4.3% 2.2% 34.8% 21.7% 10.9% 2.2%
staan should have been used but for which the learner chose another posture verb. The following section provides a more detailed analysis of the most common patterns in these two error groups. 3. Qualitative analysis
The quantitative analysis above has revealed a number of tendencies that could be summarized as follows: i. ii. the posture verbs are largely underused in L2 productions; the dierent posture verbs are often confused;
On the use of posture verbs iii. iv.
337
staan is the most frequent verb in the incorrect sentences (30 / 46 or 65.2%); L2 speakers sometimes use posture verbs where a neutral verb is to be used.
Given the typological dierences between Dutch and French, observations (i) and (ii) do not really come as a surprise. The other two observations may not have been intuitively obvious, even if in retrospect they, too, are perhaps not so surprising after all. The high frequency of staan in the set of errors (regardless of the subdivision drawn up above) lies in line with the verb expressing the canonical position of humans (and entities on their base or in their optimal or functional position). Looking at this from the learners perspective then, this can be phrased as follows: when learners have identied a context as a posture verb context, they will most likely choose staan as the default posture verb (especially when they have no idea which posture verb to use). Notice that this corresponds nicely with frequency of exposure: the fact that staan refers to canonical posture also makes it the most frequent posture verb in Dutch (cf. Table 1 and the results in Lemmens 2002, 2005b). While it is often said in L2-pedagogy that what you put in, is not what you get out, the preference for staan as the default verb seems to indicate that L2 speakers do pick up dominant patterns in the target language without being explicitly told. (As a rule, pedagogical grammars do not mention frequency and/or prototypicality.) Finally, there is the somewhat surprising observation that L2 speakers use posture verbs where Dutch does not allow them. This can certainly in part be attributed to what we conveniently call a general posture verb panic, that incites L2 speakers to simply replace any form of locational or existential zijn be with a posture verb (a form of hypercorrection); nevertheless, there is still some semantic logic in their behaviour, not unlike that exploited by native speakers, as will be detailed in section 3.1 below. Sections 3.2 and 3.3 will look at the cases where the wrong posture verb is chosen (group 1 in Table 3 above). A number of these errors actually centre around certain well-entrenched substructures (local prototypes); in these cases, the link with the postural prototype may no longer be transparent, but this well-entrenched usage motivates new extensions. There are two such structures that we consider here, viz. text as a standing entity (section 3.2) and geotopographical location (section 3.3). 3.1. Overuse of posture verbs
Given the strong obligation for using a posture verb in Dutch when one wants to express the location of an entity, L2 learners will most likely
338
realise the importance of using these verbs at a relatively early stage in their learning process. The high number of metaphorical extensions of these verbs (in the L1-control corpus, about 46% of the cases) will undoubtedly add to the initial confusion and may lead to some kind of posture verb panic, inciting learners to use a posture verb in contexts (often metaphorical ones) where no such verb is allowed. Consider the following cases: (12) a. De vrouw *staat een beetje wanhopig omdat ze wilde dat haar man de tuintrap verft. (DL2-S-0205)20 the woman stands a bit desperate because she wanted her husband to paint the gardensteps Geachte Vrouw, Hier *zit de resultaten van mijn verslag. (DL2Z-0059) dear woman (sic), here sits the result of my report
b.
In both cases, the use of a posture verb is inappropriate; the verb zijn be has to be used. The grammaticalisation of posture verbs has not gone that far (yet) that a pure copular use (X BE ADJ) as in (12a) is generally possible, even if there are cases that come quite close (e.g., het huis staat leeg the house stands empty). Yet even for the latter, a certain locative colour remains, whereas in the example here this is not the case. If a locative complement had been added or even a te V complement (expressing a progressive), staan would actually have been quite possible: zij staat er wanhopig bij she stands there desperately PREP or zij staat wanhopig tegen haar man te roepen she stands desperately to her husband to yell ( is yelling at). For example (12b) on the other hand, the locative hier here (expressing something like enclosed with this letter, herewith) is not locative enough to sanction a posture verb. The use of zitten may have been triggered by the idea of the report being attached to the letter; French joindre join (expressing ATTACHMENT) often takes zitten as the Dutch equivalent. In some cases, the error may be attributed to a confusion of dierent idiomatic constructions: (13) b. In de eerste tekst zoekt men als er een verband *staat tussen de witte massa (DL2-S-0094) in the rst text they (try to) nd whether there stands a connection between the white matter
20. For ease of identication, the verb errors in the cited learner examples have been marked with a *; as said before, other mistakes that may occur in the sentences have not been corrected nor have they been marked.
339
The two correct expressions, quite similar to each other, are either X staat in verband met Y (X stands in connection with Y) or Er is een verband tussen X en Y (there is a connection between X and Y); this seems to be a clear case of constructional cross-contamination. However, the latter example (as a handful of others) may also be due to a (phonological) confusion of staan and bestaan (exist). While bestaan is etymologically related to staan, this is no longer obvious (even native speakers are probably not aware of this) and the verb is often interchangeable with existential zijn. However, one of the L2-errors in our data illustrates that the interchangeability does not always hold: (14) De kranten zijn meestal goed maar ik vind dat er ook een nadeel *staat . . . Er zijn bladen die de waarheid niet precies vertellen (DL2S-0130) the newspapers are usually good but I think that there stands also a disadvantage . . . There are papers that do not tell the exact truth
Supposing that the L2-speaker confused staan with bestaan, then this would still yield a coding that at best is highly marked, since the verb zijn is the most appropriate alternative. This intuition is conrmed by a Google search on er zijn/bestaan nadelen there are/exist disadvantages, yielding 17,500 vs. 3 hits respectively. The reason why bestaan is disfavoured is that it too strongly focuses on the idea of existence, whereas this seems uncalled for in the present context. While the dierences between bestaan and zijn appear motivated, a full explanation of these goes beyond the scope of the present paper.21 Let us now look at two important subcases where the wrong posture verb is used, triggered by two well-entrenched uses, staan as used to refer to written text (3.2) and liggen as used to refer to geotopographical location (3.3). 3.2. Text as a standing entity
The L2 speakers seem to be suciently familiar with the Dutch convention of using staan to refer to written text, as it is used correctly in 55 occurrences, which amounts to 30.7% of their correct uses of staan. At the
21. It is to be expected that this dierence will be at least partially similar to that in English between be and exist. Notice that the latter, too, is derived from a Latin verb that referred to standing (ex- out, forth sistere cause to stand). Similarly, Spanish has estar be, which evolved from Latin stare stand, whose uses dier from those of the copula ser be. The evolution of estar actually lines in line with the claim that standing is the canonical position for humans.
340
same time, there are a number of cases where this context leads to mistakes, either because staan is not being used or because staan is used incorrectly (overextension). Let us begin with the rst case; heres one of the L2-examples: (15) Zotte mensen *zit ook tussen aanhalingstekens omdat het een uitdrukking van het meisje is. (DL2-Z-0016) crazy people sits also between quotation marks because it is the expression of ( used by) the girl
The students choice for zitten is not without motivation, the word being closely contained by the quotation marks; however, talking about the graphemic representation of language renders a coding with staan absolutely compulsory. At the same time, the strong obligation can lead to (subtle) errors as well, as is the case for the following student, inappropriately overextending that use of staan: (16) Er ? *staat een bijbedoeling in de zin die op een verschillende manier genterpreteerd zal worden (DL2-S-0160) there stands a hidden intention in the sentence that will be interpreted in a dierent way
At rst glance, nothing seems wrong with this example, since the student is talking about the text that will be interpreted dierently. However, upon second thought, the formulation just does not seem fully idiomatic, since a hidden intention is not really orthographically expressed, unlike is the case with straightforward meanings, where this metonymy does apply, e.g., Hun ideeen staan in het werkboek their ideas stand in the workbook (Google example).22 A coding with zitten (expressing containment) would thus have been more idiomatic. Despite the fact that basically any text can be said to have a meaning sitting in it, there are certain conventionalised collocations, as becomes apparent from the following L2-error, where the learner is talking about the information in newspapers: (17) Maar als je [die kranten] eens koopt, ontdekt je dat daar niets in *zit (DL2-Z-0055) but when you buy [these newspapers], you discover that there sits nothing in them
Apparently, this learner has mastered the usage of zitten to refer to the meaning inside texts, yet a native speaker of Dutch would immediately
22. http:/ /www.zinweb.nl/content/leeszin/boek.asp?oId=359, last accessed Jan. 12, 2009.
341
replace the verb with staan to render the sentence more idiomatic.23 The reason for this is that in newspaper articles, and generally all other types of non-ctional prose, meanings should be directly derivable from the very words themselves since these texts are not supposed to have hidden meanings. In Dutch postural logic, their meaning thus stands on the paper, black on white. It is obviously not impossible for these texts to have a deeper layer of meaning (implications, humour, sensitivity, etc.), yet such is typically not associated with them.24 Rather, these are the things one nds in ctional prose, poems or song lyrics. Notice, however, that if the meaning is suciently evident from the words/text itself, staan remains a preferred coding even in these types of texts. In sum, what the above L2-examples reveal is that the learners are aware of certain common extensions of staan (printed text) and zitten ((close) containment), as further illustrated by the higher frequency of theses uses (see in section 2.1) that are generally also quite frequent in the L1 data. At the same time, the learners may not have fully mastered some of the collocational subtleties of the target language, which themselves are semantically well-motivated (which, unfortunately, we cannot aord to elaborate on here). 3.3. Geotopographical location
One of the extensions that is common for liggen is to express geotopographical location, which concerns the location of entities that are typically conceived of as locations themselves, such as buildings, cities, villages, etc. This usage is quite frequent in the learner data: there are 29 occurrences in the L2-corpus, which amounts to 33% of the total attestations for liggen (88); only 2 of these are incorrect (cf. below). Conversely, there are 5 contexts of geotopographical location where the L2-learner uses staan instead of liggen, of which 4 are given in example (18) below (2 were by the same speaker, so only one of those is given).
23. Notice that this sentence would be appropriate if the speaker were referring to extra things that one may nd in a newspaper, such as loose advertisement brochures, a concert calendar or other loose sections/quires you can take out, free stickers of CDs enclosed with it, etc. Such physical containment is, however, not referred to in this particular context. 24. A simple Google search on in de krant zit yielded only 2 examples (as opposed to 8,160 for in de krant staat) referring to this context (and not to the one mentioned in the previous footnote) where the located entities were niet genoeg diepgang not enough depth and zoveel emotie so much emotion; they essentially conrm the tendencies described here.
342 (18)
M. Lemmens and J. Perrez a. b. c. de landen die vlak bij de zee *staan . . . (DL2-S-0012) the countries that stand close to the sea terwijl Gosselies . . . verder van Charleroi *staat (DL2-S-0114) while Gosselies stands further away from Charleroi25 Daar *staat een beautycenter met sauna, bubbelbad en massages. (DL2-S-0158) there stands a beauty centre with sauna, jacuzzi and massages De universiteit *staat in Luik en ik houd veel van deze stad (DL2-S-0200) ` the university stands in Liege and I love this city very much
d.
For (18a) and (18b) there is absolutely no discussion: the location of land areas and cities must be coded with liggen. For (18c) and (18d), the situation is more complex, since the two buildings could be conceived of as standing (a vertical entity resting on its base). However, in the context at hand, such a focus on the building is rather infelicitous. Interestingly, one erroneous use of liggen in this context misses precisely on this point: (19)
?
Bauval [beweert] dat de piramiden van Gizeh op een bepaalde wijze *liggen in overeenstemming met het midden van het sterrenbeeld Orion. (DL2-L-0026) Bauval claims that the Gizeh pyramids lie in a certain way in accordance with the middle of the Orion constellation
Overall, this is a context of geotopographical location, and liggen is possibly acceptable; the reason why it has been marked as an error, is that the sentence talks about the pyramids being deliberately positioned in a certain way, which renders their canonical position (resting/put on their base) again salient, and the use of staan would have been more idiomatic. There is one particular case where a learner seems to overextend the geotopographical context to one that is not really one: (20) Voor een steeds betere dienstverlening zal de geldverdeler buiten het kantoor *liggen (DL2-L-0030) to provide an increasingly better service the cash dispenser will lie outside the bank oce
Usually, cash dispensers are built into the wall and do not occupy a land surface of their own, which makes the use of liggen very marked; rather a coding with a more general location verb such as zich bevinden be lo-
25. Gosselies and Charleroi are two cities in the French-speaking part of Belgium.
343
cated is preferred. One could say that the use of liggen here creates a Google Maps-eect which is inappropriate for entities not usually conceived of as locations having (x,y) coordinates. We can say that, at least judging on the data used for this study, that L2-learners have mastered this use of liggen fairly well, interferences probably being most common in the context of buildings and the like for which a coding variation liggen/staan is mostly possible, even if the context usually guides the language user to a clear preference.
4.
Conclusions and prospects
Our pilot study of the use of the Dutch posture verbs staan, liggen and zitten by French-speaking learners has unravelled some interesting tendencies. In our quantitative analysis of the data, we rstly observed that the learners underuse these verbs in their productions as could be expected given the typological dierences between French and Dutch regarding the expression of posture and location. Considering the specic contexts in which the posture verbs have been used correctly by the learners, we have further shown that they use them far more frequently in postural and locational contexts, whereas the native speakers tend to use them more frequently in metaphorical contexts. This observation suggests that the learners are more inclined to use the posture verbs in their basic contexts, being less at ease with their metaphorical extensions. These distributions lend support to the idea that the coding exibility is a major diculty for the learners when faced with the wide range of extensions staan, liggen and zitten. Rening our analysis revealed, however, that the learners appear to master some specic patterns of the metaphorical uses of the posture verbs, such as zitten expressing containment or possession, staan referring to text as standing entity or liggen expressing the location of abstract entities. Our qualitative analysis even pointed out that the learners tend to overextend certain of these metaphorical patterns. Similarly, the qualitative analysis has revealed some other cases of overgeneralisation whereby the learners resorted to a posture verb in contexts in which a more general verb would have been more natural (posture verb panic). This overuse of certain patterns not only shows that the L2 user has probably mastered the logic of certain specic uses, but also that they are exploiting these insights to encode similar situations. In doing so, they will inevitably overgeneralise or ignore collocational patterns of the target language that L1 speakers have acquired through massive and repeated exposure to linguistic input. The absence of a L1 acquisition
344
control corpus motions to extreme caution, yet on the basis of the L2 patterns discussed above (and ignoring issues of cognitive development and maturity), we are inclined to conclude that, in general terms, L2 acquisition strategies may exploit the same principles as what one observes for L1. Clearly, given the high frequency of posture verbs in Dutch and the problems these entail, the language learner may sometimes decide to play safe and blindly apply a posture verb in contexts where their native language might have guided them to using a location verb such as zijn (be) or zich bevinden (be found). However, looking at the errors in question reveals that L2 speakers do follow general overextension strategies that characterise L1 acquisition as well (cf. Brown 1958; Clark 2003: 211212; Tomasello 2003: 1278).26 In other words, these errors are not blind but reveal at least a partial insight into the linguistic system. While some hypercorrection cannot be excluded, their insight is clearly in line with the input data, as shown, for example, by the high frequency of staan in the errors or by the recurrent use of some expressions such as centraal staan (stand central). Considering the (recurrent) use of certain specic metaphorical patterns of the posture verbs has allowed us to evaluate to what extent the learners master the semantic network of the verb in question. In line with a usagebased approach, our claim, partly supported by the observation that the learners seem to have a good control of expressions with posture verbs, is that the learners, when assimilating a new pattern of use of a posture verb, might rather learn it as an separate unit and miss insights as to how the dierent nuances of a given posture verb relate to each other, preventing them to integrate the new pattern into the verbs semantic network. To put it another way, having mastered some specic metaphorical uses of a given posture verb does not mean that the learners master the whole semantic structure of the category. Some of the patterns that the L2-learner has to uncover may be relatively straightforward, such as use staan when coding the location of an entity on its base, or use liggen for a symmetrical object located in space, or use zitten when an entity is closely contained by another. Other cases, some of which were discussed here (but there are many others), remain problematic, as it may not always be clear from an (encoding) point of view whether a neutral verb is to be used or rather a posture verb and if so, which one. This is
26. As we have not yet looked at L1 acquisition of posture verbs, we are not claiming that the actual patterns of extension are necessarily the same; the suggestion is that the general extension mechanisms, as revealed through overextension, are exploited by both.
345
particularly true for certain collocational patterns for which the internal motivation may not be so easy to discover. In line with a usage-based view on language acquisition, it is expected that the L2 language learner may eventually unravel these via the same interplay of factors that the L1 language learner operates with, i.e., frequency of input, implicit and explicit negative input and statistical pre-emption (cf. Tomasello 2003; Goldberg 2006). Summing up then, our analysis appears to support two important interrelated claims. The rst one is that it is incorrect to consider the learner system as simply an imperfect version of the target language; rather, it is a linguistic system in its own right that follows a mixed logic: some of the errors are due to interferences from their native language (in our case, the underuse of posture verbs) yet others are due to overextensions of patterns they observe in the target language, as illustrated above. The second observation, which follows logically from the rst and which, moreover, has important pedagogical consequences, is that input plainly matters, also for L2 acquisition: L2 speakers do pick up dominant patterns in the target language without being explicitly told (cf. also Rast 2008) and they apply these creatively. The corpus-study reported on here is obviously but a rst (yet necessary) step to unravel the processes at work in L2 acquisition of Dutch posture verbs. Despite its limitations, mainly related to corpus size and the type of texts, our study has allowed us to discover some general patterns in the errors produced by the learners, which might have been more dicult to observe in a controlled experimental setting. This particularly concerns the metaphorical uses of the posture verbs. Further research is obviously warranted and will be pursued along two paths. Firstly, extending the existing contrastive research for L1 as described in Lemmens (2005a), we will carry out elicitation experiments where francophone L2 speakers describe the location of entities as given by a controlled set of illustrations and compare these (semi-spontaneous) narrations to those produced by native speakers. Secondly, we will do follow-up experiments probing into intuitions of L1 and L2 speakers concerning some of the onomasiological variations described above. It is to be expected that this research will conrm the tendencies outlined here and provide further insight into the L2 language system. Finally, in order to fully evaluate the suggestion that the L1 and L2 acquisition strategies for posture verbs are comparable, a more systematic analysis of the acquisition of these verbs in L1 is warranted. Received 1 March 2009 Revision received 25 November 2009 Universite Lille 3 & CNRS Facultes Universitaires Saint-Louis
346
References
Ameka, Felix K. and Stephen. C. Levinson. 2007. The typology and semantics of locative predicates: posturals, positionals, and other beasts. Linguistics 45. 847871. Brown, Roger. 1958. How shall a thing be called? Psychological Review 65. 1421. Clark, Eve V. 2003. First language acquisition. Cambridge: Cambridge University Press. Fagan, Sarah. 1991. The Semantics of the positional predicates liegen/legen, sitzen/setzen, and stehen/stellen. Die Unterrichtpraxis 24.136145. ` Goldberg, Adele E. 2006. Constructions at work: The nature of generalization in language. Oxford: Oxford University Press. Guirardello-Damian, Raquel. 2002. The syntax and semantics of posture forms in Trumai. In Newman, John (ed.), The linguistics of sitting, standing and lying (Typological Studies in Language 51), 141177. Amsterdam and Philadelphia: John Benjamins. Gullberg, Marianne. To appear. Language-specic encoding of placement events in gestures. In Eric Pederson and Ju rgen Bohnemeyer (eds.), Event representations in language and cognition. Cambridge: Cambridge University Press. Gullberg, Marianne and Bhuvana Narashimhan. This volume. Gestures and the development of semantic distinctions in Dutch. Hiligsmann, Philippe. 1997. Lingustische aspecten en pedagogische implicaties van de tussen taal van Franstalige M.O.-leerders van het Nederlands [Linguistic aspects and pedagogical ` ` implications of the interlanguage of French M.O. learners of Dutch]. Geneve and Liege: ` ` Droz, Bibliotheque de la Faculte de Philosophie et Lettres de lUniversite de Liege. Klein. Wolfgang. 2008. From the learners point of view. Paper presented at the International Conference on Language acquisition: Comparative perspectives. Paris, 56 December. Klein, Wolfgang and Clive Perdue (eds.). 1993. Adult language acquisition: Cross-Linguistic perspectives. Cambridge: Cambridge University Press. Kutscher, Silvia and Eva Schultze-Berndt. 2007. Why a folder lies in the basket although it is not lying: the semantics and use of German positional verbs with inanimate Figures. Linguistics 45. 9831028. Lemmens, Maarten. 2001. LOCATION versus POSITION: Coding strategies for referent location Paper presented at the ICLC7, University of California, Santa Barbara, California, 2227 July. Lemmens, Maarten. 2002. The semantic network of Dutch posture verbs. In John Newman (ed.), The linguistics of sitting, standing and lying. (Typological Studies in Language 51), 103139. Amsterdam and Philadelphia: John Benjamins. ` Lemmens, Maarten. 2005a. Motion and location: toward a cognitive typology. In Genevieve Girard (ed.), Parcours linguistiques. Domaine anglais. (CIEREC Travaux 122), 223244. St. Etienne: Publications de lUniversite St Etienne. Lemens, Maarten. 2005b. Aspectual posture verb constructions in Dutch. Journal of Germanic Linguistics 17. 183217. Lemmens, Maarten. 2006. Caused posture: experiential patterns emerging from corpus research. In Anatol Stefanowitsch and Stephan Gries (eds), Corpora in cognitive linguistics: Corpus-based approaches to syntax and lexis. Berlin: Mouton de Gruyter, 261296. Lemmens, Maarten and Dan I. Slobin. 2008. Positie- en bewegingswerkwoorden in het Ne derlands, het Engels en het Frans. In Philippe Hiligsmann, Melanie Baelen, Anne-Lore Leloup and Laurent Rasier (eds), Verslagen en mededelingen van de Koninklijke Academie voor Nederlandse Taal- en Letterkunde 118, 1732. Newman, John. 2002. A cross-linguistic overview of the posture verbs sit, stand, and lie. In John Newman (ed.), The linguistics of sitting, standing and lying. (Typological Studies in Language 51), 124. Amsterdam and Philadelphia: John Benjamins.
347
Newman, John and Sally Rice. 2004. Patterns of usage for English SIT, STAND, and LIE: A cognitively inspired exploration in corpus linguistics. Cognitive Linguistics 15. 351396. Perrez, Julien and Liesbeth Degand. In preparation. Het Leerdercorpus Nederlands (Cahiers du Cental X). Presses universitaires de Louvain. Rast, Rebekah. 2008. Foreign language input: Initial processing. Clevedon: Multilingual Matters. Serra Borneto, Carlo. 1996. Liegen and stehen in German: A study in horizontality and verticality. In Eugene Casad (ed.), Cognitive linguistics in the Redwoods, 458505. Berlin: Mouton de Gruyter. Talmy, Leonard. 2000. Towards a Cognitive Semantics (Vol. 1 and 2). Cambridge, MA: MIT-press. Tomasello, Michael. 2003. Constructing a language. A usage-based theory of language acquisition. Cambridge, MA.: Harvard University Press. Van Oosten, Jeanne. 1984. Sitting, Standing and Lying in Dutch: A Cognitive Approach to the Distribution of the Verbs Zitten, Staan, and Liggen. In Jeanne van Oosten & Johan Snapper (eds.), Dutch linguistics at Berkeley, UCB. 137160. Vuillermet, Marine. 2008. Ese Ejja posture verbs do not just sit there: An inquiry into other ways they stand out. Paper presented at the Workshop for American Indigenous Languages, University of California, Santa Barbara, 2324 May.
Book reviews
Steen, Gerard J., Finding Metaphor in Grammar and Usage. A Methodological Analysis of Theory and Research. Series Converging Evidence in Communication and Language Research 10. Amsterdam/Philadelphia: John Benjamins, 2007, 430 pp. ISBN 978 90 272 3897 9. Hardbound EUR 110.00 / USD 165.00 Reviewed by Alice Deignan, University of Leeds, UK. Email: 3a.h.deignan@education.leeds.ac.uk4 Steens detailed discussion of research into metaphor in language and thought is extremely timely in its tackling of methodological issues that have often been neglected in the excitement of rapid theoretical developments in the eld. Encompassing much of the vast literature on conceptual and linguistic metaphor produced since Lako and Johnsons groundbreaking Metaphors We Live By (1980), the author describes recent theoretical developments and research projects in depth, and relates them to each other through a framework carefully developed in the rst chapter of the book. The length of the book allows space for complex issues to be explored, and for detailed discussion of metaphor research; critical descriptions of single projects often run to a number of pages. The insightful and detailed commentary on a huge range of projects is one of the strengths of the book. The book is divided into three parts, the rst of which, Mapping the Field, sets out theoretical and methodological foundations, and includes a useful review of the currently dominant models of metaphor in language and thought. The second part, Finding metaphor in grammar, takes the reader in detail through the methodological processes involved in identifying metaphor in the language system, beginning with identifying linguistic candidates for metaphor and through the theoretical questions that need to be tackled. These concern domain identication, establishing mappings and considering the nature of processing performed by
Cognitive Linguistics 212 (2010), 349370 DOI 10.1515/COGL.2010.013 09365907/10/00210349 6 Walter de Gruyter
350
language users. The third part, Finding metaphor in usage, describes the steps involved in identifying metaphor and related tropes in language in use. Each chapter in the second and third parts follows the same structure, which has been established in Part 1: rstly, descriptions of research exemplifying the issues at stake are presented, chosen from both synchronic and diachronic perspectives. Secondly, in a section entitled Conceptualisation, the theoretical problems are outlined, again with reference to a selection of studies, and then, in a section named Operationalisation, the methodological challenges of investigating these are explored. In what are usually the longest sections of each chapter, the author then shows how these issues can be tackled, through the three major methodological approaches reviewed in Part 1: introspection, observation and manipulation. Chapter 1 sets up three two-way distinctions in the way that metaphor can be considered: as grammar (language as system) or as usage (language in use, performance); as existing in language or in thought; and as symbol or as behaviour. Grammar as used here and in the books title, may be an initially misleading term for some linguists: here it includes lexicogrammar, and all aspects of language that are part of the socioculturally conventionalized and cognitively entrenched part of the many concrete events of usage that occur in reality (p. 5). Grammar is, as Steen points out, derived and abstracted from usage. In his discussion of the distinction between language and thought, Steen cites cross-linguistic metaphor studies to show how metaphor can vary between languages at the level of thought and at the level of language. He points out that in some studies it is dicult to tell whether claims are about language or about thought. The third distinction raises the problem of whether language is fundamentally symbol or behavior. Steen argues that this distinction has been conated by cognitive linguists, who tend to investigate language as if it were a symbolic system, but then treat their ndings as having reality for human processing, or behaviour. He argues that the two aspects need to be disentangled. Metaphor can be investigated through each of these three prisms, which intersect to lead to eight dierent kinds of question. Steen argues that it is important for researchers to clarify which kind of question they are tackling. He discusses the problems of using evidence from one area of research to investigate hypotheses in another; for example, using evidence from language to investigate thought, and shows how this can lead to circularity. This chapter clears a way through a number of issues that have often been muddled and is in itself a very useful contribution to the eld. Chapter 2 begins with a description of the well-known scene in the lm Mary Poppins, in which the characters become helpless with laughter
351
and gradually oat upwards. Analysts see conceptual metaphors everywhere, and would tend to see this as a non-linguistic realisation of HAPPY IS UP; Steen asks whether they are right to do so. Caution about evidence is advised: evidence that is merely compatible with a hypothesis such as conceptual metaphor theory is not sucient to demonstrate that the hypothesis is true: an important problem for the theory has been its proponents lack of consideration of alternative explanations for the linguistic evidence. A number of developments in metaphor research are discussed as examples of deductive and inductive approaches, and as ways in which theory can be tested. The important message of the chapter is that researchers have often given evidence that is consistent with conceptual metaphor theory as evidence for it; this is not enough- converging evidence is needed. Chapter 3 tackles theoretical denitions of metaphor. Four approaches to metaphor are detailed: metaphor as cross domain mapping (the conceptual metaphor approach of Lako and Johnson); metaphor as involving many spaces (Blending); Glucksbergs view of metaphor as class inclusion, and the career of metaphor approach of Gentner and her co-researchers. These are very usefully overviewed and compared. Metonymy is also dened, using the notions of closeness and intimacy. Steen argues that any pair of domains can be both contiguous (metonymy) and similar (metaphor), and that degrees of each are therefore independent scales, rather than opposite ends of the same scale. In some cases, the relationship between domains will appear to be more strongly one of contiguity, and the resulting trope will be primarily metonymic. In other cases, the domains are similar, resulting in a trope that is primarily metaphorical. This argument allows for the same linguistic expression to be used more metaphorically at some times and more metonymically at others. Detailed examples from language in use would have helped to support this interesting argument. Steen notes that Lako and Johnson attacked the notion of metaphor as similarity; he examines their objections and modies a denition of similarity in this light. However, as he notes, establishing a linguistic or conceptual denition of metaphor in terms of similarity and comparison does not imply that language users process metaphors in this way. Steens rigorous insistence on the separation between process and product is essential here. The chapter then outlines some problems that need to be tackled: the demarcation of source and target domains, whether highly conventionalised expressions can be considered as metaphorical, and whether metaphors have to be processed online. In Chapter 4, Steen argues the need for tight conceptualisation. He develops criteria for metaphor identication with reference to the four
352
theories and eight aspects of metaphor originally dened. He looks at how each of the theories would identify metaphor in usage, and goes on to consider the dierences in their positions. There is detailed analysis of examples to show how the choice of unit of analysis has an impact on eventual decisions about the source domain and nature of conceptual mapping. The pragglejaz procedure aims to identify metaphor in language, which, Steen argues, is only indirectly connected to metaphor in thought, psychological structures and processes in individuals. The pragglejaz procedure attempts to be atheoretical; Steen claims that although the term comparison is used, the procedure could be employed by researchers who reject the notion of comparison, such as class inclusion theorists, to identify metaphor in text, because it makes no underlying claims about the nature of thought. The question of metaphoricity being perceived dierently by dierent language users is mentioned, but this complex issue is not explored in detail. The dictionary and corpora are discussed as tools for metaphor analysis. Steen describes changes in the literal use of words such as ardent and fervent, though changes in the dictionary descriptions of such words may be due to changes in lexicographical practice over the last forty years, rather than, or as well as, etymological developments. Like much of the book, this chapter brings systematicity to the work on metaphor in dierent schools, oering some new perspectives and insights such as the proposed relationship between metaphor and metonymy. Chapter 5 tackles the thorny question of what counts as data in metaphor studies. It begins with David Lodges amusing send-up of a cognitive scientist trying to analyse his own thoughts in the contemporary novel Thinks, illustrating the problems of capturing and using introspective data. Metaphor studies use three kinds of data: verbal, non-verbal and meta. Verbal data, or the analysis of text, are the most familiar to linguists. Non-verbal data are typically behavioural data from language users in action (p. 105), often used in psycholinguistic studies, and consist of studies such as the analysis of eye movements, pausing, gesture, and reading times. Meta data dier from verbal and non-verbal data in that they are not evidence of performance by language users, but reections on performance, and include tests for ambiguity and polysemy. They are typically generated by experts, but in some cases, characterisations produced by non expert users are helpful as evidence in themselves or are used as a stage in the development of experimental materials. Steen describes the collections of these dierent kinds of data in a sound, thoughtful discussion, of great use for research students, as well as for experienced researchers. Data are then reclassied into introspective, observed and manipulated, a division which is used throughout the rest of the book. The
353
advantages and disadvantages of using each of these collection techniques are discussed; these issues are returned to in more depth in later chapters. In this chapter, quantitative and qualitative data are also discussed, as is the use of statistical measures. This is the nal chapter of Part 1 of the book, setting up categories and systems for researching metaphor. Part 2 is concerned with applying these to nding metaphor in grammar. Chapter 6 looks at the rst of the eight potential research questions concerning metaphor that were developed in Chapter 1, seeking linguistic metaphors in grammar. It begins with a systematic critique of some of the metaphors proposed by Lako and Johnson (1980), reiterating the point that while the linguistic evidence presented is consistent with the proposed conceptual metaphors, this is not sucient. Steen shows that a more detailed analysis of the linguistic evidence for TIME IS MONEY, using dictionary data, considerably weakens the claims of Lako and Johnson. In his consideration of introspective evidence, Steen explores the issues of monosemy, polysemy and ambiguity, describing the traditional tests that have been developed to identify these. He argues that these tests are helpful for all but borderline cases- although it could be counter-argued that the borderline cases are exactly where tests are needed, central cases being easily resolved. The observation data that are described are mainly from corpora; the relatively recent access to very large corpora and fast search tools has given metaphor researchers a hugely powerful resource. Manipulation is less often used as a source for data for identifying linguistic metaphor in grammar; Steen explains why this is and reviews some of the few studies that attempt it. Chapter 7 explores the very dicult issue of what constitutes a domain; Steen points out that domains need to be separated from each other in a methodologically responsible way (p. 171), as a preliminary to identifying the relationship between two domains, and then deciding whether cross-domain mapping is involved, which is the topic of the following chapter. Traditionally, the central distinction that has been posited between domains is concrete-abstract, but Steen shows what will be familiar to metaphor researchers who have tried to operationalise this distinction: although at rst sight concrete and abstract would appear radically different, in practice domains seem to be ordered on a continuum from one to the other. Steen also describes the problem of connecting linguistic forms to domains. Further, the analyst needs to decide at what level to specify mappings; Lako and Johnsons domains appear to operate at an intermediate level, while Grady (1997) proposed analysing at a basic level. Steen considers in detail how decisions about domains impact on definitions of metaphor, and shows how this works in practice through discussion of dictionary data and problems such as how to deal with simile.
354
In Chapter 8, Steen follows up his discussion of domain identication with an exploration of the relationships between domains and how these can be classied. Kovecses (2000) series of studies on emotion are dis cussed as examples of cross-domain mapping. It is stressed that it is vital to check that words really belong to the domains that are claimed for them, some previous research having been less than rigorous on this point. Steen shows how issues of identifying relationships between domains are dealt with- or, in a number of cases, ignored- in introspective work, in corpus studies, and in work in the psycholinguistic tradition. At this stage he reformulates the dierence between metaphor and metonymy, managing to avoid using the notion of domain at all, talking instead about perceived similarities. As elsewhere in the book, it is informative even for researchers already familiar with the work reviewed to see it contrasted and discussed methodologically. Steen perceptively nds common problems across dierent traditions and shows how dierent data types and techniques of analysis can complement each other. In Chapter 9, the focus shifts from language to the cognitive, and Steen considers ways of tackling two more of the questions raised in Chapter 1, concerning storing, acquiring and losing linguistic and conceptual metaphors. Two types of cognitive process are considered: usage and acquisition; that is, local and long term. There is a very interesting discussion of the acquisition of linguistic and conceptual metaphors, citing Johnson (1999), who presents evidence from transcripts of childrens talk. It is argued that when children acquire a conceptual metaphor such as UNDERSTANDING IS SEEING, they learn metaphorical and literal meanings of SEEING together; this is not a metaphor for children, and the literal and metaphorical meanings are only separated later. Steen therefore concludes with Johnson that the acquisition process is fundamentally a metonymic one. Earlier views on the topic were not evidencebased and though they seemed commonsensical at the time, have been shown by Johnsons evidence to be incorrect. This underlines the importance of using evidence rather than or in addition to theorising. The discussion of the cognitive aspect of metaphor in grammar for adults is mostly centred round Gibbs work (for example, 2006), partly because he is one of the very few researchers to have tackled this area at all and also because his work is groundbreaking. Steen asks whether introspection can be enough to answer questions about cognition and metaphor acquisition, storage and maintenance and concludes that it is not possible to reliably introspect about long-term processes. Introspection generates hypotheses but not ndings. Observation techniques such as using a corpus are also of limited use, because symbolic patterns that are abstracted from usage do not have a one-to-one relationship with mental representa-
355
tions. Steen argues convincingly that evidence gained by manipulation is needed to test hypotheses generated by introspective and observational data. Part 3 of the book is concerned with nding metaphor in usage, and is divided into three chapters, dealing with metaphorical language, related linguistic gures such as simile, and cognitive processes. In Chapter 10, the study of metaphorical language use, Steen argues that the range of metaphors found in usage is far greater than that found in grammar, because of the enriched contextual meaning, and because ellipted forms can also be included. It is pointed out that shared context makes the target domain well known and specic, and contributes a great deal to the meaning of the metaphor. Interesting and useful contrasts are drawn between diachronic studies which focus on the meanings generated by individual users, such as Chiltons (1996) study of cold war discourse spanning 50 years, and synchronic studies of typical user meaning, such as Kollers (2004) study of business discourse. Steen then returns to the pragglejaz procedure, previously mentioned in Chapter 4, for its explicit steps and listing of the various decisions that have to be made. Pragglejaz does not attempt to specify any underlying cross-domain mapping; Steen shows how Barcelonas (2002) two-step procedure for identifying domains and characterising mappings can take up where pragglejaz nishes. Steen discusses the use of corpora to observe metaphor in usage, and contrasts the approaches of several current writers. It is argued that manipulation, through using large numbers of informants to make judgements about carefully constructed cases, can illuminate issues such as degree of metaphoricity and semantic relatedness which are less reliably assessed through introspection and observation. Chapter 11 deals with other forms of metaphor, mainly simile. Simile does not involve indirect language use but there is nonetheless some form of cross domain mapping. There is a long and interesting discussion of Shakespeares Shall I Compare Thee to a Summers Day, showing how linguistic metaphor is embedded within simile. It is argued that although this text is exceptionally complex, a degree of related complexity is also to be found in ordinary discourse. Chiltons cold war data (1996) also show some very interesting similes and metaphors, which are developed over a stretch of text. Mapping is involved even though not all the resultant language is strictly metaphorical. This kind of cross-domain activity realised through non-metaphorical language is evidence that metaphor is conceptual as well as linguistic. There are also some fascinating worked examples from Tennyson and the Iliad, showing how similes are developed over extended texts and in dierent genres. Allegory and parable are covered, described as extended texts from the source domain that
356
are dicult to identify as metaphorical. This raises the issue of implicit versus explicit metaphor use; dierent analysts take dierent positions on the issue, but if we only look for metaphorically used language then we lose these and some other cases. Steen advances a variant on the pragglejaz procedure designed to deal with such cases. This chapter might appear less interesting for some researchers, if a decision has been made not to study implicit metaphor, but it contains some useful insights and tackles important aspects of simile. Chapter 12 turns to cognitive processes and products in usage. To date, most research on metaphor processing has been concerned with linguistic structure, not usage. Steen points out that conceptual metaphor theory does not set up a model of processing, so many psycholinguists are sceptical of it. An important exception is Gibbs, who works from the theory to devise deductive, experimental work. Steen uses Gibbs denitions of dierent kinds of understanding to tease out four aspects of processing: recognition, comprehension, interpretation and appreciation. Comprehension is essential, the other three are optional, and their nature depends on the language use and event. This raises the question whether entailments of cross-domain mapping are found in automatic comprehension, or only in conscious understanding, a question especially relevant for education and text design. Gioras work on salience (for example, 2003) leads Steen to argue that metaphorical entailments are not always mapped automatically, online; if the metaphorical sense of a word is the most salient, then no literal sense is activated. Steen also notes that psycholinguists have largely ignored genre, though all language use is genre-regulated (p. 353). The question of when automatic comprehension stops and conscious understanding starts takes a dierent perspective depending on which of the four language skills we are considering. In terms of data, Steen argues that it is impossible to get good introspective evidence for cognitive processes in usage, because we cannot introspect about our online production and comprehension, though we can consider the products of comprehension. This leads him to critique the use of introspective evidence in the standard pragmatic model of comprehension, and in conceptual metaphor theory and blending theory, arguing that all three hypotheses are founded purely in the researchers intuitions. Observation techniques are little better: conversation and discourse analysts can get close to mental processes through the detailed analysis of spoken data, but they cannot give denitive answers. Steen especially cautions against the analysis of published texts as an insight into the composing process, because of the amount of editing and reection that will have
357
gone into them which means they cannot be seen as the product of online production. Manipulation is therefore the only technique that can hope to provide useful and reliable information about metaphor processing in usage. Chapter 13 concludes the book. It begins with an overview of the attempt to establish methodological guidelines for empirical research in metaphor, drawing together and comparing dierent approaches and different but related kinds of questions. Steen comments that it would be useful for researchers to dierentiate the stages of their research into the recognised categories used in science, for studies to be comparable. One of his conclusions is that the dierent models of metaphor have a reasonable amount in common, in that they all consider two domains. Steen again criticises many metaphor studies for not dening their categories tightly enough, and reiterates the need for caution and for careful distinctions, for instance between introspection as a source of data and as a tool for analysing data, and text as prompts for eliciting data from informants or as data themselves. In the latter part of the conclusion, Steen returns to the eight questions that were generated in Chapter 1, by the intersection of three distinctions to be made by metaphor researchers: metaphor in grammar versus usage, metaphor in language versus metaphor in thought, and metaphor as symbol or as behaviour. These pages usefully summarise the methodological points developed through the book and some of the principal procedures that have been described and developed. Gaps and possibilities for future research are pinpointed here, making for a thought-provoking research agenda for scholars of real world metaphor use. Many studies of metaphor have extrapolated from language data to claim to have discovered new conceptual metaphors, with no discussion as to how the linguistic metaphors were identied, or how it was decided that they represent any underlying patterns of thought. This book is a much-needed examination of all the issues involved. In a sense, it is not a book of answers; it is a book about how to tackle the questions, and it contains many thorough and perceptive critiques of other researchers attempts to do so. It is a huge undertaking, both reviewing and developing ideas, and a review such as this one can only hope to select a few of the many topics covered, inevitably in a biased and partial manner. For me, the complexity of the prose, both at the level of sentence and of argumentation, meant that the book is not an easy read- perhaps this is inevitable given the range of topics covered and the ambitious depth. However, this is a minor criticism of what is a signicant contribution to the eld, one that will form core reading for current and future researchers of metaphor.
358
References
Barcelona, A. (2002) Clarifying and applying the notions of metaphor and metonymy within cognitive linguistics: An update. In R. Dirven and R. Porings (eds.) Metaphor and Meton ymy in Comparison and Contrast. (pp. 207277) Berlin: Mouton de Gruyter. Chilton, P. (1996) Security Metaphors: Cold War Discourse from Containment to Common House. New York: Peter Lang. Gibbs, R. W. (2006) Embodiment and Cognitive Science. New York: Cambridge University Press. Giora, R. (2003) On our Mind: Salience, Context and Figurative Language. New York: Oxford University Press. Grady, J. (1997) THEORIES ARE BUILDINGS revisited Cognitive Linguistics, 8, 267 290. Johnson, C. (1999) Metaphor vs conation in the acquisition of polysemy: The case of SEE. In M. K. Hiraga, C. Sinha and S. Wilcox (Eds.) Cultural, Typological and Psychological Issues in Cognitive Linguistics, pp. 155169. Amsterdam: John Benjamins. Koller, V. (2004) Metaphor and Gender in Business Media Discourse: A Critical Cognitive Study. Basingstoke and New York: Palgrave Macmillan. Kovecses, Z. (2000) Metaphor and Emotion: Language, Culture and Body in Human Feeling. Cambridge: Cambridge University press. Lako, G. and Johnson, M. (1980) Metaphors We Live By. Chicago: University of Chicago Press.
L. David Ritchie, Context and Connection in Metaphor. How Simple Ideas Shape Human Experience. Houndmills: Palgrave Macmillan, 2006, 248 pp., ISBN-13: 978-1-4039-9766. Hardcover $ 74.95. Reviewed by Ksenya L. Filatova, Ekatherinburg, Russia, Ural State University. Email 3ksenya.latova@gmail.com4 The book under review merges existing theories of metaphor into a highly consistent whole, drawing on current research in cognitive studies of the brain. Starting with illuminating and constructive criticisms, the author delivers a cogent synthesis of selected concepts from diering theories, to the general benet of his Context-Limited Simulators (CLS) theory. Compositionally, Context and Connection in Metaphor consists of nine chapters, including an introduction. Its index system (by name, by metaphor, by subject) is comprehensive and reader-friendly. The rst chapterintroduction gives a vivid overview of the authors stancemetaphors are claimed to be simultaneously cognitive, in a thoroughly biological sense, and social (4). The introduction sets up the books polemic tone with a debunking of meta-metaphors in which metaphor theorists are getting trapped (8), for example: mind as a machine,
359
language as a code. Lako and Johnsons denition of metaphor is accepted as a working one and is elaborated upon later. Chapter 2 deals with attributional and relational models of metaphor, the two cornerstones of classical metaphorology. After careful examination of its subtle variants, Ritchie points out that both property attribution and superordinate category formation imply the existence of common salient features that have already been metaphorically interpreted, and thus, both theories are fundamentally circular (25). Chapter 3 presents a thorough overview of conceptual metaphor theory, from the original insights (G. Lako and M. Johnson) to some of the recent extensions (J. Grady, S. Narayanan) and principal criticisms (J. Vervaeke and J.M. Kennedy) concerning the theory. Ritchie then proposes a detailed analysis of the classic ARGUMENT IS WAR metaphor, and shows that the concept of WAR is far from being directly grounded in physical and social experience, and that people come to understand it in the rst place by metaphorical elaboration of their own embodied experiences of interpersonal conict, athletic competitions, and role-playing (45). Considering one by one the numerous problems raised by the conceptual metaphor theory (multiple meanings versus metaphorical meanings, xity of metaphorical meanings, suciency of metaphor in representing abstract, experienced-based concepts, and groundings of complex metaphors, inuence of metaphors on thought, lexicalizing conceptual metaphors), the author comes up with the main inconsistency between Lako and Johnsons theory and the avalanche of research that followed. Ironically, the revolutionary denial of literal meaning and all the referential logic in their seminal 1980 work was rather neglected by their followers who established one-to-one correspondences between metaphorical expressions and underlying root metaphors (54). According to Ritchie, the fundamental insight of Lako and Johnsons conception is that the co-occurrence of physical experiences . . . with each other and with more abstract emotional experiences . . . strengthens neurological connections that form the basis for subsequent understanding of concepts (55). He also makes a pertinent remark concerning languagedriven versus embodied knowledge. He argues that the role of the latter has been exaggerated, to the detriment of admitting the importance of communication and language in the development of conceptual metaphors (57). In Chapter 4 the author deals with G. Fauconnier and M. Turners conceptual blending theory, criticisms of that theory and responses to these criticisms. His major objection lies in the unnecessary complexity of the theory: faithful to his iconoclastic demeanour, Ritchie shows that many elaborations derive from meta-metaphors (space, blending),
360
and without contesting the conceptual integration as a phenomenon, he promises a more coherent explanation thereof. Chapter 5 suggests an overview of some recent ideas about context and common ground. Beginning with relevance theory (D. Sperber and D. Wilson), Ritchie teases apart notions of context, cognitive environment and mutual cognitive environment, and accuses the theory of the same circularity vice: the decision whether to process a message in a particular context relies on the eects achieved by processing the message (83). Besides, he notes as unresolved the issues of innite recursion and of biological grounding. Moving to the conversational model of H. H. Clark, the author dwells on the social and interactive dimension it oers: the analysis of the communicative situation itself is considered to be useful for describing the coordination of multiple contexts and their inuence on the common ground. In Chapter 6, the author gives a detailed synopsis of L. Barsalous perception-based theory of cognition which he characterizes as an explicitly embodied approach (97) especially consistent with the limited processing capacity of the human neural system (123), and explains how it should be applied to language. Stating that, due to working memory limitations, simulations of concepts and schemas could never be complete, and that the previously activated contents of working memory must play a role in determining which features are simulated and which are omitted (114), Ritchie raises the question of how contextboth limited and extendedcan be incorporated into such a theory of cognition. That gives him an opportunity to re-analyze a number of theoretical constructs (experienced present, relevance, cognitive environment, social schemas, distributed cognition, cultural schemas, common ground ) and to show that they t into the previously described model. Though numerous gaps in the perceptual simulators theory are identied (123), Barsalous conception is proclaimed to be optimal for elaborating a new theory of metaphor. Chapter 7, the very heart of the book, presents a new theory of metaphor use and interpretationCLS simulators theory. The underlying ideas being already obvious to the reader, Ritchie takes his time to weave all the loose threads into smooth theoretical tissue. Though being compatible with conceptual metaphor theory, CLS accords more importance to language as a direct source of concept development (132). Ritchie then revisits the main assumptions of conceptual metaphor theory in this new light (basic experience, types of metaphors, elds of meaning, conceptual elds). Getting back to conceptual blending, Ritchie proposes his solutions to the problems analysed in Chapter 4 (the monk climbing the mountain, Margaret Thatcher for President . . .). He also compares his
361
theory to B. Indurkhyas perceptual blending in poetic metaphors and to frame-shifting. The climax of the chapter is the newextensivedenition of metaphor in terms of CLS: Metaphor alters the way one concept (the topic) is experienced by suppressing context-irrelevant simulators associated with another (the vehicle), and connecting them with the topic. In many cases the power of metaphor, in comparison to an apparently equivalent literal statement, derives from the fact that the dening attributes of the vehicle are irrelevant to the topic, hence the perceptual simulators associated with the dening attributes are suppressed as context-irrelevant. That leaves only the secondary attributes, those that express the nuances of thought and feeling experienced by the originator of the message, activated in the hearers working memory. Because they are the only attributes activated, these secondary attributes receive more cognitive processing and become more strongly associated with the topic (169). Ritchie then briey addresses the questions of creating and using metaphors, of creativity, interpretation and analysis: the power of metaphors, in terms of CLS, is seen in activating the full range of the secondary simulators, all the proprioceptive and emotional connections that the literal translation invariably loses. The concluding two chapters elaborate on implications of CLS theory. Chapter 8 dwells on the notion of context and its role in interpreting metaphors within the conversation and in creating metaphorical entailments. It addresses directly the question of gurative language interacting with concept simulators and social structure. In his insightful example (By the time Mary had her fourteenth child, shed nally run out of names to call her husband ) he shows how, induced by the playful use of language, culturally approved frames resonate with subversive frames (185). Notions of frame shifting, cultural elds of meaning, individual conceptual elds receive a thorough explanation. Factors that inuence the power of metaphors (their reproductive tness; epidemiology of representations) are described; generative metaphors, especially in science, are touched upon. The last chapter of the book, Chapter 9, serves mostly as a conclusion, repeating the basic theoretical points. It then presents a coherent classication of metaphors according to the perceptual simulators that the vehicle activates in a particular context: external sensory perceptions, proprioception, introspection (209). The strength of metaphors and cultural restructuring through cultural metaphors are then discussed. The book closes with questions for further investigation (215216) and implications for metaphor analysis (216217). In his concluding remarks Ritchie enumerates the major advantages of CLS theory and posits that it has the potential to integrate metaphor theory with a more general theory of
362
communication that incorporates social and cultural processes along with the cognitive processes described by Barsalou (217). Overall, Context and Connection in Metaphor is a serious interdisciplinary work of great interest to a wide range of research programmes. Among its greatest benets one should cite, rstly, the highly coherent incorporation of biological grounding into metaphor theory, a feature which should appeal to a signicant number of cognitive linguists. Secondly, the CLS theory turns out to be an extremely delicate instrument of purely phenomenological analysis. Starting with the assumption that experience is varied in a way that is continuous and subtle and that no code-like language can possibly express the full range of experience (125), Ritchie then speaks of triggering such perceptual / proprioceptive / introspective reactions that are sometimes visceral, intimate, indescribable, and constitute the subtlest nuances of meaning. From this perspective, CLS theory reveals its enormous potential for communication studies and textual analysis. Finally, there is a very interesting idea that is always present in the book, but never emphasized: Ritchie states that utterances do not necessarily fall into rigid categories of literal or metaphorical, nor are language users conscious of these distinctions (48). This rejection of strict literal versus metaphorical opposition is common to some Western semantic theories, and its appearance in cognitive linguistic apparatus is a welcome contribution.
Teun A. van Dijk, Discourse and Context: A Sociocognitive Approach. Cambridge: Cambridge University Press. xiv + 267 pp. ISBN 978-0-52189559-0. Reviewed by Gerard Steen, Department of Language and Communication, VU University Amsterdam. E-mail: 3gj.steen@let.vu.nl4 This book is the rst monograph dedicated entirely to the notion of context (p. ix), says the preface. It therefore classies itself as exploratory, theoretical, and fragmentary. In the introduction, Teun van Dijk announces that he will design elements of a framework for a theoretical concept of context that can be used in theories of language, discourse, cognition, interaction, society, politics and culture (p. 15). The chapter titles partly reect this: after an introduction, we get context and language, context and cognition, and context and discourse, plus a conclusion.
363
These are no modest ambitions. Yet contexts are dened from the beginning as subjective participant constructs, or mental models; this positions the book solidly inside cognitive and social psychology. Its subtitle is a sociocognitive approach. Discourse and context hence goes back to the original proposal of context models for discourse processing in van Dijks well-known Strategies of discourse processing, co-authored with cognitive psychologist Walter Kintsch (Van Dijk & Kintsch, 1983). In that long-standing and inuential model, language users simultaneously use several mental models during discourse production and reception, one of which includes a mental representation of the context of the discourse (participants, setting, and so on). This lineage makes it unclear how new the theory presented here really is, and whether the essentially psychological basis of the proposed concept of context is equally acceptable across the board in the humanities and the social sciences. Whether the book resolves the latter issue will be answered at the end of this review. The claim that this is the rst monograph dedicated entirely to the notion of context, and, more specically, to context as a mental construct, is not quite true. One well-known counterexample is Givons (2005) Context as other minds, the preface of which says:
The non-objective nature of context . . . has been conceded by pragmatists from Lao Tse to Aristotle to Kant to, more recently, Sperber and Wilson (1986). But arming that context is a mental construct only opens up a vast research agenda. . . (p. xiii)
Givon is one of many who have been inuenced by Van Dijk & Kintsch (1983). He gives his own spin to the mentalist view by dening the functional role of contexts as the bridging principle . . . , the one that would connect rst-order framing of external reality, second-order framing of ones own mind, and third-order framing of other minds (ibid.). This bridging function of context between language users, other language users, and reality is theoretically crucial to more discourse analysts in different disciplines, and it is what Van Dijk keeps returning to throughout his book. But van Dijk does not aim to oer a systematic, balanced and exhaustive review of these fragments of theory, preferring to oer his own coherent account. Givon is given one page of discussion, in the chapter on discourse and cognition, which ends on a negative tenor: Note though that apart from knowledge Givon barely explores the other dimensions of context as complex representation of communicative situation (p. 97). Yet knowledge plays a central role in any sociocognitive approach to context; indeed, Van Dijk claims that the main function of
364
a mental context model is management of knowledge, as we shall see. A question arises, therefore, from the beginning, about the way van Dijk presents his view of the eld. I have great admiration for van Dijks work. His contributions to the eld are many. But this book is not the rst monograph entirely devoted to context, nor is it as revolutionarily new as is suggested by the rst pages. These two notes of uneasiness triggered by the preface have an eect that persists, even though Discourse and context is an interesting endeavor to advance the theory of context. In what follows I will give an account of van Dijks argument and comment on the way it is presented. The introduction begins with an illustrative analysis of aspects of a Parliamentary debate in the British House of Commons, to show all sorts of ways in which the production and reception of discourse is dependent on context, in particular, context as mentally represented by the participants in the debate. Aspects of deixis as well as of the political situation in Britain are discussed from this perspective in order to demonstrate that when people communicate they also make inferences about their own and other participants denition of the communicative situation. This suggests that we need a theory of this aspect of discourse, leading into a brief overview of the way context and contextualization have been dealt with in a broad range of disciplines. Against this background, a series of twenty tenets about context is then presented at the end of the chapter; they function as a sketch of the new framework to prepare the reader for what will be claimed and developed in the rest of the book. The order of, and relation between, the twenty tenets about context is not transparentindeed, not all of them are formulated as claims. When it is claimed that contexts are mental models (claim number 3) and socially based (claim number 7), the question arises whether this suggests that all mental models are socially based (which is a claim that may mean many things to dierent types of readers, e.g. psychologists versus sociologists), or whether context models are mental models which are also socially based, which is a dierent story altogether. And it is hard to appreciate the dierence between the claim that contexts are mental models (claim 3) and that context models are schematic (claim 5), since mental models are described in terms of schema properties under claim 3. The principles by which these claims can be distinguished and related and grouped as elements of a theoretical framework for context are not discussed, but they are not so self-evident and transparent that they naturally emerge from the initial manifesto. These are nontrivial observations about a text which continues in the second chapter with a harshly formulated critique of the only theory of context in linguistics that is taken seriously by the author: Systemic Func-
365
tional Grammar (SFG). According to van Dijk, this approach suers from the following defects (pp. 2930): 1. 2. 3. 4. 5. 6. Too much linguistic (lexico-syntactic) sentence grammar; Too few autonomous discourse-theoretical notions; Anti-mentalism; a lack of interest in cognition; Limited social theory of language; Too much esoteric vocabulary; Too little theoretical dynamism, development and self-criticism
SFGs contextual notions of eld, tenor, and mode are negatively evaluated as hardly well-dened (p. 38), making up a rather strange list (ibid.) and a simple, heterogeneous and hardly theoretically consistent denition (ibid.), and formulated in rather idiosyncratic terminology (p. 39). The SFG denition of register is judged as being of the same kind: . . . , denitions are limited to rather vague and unsystematic lists of examples (p. 41). And the three well-known metafunctions of language in SFG, the ideational, interpersonal, and textual functions, are given the same treatment: . . . , but it need not surprise us that the arbitrariness of the contextual categories [eld, tenor, and mode, GS] carries over to their linguistic correlates (p. 41). The conclusion at the end of the 7-page section of the chapter that centers on the work by Halliday is as follows:
Suce it to say that the original theory of context, as limited to a heterogeneous collection of three vague categories, is indeed rather arbitrarily related to a functional typology that is equally misguided, or at least quite limited. That is, a bad theory of context also generates a bad theory of the very functions of language, language use or discourse. Or rather, SFL does not really oer a theory of context, but rather a theory of language focusing on grammarand later also on text and discourse. (p. 42)
Even if one agrees with many or most of the objections advanced by van Dijk, the way they are presented is counterproductive to persuasion. In a 200-page work which aims to set up a framework for a theory of context, it is disproportionate that the single linguistic approach to context taken seriously by the author is dismissed on the basis of a theoretical critique that basically centers on a handful of pages of argumentation against its main proponent (Halliday) while the alternative pursued by the author is only available as a sketch, in a series of twenty implicitly related tenets. The consistently negative tone of his argument does not do justice either to the fact acknowledged at the end of this chapter, that there is a rele-
366
vant body of SFG work which can be integrated into the authors own, more encompassing approach (p. 55). Radical criticism of SFG is clearly needed, but it should be pursued in other ways. The last sentence of the above quotation suggests, moreover, that van Dijk has an explanation of the rather sorry state of the SF theory of context (p. 43): it is a theory of language where grammar is the core and everything else is context (p. 51). I would agree, but would like to add that this indeed looks like an understandable state of aairs, at least from a historical perspective of the development of functional grammar, functional linguistics, and discourse analysis in the latter half of the previous century. This historical perspective might have been employed by the author to present a more constructive picture of the role of SFG for his development of his own theory of context, which, by his own account, is concerned with another level, or phenomenon: discourse, not language and its use. The contrast between language and discourse is familiar to quite a few discourse analysts, but problematic to many linguists, and it might have been explained in more detail and perhaps at an even earlier stage. As it happens, this is deferred until chapter 4, on page 116 and following. What is more, it appears that van Dijks analysis focuses on an old version of SFG. He has missed Hallidays book from 1999, co-authored with Christian Matthiessen, entitled Construing experience through meaning: A language-based approach to cognition. It has its own agenda which lies outside the scope of this brief review. But it demonstrates that the modern Halliday is not anti-mentalist and that there is a new dynamism in his work ( pace objections 3 and 6 above), and that Halliday looks at context in ways that in spirit are compatible with van Dijks endeavor. That this new development in Hallidays work focuses on language is not surprising for a linguist, and is moreover one of the three essential areas in the theoretical framework developed by van Dijk in the current monograph (context in language, cognition, and discourse). His lack of awareness of this and consequent developments in SFG is a sorry and undermining omission. After this chapter on Context and language, we know that van Dijk holds that there are no serious theories of discourse context in linguistics except for SFG, and that SFG does not work. However, we do not know what his alternative view of discourse context regarding language is, as might have been expected from the chapter title. This raises the stakes for the next chapter, which is entitled Context and cognition. The chapter is twice as long (55 pp) as the chapter on context and language (28 pp), and is followed by a third central chapter, called context and discourse, which is twice as long yet again (106 pp). With the obvious
367
connection to the old work by Van Dijk and Kintsch, the chapter on context and cognition should fulll some of the readers expectations. Chapter 3, then, presents van Dijks main idea, that discourse behavior, including its individual and interactive processes and products, is regulated by peoples mental models of context. A simple schema of the role of the context model in discourse production is oered at the end of the chapter, when all of the elements of this simple schema have been introduced and discussed in the preceding pages (p. 103). Most important in this connection are components of semantic or social memory (general, sociocultural knowledge; group, local knowledge; group ideologies; and group attitudes) as well as of episodic memory (event model; context model; and discourse representation). The context model is a mental representation of the observable or material communicative situation, while semantic memory is depicted next to the independent social structure and social situation, and the discourse representation in episodic memory feeds out again into observable discourse/interaction. This chapter is an elaboration of van Dijks earlier work on context models with Walter Kintsch (van Dijk & Kintsch, 1983; van Dijk, 1999) and presents new theoretical connections, details, and elaborations. The crucial addition is the designation of a so-called K-device, or knowledge device, which handles contextual knowledge management in order to monitor discourse production (and presumably reception) and its expression (or interpretation) in explicit linguistic structures. It is a coordination device for ( joint) action and discourse (p. 94), an explicit link with the work by Herb Clark (e.g., 1996). The upshot of this approach, therefore, is that it locates language use and discourse in individual cognition and performance, in situated contexts. Peoples multi-level cognitive representation of discourse is clearly one of the fundamental areas of investigation which has aorded a steadily developing line of research since the mid seventies of the previous century. This has been partly stimulated by van Dijks own wide range of publications and emerged from psycholinguistic, sociolinguistic and anthropolinguistic oshoots of studies of language use in formal and functional linguistics. It has led to psychological approaches to discourse, including attention to the role of context models, extensively represented in the Society for Text and Discourse and its journal Discourse Processes. But it is also true that an explicit theoretical model of the structure, function, and eects of context models has remained largely extant: That is, psychological model theory is semantic, not pragmatic. It does not postulate an intermediary representation of the communicative situation in terms of mental models (p. 57). It is the main merit of van Dijks monograph that he attempts to redress this situation.
368
This is also the reason why the book is potentially interesting to cognitive linguists. It presents a cognitive theory of language use, but with more attention to higher-level cognition and communication and to nonlinguistic aspects of usage events (general knowledge structures, social and cultural structures) than is customary in much of cognitive linguistics. Moreover, it is more solidly connected with the discourse psychological investigations carried out in cognitive and social psychology, and less oriented towards the microlinguistic experimental analyses of psycholinguistics that are more popular in cognitive linguistics. This book may therefore oer genuinely new ideas to many cognitive linguists interested in the behavioral complexities of situated language use. But at this point two questions remain: rstly, there is the issue that context is restricted to its role as a mental model, which does not seem to allow for alternative conceptualizations of context. It is not clear, at least at this stage of the book, how van Dijks approach relates to social, cultural or even linguistic conceptualizations of context, all of which are, in principle, equally viable until the opposite is demonstrated. True, in the introductory chapter, van Dijk has stressed that he does not reduce the theory of context to a mere cognitive account and that he will analyze context in relation to social cognition, social interaction, social structure and culture, respectively (p. 27) in a companion volume to the present monograph (Discourse and society). But this mere announcement does not alleviate the readers problem when trying to understand the theoretical purport of chapter 3. Secondly, given the lack of an alternative to SFG in the previous chapter, the question remains where the cognitive representation of language as language comes in, and how it relates to the cognitive representation of context in a context model. The old van Dijk and Kintsch (1983) multidimensional model of discourse had a surface text as one other, separate model for the linguistic representation of discourse; but as far as I can tell this old model is only invoked once by means of a cursory reference in chapter three (p. 101). The relevant sentence suggests that this still is the operative model within which we need to understand van Dijks comments about context models, but the relation newly developed in this book between surface text, context model, and context is not given any space in the rst three chapters. A tip of the veil could at least have been lifted in order to clarify both of these issues and contextualize the notion of context model bettercertainly after the strong criticism of SFG which comprises the rst central chapter of the book. Perhaps this is what happens in chapter 4? For the rst issue, the answer is positive: the rst part of chapter 4 presents an account of the way in which a mental context model contrasts
369
with social and cultural models of context. The latter are seen as abstractions from concrete discourse events in which mental models mediate between the individual and social interaction. This is, then, the answer to the question how van Dijks psychological theory of context can also be useful in the social sciences. Since he argues that social and cultural models of context that do not include mental mediation are abstractions and, essentially, reductions to statistical tendencies, it is a moot point for discussion whether all social scientists would accept this oer. The second issue, the relation between context, discourse, and language, is discussed next, rst of all with reference to the notions of style, genre, and register. These are the concepts needed to describe linguistic variation from a functional perspective in relation to a model of discourse that takes context models as crucial. Appropriate style and register are guided by considerations of genre, which are somehow driven by the context model. Most of what is presented here is compatible if not identical with the assumptions and ndings of Douglas Biber and his associates (e.g., Biber, 1989), which makes good sense, but might have been reected on more explicitly in a volume that attempts to develop a new theoretical model against the background of a range of disciplines. The second half of van Dijks treatment of the relation between context, discourse, and language moves away from the lexico-grammatical concerns that are typical of register and style in relation to genre. Genre, register and style, van Dijk claims, have to do with the general contextual conditions for language variation in discourse. The second half, by contrast, examines more local expressions and enactments of the relation with context in the language of text and talk (discourse). The crucial assumption is that the context model controls the possible variation in the production and reception of language and discourse. Various levels of lexico-grammar are included (sounds, syntax, vocabulary), as well as various levels of let us say text (rhetorical structures, superstructures, and text types). The meaning, functions and supposed eects of these phenomena are discussed in terms of for instance topics and related to context models, often through the mediating format of distinct genres of discourse (illustrated by media texts, conversations, and so on). Spoken interaction is given the same type of treatment. The main point of this part is that there is more functional variation than is allowed for in traditional (sociolinguistic) studies of the eect of social context, simply because context models are not one-dimensional but complex. Chapter 4 is over one hundred pages long and does not contain a formal or explicit model. It presents a cursory, linear discussion of all sorts of linguistic phenomena of discourse without much schematization. This can be followed by the initiate, but requires hard work. I wonder why van
370
Dijk has not oered more help in the form of, for instance, a taxonomy or classication of context models : his comments about the notion of discourse genres show that he is suciently aware of the potential in that area. However, we nd no discussion of the relevance of a number of well-known genre-analysts, who have related views of the role of peoples mental representations of context as the driving force between discourse production and reception. John Swales, John Bazerman, Eric Paltridge, and Ken Hyland, for instance, are lacking from the bibliography, even though Bhatia has been included. This is a remarkable neglect in a volume of this focus, scope and ambition. In all, I nd Discourse and context an interesting and provocative but somewhat uneven attempt at advancing the theory of context models. It presents and illustrates many theoretical insights that are worth attending to and developing. But the way they are presented in this monograph may slow down their incorporation into current research on discourse. Author note I am grateful to Alan Cienki and Wilbert Spooren for their helpful comments; they are not to be held responsible for the views expressed in the nal version. References
Biber, D. (1989). A typology of English texts. Linguistics, 27, 343. Clark, H. H. (1996). Using language. Cambridge: Cambridge University Press. Givon, T. (2005). Context as other minds: The pragmatics of sociality, cognition and communication. Amsterdam / Philadelphia: John Benjamins. Halliday, M., & Matthiessen, C. M. I. M. (1999). Construing experience through meaning: A language-based approach to cognition. London: Cassell. Van Dijk, T. A. (1999). Context models in discourse processing. In H. Van Oostendorp & S. Goldman (Eds.), The construction of mental models during reading (pp. 123148). Hillsdale, NJ: Erlbaum. Van Dijk, T. A., & Kintsch, W. (1983). Strategies of discourse comprehension. New York: Academic Press.
The English past tense: Analogy redux

STEVE CHANDLER*
Abstract The debate over how best to characterize inflectional morphology has been couched largely in terms of the dual-mechanism approach described in Pinker (1999) versus single-mechanism connectionist approaches derived from Rumelhart and McClelland (1986). There are, however, other singlemechanism approaches. The exemplar-based or analogical models of Daelemans et al. (2002) and Skousen (1989) also model inflectional usage accurately within a single-mechanism. The most striking theoretical claim peculiar to these purely analogical models is that they do not posit any resident linguistic generalizations for processing language. Instead they process new instances of usage by comparing them systematically to remembered instances of previous usage. Based on a comparison of their Minimal Generalization Model with an adaptation of Nosofskys (1990) Generalized Context Model, Albright and Hayes (2003) argue that such purely analogical models are intrinsically inadequate for modeling the English past tense. This paper shows, however, that Skousens (1989) Analogical Model performs as well as the Minimal Generalization Model. The implications of these results for cognitive linguistics are discussed. Keywords: analogical modeling, exemplar-based modeling, inflectional morphology, English past tense
1. Introduction After more than two decades of intense research, the debate over the proper theoretical characterization of inflectional morphology continues largely
* Address for correspondence: Steve Chandler, Dept. of English, University of Idaho, Moscow, Idaho 83844-1102, USA. Email: chandler@uidaho.edu Cognitive Linguistics 213 (2010), 371417 DOI 10.1515/COGL.2010.014 09365907/10/00210371 Walter de Gruyter
372
S. Chandler
unabated and unresolved. There appear to be at least two major reasons for why this is so. Probably most important is that for many linguists nothing less is at stake than our proper understanding of and characterization of how language is represented in the brain. The other reason is that the modeling of the inflectional processes has proven to be computationally tractable, and consequently there are currently several computational models available which allow us to test the alternative theoretical models against actual human behavior more precisely and more extensively than is the case for almost any other aspect of language usage. For the most part the debate over the nature of the past tense has been couched in terms of the dual-mechanisms model described in Pinkers (1999) Words and Rules versus various single-mechanism connectionist implementations inspired by the work of Rumelhart and McClelland (1986). The former model posits that regular and irregular inflectional forms are divided between radically different representational and processing systems, the dual mechanisms. The latter class of models posits that although regular and irregular verbs show important differences in behavior, those differences can, nonetheless, be accounted for within a single representational and processing system, the single route. There are, however, other single-mechanism approaches which have also proven successful at modeling inflectional behavior, namely Skousens Analogical Model (Skousen 1989, 1992) and the Memory-Based Models (especially the Tilburg Memory-Based Model) of Daelemans et al. (2002), and, to a lesser degree, an adaptation of Nosofskys (1986, 1990) Generalized Context Model. The most striking difference between those analogical models, or exemplar-based models, as a group (described in more detail below) and the other two types of models is the status of resident linguistic generalizations. Both the Words and Rules Model and the connectionist models posit resident mental (linguistic) structures that have been abstracted away from the tokens of linguistic usage that one has encountered during language acquisition. Those resident generalizations then become the basis for processing (interpreting and producing) subsequent instances of linguistic usage. The purely analogical models posit no such resident linguistic generalizations. Instead, they posit that speakers process new instances of linguistic usage by comparing the new instance to an ever-accumulating store of remembered linguistic experiences. If a sufficiently close match is found in memory, then that match becomes the basis for processing the new instance of usage. If no match is found in memory, then the analogical models provide explicit processes for comparing the new instance to similar instances in memory and using those remembered instances as the basis for operating on the new instance analogically. As will be described below, what distinguishes one analogical model from another are the details of the procedures for comparing new instances of usage to remembered instances and then selecting one or more of those
The English past tense: Analogy redux 373 remembered instances as the basis for operating on the new instance analogically. In a series of studies, Albright and Hayes (2002, 2003) and Albright (2007) have proposed yet another alternative, a single-mechanism, rule-based model which they argue both disconfirms certain empirical claims of the Words and Rules approach and exhibits certain empirical and theoretical advantages over the purely analogical models as a class of models. The rules that they posit are resident linguistic generalizations, morphophonemic rules, which are abstracted away from instances of past-tense usage and that in turn become the basis for subsequent language processing. In this paper, I shall show that Albright and Hayes claims regarding the empirical and theoretical advantages of their rulebased, Minimal Generalization Model over purely analogical models are empirically incorrect. As I show below, Skousens Analogical Model (Skousen 1989, 1992) accounts for both the verb-form rating data and the nonce-verb production data presented in Albright and Hayes (2003) as well as their Minimal Generalization Model does, and it does so without positing the intervening operation of resident linguistic generalizations. I go on then to argue that not only are Albright and Hayes incorrect in their claim regarding the theoretical inadequacy of purely analogical models but that indeed it is their model which is both theoretically and empirically inadequate for explaining fully the phenomena of past-tense usage. Finally, I shall discuss some of the implications of these results and of exemplar-based models for the representation of linguistic generalizations within a cognitive linguistics framework. 2. The models 2.1. Words and Rules versus connectionist approaches
The Words and Rules Model of Pinker (1999) posits two, largely independent linguistic mechanisms underlying performance on inflectional morphology such as the English past tense. One mechanism, linked closely to the mental lexicon, accounts for the usage of those 120 or so English verbs associated with irregular past-tense forms, such as sing~sang, eat~ate, and so forth. Presumably, those arbitrary but conventional usages are acquired through the same mechanisms and processes of associative learning thought to underlie lexical acquisition in general, and they indeed show the characteristics of associative learning. For example, they show the effects of frequency of usage in their acquisition and subsequent use, as well as in their propensities for errors. They also show other earmarks of associative learning in how words that resemble one another in form and meaning sometimes interfere with one anothers usage as seen in the common confusions between pairs such as sit~set, lie~lay, and ring~bring.
374
S. Chandler
The other mechanism of inflectional morphology modeled within the Words and Rules account is characterized as part of the grammatical system of the language, a schematized, general linguistic procedure for inflecting any lexical item marked symbolically as <+VERB> and <+PAST> and not already associated with an irregular past-tense form. Since this general process operates on a category of lexical items (i.e., regular verbs) rather than on individually learned verbs, its operations are said to be largely immune to those characteristics of associative learning just mentioned. In other words, all regular verbs are supposed to be equally good examples of verb regularity. They should show neither the sorts of frequency effects seen in the usages of irregular verbs nor the interferences seen among irregular verbs of similar form and meaning. Moreover, since the regular process operates on a category of verbs rather than directly on the individually learned verbs themselves, the regular process becomes the default form of the past tense extended to new items and overgeneralized to normally irregular verbs whenever the brainfor whatever reasonfails to find and access the associated irregular past-tense form. Pinker and his colleagues have amassed an impressive array of observational, behavioral, experimental, and clinical data which, they argue, demonstrate a strong neuropsychological dissociation between the operations of these two mechanisms on past-tense inflection (see Pinker and Ullman 2002 for a more recent survey of these data and arguments). In 1986 Rumelhart and McClelland demonstrated that both regular and irregular past-tense forms could be represented and modeled within the common computational framework of a connectionist model, a single mechanism. In Rumelhart and McClellands original implementation, examples of verbs and their past-tense forms, both regular and irregular, were presented to an array of units representing the phonological features of those verbs. Those units were connected in turn to another array of phonological units representing the pasttense form of the input verb. Subsequent implementations of connectionist models have almost always added a set of hidden units between the input and output units which encode recurring patterns among the input units. During training, feedback adjusts the weights of the connections among those units selectively such that different patterns of units begin to emerge which come to represent different categories and subcategories of the input verbs and their associated past-tense forms, both regular and irregular. In most connectionist models, the individual tokens of verb forms presented to the network during training are not retained in the connectionist network. Each token influences the evolving patterns of connections that come to represent the different verb types, but the tokens themselves are not represented therein. In a detailed analysis and critique of Rumelhart and McClellands (1986) connectionist model, Pinker and Prince (1988) identified a number of serious theoretical and empirical problems with that model. Over the ensuing years,
The English past tense: Analogy redux 375 numerous other researchers such as Marchman and Callan (1995), Hare et al. (1995), and Joanisse and Seidenberg (1999) have sought to address those criticisms and have demonstrated improved connectionist models of past-tense usage. Chandler (1995, 2002) and Skousen (1995) have argued that connectionist models such as the ones just cited continue to exhibit certain inherent theoretical and empirical deficiencies in their attempts to model past-tense usage. For example, as demonstrated in Pinker and Prince (1988), connectionist models do not extend gracefully to unusual looking nonce English verbs such as ploamph. The networks tend to return unnatural blends of alternative pasttense forms. As Eddington (2002) has shown, however, the Analogical Model predicts the past-tense forms of such nonce verbs easily and accurately, matching human performance almost perfectly. More importantly, though, the connectionist models most commonly in use today do not retain perceptual memories of the individual training items presented to it and are not able to model human performance on learning and applying nonlinearly-separable categories, that is, categories whose memberships overlap inextricably (Whittlesea, 1997).1 Both of these issues are discussed in section 7.2. 2.2. Connectionist models versus exemplar-based models
As we shall see below, analogical models of language behavior come in several varieties. Strictly speaking, the connectionist models such as those cited above are analogical models because they determine output, or behavior, by comparing a new input item to a collective representation of the items that the system has already encountered and internalized in some sense. Most connectionist models, however, abstract, or schematize, information about the representation and behavior of forms away from the individual training exemplars, or tokens, that gave rise to the network representation in the first place. They then discard or otherwise ignore those training tokens and assign outputs to new input forms by comparing the new form to the composite network representations rather than directly to the training exemplars. Such connectionist approaches contrast sharply with the analogical models proper, or exemplar-based models, described below, in which a new input form is compared directly to one or more tokens of similar items that have been retained in a data base of previous linguistic experiences. A number of studies, such as Nakisa et al. (2000), Mudrow (2002), and Eddington (2000), have compared the strengths and weaknesses of connectionist models of inflectional behavior vis-a-vis analogical models.
1. As discussed in Chandler (2002), connectionist models can learn and apply nonlinearlyseparable categories if they include enough hidden units to represent each member in the overlapping parts of the categories uniquely. At that point, however, the connectionist models become formally equivalent to exemplar-based models.
376
S. Chandler
When truly comparable versions of the two types of models are compared head-to-head on a given task, the exemplar-based analogical models almost always perform as well as or better than the competing connectionist models (e.g., on the Finnish past tense and Danish compound nouns in Mudrow, 2002; on Arabic plurals and the English past tense in Nakisa et al., 2000; the English past-tense in Eddington, 2000). As already noted above, however, there are other important empirical and theoretical reasons for preferring exemplarbased models over connectionist models. Since these issues have been discussed in detail elsewhere (Chandler 1995, 2002, 2009a; Skousen 1995), I shall not discuss connectionist models further in this paper. To date, at least three different exemplar-based models of analogical behavior have been applied to the modeling of the English past-tense, Skousens Analogical Model (Skousen 1989, 1992), the Tilburg Memory-Based Learning model of Daelemans et al. (2002), and an adaptation of Nosofskys Generalized Context Model (derived from Nosofsky 1986, 1990). The defining characteristics that these exemplar-based models all share are (1) that they retain and refer to a collection of prior linguistic experiences in determining the output behavior of a new input form, (2) that they include some algorithm for evaluating the similarity of the new form to those remembered forms, and (3) that they have a decision rule for deciding which remembered exemplar or exemplars will form the basis for deriving the output analogically. A number of studies have shown that one or more of these purely analogical models can account accurately for findings reported in support of both the dual-mechanisms approach and the single-mechanism connectionist approaches to modeling inflectional morphology (e.g., Chandler 2002, 2009b; Eddington 2000, 2003; Daelemans 2002; Keuleers 2008; Nakisa et al. 2000). These three models differ among themselves, however, in how they compare a new input form to the remembered exemplars, how they select certain candidate examples from the data base to provide the possible basis for an analogical response, and how they choose from among those possible candidate forms one or more alternatives to provide the basis for an actual analogical response. As will be shown below, those differences turn out to have important empirical consequences, indeed they are precisely why both the Tilburg Memory-Based Learning model and the Analogical Model reproduce past-tense behavior so much more accurately than does the particular adaptation of the Generalized Context Model used by Nakissa et al. (2000) and by Albright and Hayes (2003). 2.3. Minimal generalization
Albright and Hayes (2002, 2003) and Albright (2007) have proposed their Minimal Generalization Model, which they claim accounts for certain empirical findings that contradict the Words and Rules approach to inflectional mor-
The English past tense: Analogy redux 377 phology and for certain other empirical findings not explicable by what they call purely analogical models. The Minimal Generalization Model formulates and abstracts morphophonemic rules for deriving the past-tense forms of verbs by the iterative comparison of pairs of verbs that take the same past-tense form, whether regular or irregular. For each pair-wise comparison, the model abstracts the phonological features common to the two base forms while discarding the features unique to each verb. Thus, comparing talk~talked and walk~walked would produce the past-tense rule t / [ _ak]__, and comparing drive~drove and ride~rode yields the rule [aj] [ow] / [ __{v,d}]. The resulting abstraction provides a structural description for predicting the past tense of a new verb whose base form also shares that phonological structure. These are the minimal generalizations of the model. Subsequently, the model also compares pairs of such rules that predict the same past-tense form and extracts an even more abstract rule while retaining the more specific rule. Ultimately, the model arrives at a set of minimal generalizations that predict correctly the appropriate past-tense forms for larger sets of verbs. For several theoretical and technical reasons, Albright and Hayes (2003) chose to represent their rules as bundles of phonetic features (rather than as the segmental symbols used here for simplicity). Using features to represent natural classes of phones allowed them to capture certain phonologically-based generalizations about speakers use of inflections more naturally than an inventory of fully specified phonetic segments would have. For example, it allowed them to describe the generalization that every verb of English that ends in a voiceless fricative . . . is regular (p. 127), and it allowed them to account for other well-attested facts such as that speakers will extend the regular [-s] allomorphs of the plural and possessive morphemes to borrowed words ending with the voiceless velar fricative [x], as in Bachs [baxs], even though there is no analogical basis for doing so in English based on full segmental representations. Using feature representations also allowed them to model the phonological conditioning of the regular past-tense allomorphs in an intuitively more insightful way than could an inventory of word-final segments, a capability that they identified as canonical in distinguishing between the two types of models:
Locating the final consonant to determine the correct ending is a canonical case where structured similarity is required: the past tense allomorph depends solely on the final segment of the stem, in particular on just a few of its features. [Their adaptation of the Generalized Context Model], however, is inherently unable to focus on these crucial structural elements. (Albright and Hayes 2003: 151)
Finally, Albright and Hayes (2003) also used the notion of natural classes, represented as features, to solve the technical problem of aligning segments
378
S. Chandler
properly within their computer implementation when comparing words with different phonological structures. Given two forms such as drive [dajv] and dive [dajv], for example, the initial obstruents should be compared to one another and the two nuclear vowels should be compared while the system should not try, inappropriately, to compare the second-position approximate of drive with the second-position vowel of dive. In Albright and Hayes implementation the [] will be compared to a null segment inserted at the phonotactically appropriate point in the representation of dive. All computational models comparing words phonologically have to solve this segment alignment problem in some way, and I describe below how the Tilburg Memory-Based Learning Model and the Analogical Model have addressed it. From time to time the phonological context corresponding to one rule within the Minimal Generalization Model will also correspond incorrectly to verbs with other past-tense forms. For example, the rule [] [] / {l, } ___ as in string~strung correctly predicts six past-tense forms (the rules hits) but incorrectly includes three other verbs such as bring~brought and ring~ringed (encircled) for a scope of nine verbs in total. Dividing the number of hits by the number of verbs in the rules scope yields the raw confidence value, the probability that the rule will predict the past-tense form correctly given any particular input verb that conforms to the rules structural description. However, allowing the raw confidence score (or probability) alone to determine whether a rule will apply results in many overgeneralizations of the irregular forms, such as producing brang for brought. Therefore, Albright and Hayes (2003) calculated an adjusted confidence value for a given rule based on how many verbs in the training set it predicts correctly. Their rationale was that rules that predict many past-tense forms correctly and few or none incorrectly inspire a greater sense of confidence and are, therefore, assigned higher adjusted confidence values than are rules having the same raw confidence value but which apply only to relatively few verbs. When rules with different structural descriptions (one being a more specific subset of the other) apply to a new verb, the rule with the higher adjusted confidence value applies. Albright and Hayes (2003) see highly accurate rules that also apply to many verbsi.e., rules having high confidence valuesas identifying what they call islands of reliability in past-tense usage. For example, as mentioned earlier, they noted that all verb stems in English ending with a voiceless fricative are regular. Thus, this more specific past-tense rule has the highest adjusted confidence value that their model permits, whereas the more general, but not always correct, rule that a verb ending with a voiceless obstruent other than [t] takes [t] as its past-tense marker evokes a somewhat lower level of confidence. Albright and Hayes identified this notion of islands of reliability as their most important theoretical departure from Pinkers (1999) Words and Rules approach because the influence of such islands implies that not all regular
The English past tense: Analogy redux 379 verbs do indeed provide equally good examples of regularity. In particular, Albright and Hayes predicted, contrary to the predictions of the Words and Rules approach (Pinker and Ullman, 2002) and contrary to the findings reported in Prasada and Pinker (1993), that in a goodness-of-form rating task the participants would rate regular past-tense forms that corresponded to islands of regularity as better examples of the past-tense than they would regular verbs that did not correspond to some island of regularity. They also predicted that when competing rules applied to a verb that the rule having the highest adjusted confidence level would apply. Thus, in a past-tense production task, the output ought to be more consistent for verbs corresponding to islands of regularity than for the more idiosyncratic verbs. Because the Minimal Generalization Model retains many verb-specific rules from its training set, its performance on real verbs is excellent. The much more interesting tests of the model arise, therefore, in extending its predictions to verbs not included in the training set. For this reason, Albright and Hayes (2003) proposed to test the predictions of their model against data elicited from the participants in a pair of new wug tests ( la Berko 1958), in which the participants provided past-tense forms for novel, or nonce, verbs. Albright and Hayes also proposed to compare the performance of their rule-based model against that of a pure analogical model. They argued that an analogical model that predicted past-tense forms by comparison to overall words would (1) not match the accuracy of the phonologically-based rules derived by the Minimal Generalization Model and (2) would neither be able to identify the islands of reliability that they considered so important nor predict the rating data associated with those islands. 2.4. Selecting an analogical model to test
As just discussed, Albright and Hayes (2003) proposed to demonstrate (1) that inflectional morphology is better characterized as rule-based behavior abstracted away from the phonological structure of words rather than as analogical processes that compare whole words and (2) that speakers form and recognize islands of reliability based on phonological similarities among groups of like-behaving verbs. Given that there are at least three different analogical models extant in the literature that have been applied to modeling the English past tense, the Tilburg Memory-Based Learning Model, the Analogical Model, and the Generalized Context Model, Albright and Hayes described the criteria that led them to adopt a version of Nosofskys Generalized Context Model (Nosofsky 1990) over the other two as their representative of the analogical approaches to modeling inflectional morphology. Their choice turned out to be unfortunate, for while the criteria that they appealed to for choosing the Generalized Context Model (described in this section) are well-taken in principle,
380
S. Chandler
their application of those criteria led them to test a version of the analogical model that had already been shown elsewhere to be less successful in modeling inflectional morphology than are the other two analogical models (e.g., Daelemans 2002; van den Bosch 2002). Two issues were particularly important to Albright and Hayes (2003) in choosing an analogical model to compare to their Minimum Generalization Model. One was that both test models could be fully implemented computationally so as to insure that the models themselves were doing all the work of predicting the past-tense forms fully without human intervention. The other issue was that, for the reasons cited earlier, they wanted to represent their test forms, data sets, and rules with phonological features rather than simply as sequences of phonetic segments. The Generalized Context Model seemed to them to accommodate both requirements more readily than did the other two extant models. Albright and Hayes (2003) discounted both the Tilburg Memory-Based Learning Model and the Analogical Model because while those models identify the lexical sources for a given analogical response, at the time of their study neither model had actually been used to apply the analogy and generate a fully specified phonological output. As it happens, the Generalized Context Model, which Albright and Hayes did select as their analogical test model, also does not generate linguistic output. As a model of categorization, it only evaluates the relative probability that the alternative outcomes (in this case, alternative categories of past-tense forms) presented to it for a given verb are likely to be chosen as the past-tense form of that verb. Albright and Hayes used a separate algorithm to generate all possible output forms for a given test verb by allowing every rule that conformed to the phonological structure of that test verb to produce a possible past-tense form for it. They then fed those possible outcomes to the Generalized Context Model for it to evaluate. Since the supplementary algorithm that generated those past-tense forms overgeneralized the regular past-tense allomorphs, producing, for example, *[skajdd], *[skajdt], and [skajdd] for the regular past tense of scride, Albright and Hayes used a phonotactic filter to block the first two outputs from further consideration. The other important reason that Albright and Hayes (2003) cited for choosing to use a version of the Generalized Context Model over the other two analogical models was that they wanted, for the reasons described earlier, to represent their test forms, data sets, and rules as bundles of phonological features rather than simply as sequences of phonetic segments. The Generalized Context Model appeared to them to accomplish such feature-based comparisons more easily than the other two models could. While their arguments are well taken, they do not necessarily motivate the choice of the Generalized Context Model over the other two analogical models as strongly as Albright and Hayes
The English past tense: Analogy redux 381 argued. Both the Tilburg Memory-Based Learning Model and the Analogical Model can, and have been used to, operate on forms represented as bundles of phonetic features when it appears important to do so. Eddington (2003) and Eddington and Lonsdale (2007) have shown that phonetic features may be used rather than segmental symbols with no significant change in the linguistic performance of the model. For reasons having to do with the computational efficiency of the algorithm, however, rather than with theoretical performance, researchers using the Analogical Model have usually chosen to represent their data as phonetic segments rather than as bundles of phonetic features.2 Albright and Hayes (2003) also found it especially problematic that neither the Tilburg Memory-Based Learning Model nor the Analogical Model, in the then extant version, could derive the three allomorphs of the English regular past tense correctly. Those models simply predicted the probability that the regular inflection would be applied to a given verb. Albright and Hayes cited the use of the phonologically conditioned allomorphs for the English regular past tense as prima facie evidence that inflectional processes are more properly described as governed by rules based on minimal phonological generalizations than as whole-word analogies. Therefore, any adequate model of English pasttense usage should be able to predict those allomorphs correctly. Again, it is true that the extant version of the Analogical Model does not derive the appropriate allomorph of the English regular past tense, and early attempts to do so were not successful (e.g., Derwing and Skousen 1994). In its current implementations, the model simply indicates that a given verb form will be regular, but it does not generate a phonological form for that verb.3 A procedure has been proposed elsewhere (Chandler 2009b), but not yet implemented computationally, that can be appended to the Analogical Model to predict the pronunciation of the regular past-tense. I describe an adaptation of that procedure below. Meanwhile, Keuleers (2008) has recently demonstrated a new working version of the Tilburg Memory-Based Learning Model which does generate fully-specified phonological forms for the past tense of verbs presented to it.
2. The reason for this has to do with what is known within Analogical Model research as the exponential explosion problem. In the earlier versions of the computer implementation, every variable that was added to the data representations increased the amount of memory storage and processing time needed to run the program exponentially. Thus, researchers sought to minimize the number of variables that the program had to process. This problem has been ameliorated significantly in more recent implementations of the model. Researchers using the Tilburg Memory-Based Learning Model and the Analogical Model do not generally predict the fully-specified phonological output of processes such as verb morphology because the modelsand their computer implementationsare more general models of linguistic categorization intended to apply to a variety of linguistic behaviors at all levels of linguistic description. Unlike the Minimal Generalization Model, they were not designed specifically as models of inflectional morphology.
3.
382
S. Chandler
As noted earlier, the proper alignment of segments is a recurring technical problem in the implementation of all computational models of phonological comparison, and the approach proposed by Albright and Hayes (2003), aligning sounds according to their shared natural classes, is an interesting alternative to those used in analogical modeling. Nonetheless, researchers working with both the Tilburg Memory-Based Learning Model and the Analogical Model also have working solutions to the problem. It has been common practice, for example, for researchers working with the Analogical Model to insert null place holders into the phonological representations of their test items as they are encoded into program-readable form (cf., Eddington 2003, 2007). The result is the same as that achieved by the procedure described by Albright and Hayes (2003). More recently, Keuleers (2008) has implemented a different procedure which compares the information gain ratio of pairs of segments in the words being compared and then aligns the segments so as to maximize that gain. Skousen (2006) and Eddington (2007) have recently suggested a different approach to the problem of aligning forms within the Analogical Model. They propose aligning the vowel nuclei and then comparing all of the phonotactically legal onsets and phonotactically legal codas represented in the remembered exemplars that are compatible with any other segments specified as part of the given test word. This is the approach that I adopt below and illustrate there. 2.5. The Generalized Context Model As an analogical model to test against their Minimal Generalization Model, Albright and Hayes (2003) chose a version of Nosofskys (1990) Generalized Context Model as implemented in a previous study of inflectional morphology by Nakisa et al. (2000). The Generalized Context Model was developed by Nosofsky as an exemplar-based model of concept learning and categorization behavior and has been tested most extensively as such. The model categorizes a target itemusually an artificially created visual stimulusby comparing it feature by feature with the sum of the features of all the training exemplars given to the model. The Generalized Context Model then sums the similarity of the target itemthe number of shared featuresover all the members of a given category of interest and divides that value by the sum of the items feature-by-feature comparison to all members of all the categories being compared. The result is the probability of that item being a member of the category of interest. In this study of past-tense forms, the alternative categories are defined by the alternative past-tense forms theoretically possible for a given nonce verb. Thus, a nonce verb such as spling could fall into the sing~sang category of verbs or the string~strung category or the bring~brought category. The outputs of the model are the probabilities of the nonce verb falling into each of those alternative past-tense-form categories.
The English past tense: Analogy redux 383 Nosofskys (1990) Generalized Context Model has proven especially successful in modeling experimental data on concept learning and categorization, but that success has depended crucially on the researcher including certain additional factors in the equation for determining the relative probability of the alternative responses. In particular, the equation must include values representing (1) the type of similarity metric used and (2) the different weightings to be given to the various stimulus features (phonological features in these linguistic studies) used to represent stimuli and exemplars. The feature weightings are especially important. In concept learning experiments, the researcher must establish the appropriate feature weightings beforehand, usually by means of pair-wise similarity ratings to arrive at a confusion matrix for the stimuli used in a given experiment. Not determining and using feature weightings degrades the performance of the model significantly. When they chose to adapt the Generalized Context Model, both Nakisa et al. (2000) and Albright and Hayes (2003) understood, and acknowledged that not using the feature weightings would degrade the performance of the model significantly. Their decision was not unreasonable. It was not immediately evident that they could derive feature weightings from the thousands of verbs included as the data base for their studies, and any weighting values that they could have derived would have been fleeting and highly variable. Nosofsky (1986, 1990), Cost and Salzberg (1993), Eddington (2000, 2003), Daelemans, Gillis, and Durieux (1997), and Keuleers (2008) have all shown that such feature weightings do not become permanent values in the data set. Instead they are variables, often changing when the makeup of data set changes, as when new items are added or deleted. Thus, for example, when Albright and Hayes chose to exclude verbs with a lemma frequency of less than 10 from their data base, the weightings of the phonological features that would predict the results most accurately would have changed and could have changed significantly. The studies just cited have also shown that the feature weightings derived for one experimental task do not necessarily transfer to other tasks. This means that they need to be determined anew for each task rather than stored as permanent feature values used to represent the items in the data set. The feature weightings that would best predict the naturalness ratings reported by Albright and Hayes, for example, might not be the same weightings that would also best predict the past-tense productions. The potential consequences of deciding to test the weakened version of the Generalized Context Model become evident when we compare its ability to predict German plural forms with that of the other two analogical models under discussion here. Nakisa et al. (2000) found their version of the Generalized Context Model to predict the plural forms of German nouns correctly about 74.3% of the time. In subsequent studies, however, Daelemans (2002) found the Analogical Model to reproduce the plural forms for those same German
384
S. Chandler
nouns correctly about 92% of the time, and van den Bosch (2002) found the Tilburg Memory-Based Learning Model to reproduce them correctly about 94.8% of the time. Thus, in adopting the version of the Generalized Context Model used by Nakisa et al., Albright and Hayes (2003) not only were not testing the strongest possible version of the Generalized Context Model but were actually testing a version that at least two other analogical models had already outperformed significantly on the same test items. 2.6. The test models not chosen by Albright and Hayes (2003)
2.6.1. The Tilburg Memory-Based Learning Model. For the reasons already discussed, Albright and Hayes (2003) chose not to use the Tilburg MemoryBased Learning model of Daelemans et al. (2002) as their analogical test model even though it had been used in a number of studies to predict linguistic behavior analogically by comparing a test form to exemplars of behavior stored in memory (e.g., Nakisa and Hahn 1996; Nakisa et al. 2000; Daelemans 2002; Eddington 2002). In its simplest form this model simply compares the number of features (usually phonological segments) shared and not shared between the test item and each of the exemplars in the corpus to arrive at the exemplar or set of exemplars that are most similar to the test item. Those exemplars then provide the basis for an analogical response to the target item. Unfortunately, as Skousen (1989) and others have demonstrated, models that simply select the nearest neighbor, the most similar exemplar or exemplars from memory, are empirically inadequate; they often predict incorrect responses. This turns out to be so in large part because not all features are equally predictive of the appropriate behavior (past-tense form in this case) for every word. Consequently, researchers using the Tilburg Memory-Based Learning Model have developed more sophisticated versions of the model in which they determine beforehand, in a separate procedure, the information gain for each of the features in each of the different contexts (different exemplars). Information gain is essentially the conditional probability that a given feature will predict a given outcome (past-tense form) within a given context (the other features of the word). Using information gain to weight the predictiveness of the features differentially is crucial to the satisfactory performance of the model (cf., Daelemans 2002; Eddington 2002) and turns out to be formally equivalent to the feature weighting used by Nosofsky (1986) with his Generalized Context Model (cf., Keuleers 2008). Consequently, this feature-weighting procedure is subject to the same criticisms noted earlier with respect to Nosofskys model. Such weightings are task specific and change as exemplars are added or deleted from the data set (the corpus). As part of a larger study on Memory-Based Learning of inflectional morphology, Keuleers (2008) has recently reported a very detailed theoretical,
The English past tense: Analogy redux 385 empirical, and methodological critique of the simulations published in Albright and Hayes (2003). More importantly, though, Keuleers has demonstrated that his adaptation of the Tilburg Memory-Based Learning Model, referred to as simply the Memory-Based Learning Model in his paper, predicted the rating performance of the participants in Albright and Hayes experiments as well as their Minimal Generalization Model did. Keuleers also simulated very closely the likelihood ratings reported in Prasada and Pinker (1993). Keuleers (2008) implementation of the Memory-Based Learning Model just cited compared the segmental differences and similarities between the nonce verbs from Albright and Hayes (2003) and the exemplars of the monomorphemic monosyllabic verbs contained in the CELEX English data base (Baayen et al. 1995).4 To meet Albright and Hayes objection that the Tilburg Memory-Based Learning Model did not specify the phonological output of its analogical operations, Keuleers appended a separate procedure to the model for specifying that output. He identified 41 Transformation Labels corresponding to an exhaustive list of the morphophonemic changes represented in the corpus, including the regular English allomorphs, for producing a pasttense form from a base form. He tagged each exemplar in the corpus with the appropriate transformation label. Once his Memory-Based Learning Model had identified a verb or set of verbs from the corpus as the basis for an analogical response, the follow up procedure then applied the transformation associated with the chosen exemplars to the target word. The combined effects of the distance metric and the feature weighting that Keuleers used insured that exemplars exhibiting the appropriate regular allomorph were always selected over any exemplars having one of the inappropriate allomorphs. As already noted, Keuleers (2008) simulations with his Memory-Based Learning Model performed comparably to the Minimal Generalization Model of Albright and Hayes (2003). This result alone vitiated Albright and Hayes major theoretical claim regarding purely analogical models and accords with the performance of Skousens Analogical Model (1989, 1992) reported below. In his analysis and discussion of his results, Keuleers also identified a major logical flaw in the design and implementation of Albright and Hayes study. He showed that when the Memory-Based Learning Model incorporates the same distance metric and feature weighting values commonly used in the stronger versions of Nosofskys Generalized Context Model (Nosofsky 1986, 1990) and compares a test item to all possible verbs contained in the data base the Memory-Based Learning Model becomes formally equivalent to the stronger
4. Keuleers also included the polysyllabic verbs in the CELEX data base, but he encoded and compared only the final syllable for each of those verbs since all of the test items were monosyllabic.
386
S. Chandler
version of the Generalized Context Model. This means that, in all likelihood, the stronger version of the Generalized Context Model, the version not tested by Nakisa et al. (2000) or by Albright and Hayes, would have performed comparably to the Minimal Generalization Model that was used by Albright and Hayes. 2.6.2. The Analogical Model 2.6.2.1. Predicting the inflectional category. The other purely analogical model not chosen by Albright and Hayes (2003), the model used below to simulate their study, is the Analogical Model described in Skousen (1989, 1992). As an exemplar-based model of categorization and choice behavior, Skousens Analogical Model assigns a given form, e.g., a nonce verb, to a category of linguistic behavior, a particular past-tense form, by comparing the given form and its current context to remembered categorizations of previously encountered forms, or exemplars.5 Theoretically, the model draws on a data set of contextualized memories for tokens, or exemplars, of linguistic usages. For purposes of simulation, we choose some corpus or data set of usage examples representing the linguistic behavior of interest, past-tense forms in this case. Such exemplars are not just memory traces of raw, unanalyzed sensory experiences; rather they include the results of the cognitive and linguistic interpretations that were assigned to them at the time they were experienced and stored. Thus, they are conceptually very much like the notion of usage event described in Langacker (1987) except that we posit that exemplars do retain much sensory information about the event as it was originally experienced, filtered only by the effects of stimulus sampling and attention (Estes 1976, 1994). Since we make no effort to predict which features of a given exemplar might be more or less important for operating on some future instance of usage, the Analogical Model does not incorporate feature weighting. The model also does not need to posit some sort of usage counter for keeping track of usage frequency. Instead, it simply assumes that tokens of usage are recorded in memory and that frequency effects fall out as a natural consequence of the number of tokens recorded in memory. It is customary in simulations using the Analogical Model, to enter the data set with an imperfect memory factor set at 0.5 (see Skousen 1989 and 1992 for the theoretical and empirical justifications for this practice). This means that on any given occasion there is a 0.5 probability of accessing and recalling a given
5. The most recent computer implementation for the AM and example data sets are available as free downloads from the web page of the Analogical Modeling Research Group <http:// humanities.byu.edu/am/>.
The English past tense: Analogy redux 387 token in the data set. As a consequence of this, higher frequency formsthose having many tokens in the data setwill almost always be accessed, recalled, and applied whereas less frequently recurring forms are less likely to be remembered. A form stored in memory (or in the corpus) only once has only a fifty-fifty chance of being recalled on any given occasion. The core of Skousens Analogical Model is a procedurean algorithmfor predicting the behavior-of-interest for a currently presented form by comparing it systematically to forms in memory also exhibiting that behavior. For example, given a nonce-English verb such as chool, phonologically /ul/ (from Albright and Hayes 2003), the Analogical Model predicts a past-tense form for it by comparing it to similar verbs in memory that have not been overlooked by the imperfect memory factor. Using the data set and procedure described below, for example, the model turns out to favor chooled /uld/ about 91% of the time but admits chole /ol/ about 9% of the time due to the competing analogical influence of choose in memory. As a group, the participants in the two experiments reported by Albright and Hayes produced chooled 98% of the time and chole 2% of the time. To predict the past tense form of a spoken verb stimulus such as /ul/, the Analogical Model compares the target item to every token in the data set (memory) that shares a positionally-equivalent phonological segment with the target form, or a phonotactically legal sequence of segments that includes any segments specified by the target form. The forms are compared by removing the segments of the target form systematically and comparing the remaining segments with the words in the data set that share those segments with the target word. These subsets of features that represent the verb choolwith their serial positions preservedare called the supracontexts of chool. Thus, /ul/ will be compared to all the words in the data set (not overlooked by imperfect memory) that share with it any of the phonological supracontexts shown in Figure 1. The Analogical Model algorithm does not simply identify all the forms in memory that share features with the target item. It selects for possible analogical extension only those forms that correspond to supracontexts that do not increase uncertaintyin an information-theoretic senseabout the alternative outcomes. Uncertainty within the Analogical Model is a function of the number of disagreements in behavior (or category membership) represented in the forms selected by a given supracontext. Using the equation derived in Skousen (1989, 1992), in which the number of disagreements in a supracontext equals 2 times the number of regular forms times the number of irregular forms (treating each different kind of irregularity separately) or 2nrne, we find that the supracontext / _ _ / exhibits 28 disagreements (2 1 14). Supracontexts, such as /_ ul/ in Figure 1, are said to be homogeneous by definition because the verbs that they select from the data set represent only one category of past-
388
S. Chandler
Supracontexts /ul/ /u_/ /_l/ /_ul/ /_ _/ /_u_/ /_ _l/ /_ _ _/ Associated verbs selected from the data set
---choose, chew chill pull, bull, tool, duel, drool, cool, fool, rule cheat, chip, chin, chill, choose, cheep, cheer, check, chair, chat, chuck, choke, chop, char, chime poop, puke, pull, boot, boo, boom, bull, toot, tour, dupe, duke, doom, duel, coop, cool, cure, moo, moon, fume, fuel, suit, choose, hoot, loop, root, whoop peal, deal, kneel, [and 56 others] (all 1,693 verbs in the data set)
Figure 1.
Supracontexts for Albright and Hayes (2003) English nonce verb chool /ul/ and the verbs from the data set associated with each of those supracontexts (irregular verbs shown in bold)
tense form, regular in this case. Other supracontexts, such as /u _/, are also said to be homogeneous even though they may predict more than one possible outcome if that supracontext does not select any words from the data set (remembered exemplars) that (1) share more features with the target word /ul/ than that particular supracontext alone does and (2) predict greater uncertainty about the outcome (past-tense form) than the words sharing only the features of that supracontext with the target word do. This is the case for /u _ /. As demonstrated mathematically in Skousen (1989, 1992), supracontexts that select forms from memory that do not meet these two criteria represent an increase in the uncertainty about the outcomes to be associated with the target word. For example, the supracontext / _ _/ illustrated in Figure 2 selects both choose and chew from the data set. Both verbs are more likei.e., share more features withthe target verb chool that the supracontext itself does, and they disagree in how they form the past tense. Such supracontexts are dubbed heterogeneous and are discounted from further consideration.6 The tokens selected by them do not become part of the basis for an analogical response. Figure 2 illustrates one procedure for determining whether a given supracontext is homogeneous or heterogeneous. We examine the subcontexts that correspond to a given supracontext. Subcontexts are similar to supracontexts in that they are derived by removing variables (phonological segments in this case) from the target item systematically. However, whereas the deleted
6. Homogeneous supracontexts represent a minimum amount of uncertainty about possible outcomes, not necessarily zero uncertainty. Heterogeneous supracontexts introduce an increase in uncertainty about the possible outcomes.
The English past tense: Analogy redux 389

Supracontexts /ul/ /u_/ Subcontexts and associate verbs from the data set |ul| ---|ul| ---|ul| ---|ul| ---|ul| choose chew |ul| chill |ul| pull bull tool duel |ul| chill drool cool fool rule |ul| choose chew |ul| cheat chase check chuck char chip cheep chair choke chime chin cheer chat chop
/_l/ /_ul/
/_ _/
|ul| ----
/_u_/
|ul| ----
|ul| pull bull tool duel drool cool fool rule |ul| pull bull tool duel drool cool fool rule |ul| pull bull tool duel drool cool fool rule
|ul| |ul| choose loom chew poop poof boot boom toot suit etc. |ul| chill |ul| peal deal kneel etc.
/_ _l/
|ul| ----
/_ _ _/
|ul| ----
|ul| chill
|ul| choose chew
|ul| cheat chip chin etc.
|ul|
|ul|
|ul| *
Figure 2. Supracontexts and subcontexts for Albright and Hayes (2003) nonce English verb chool /ul/ with the words from the data set associated with each subcontext (irregular verbs in bold); X denotes a heterogeneous supracontext *(all 1,693 verbs in the data set except those beginning with //, containing /u/, or ending
with /l/)
390
S. Chandler
variables in a supracontext may be replaced with any value represented in the compatible items (words) in the data set, including the same value that was deleted from the target item to create the supracontext, in the subcontexts the values that were deleted from the target item are specifically disallowed. Thus, for /ul/ the words from the data set selected for the supracontext / _ _/ could include chill /Il/ and cheep /ip/ etc. For the subcontext |ul|, however, the strikethroughs indicate that words from the data set that contain the corresponding /u/ or /l/ are specifically excluded from the subcontext. The effect, as demonstrated in Skousen (1992), is to search for words in memory (the data set) that are more like the target word itself than the supracontext is. If those subcontexts collectively represent more disagreements in behavior among themselves than the words corresponding to the whole supracontext do, then that supracontext represents an unwarranted increase in uncertainty as to the outcome and is discounted from further participation in determining the analogical set. As noted earlier, the supracontext / _ _ / exhibits 28 disagreements. When we examine the subcontexts of / _ _ / as shown in Figure 2, we see that those subcontexts exhibit a total of 32 disagreements (2 1 16). This increase in disagreements in going from the supracontext to its subcontexts means that that supracontext is heterogeneous. Figure 2 shows the supracontexts and their associated subcontexts for the nonce-verb /ul/ and the verbs that each subcontext selects from the data set. The heterogeneous supracontexts are marked with an X.7 The forms and outcomes (verbs and, although not shown here, their associated past-tense forms) from the homogeneous supracontexts shown in Figure 2 are compiled into an analogical set, the candidate forms which serve as the analogical basis for operating on the target form. Figure 3 shows the analogical set for predicting the past tense of /ul/. A given verb appears in the analogical set once for each separate homogeneous supracontext that selects it, but it will appear in an analogical set more than once if it occurs in more than one homogeneous supracontext (there are no examples of this in Figures 2 and 3). Thus, while both choose and chew occur in more than one of the supracontexts for
7. As shown in Skousen (1989, 1992) this procedure for determining homogeneity is equivalent in power to the Pearson X 2 statistic applied to a contingency table. Readers sometimes find this algorithm for determining homogeneity and heterogeneity highly implausible as a psychologically real process. That may or may not turn out to be the case (it requires further research), but it also misses the point. There exists ample evidence that psycholinguistic operations often do show the effects of uncertainty in an information-theoretic sense (e.g., Moscoso et al. 2004). We do not yet have any commonly accepted understanding of a possible neuropsychological basis for such effects. The procedure described here is simply an algorithm which models those information-theoretic effects, but as argued in Skousen (1998), the procedure accomplishes the effect simply by comparing remembered forms and their behaviors, something brains can do, without positing, implausibly, a neural ability to calculate X 2.

Homogeneous supracontexts /u_/ /_l/ /_ul/
Figure 3.
Contributions to the analogical set choose, chew chill pull, bull, tool, duel, drool, cool, fool, rule
The analogical set for the nonce verb chool /ul/ assembled from verbs in the data set (see Figure 1) associated with the homogeneous supracontexts shown in Figure 2 (irregular verbs shown in bold)
/ul/, only one of those supracontexts is homogeneous. If those verbs had occurred in more than one homogeneous supracontext, they would have appeared more than once in the analogical set. This means, in effect, that if there are groups of words in the data set (long term memory) that all share the same features with the target form (as do the words corresponding to /_ ul/, for example), those words will collectively exert more influence on the behavior of the target form, the basis for so-called gang effects. However, even a less similar form occurring only once in the analogical set has a well-defined, albeit smaller, probability of becoming the model for an analogical response. This theoretical outcome reflects the fact that in tests of nonce verb inflection, people occasionally do respond with past-tense forms that clearly are not motivated by the verbs that are most similar to the target verb, i.e., its nearest neighbors (e.g., Albright and Hayes 2003; Chandler 1998; Keuleers 2008; Skousen 1989). The model proceeds next to choose a response category (category of pasttense form) from among the alternatives represented in the analogical set. Research on choice behavior and decision theory shows that people will respond to forced-choice tasks in one of two ways depending upon task instructions and other circumstances and that they appear to have at least some strategic control over their choice of decision rule to apply (e.g., Ashby 1992; Estes 1994). Sometimes people simply choose one item randomly from among the alternatives available to themthe analogical set in the Analogical Modelto serve as the basis for an analogical response. This is the random selection rule. In these cases, the number of tokens representing a given response in an analogical set also represents the probability of that response relative to the probability of any alternative responses represented in the analogical set. In other circumstances, people may opt to respond with the most frequently occurring response represented among the alternative responses, the plurality selection rule. In this case people respond consistently with the most common outcome. This rule allows people to maximize a particular type of outcome if they so desire. In the simulations reported below, I compared the probability of the alternative outcomes predicted for a given nonce verb to the proportion of alternative response actually given by the
392
S. Chandler
participants in Albright and Hayes (2003) study. Those probabilities are equivalent to modeling the random selections of a group of participants. As will be discussed below, when Albright and Hayes repeated their experiment with a significant change in the procedure, that change clearly motivated a different response strategy within many of their participants, a strategy that the Analogical Model can replicate with only a minor change in the decision rules available to it. 2.6.2.2. Predicting the morphophonemic form. In practice, the Analogical Model selects a word (random selection), or the largest set of like-behaving words (plurality selection), to act as the model for an analogical response to an item of interest. Thus, the model only assigns the target verb to a category of past-tense forms, it does not itself actually apply the analogical process to the target form to specify its phonological form. To continue the example developed above, the Analogical Model does not tell us explicitly what the phonological form of the past tense of chool will be. It simply specifies that it will be irregular, on analogy with choose, or regular, on analogy with chew or chill etc. Pinker and Ullman (2002: 458) were critical specifically of studies that used models that simply identified categories of responses rather than actually generating the phonological representation of the past tense for an output form. They argued that any supplementary mechanism that could generate the correct phonological output would amount to a rule. As noted earlier, Keuleers (2008) actually did append a system of morphophonemic rules to his MemoryBased Learning Model for specifying the phonological form of the outputs. Albright and Hayes (2003) chose not to test the Analogical Model in part because it does not actually produce a phonologically specified output for its analogies. Although the extant implementation of the Analogical Model does not include an algorithm for producing phonological representations for the models output, Chandler (2009b) has proposed an algorithm for doing so for both regular and irregular verb forms.8 The proposed algorithm posits that speakers apply the analogically motivated alternation to the supracontext that specifies the greatest number of segments (in this implementation) shared between the target form and the analogical source and then restore the segments of the target form that were deleted in arriving at that supracontext. Thus, chool ul] shares the phonological supracontext /u_/ with choose /uz/. Applying the analogy to the supracontext derives /o_/, and restoring the deleted segment yields /ol / as a response. Similarly, selecting chew /u_/ as the basis for the
8. The current implementation for AM does not include subroutines for generating phonological forms because it is a general model of analogical behavior, not specifically a model for predicting past-tense forms.
The English past tense: Analogy redux 393 analogical response would derive /u_+D/ (where {+D} represents the regular past-tense morpheme), or selecting pull /_ul/ would derive /_ul+D/. Restoring the respective missing segment in each case would yield /ul+D/, realized as described below as chooled. Predicting the pronunciationthe allomorphfor the regular past-tense inflection requires an additional analogical step. Chandler (2008) showed how the Analogical Model can be used to predict pronunciation from spelling. It does so by first deriving the probabilities of the alternative pronunciations predicted for each individual letter seriatim and then calculating the conditional probabilities for the alternative pronunciations of the overall word predicted by those letter-by-letter probabilities. We can use the same procedure that Chandler described for predicting spelling pronunciation to predict the pronunciation of the regular English past-tense morpheme {+D} in the phonological context of the base verb to which it has been appended. For the purposes of predicting the pronunciation of the English regular pasttense form, we can restrict our consideration to the effects of the final segment of the base form of the verb on allophone choice. We do this not because of any privileged status allotted to the final segment by an omniscient linguist, but simply because, for English, extending the contexts further into the base form has no additional effect on the outcome.9 For the first-order supracontexts those with only one variable removed from the target wordonly the supracontext with the final segment removed might predict more than one possible outcome for the pronunciation of the past-tense for most English verbs. All other first-order supracontexts will, of course, end with the same segment as the test word and will, therefore, select from the data set only verbs that share that final segment and therefore also share the appropriate allomorph. For all other supracontexts, those with two or more variables removed, the test for heterogeneity will insure that only those supracontexts ending with the same segment as the target word and therefore exhibiting only one possible outcome for the choice of allomorph will also be homogeneous and therefore contribute to the analogical set. Supracontexts with the final segment and some one or more other segments deleted might correspond to words in the data set showing possible alternative pronunciations for the past-tense inflection, but if they do, those supracontexts will always be heterogeneous and will therefore not contribute those alternative pronunciations to the analogical set. The effect of this procedure is that for any verb assigned the regular past-tense morpheme by the initial analogical process, every regular verb in the language sharing the same stem-final segment will contribute to the overwhelming analogical
9. While this happens to be true for English, it does not apply to languages showing such morphophonemic characteristics as vowel harmony across verb stems and affixes (e.g., Maya Yucatec) or umlauting as in German.
394
S. Chandler
preference for pronouncing that verb with the same allomorph as all other verbs in the language ending with the same segment. The one exception to the results just described that would arise from time to time involves the first-order supracontext in which only the final segment of the base form of the target word has been removed, and this case turns out to be, potentially, very interesting. Take, for the example, the nonce verb from Albright and Hayes (2003) described earlier, /ul+D/. The supracontext /u_+D/ happens to be deterministic, chew choosing only the /-d/ allomorph and therefore also predicting [uld]. However, applying the algorithm to an English verb such as back seems to cause problems. For back, the homogeneous first-order supracontext /b_+D/ predicts [-d] from batted and banded and [-d] from bagged and banned and [-t] from banked. For all of the other homogeneous supracontextsall necessarily having the base form end with /k/ and running potentially to many hundreds of verbsthe result will be [-t] deterministically. The predicted pronunciation will, therefore, be [bkt] overwhelmingly. However, that one first-order supracontext leaves open the slight possibility of predicting *[bkd], and *[bkd]. These predictions for backed turn out not to be so much problematic, as they might appear to be at first glance, but rather potentially very interesting. The form, *[bkd] violates a universal constraint on possible syllable structure (cf., Blevins 1995) and therefore cannot be realized. Invoking the procedure just described, however, leads the Analogical Model to also predict occasionally regular past-tense forms such as *[btt] for batted and *[noldd] for Albright and Hayes (2003) nonce verb nold. Neither form corresponds to a phonetic realization, but they do suggest a possible alternative interpretation of what have been called no marking errors (e.g., Stemberger and MacWhinney 1986; Marchman 1997). No-marking errors refer to instances in which children, and occasionally adult speakers, omit the past-tense suffix from a regular English verb ending in [d] or [t], such as lift or result presumably because the verb sounds somewhat as though it is already marked for the past tense (notice that this account assumes some sort of feedback procedure for comparing the output of the process to some sort of notion of typical-sounding past-tense form). The analogical approach described here suggests that the affix might be appended psychologically but not realized phonetically, a possible explanation that must await further study. The third predicted form noted above, *[bkd], is also interesting because such errors, although rare, actually are heard occasionally. Children sometimes overuse the [-d] allomorph (e.g., Cazden 1968; Kuczaj 1977), and both Parkinsons disease patients and Huntingtons disease patients, as reported in Ullman et al. Pinker (1997), occasionally overuse the [-d] allomorph. Such errors appear to represent an example of the kind of unidirectional leakage predicted by the Analogical Model, as described in Skousen (1989). Leakage refers to the very rare but predictable errors that
The English past tense: Analogy redux 395 speakers occasionally do produce in violation of what appear to be categorical, i.e., exceptionless, rules. 3. The wug experiments In order to derive a set of data against which to compare the predictions of their Minimal Generalization Model and the Generalized Context Model, Albright and Hayes (2003) designed and carried out a new wug study (after Berko 1958). They developed a set of 41 core verbs, monosyllabic nonce English verbs designed to test various predictions of interest to them. In particular some of the core verbs closely resembled clusters of real English verbs that corresponded to a minimal generalization (a rule) with a high confidence value. Such verbs and rules were said to represent islands of reliability. One set of nonce verbs shared islands of reliability with regular verbs only and another set with irregular verbs only. A third set of core verbs shared islands of reliability with both regular verbs and irregular verbs, and a fourth set had no islands of reliability. Albright and Hayes expected that the islands of reliabilityactually gangs of phonologically similar verbsmight correspond to higher acceptability values elicited for the different groups of nonce verbs. In addition to the core verbs, Albright and Hayes also created a set of 17 peripheral verbs, verbs that resembled most closely English verbs having unique past-tense forms, such as see~saw and come~came. In a pretest, one group of participants rated the nonce verbs, both their stems and their alternative past-tense forms, on a scale of one to seven for naturalness or English likeness. Then, in Experiment 1 another group of participants heard the nonce verbs presented in natural but semantically neutral discourse contexts before being asked to complete a statement orally that solicited a past-tense form for the nonce verb. Issues of interest included not only how closely the predictions of the two models would approximate the past-tense productions of the participants but also how closely the productions of the participants and the predictions of the models would correlate with the pretest ratings of acceptability, and how the islands of reliability might correlate with those ratings. Experiment 2 replicated Experiment 1 except that after a participant had supplied a possible past-tense form for a given test item, Albright and Hayes (2003) then asked that participant to also rate the goodness of the regular past-tense form and one or two possible irregular past-tense forms modeled for that given verb. In other words, after responding to a test item, the participants in Experiment 2 heard one or more possible irregular past-tense forms modeled for them, which they then had to rate for goodness as a possible past-tense form. This difference in procedure turned out to have a major effect on performance in the second experiment as compared to the first experiment.
396
S. Chandler
4. The simulations 4.1. The Minimal Generalization Model Using the Minimal Generalization Model described above, Albright and Hayes (2003) trained their system on 4,253 English verbs in the CELEX data base (Baayen, et al. 1995) having a lemma frequency of ten or greater, and they then applied it to the set of nonce English verbs described above. For each nonce verb, their model identified the set of alternative morphophonemic rules which would apply to that verb and ranked them according to the confidence value associated with each rule. The model then applied the rule with the highest confidence value. 4.2. The Generalized Context Model
To test the version of the Generalized Context Model adopted from Nakisa et al. (2000), Albright and Hayes (2003) first used the morphophonemic rules that had been abstracted from the CELEX data set to generate every past-tense form consistent with the phonological structure of the nonce verb. They then filtered out the forms that had regular allomorphs resulting in phonotactically impossible sequences such as *[drtt] for drit and *[noldd] for nold. Finally, they fed the remaining past-tense alternatives to the Generalized Context Model which compared them to all of the past-tense forms in the CELEX data set and returned a relative probability for each alternative. 4.3. The Analogical Model
For the simulations reported here, I use the data set described in Chandler (2002), which consists of all the monosyllabic, monomorphemic verbs listed in Francis and Kueras (1986) word frequency table augmented with the same sorts of verbs listed in the Longman Dictionary of American English (Stern 1997) but not included in Francis and Kuera. The result is a data set of 1,693 verbs. This is largely a sample of convenience, but all of the nonce verbs tested by Albright and Hayes were monosyllabic, and those authors explicitly excluded derivational variants of the simple verbs, such as mistake from take, from their data set. Multisyllabic verbs, both basic verbs and verbs derived from basic verbs, can affect the results, but apparently not greatly (see Eddington 2003 and Keuleers 2008 for discussions of this issue). Moreover, it is not yet clear how best to represent such verbs in analogical models. Albright and Hayes (2003) eliminated all derivational variants of the base verbs included in the CELEX corpus. Keuleers, also using the CELEX corpus, compared only the final syllables of the multisyllabic verbs included in the corpus. As described earlier, the Analogical Model predicts the probability of assigning the alternative past-tense forms represented in the analogical sets to the
The English past tense: Analogy redux 397 nonce English verbs that Albright and Hayes (2003) tested. Those probabilities are equivalent to the group data for a set of participants who are each selecting their responses randomly from the analogical sets.10 To model the acceptability ratings for the alternative past-tense forms reported in Albright and Hayes (2003), I used the following procedure, which had proven elsewhere (Chandler, 2009b) to model the naturalness ratings reported in Prasada and Pinker (1993). Using the data set described above, the Analogical Model generates an analogical set for each nonce verb. The analogical sets will be larger or smaller depending on how many verbs in the data set are selected by the homogeneous supracontexts; and, depending on which English verbs the nonce verb resembles closely or less closely. The number of verbs in a given analogical set that share a given past-tense form can be interpreted directly as a measure of the likelihood or naturalness of that form as the past tense of the given nonce verb. As shown below, these values turn out to correlate significantly with the ratings provided by the participants in Albright and Hayes experiments. 5. Results and comparisons of the overall performance of the models Table 1 shows how the Analogical Model performed overall in comparison to the results reported in Albright and Hayes (2003) for their Minimal Generalization Model and the Generalized Context Model. Overall, the Analogical Model performed appreciably better than the Generalized Context Model in predicting both the past-tense forms produced by the participants in the two experiments reported in Albright and Hayes and in predicting the ratings assigned to the alternative past-tense forms by those participants. The overall performance of the Analogical Model and the Minimal Generalization Model appear roughly comparable. For responses to the 41 core verbs, the Analogical Model predicted the adjusted likelihood ratings reported in Albright and Hayes only slightly better than did the Minimal Generalization Model. In predicting both the regular past-tense productions and the naturalness ratings for those regular inflections, the Minimal Generalization Model performed somewhat better than the Analogical Model did. On the other hand, the Analogical Model predicted the irregular responses to the nonce verbs much better than the Minimum Generalization Model did, and the adjusted naturalness ratings for those
10. The implementation of the Analogical Model used for these simulations determined the probability that a particular irregular past-tense form or one of the regular verbs included in the analogical set would provide the basis for an analogical response. The model itself did not specify the actual phonological output of the analogy. As just described in the preceding section, however, a straightforward procedure has been proposed elsewhere for deriving the phonological output although it has not yet been implemented computationally.
398
S. Chandler
Table 1. Correlations (r) of participant responses to model predictions for core verbs Model MGM Overall Regulars Irregulars Overall Regulars Irregulars Overall Regulars Irregulars Pearson r for adjusted ratings 0.806* 0.745 0.570 0.780 0.448 0.448 0.812 0.659 0.620 Pearson r for production probabilities not given 0.678 0.333 not given 0.446 0.517 0.954 0.628 0.507
GCM
AM
* For all correlations, p < 0.0001 Table 2. Correlations (r) of participant productions to model predictions for peripheral verbs MGM All responses Regular responses Irregular responses not given 0.44895 (p = 0.0353) 0.33357 n.s. (p = 0.0697) AM 0.93852 (p < 0.0001) 0.31155 n.s (p = 0.1117) 0.61243 (p = 0.0016)
irregular responses somewhat better than the Minimum Generalization Model did. Albright and Hayes (2003) did not report separately the data for the performance of the Minimal Generalization Model and the Generalized Context Model on the peripheral verbsnonce verbs that resemble real English verbs having unique past-tense forms. The predictions of the Analogical Model, however, correlate highly with the overall productions of the participants reported for those verbs too (r = 0.939, p < 0.0001). When broken down into regular responses versus irregular responses, the Analogical Model did not perform as well as the Minimal Generalization Model at predicting the regular responses (see Table 2), but it predicted the irregular responses better. For the combined core verbs and peripheral verbs, the predictions of the Analogical Model again correlated highly with the overall productions of the participants (r = 0.948, p < 0.0001). Albright and Hayes (2003) also showed that their participants tended to give higher naturalness ratings for the past-tense forms of nonce verbs that resembled the past-tense forms of clusters of similarly behaving verbs in the corpus, their so-called islands of reliability. The Analogical Model also predicts such gang effects when a nonce verb resembles clusters of real verbs in the data set.
The English past tense: Analogy redux 399 For each nonce verb, the AM derives an analogical set of verbs which may serve as the basis for an analogical response. As described above, the number of verbs in the analogical set reflects the number of real verbs in the data set (corpus) that resemble the nonce verb without increasing uncertainty about the possible outcomes. For the 41 core verbs, the number of verbs in the analogical set motivating a given response for a nonce verb correlates strongly with the naturalness ratings assigned to those responses, r = 0.60316 (p < 0.0001) and for the combined set of core and peripheral verbs, the correlation is r = 0.58969 (p < 0.0001). 6. Discussion of these results 6.1. Predicting past-tense forms
6.1.1. The overall performance of the models. As can be seen from the results just reported, overall Skousens Analogical Model (Skousen 1989, 1992) predicted the performance of the participants in assigning past-tense forms to nonce-English verbs as well as did the Minimal Generalization Model of Albright and Hayes (2003). These results alone demonstrate that Albright and Hayes were incorrect in their assertion that a purely analogical model could not model speakers behavior with past-tense morphology as accurately as a rule-based model such as theirs could. In doing so, these results confirm and complement a similar finding and conclusion reported in Keuleers (2008) with respect to his Memory-Based Learning Model. Moreover, Keuleers demonstrated that if Albright and Hayes had used the stronger version of Nosofskys (1990) Generalized Context Model, one incorporating feature weighting, then that model would most likely have also performed comparably to the Minimal Generalization Model. Thus, the claim of Albright and Hayes regarding the empirical inadequacy of purely analogical models appears to have been thoroughly disconfirmed. While the overall performance of Albright and Hayes Minimal Generalization Model (Albright and Hayes 2003) and of Skousens (1989, 1992) Analogical Model was very similar, there were, nonetheless, interesting differences. For example, the Minimal Generalization Model performed slightly better at predicting the regular inflections, and the Analogical Model performed slightly better at predicting the irregular forms. Given the differences in the implementations of the two models compared here, it is not possible to identify exactly what might account for the small differences seen in their performance. However, Albright and Hayes trained their model on 4,253 verbs taken from the CELEX data base for English (Baayen et al. 1995) while the Analogical Model simulations reported here were based on a data set of only 1,693 verbs. Research on the Analogical Model shows that its performance improves with the
400
S. Chandler
size of the data base (cf., Wulf 2002; Daelemans et al. 1999), and since most verbs beyond those included in my data set are regular, the larger data set would most likely have shifted the proportion of responses toward greater regularization. Another potential source of difficulty in comparing the performance of the two models lies in how the data were reported. Baayen et al. (2002) and Baayen (2004) have described and illustrated the potential fallacies inherent in testing the predictions of models such as these against grouped data for participants. In particular, grouped data may obscure important systematic differences in the response strategies of individual participants. Thus, Baayen (2004) has advocated using multilevel regression to compare the different influence of different predictor variables on the response patterns of individual participants. The systematic differences in the acceptability ratings reported for Albright and Hayes first experiment versus their second experiment reinforces why it is useful to be able to track how individual participants are responding with each test item. The ratings reported for the second experiment showed the strong effects of a response bias, created almost certainly, as Albright and Hayes pointed out, by a key change in procedure introduced into their Experiment 2. In Experiment 2, after inflecting each nonce verb the participants were then given a set of possible past-tense forms for that nonce verb, including both possible regular forms and possible irregular forms to rate for acceptability. The effect seems to have been to invite the participants to try to give more irregular responses. This result is consistent with the strong experimental-context effects demonstrated in Ramscar (2002), effects which neither model being compared here attempted to capture. Consequently, the Analogical Model predicted the actual production probabilities of the participants in Experiment 1 more accurately than it did the production probabilities in Experiment 2, overestimating the probabilities in Experiment 1 by only 0.00032 compared to 0.03247 for those in Experiment 2. As described earlier, the Analogical Model attributes to participants some strategic control over the decision rule that they use to arrive at a response. The two decision rules described earlier predict different patterns of responses for individual participants. Reliance on the plurality selection rule leads to greater generalization in responses, usually regularization. In contrast, reliance on the random selection rule leads to responses that approximate more closely the probabilities of the alternative responses represented in the analogical sets. The response bias evidenced in Albright and Hayes Experiment 2 (Albright and Hayes 2003), the significantly greater use of irregular responses, suggests a third possible strategy, namely that at least some of those participants may have been strategically passing over the first plurality in the analogical set usually the regular responsein favor of the most frequent irregularity. An
The English past tense: Analogy redux 401 analysis of the response patterns of individual participants would allow us to examine this possibility further. 6.1.2. Some selected comparisons of the results 6.1.2.1. Predicting the past-tense forms. While the two models paralleled one another closely in overall performance, a more detailed item analysis reveals some interesting differences in the claims and performance of the two models. For example, Albright and Hayes (2003) noted that all English verbs ending with voiceless fricatives are regular, and they cited this fact as illustrative of the sorts of structural generalizations that their model was better able to capture than could a purely analogical model. Interestingly, however, the Analogical Model appeared to capture the actual behavior of those verbs more accurately than the Minimal Generalization Model did. Of the 41 nonce verbs used by Albright and Hayes, four end with a voiceless fricative, drice, rife, blafe, and nace. As illustrated above in Figures 1, 2, and 3, the Analogical Model compares each supracontext of an input verb with every verb in the data set compatible with that supracontext. For every supracontext ending with the voiceless fricative of the target nonce verb (most of the supracontexts), the Analogical Model will select from the data set only verbs also ending with that voiceless fricative. Those English verbs will, of course, all be regular, and those supracontexts therefore always homogeneous. One consequence of this procedure is that those nonce verbs will wind up being compared with every verb in the language that also happens to end with the same voiceless fricative, potentially overwhelming pressure toward regular inflection. The Analogical Model, however, will sometimes admit a small possibility of an irregular inflection for such verbs. The set of first-order supracontextsthose with only one segment removedwill include the supracontext in which the final segment has been removed. For example, the nonce verb drice, includes the supracontext /daj _ /, which turns out to be homogeneous and which selects the irregular verb drive from the data set. All other supracontexts for drice, those with the final segment and one or more other segments removed, turn out to be heterogeneous (unless the verbs from the data set happened to all show the same past-tense form, a rare event). This will always be the case, and those heterogeneous supracontexts do not contribute items to the analogical set. In this study, the Analogical Model predicted that the past-tense form droce /dos/ would occur about four percent of the time. The data reported by Albright and Hayes (2003) showed the subjects in their first experiment not using droce at all and those in their second experiment using it about nine percent of the time (an average of five percent). This appears to be another example of rule leakage, described earlier, a situation in which what appears to be a virtually deterministic rule will, on rare occasions, show leakage toward
402
S. Chandler
some specified alternative outcome but not others, i.e., it is not just a random speech error; it is predicted by the model. For rife the Analogical Model predicted rofe nine percent of the time; Albright and Hayes participants used it ten percent of the time. In the case of blafe the Analogical Model failed to predict bleft used once by one participant, but it predicted accurately that the participants would consistently inflect nace as naced. Albright and Hayes (2003) also discussed briefly the occurrence of a puzzling anomaly in their data. A significant number of their participants, 17% in Experiment 1 and fully 50% in Experiment 2, responded with nold as the past tense of nold. Since the corpus contained no obvious examples motivating a zero past-tense marking for nold and since neither their model nor the Generalized Context Model that they tested predicted it, Albright and Hayes took the responses as anomalies. As discussed above, the responses actually resemble the no-marking errors described in Stemberger and MacWhinney (1986) and Marchman (1997) in which a base form that resembles some existing pasttense form is sometimes returned incorrectly as a past tense. The tentative analogical process described above for predicting the selection of a regular allomorph suggests a possible alternative explanation for such errors. The Analogical Model predicted that the participants should have applied the regular past tense morpheme about 60% of the time. They actually did produce it, with the correct allomorph, about 53 percent of the time. The tentative analogical process described above for predicting the selection of a regular pasttense allomorph predicts that based on verbs such as poll~polled, bowl~bowled, toll~tolled, etc. that the regular allomorph [-d] could have been misapplied to nold about 26% of the times yielding the unpronounceable *[noldd]. Thus, the Analogical Model would predict nolded about 45% of the time and nold about 15%, close to the 53% and 17% actually observed; and, as noted earlier, a larger data set would probably have shifted the proportion further toward nolded. 6.1.2.2. Predicting the acceptability ratings. One of the major goals that Albright and Hayes (2003) identified for their study was to demonstrate that the regular verbs were not entirely homogeneous in their behavior contrary to what Pinker (1999) and Prasada and Pinker (1993) had predicted. In particular, Albright and Hayes sought to demonstrate that even regular verbs could show similarity effects and prototype effects. The Minimal Generalization Model derived confidence values for its different rules, which values then determined which of competing rules would apply to a given novel verb. As noted earlier, the confidence values are based in part on how many verbs share a given similarity in phonological structure and inflectional behavior. Albright and Hayes argue that if a similarity is shared by a lot of verbs, then that situation would give rise to a greater sense of prototypicality within a speakers mind, an intu-
The English past tense: Analogy redux 403 ition about those verbs forms that they called an island of reliability. This claim contradicts Pinkers position that all verbs not marked specifically as irregular are marked simply as a member of the abstract lexical category <verb> and that grammatical processes operate on the category label and remain indifferent to similarities and differences among the members of that category. Prasada and Pinker (1993), especially, reported data from an acceptability rating task which confirmed their claim that speakers would show the effects of form-similarity when rating the acceptability of nonce-irregular verbs but would not show such effects when rating the acceptability of nonce-regular verbs. In their study, Albright and Hayes (2003) found that their participants indeed did rate the acceptability of both regular and irregular past-tense forms higher when the nonce verbs shared some similarity in form with a larger number of real verbs showing that past-tense form than when a nonce verb shared similarities with fewer real verbs. They argued that their model captured this intuition in the higher confidence values assigned to morphophonemic rules conforming to the phonological structure of some island of reliability. In the tests of the Analogical Model reported here, I compared the acceptability ratings reported by Albright and Hayes for each inflectional variant of a nonce verb with the number of real verbs sharing that variant that the Analogical Model assigned to the analogical set for that nonce verb. Recall that the analogical set is the set of verbs from the data set (the corpus) identified by the analogical process as candidates for providing the basis for an analogical response (as illustrated in Figure 3). If a nonce verb resembles many real English verbs, it may derive a large analogical set, a set of verbs that resemble it. If a nonce verb does not resemble many real English verbs, then its analogical set will contain relatively few verbs. Thus, the number of verbs in an analogical set provides a basis for the same intuitive judgments of prototypicality that Albright and Hayes characterized as islands of reliability. Table 1 shows the correlation of the number of verbs sharing a given behavior in an analogical set with the acceptability ratings returned by the participants in Albright and Hayes study. As is evident in Table 1, those numbers correlate very strongly overall with the acceptability ratings reported by Albright and Hayes. Thus, the Analogical Model provides a very straightforward, and intuitive, basis for judging acceptability. In Chandler (2009b), I report how the same procedure just described performed in predicting acceptability ratings for the nonce verbs tested in Prasada and Pinker (1993). Prasada and Pinker solicited ratings for groups of nonce verbs that resembled real English verbs closely, moderately, or little to none at all. They reported that the acceptability ratings given by their participants for irregularly marked past tenses correlated strongly with the degree of similarity of the base form to real irregular verbs in English. For the regular past-tense
404
S. Chandler
forms, however, there was no significant correlation; the ratings were uniformly high. When we examine the analogical sets derived for each of the nonce verbs that Prasada and Pinker tested, however, we see the same basis for their results as just described for the results reported by Albright and Hayes (2003), that is, the make up of the analogical sets for each verb. Of the 30 pseudo-regular verbs tested by Prasada and Pinker, 21 (70%) produced analogical sets that were 100% regular. The remainder produced analogical sets that were from 92% to 98% regular. These analogical sets contrasted sharply with those derived for the pseudo-irregular verbs, which were never more than about 70% regular. Thus, the Analogical Model accounts uniformly for both the differences in ratings among regular past-tense forms that Albright and Hayes reported and the lack of differences in ratings among regular past-tense forms that Prasada and Pinker reported. In conclusion, it appears that the Analogical Model predicts the data derived in the two experiments reported in Albright and Hayes (2003) as well as their Minimal Generalization Model does and much better than does the version of Nosofskys (1990) Generalized Context Model that they tested. Together, the results reported here and the results reported in Keuleers (2008) show that Albright and Hayes claim regarding the empirical superiority of their Minimal Generalization Model over purely analogical models is incorrect. 7. General discussion 7.1. Modeling the English past tense
Albright and Hayes (2003) developed their Minimal Generalization Model specifically to describe and explain what they argued to be the sole basis for past-tense-form production, phonologically conditioned morphophonemic rules abstracted away from the training exemplars that motivated those rules in the first place. In this respect, their model represents a continuation of the traditional position assumed by many structural linguists. They designed their model specifically to abstract generalizations away from the data and to represent them as structured generalizations about the language that in turn become the basis for processing new instances of linguistic usage. In other words, they become part of the grammar of the language. Albright and Hayes further argued that purely analogical models such as Nosofskys (1990) Generalized Context Model could not capture such structured phonological regularities adequately because they compared and evaluated all of the sounds making up a word equally, resulting in what Albright and Hayes called variegated similarities. Ironically, as Keuleers (2008) showed, it is exactly the feature of Nosofskys model that Albright and Hayes chose to discardthe differential weighting of features based on how much information they contributed to the
The English past tense: Analogy redux 405 categorization of a formthat would have allowed the Generalized Context Model to capture those structural similarities as well as the Minimal Generalization Model did. Given such feature weightings, the structural regularities are inherent in the data and emerge from them in the process of comparing and evaluating forms. In this paper, I have shown that Skousens Analogical Model (Skousen, 1989, 1992) can also capture the structural regularities described in Albright and Hayes (2003). In the Analogical Model, retaining only the homogeneous supracontexts means that those features and combinations of features that are inherently more predictive of behavior within a set of data will appear more often in an analogical set. Thus, for the Analogical Model as well, the structured similarities emerge as consequences of the analogical process of comparing and evaluating forms, but those emergent similarities are generated anew each time a new form is processed and do not become resident linguistic generalizations about the language that are then used to process subsequent instances of linguistic usage. Moreover, since the Analogical Model compares a test form to all of the remembered verbs that share features with it, test verbs sharing features with many verbs will generate larger analogical sets than will test items that do not share features with many extant verbs. The number of remembered verbs that occurred in the analogical sets for the test verbs turned out to correlate strongly with the acceptability ratings that participants assigned to different possible past-tense forms for those verbs. In the studies reported in Albright and Hayes (2003) and in this paper, lexical semantics has played no role in predicting the choice of past-tense forms. Following the lead of Pinker and Prince (1988), Albright and Hayes appear to assume that inflectional morphology is purely a function of phonological form (p. 123), and they provided no mechanism for incorporating such semantic effects into their morphophonemic model.11 In their experiments, they presented their nonce verbs in semantically neutral carrier sentences. This theoretical position is problematic, however, because as both Chandler (1998) and Ramscar (2002) have shown, semantic similarities among words do influence significantly the probability of applying a regular or an irregular past-tense form to a given nonce verb. In the Chandler study presenting a nonce verb such as traw within a semantic context that resembled one of the meanings of draw increased the probability of receiving trew as the past-tense form by as much as 60% over the probability of receiving it when the test item was presented in a semantic context unrelated to any of the meanings of draw. Ramscar reported a comparable influence of meaning on choosing to inflect nonce verbs such as
11.
It is not clear to me how the Minimal Generalization Model treats homophonous verb pairs such as lie (recline) versus lie (prevaricate) or ring (peal) versus ring (encircle).
406
S. Chandler
frink regularly or irregularly. While the relative weight of semantic similarity versus phonological similarity is, as yet, an unresolved issue, semantic similarity clearly does influence past-tense production, and Eddington (2002) and Skousen (2006) have explored how semantic features may be appended to the phonological representations within the Analogical Model. Other studies have also shown that variables other than phonological form alone influence inflectional behavior significantly. Keuleers (2008) found that both spelling (independently of pronunciation) and a participants perception of a word as either native or borrowed influenced their inflectional choices. Many studies of sociolinguistic variation in language usage have documented that nonlinguistic variables such as the ages and genders of interlocutors influence the choice of alternative linguistic forms such as the English present participle, [-i] versus [-n] (e.g., Wald and Shopen, 1983). Skousen (1989) has shown how sociolinguistic variables such as those influencing terms of address in Egyptian Arabic can be readily incorporated into Analogical Model analyses of linguistic performance. In focusing on linguistic structure alone, Albright and Hayes (2003) have provided no natural mechanism for incorporating such well-attested non-structural variables into their accounts of linguistic behavior. In doing so, they are perpetuating the tradition of isolating linguistic explanations from more general accounts of learning and cognitive processes. 7.2. The Analogical Model and the nature of linguistic generalizations
In this paper, I have shown that a purely analogical model, specifically Skousens Analogical Model (Skousen, 1989, 1992) predicts the performance of participants at inflecting nonce-English verbs for the past tense as well as does the Minimal Generalization Model of Albright and Hayes (2003). Moreover, the Analogical Model has the ability to model readily factors other than phonological structure which also influence the choice of past-tense form significantly, variables which Albright and Hayes omit from their model, and from their experiment. Keuleers (2008) has shown that his Memory-Based Learning model, based on Daelemans et al. (2002), and probably also a more conventional version of Nosofskys Generalized Context Model (Nosofsky 1986, 1990) can model the inflection of English-nonce verbs at least as well as the Minimal Generalization Model does. Demonstrating that one theoretical approach can accommodate a set of data as well as another can does not in and of itself allow us to prefer one theoretical approach over another. However, the fact that the three exemplar-based models discussed in this paper can also account for the influence of variables that have been shown elsewhere to affect significantly the probability of choosing one past-tense form over another but that are excluded from the Minimal Generalization Model does argue for rejecting the latter in its current form. It is conceivable that one could find a
The English past tense: Analogy redux 407 way to incorporate those semantic and sociolinguistic variables into a rulebased model. They certainly could be incorporated into a connectionist, or connectionist-like, model such as those tested by Marchman and Callan (1995), Hare et al. (1995), and Joanisse and Seidenberg (1999), among others, but the Minimal Generalization Model of Albright and Hayes (2003) does not account for them. Beyond providing for the straightforward empirical evaluation of competing theories, we also look to our theories to provide coherent accounts of how the narrow questions of interest fit explicitly into broader theoretical frameworks. In particular, those of us who subscribe to the tenets of usage-based, cognitive linguistics (after Croft and Cruse 2004; Goldberg 2006; Langacker 1987, 2009; Tomasello 2003), look to our theories to help us to describe explicitly the common neuropsychological structures and processes that we believe to underlie both linguistic and nonlinguistic cognitive behavior. We also expect that good theories will help us to recognize and come to understand previously unnoticed aspects of language behavior. In the tradition of the generative linguistic enterprise (e.g., Chomsky 1965, 1986), the Minimal Generalization Model was developed specifically as a linguistic learning procedure posited specifically to explain the acquisition and use of a very narrow range of linguistic generalizations. Consequently, it is not immediately apparent how that model might be extended to other aspects of linguistic usage. On the other hand, the three exemplar-based models discussed in this paper were developed as more general accounts of cognitive and linguistic behavior, and they have all been applied to a variety of linguistic and nonlinguistic phenomena (see Chandler 2009b for a recent review of this work). The theoretical framework that locates the phenomena of inflectional usage solidly within a larger understanding of linguistic usage and cognition is categorization, the cognitive operation of assigning tokens of linguistic usage to category memberships. Cognitive linguists such as Abbot-Smith and Tomasello (2006), Croft and Cruse (2004), Goldberg (2006), and Langacker (1987, 2009) have all singled out categorization as the quintessential cognitive act underlying linguistic usage. In espousing this view, however, these researchers also import the unresolved questions and theoretical controversies regarding the nature of categorization into linguistics and make them issues of linguistic theory as well. The central unresolved question within the literature of concept learning and categorization is whether categories are represented within the brain as resident schematic structures abstracted away from the experiences that led to the learning of those categories or whether they arise on the fly, fleetingly, as needed, as characterized by exemplar-based models of categorization. A reading of literature reviews on this issue over time, from Estes (1994) to Shanks (1995) to Whittlesea (1997) to Murphy (2002) reveals a clear theoretical trajectory away from the schematized-prototype models of categorization
408
S. Chandler
toward the exemplar-based models. Whittlesea has articulated the central point most sharply: . . . there is no positive evidence for a dedicated mental apparatus that abstracts general, summary properties across particular experiences, or that conceptual knowledge is represented in a separate memory system, organized according to different principles than those governing the encoding of particular events (Whittlesea 1997: 340, emphasis added). This body of research in cognitive psychology is crucial to our understanding of exemplar-based models in linguistics because Nosofskys Generalized Context Model (Nosofsky 1986, 1990) and the Memory-Based Learning Models of Daelemans and others (Daelemans et al. 2002) are direct outgrowths of those exemplar-based approaches to modeling concept learning and categorization. In all four of the works on cognitive linguistics cited just above, the authors conclude that there is evidence from linguistic usage which justifies invoking schema abstraction as a key component of language acquisition and use. Croft and Cruse (2004) take no explicit position on the issue of exemplar-based models versus schema-abstraction models, but their discussion of linguistic categories makes clear their position that schema-abstraction underlies category formation. Abbot-Smith and Tomasello (2006), Goldberg (2006), and Langacker (2009) all argue that the nature of how exemplars are represented and used within exemplar-based models reduces to a kind of schematic generalization, and Abbot-Smith and Tomasello and Langacker have gone on to argue specifically that users must abstract linguistic generalizations into structural schema that stand apart from the instances of remembered usages that motivated those generalizations in the first place:
. . . we would question the assumption [of the purely exemplar-based models] that more abstract prototype categories are only generalized online and leave no permanent representational change . . . if the mutual similarities of a particular collection of exemplars (such as transitive sentences) are summed over regularly, we believe this is highly likely to permanently change the users linguistic representations in some way equivalent to the formation of some kind of more abstract representation. (Abbot-Smith and Tomasello 2006: 282). The proposal, then, is that something reasonably called linguistic structure does exist, that it emerges from usage, that it has presence in cognition (as patterns of processing activity), and that it has a causal role in speaking and understanding. (Langacker 2009: 18).
I shall argue here briefly that in each of the three works just cited, works in which exemplar-based approaches are considered and then found wanting, that those discussions reveal an incomplete understanding of how exemplar-based models are thought to represent exemplars and to derive structural generalizations from them, and, finally, I shall show that some of the specific objections
The English past tense: Analogy redux 409 that those authors express regarding exemplar-based models in general actually do not apply to Skousens Analogical Model (Skousen 1989, 1992). All three of the exemplar-based models discussed in this paper posit, in theory, the continuous, life-long accumulation of episodic memories for ones experiences, what Langacker (1987) calls cognitive events. Presumably, those episodic memory traces consist of a cross-sectional record of the sensorimotor activity (after Barsalou 1999), the affective mental activity (after Colzato et al. 2007), and possibly even a neural record of ones overall bodily state during the episode (after Damasio 1999). I shall use the term sensory as shorthand for the whole range of neural activity across the brain that has been bound into an episodic memory, and when focusing our discussion and analysis on linguistic usage, we can speak of linguistic exemplars. Both the endogenous eye blink (Burton 1990) and the attentional blink (Raymond et al. 1992) have been suggested as possible indicators of the chunking of continuous neural activity into temporal episodes. The important point is that the episodic traces record experience in as much detail as the sensory receptors and attentional mechanisms are capable of recording. Presumably, these traces become the basis for the embodied meaning of instances of language usage (after Lakoff and Johnson 1999). That our episodic memories record much incidental sensory information beyond what we might think important to the task at hand has long been demonstrated in the phenomena of latent learning (aka perceptual learning) (McLaren et al. 1989). In perceptual learning, seemingly insignificant, incidental aspects of a stimulus and its context are shown later to influence some subsequent cognitive task, showing, thereby, that those seemingly incidental sensory features actually had been recorded as part of the trace of the original experience. These effects are seen in linguistic usage in the influence of seemingly insignificant phonetic characteristics on subsequent phonetic form as demonstrated, for example, by Goldinger (1997), Johnson (1997), Pierrehumbert (2001), and Gahl (2008). Since our brains have no way of knowing ahead of time which features of a sensory episode might be important for interpreting some future experience, we simply record as much sensory detail about a current experience as our sensory system can take in, without necessarily weighting those details for their relative importance. This description of sensory memory for episodes of experience is not to be taken as implying that the memory traces contain complete, veridical records of a given episodic experience. On the contrary ones memory will contain many imperfectly recorded exemplars due to the effect known as stimulus sampling (Estes 1950). For any given episode, only some subset of the perceptual features potentially recordable for that episode may actually be encoded in the memory trace of the episode. Thus, Goldberg (2006) and Langacker (2009) are correct in noting that the memory representation for a given token of linguistic experience may beindeed, probably will beincomplete, but surely, this
410
S. Chandler
incomplete sensory representation of an exemplar due to stimulus sampling is not what we mean by generalization and schema abstraction. Rather than abstracting schematic representations away from shared sensory features, it could just as well be that the subsequent activation of multiple episodes of experience with a stimulus, each of which might be fragmented in a different way, could serve to combine those sensory representations into a more-fully specified representation of a form, much as we hear a missing sound in the phenomenon of phonemic restoration (Warren 1970). While episodic memory traces may encode all of the sensory information noticed in perceiving that episode, it does not follow that exemplars are unanalyzed gestalts of sensory experiences as Langacker seems to envision them (Langacker 2009: 22). The exemplar-based models all assume that we encode both the sensory information associated with an episode and a record of our interpretation of what those features signify. Thus, we interpret walked as the past tense of the meaning denoted by walk and sang as the past tense of the meaning denoted by sing, and similarly for more complex expressions such as a simple transitive sentence. Where do these interpretations come from if there are no resident linguistic generalizations to guide them? They are immanent in the exemplars activated to process the new instance of usage. I use Langackers term (Langacker 1987) advisedly. In the exemplar-based approaches, the linguistic generalizations are immanent in the collection of exemplars activated by comparison to the new instance of linguistic usage, but the models do not posit the additional step of abstracting resident linguistic generalizations, or structures, away from those remembered exemplars. Presumably, these immanent but ephemeral generalizations are developed and elaborated over time through the comparison of new episodes with the traces of previous episodes, much as described by Langacker (1987, 2009). Following Pierrehumberts lead (Pierrrehumbert 2001), Langacker (2009) characterizes the members of a category as forming a cloud of remembered tokens:
In an exemplar model, each category is represented in memory by a cloud of remembered tokens of that category. These memories are organized in a cognitive map, so that memories of highly similar instances are closer to each other and memories of dissimilar instances are far apart . . . When a new token is encountered, it is classified . . . according to its similarity to the exemplars already stored. (Pierrehumbert 2001: 140141, quoted in Langacker 2009: 21).
Unfortunately, the characterization of category representation in exemplarbased models that Langacker adopted (Langacker, 2009) is inaccurate. The categories per se are not represented anywhere in the models explicitly. The misunderstanding seems to arise because, as Langacker notes, (some) exemplar-based models have adopted a spatial metaphor as their basis for com-
The English past tense: Analogy redux 411 paring and measuring similarity and dissimilarity among exemplars. The practice dates ultimately from Shepard (1980) and has been passed down directly to Nosofskys model (1986) and to Daelemans (Daelemans et al. 2002).12 In these models, exemplars are represented metaphorically as points in a multidimensional psychological space. There is a dimension for each variable or feature, and, collectively, the values for the distances along each dimension define a point in that space. By representing exemplars in this way, one may measure the distance, i.e., the degree of similarity, between one exemplar and another geometrically. Now, it is true that the points representing very similar stimuli would cluster close together in this psychological space and when represented visually as points on a graph might resemble a cloud of exemplars. However, that cloud does not represent or define the category, and category membership is not determined by whether an exemplar falls within one cloud rather than another. Indeed, in the case of nonlinearly separable categories, members of one category may fall well within another categorys cloud (Whittlesea 1997). Exemplar-based models of concept learning and categorization were developed in the first place because similarity to some prototype, whether implicit or explicit, often did not predict categorization behavior accurately (Medin and Schaffer 1978; Nosofsky 1986; Whittlesea 1997). For example, similarity to a specific member of a category, even if an atypical, outlier member of the category, often predicts categorization behavior better than similarity to some more distant prototype does. Moreover, members of different categories often overlap inextricably into nonlinearly separable categories in which some member of one category may resemble the members of a different category more closely than it does the members of its own category and vice versa. Consider, for example, the past-tense forms of English verbs such as ring (peal), ring (encircle), string, and bring. What this means is that verbs do not always behave like their nearest neighbors, their surrounding cloud. Nosofsky (1986) and Daelemans et al. (2002) (and Keuleers 2008) compensated for these situations by adopting feature weighting based on the differential predictiveness of different features in different contexts. Skousens Analogical Model (Skousen 1989, 1992) was developed completely independently of the other two exemplar-based models discussed here, yet it too is ultimately a model of categorization (see Chandler 2002 for an extensive discussion of Skousens model as a more general model of categorization). The Analogical Model does not incorporate the spatial metaphor of the other two models (and, thus, there are no clouds within the Analogical
12. The Memory-Based Learning Models sometimes employ an alternative similarity metric, the so-called overlap metric, which compares the number of features shared and not shared among exemplars.
412
S. Chandler
Model). Instead, it compares sets of features (supracontexts) among exemplars directly, and it uses the test for homogeneity versus heterogeneity to discount exemplars of supracontexts that are more like the target item than the supracontext itself is and that also behave differently than some of the less similar items do, i.e., nonlinearly separable items.13 The exemplars in the data set consulted by the Analogical Model are simply individual tokens of experience. Some of them may be topologically close to one another in their neural representations, but we attach no special theoretical significance to that. The process of comparing the target form to the collection of previous experiences activates those that share features homogeneously with the target form and constitutes them into an analogical set. The generalizations about the communicative significance of the target form are immanent in the exemplars momentarily drawn into that analogical set and guide the interpretation of the new input item. When the sensory information associated with that new communicative event is stored, the cognitive interpretation of it is stored along with its sensory trace, and a new exemplar is added to the data seti.e., to memory. To date, the three exemplar-based models discussed in this paper have been tested most extensively with data on phonological and morphological behavior, on which they have proven especially successful. It is not yet clear, however, how such models might scale up to a more extensive model of language. There have been a few small-scale experiments at modeling syntactic phenomena such as part-of-speech tagging (Daelemans, van den Bosch, and Weijters 1997) and prepositional phrase attachment (Lonsdale and Manookin 2004), but we do not yet have a workable formalism for applying these exemplar-based models to the hierarchical structure and compositionality of more complex syntactic constructions. Work is currently in progress on applying Skousens (1989, 1992) Analogical Model to the interpretation of word strings as specific constructions such as the Caused Motion construction and the Ditransitive construction described in Goldberg (1995). As Croft and Cruse (2004) have shown, usage of both of those constructions shows all the classic hallmarks of categorization behavior. To the extent that we can model that behavior by comparing new instances of usage to a corpus of previously recorded instances, then we need not posit the abstraction, representation, and application of resident syntactic generalizations for those constructions. Indeed, the research reported and reviewed in this paper raises the question of whether language exists in the brain as a system of resident linguistic generalizationsa grammaror as a system for operating on new instances of linguistic usage by
13. Comparisons of the performance of the Analogical Model and of Memory-Based Learning Models that incorporate feature weighting almost never result in statistically significant differences, but the feature weighting is crucial to the successful performance of the MemoryBased Learning Models (Eddington 2002; Daelemans 2002).
The English past tense: Analogy redux 413 comparing them to previously encountered instances of similar usage. To echo Whittlesea (1997), quoted above, for those phenomena which the exemplarbased models have been applied to, there does not appear to be any positive evidence (i.e., empirical evidence) motivating the rule-generalization approach over the exemplar-based approach. Received 11 August 2008 Revision received 6 October 2009 University of Idaho
References
Abbot-Smith, Kirsten and Tomasello, Michael. 2006. Exemplar-based learning and schematization in a usage-based account of syntactic acquisition. The Linguistic Review, 23. 275 290. Albright, Adam. 2007. Modeling analogy as probabilistic grammar. Unpublished manuscript, Massachusetts Institute of Technology, Cambridge, MA. Albright, Adam and Hayes, Bruce. 2002. Modeling English past tense intuitions with Minimal Generalization. In M. Maxwell (Ed.), Proceedings of the 6th meeting of the ACL Special Interest Group in Computational Phonology. New Brunswick, NJ: Association for Computational Linguistics. Albright, Adam and Hayes, Bruce. 2003. Rules vs. analogy in English past tenses: a computational/ experimental study. Cognition, 90. 119161. Ashby, F. Gregory. 1992. Multidimensional models of categorization. In F. Gregory Ashby (Ed.), Multidimensional models of perception and cognition, pp. 449483. Hillsdale, NJ: LEA. Baayen, R. Harald. 2004. Statistics in psycholinguistics: a critique of some current gold standards. Mental Lexicon Working Papers I January 3, Edmonton, Alberta: University of Alberta. Baayen, R. Harald, Piepencock, Richard and Gulikers, Lon (1995). The CELEX lexical database (CD-ROM). University of Pennsylvania, Philadelphia, PA: Linguistic Data Consortium. Baayen, R. Harald, Tweedie, Fiona J., and Schreuder, Robert. 2002. The subjects as a simple random effect fallacy: subject variability and morphological family effects in the mental lexicon. Brain and Language 81. 5565. Barsalou, Lawrence. 1999. Perceptual symbol systems. Behavioral and Brain Sciences, 22. 577 660. Berko, Jean. 1958. The childs learning of English morphology. Word 14. 150177. Blevins, Juliette. 1995. The syllable in phonological theory. In J. A. Goldsmith (ed.), The Handbook of Phonological Theory, pp. 206244. Oxford: Blackwell Publishers. Burton, Peter. 1990. A search for explanation of the brain and learning: elements of the psychonomic interface between psychology and neurophysiology. Psychobiology, 18. 119161, 162 194. Cazden, Cortney. 1968. The acquisition of noun and verb inflections. Child Development 39. 433 448. Chandler, Steve. 1995. Nondeclarative linguistics: Some neuropsychological perspectives. Rivista di Linguistica 7. 233247. Chandler, Steve. 1998. Instance-based reference for past-tense verb forms: An experimental study. Paper presented at the First International Conference on the Mental Lexicon at the University of Alberta, Edmonton, Alberta, September 35.
414
S. Chandler
Chandler, Steve. 2002. Skousens analogical approach as an exemplar-based model of categorization. In Royal Skousen, Deryle Lonsdale and Dilworth B. Parkinson (Eds.), Analogical modeling an exemplar-based approach to language, pp. 1126. Amsterdam: John Benjamins. Chandler, Steve. 2008. Predicting naming latencies with an analogical model. Journal of Psycholinguistic Research 37. 259268. Chandler, Steve. 2009a. Exemplar-based models. In David Eddington (ed.), Experimental and quantitative linguistics, pp. 100158. Muenchen: Lincom Europa Press. Chandler, Steve. 2009b. Analogical Modeling: A unified account of the regular-irregular dissociations seen in the inflectional morphology of English verbs. University of Idaho, Moscow, ID, manuscript submitted for publication. Chomsky, Noam. 1965. Aspects of the theory of syntax. Cambridge, MA: MIT Press. Chomsky, Noam. 1986. Knowledge of language: Its nature, origin, and use. New York: Praeger. Colzato, Lorenza, van Wouwe, Nelleke and Hommel, Bernhard. 2007. Feature binding and affect: emotional modulation of visuo-motor integration. Neuropsychologia, 45. 440446. Cost, Scott and Salzberg, Steven. 1993. A weighted nearest neighbor algorithm for learning with symbolic features. Machine Learning, 10. 5778. Croft, William D. and Cruse, D. Alan. 2004. Cognitive linguistics. Cambridge: Cambridge University Press. Daelemans, Walter. 2002. A comparison of analogical modeling to memory-based language processing. In Royal Skousen, Deryle Lonsdale and Dilworth B. Parkinson (Eds.), Analogical Modeling an exemplar-based approach to language. pp. 157179. Amsterdam: John Benjamins. Daelemans, Walter, Gillis, Steven and Durieux, Gert. 1997. Skousens analogical modeling algorithm: a comparison with lazy learning. In D. B. Jones and H. L. Somers (eds.), New Methods in Language Processing. pp. 315. UCL Press. Daelemans, Walter, van den Bosch, Antal and Weijters, Ton. 1997. IGTree: using trees for compression and classification in lazy learning algorithms. In D. W. Aha (Ed.), Lazy learning. pp. 407423. Dordrecht: Kluwer. Daelemans, Walter, van den Bosch, Antal and Zavrel, Jakub. 1999. Forgetting exceptions is harmful in language learning. Machine Learning, 34. 1141. Daelemans, Walter, Zavrel, Jakub, van der Sloot, K. and van den Bosch, Antal. 2002. TimBL: Tilburg Memory-Based Learner, version 4.3 reference guide. Tilburg ILK. Damasio, Antonio. 1999. The feeling of what happens. New York: Harcourt Brace. Derwing, Bruce L. and Skousen, Royal. 1994. Productivity and the English past tense: testing Skousens analogy model. In S. D. Lima, R. L. Corrigan, and G. K. Iverson (Eds.), The reality of linguistic rules, pp. 193218. Amsterdam: John Benjamins. Eddington, David. 2000. Analogy and the dual-route model of morphology. Lingua, 110, 281298. Eddington, David. 2002. A comparison of two analogical models: Tilburg Memory-Based Learner versus Analogical Modeling. In Royal Skousen, Deryle Lonsdale and Dilworth B. Parkinson (Eds.), Analogical modeling, an exemplar-based approach to language, pp. 141156. Amsterdam: John Benjamins. Eddington, David. 2003. Issues in modeling language processing analogically. Lingua, 114, 849 871. Eddington, David (2007). Flaps and other variants of /t/ in American English: Allophonic distribution without constraints, rules, or abstractions. Cognitive Linguistics, 18, 2346. Eddington, David and Lonsdale, Deryle. 2007. Analogical modeling: an update. Unpublished Manuscript, Brigham Young University, Provo, UT. Estes, William K. 1950. Toward a statistical theory of learning. Psychological Review, 57. 94107. Estes, William K. (1976). The cognitive side of probability learning. Psychological Review, 83, 3764.

Estes, William K. 1994. Classification and cognition. Oxford: Oxford University. Francis, W. Nelson and Kuera, Henry. 1986. Frequency analysis of English usage: Lexicon and grammar. Boston: Houghton Mifflin. Gahl, Susanne. 2008. TIME and THYME are not homophones: the effect of lemma frequency on word duration in spontaneous speech. Language, 84. 474496. Goldberg, Adele E. 1995. Constructions: a Construction Grammar approach to argument structure. Chicago: Chicago University Press. Goldberg, Adele E. 2006. Constructions at work. Oxford: Oxford University Press. Goldinger, S. D. 1997. Words and voices: perception and production in an episodic lexicon. In K. Johnson and J. W. Mullennix (eds.), Talker Variability in Speech Processing, pp. 3365. San Diego: Academic Press. Hare, Mary, Elman, Jeffrey L. and Daugherty, Kim G. 1995. Default generalization in connectionist networks. Language and Cognitive Processes, 10. 601630. Joanisse, Marc F. and Seidenberg, Mark S. 1999. Impairments in verb morphology after brain injury: a connectionist model. Proceedings of the National Academy of Science, USA, 96. 7592 7597. Johnson, K. 1997. Speech perception without speaker normalization: an exemplar model. In K. Johnson and J. W. Mullennix (Eds.), Talker variability in speech processing (pp. 145165). San Diego: Academic Press. Keuleers, Emmanuel. 2008. Memory-based learning of inflectional morphology. Unpublished doctoral dissertation, Universiteit Antwerpen. Antwerp. Kuczaj, Stanley. 1977. The acquisition of regular and irregular past tense forms. Journal of Verbal Learning and Verbal Behavior 16. 589600. Lakoff, George and Johnson, Mark. 1999. Philosophy in the flesh: the embodied mind and its challenge to Western thought. Basic Books. Langacker, Ronald W. 1987. Foundations of cognitive grammar, vol. 1 theoretical prerequisites. Stanford, CA: Stanford University Press. Langacker, Ronald W. 2009. The emergence of structure from usage. Unpublished manuscript, University of California, San Diego. Lonsdale, Deryle and Manookin, Michael. 2004. Combining learning approaches for incremental on-line parsing. In Proceedings of the Sixth International Conference on Cognitive Modeling, pp. 160165. Mahwah, NJ: Lawrence Erlbaum. Marchman, Virginia A. 1997. Childrens productivity in the English past tense: The role of frequency, phonology, and neighborhood structure. Cognitive Science, 21(3). 283304. Marchman, Virginia A. and Callan, Daniel E. 1995. Multiple determinants of the productive use of the regular past tense suffix. Proceedings of the 17th annual Cognitive Science Society. Hillsdale, NJ: Erlbaum. McLaren, I. P. L., Kaye, Helen and Mackintosh, N. J. 1989. An associative theory of the representation of stimuli: applications to perceptual learning and latent inhibition. In Robert G. M. Morris (Ed.), Parallel distributive Processing: implications for psychology and Neurobiology, pp. 102130. Oxford: Oxford University Press. Medin, Douglas L., and Schaffer, Marguerite M. 1978. Context theory of classification learning. Psychological Review, 85. 207238. Moscoso del Prado Martn, Fermn, Kosti, Aleksandar and Baayen, R. Harald. 2004. Putting the bits together: an information theoretical perspective on morphological processing. Cognition, 94. 118. Mudrow, Michael. 2002. Version spaces, neural networks, and analogical modeling. In Royal Skousen, Deryle Lonsdale and Dilworth B. Parkinson (Eds.), Analogical Modeling an exemplarbased approach to language, pp. 225264. Amsterdam: John Benjamins. Murphy, George J. 2002. The big book of concepts. Cambridge, MA: MIT Press.
416
S. Chandler
Nakisa, Ramin, Hahn, Ulrike. 1996. Where defaults dont help: the case of the German plural system. In George W. Cottrell (Ed.), Proceedings of the eighteenth annual conference of the Cognitive Science Society, pp. 177182. Mahwah, NJ: LEA. Nakisa, Ramin, Plunkett, Kim and Hahn, Ulrike. 2000. Single- and dual-route models of inflectional morphology. In P. Broeder and J. Murre (Eds.) Models of language acquisition (pp. 201 222). Oxford: Oxford University Press. Nosofsky, Robert M. 1986. Attention, similarity, and the identification-categorization relationship. Journal of Experimental Psychology: General, 115(1). 3957. Nosofsky, Robert M. 1990. Relations between exemplar similarity and likelihood models of classification. Journal of Mathematical Psychology, 34. 393418. Pierrehumbert, Janet B. 2001. Exemplar dynamics: word frequency, lenition and contrast. In Joan Bybee and Paul Hopper (Eds.) Frequency effects and emergent grammar, pp. 119. Amsterdam: John Benjamins. Pinker, Steven. 1999. Words and rules: the ingredients of language. New York: Basic Books. Pinker, Steven and Prince, Alan S. 1988. On language and connectionism: analysis of a parallel distributed processing model of language acquisition. Cognition, 28. 73193. Pinker, Steven and Ullman, Michael T. 2002. The past-tense debate. Trends in Cognitive Sciences 6(11). 456463. Prasada, Sandeep and Pinker, Steven. 1993. Generalization of regular and irregular morphological patterns. Language and Cognitive Processes, 8. 156. Ramscar, Michael. 2002. The role of meaning in inflection: Why the past tense does not require a rule. Cognitive Psychology, 45. 4594. Raymond, J. E., Shapiro, K. L., and Arnell, K. M. 1992. Temporary suppression of visual processing in an RSVP task: an attentional blink? Journal of Experimental Psychology: Human Perception and Performance, 18, 849860. Rumelhart, David E. and McClelland, James L. 1986. On learning past tenses of English verbs. In David E. Rumelhart and James L. McClelland (Eds.), Parallel distributed processing Vol. 2 psychological and biological models, pp. 216271. Cambridge, MA: MIT. Shanks, David. 1995. The psychology of associative learning. Cambridge: Cambridge University Press. Shepard, Roger N. 1980. Multidimensional scaling, tree fitting, and clustering. Science, 210. 390 398. Skousen, Royal. 1989. Analogical modeling of language. Dordrecht: Kluwer Academic. Skousen, Royal. 1992. Analogy and structure. Dordrecht: Kluwer Academic. Skousen, Royal. 1995. Analogy: a non-rule alternative to neural networks. Rivisti di Linguistica, 7. 213232. Skousen, Royal. 1998. Natural statistics in language modeling. Journal of Quantitative Linguistics 5. 246255. Skousen, Royal. 2006. Expanding Analogical Modeling into a general theory of language prediction. Paper presented to the Max Planck Institute, Leipzig. Stemberger, Joseph P. and MacWhinney, Brian. 1986. Form-oriented inflectional errors in language processing. Cognitive Processing, 18. 329354. Stern, Karen (Ed.). 1997. Longman Dictionary of American English. New York: Longman. Tomasello, Michael. 2003. Constructing A Language A usage-Based Theory of Language Acquisition. Cambridge, MA: Harvard University Press. Ullman, Michael T., Corkin, Suzanne, Coppola, Marie, Hickok, Gregory, Growdon, John H., Koroshetz, Walter J., and Pinker, Steven. 1997. A neural dissociation within language: evidence that the mental dictionary is part of declarative memory, and that grammatical rules are processed by the procedural system. Journal of Cognitive Neuroscience, 9. 266276.

van den Bosch, Antal. 2002. Expanding k-NN analogy with instance families. In Royal Skousen, Deryle Lonsdale and Dilworth B. Parkinson (Eds.), Analogical modeling, an exemplar-based approach to language, pp. 209223. Amsterdam: John Benjamins. Warren, R. M. 1970. Perceptual restoration of missing speech sounds. Science, 167, 392395. Whittlesea, Bruce W. A. 1997. The representation of general and particular knowledge. In Kenneth Lamberts and David Shanks (Eds.), Knowledge, concepts and categories, pp. 161195. Cambridge, MA: MIT Press. Wald, Benji and Shopen, Timothy. 1983. A researchers guide to the sociolinguistic variable (ING). In T. Shopen and J. Williams (Eds.), Style and variables in English, pp. 219249. Cambridge, MA: Winthrop. Wulf, Douglas J. 2002. Applying analogical modeling to the German plural. In Royal Skousen, Deryle Lonsdale and Dilworth B. Parkinson (Eds.), Analogical modeling, an exemplar-based approach to language, pp. 109122. Amsterdam: John Benjamins.
Who thinks that a piece of furniture refers to a broken couch? Count-mass constructions and individuation in English and Spanish
MARIA D. SERA and WHITNEY GOODRICH*
Abstract Differences between languages in count (e.g., cup) and mass (e.g., rice) nouns have been shown to impact cognition, but few studies have directly examined how the morphology associated with count and mass constructions is acquired and linked to differences in meaning. Two experiments examined the relation between English and Spanish plural morphology and the interpretation of nouns as individuated objects. In Experiment 1, English- and Spanish-speaking children and adults participated in two tasks. One task examined how participants produced plurals for nouns. Results from this task indicated that both language groups make a distinction in their use of plural morphology for count and mass nouns between 5 and 7 years of age. However, that morphological distinction was stronger and occurred earlier among English speakers. A second task examined whether the count (but not the mass) nouns in each language denoted individuated objects. Speakers of both languages tended to treat all nouns as referring to individuated objects at 5 years of age. Beginning at 7 years of age, English speakers made a reliable distinction between the referents of mass and count nouns. Speakers of Spanish however, treated both types of nouns as referring to individuated objects in this task throughout development. Experiment 2 examined the interpretation of nouns by adult speakers of both languages using a different task, and the results offer converging evidence that Spanish speakers are more likely than English speakers
* Address for correspondence: Maria D. Sera, University of Minnesota, Institute of Child Development, 51 East River Road, Minneapolis, MN 55455 (e-mail: sera@umn.edu). Whitney Goodrich is currently at the University of California, Berkeley. Acknowledgements: This work was supported by a Multicultural Research Award from the University of Minnesota to Maria D. Sera. Thanks go to Milissa Tilton, Annie Ryman, Jennifer Reeves, and Alice Friedman for their assistance with collection of the data. Portions of Experiments 1 and 2 were Honors Theses by Annie Ryman and Whitney Goodrich respectively. Parts of this work were presented at the Biennial Meetings of the Society for Research in Child Development in Minneapolis in April, 2001 and in Tampa in April, 2003. Cognitive Linguistics 213 (2010), 419442 DOI 10.1515/COGL.2010.015 09365907/10/00210419 Walter de Gruyter
420
M. D. Sera and W. Goodrich
to interpret nouns as referring to individuated objects. The reason for the difference between English and Spanish is discussed in terms of the proportion of count and mass nouns acquired early in development and as a function of developing concepts. Keywords: Individuation, English and Spanish, Count and Mass Noun Constructions, Plural Morphology, Semantics
1. Introduction People make inferences about objects from how they are named. For example, from a very young age, English-speaking children pick out an individuated whole object if it is labeled a blicket and the material that it is made of if it is called some blicket (Brown, 1957). These findings illustrate that the constructions in which English nouns appearwords such as a, the, and somelead to inferences about the nature of the entities to which the English nouns refer. So one process through which we obtain information about objects from language is by forming word classes and relating these word classes to referents. Different classes of nouns can be formed by keeping track of the determiners, articles, quantifiers, prefixes, suffixes, and other words with which the nouns occur. By this view, nouns that occur in a certain set of constructions belong in one class, while nouns that occur with different constructions belong to another class. Often, the different noun classes form coherent categories. Such seems to be the case with the English distinction between count and mass nouns. The constructions in Table 1 illustrate the patterns typically associated with the classes of English count and mass nouns. According to Stockwell et al. (1977), the most obvious differences between English count and mass nouns are (1) a lack of a plural morphology such as [s] for mass nouns, and (2) that the indefinite determiner, a, only appears with count nouns. The following features also distinguish between the two English noun classes: words for the cardinal numbers such as one, two, and three may occur with count nouns such as cup but not with mass nouns such as rice; quantifiers such as a few, and several occur with count but not mass nouns. As previously stated, count nouns also require plural morphologythe addition of a plural morphological [s]when referring to more than one item while mass nouns do not. To refer to more than one instance of a mass noun, a counter or measure word is pluralized, as for example in Three cups of rice. In short, the nouns that co-occur with quantifiers such as one, two, three, some, a few, several and require plural morphology when referring to more than one item are called count nouns. The nouns that co-occur with quantifiers such as some, much, less, little, and require measure words or counters such as pieces of, grains of, and do not allow
Count-mass constructions and individuation in Spanish 421

Table 1. Examples of the constructions typically used to discriminate English count from mass nouns. NOUN: CUP A cup is on the table. Two cups are on the table. Many cups are on the table. Some cups are on the table. Several cups are on the table. *A lot of cup is on the table. RICE *A rice is on the table. *Two rices are on the table. *Many rices are on the table. *Some rices are on the table. *Several rices are on the table. A lot of rice is on the table.
*Constructions that do not occur in English.
plural morphology when referring to more than one item are called mass nouns. For these reasons, we use count noun to mean nouns that typically occur in count noun constructions and mass nouns for nouns that typically occur in mass noun constructions. Differences between nouns that appear in count and mass noun constructions have been linked to a conceptual distinction between individuated and non-individuated objects in two ways. One is based on linguistic analyses, the other on experimental evidence. The rationale linking the distributional difference with the conceptual contrast is as follows. According to Quine (1969), MacNamara (1982) and Langacker (1987), count nouns do not have to be explicitly unitizedthat is, packaged into units of measure such as cups, kernels, and grains when they are counted because count nouns refer to individuated whole objects, with shapes, and clear boundaries. Mass nouns, in contrast, are thought to refer to shapeless collections of elements that need to have boundaries imposed on them before they can be counted. By this view, English count and mass nouns map onto an underlying conceptual distinction between individuated objects and aggregates or substances. Another source of evidence linking differences in English mass and count noun constructions to reference is experimental. As previously stated, Brown (1957) demonstrated that 5-year-old English-speaking children appear to know that a nonsense word heard in a count-noun construction (e.g., a fep) denotes a solid object whereas the same nonsense word in a mass-noun construction (e.g., some fep) denotes a substance. A number of more recent studies have also linked English count-mass constructions to a conceptual distinction between individuated objects and non-individuated ones. For example, Middleton et al. (2004) have shown that English-speaking adults make a conceptual distinction between count and mass nouns that refer to aggregates. By aggregates, they mean multiple, co-occurring and relatively homogenous constituents that can be divided into objects. In English, aggregates can be named both by plural count nouns such as pebbles and mass nouns such as rice. In one experiment,
422
Middleton et al. (2004) report that English-speaking adults rate the referents of familiar count nouns such as beans as more perceptually distinguishable than the referents of familiar mass nouns such as rice. Recent studies have also focused on the relation between count-mass morphology and the conceptual distinction between individuated objects and continuous substances both developmentally and cross-linguistically. Developmental work has focused on whether the morphological distinction precedes or follows the conceptual distinction. For example, Soja et al. (1991) argued that a distinction between solid objects and substances is made prior to the production of mass-count syntax among young English-speaking children. However, Samuelson and Smith (1999) showed that count nouns were not interpreted as referring to solid objects until approximately three years of age, and after children have acquired many count nouns. Even at that age, however, children did not interpret mass nouns as referring to non-solid substances. Studying conceptual development, Keil (1979) found that it was not until about 7 years of age that children know that certain predicates (e.g., is tall) are only allowed with nouns that refer to solid objects in contrast with other predicates (e.g., leaks out) that are allowed with nouns that refer to substances and aggregates. That is, young children, unlike older children and adults, are just as likely to accept sentences such as The milk is tall, and The chair leaks out of boxes as The chair is tall and Milk leaks out of boxes. Thus, the evidence from language and conceptual development suggests that the acquisition of count and mass nouns develops asynchronously among English speakers, and that the learning of the corresponding semantic distinctions between individuated entities and non-individuated ones is a long protracted process that lasts well into childhood. Crosslinguistic studies have primarily focused on whether differences across speakers of different languages in count-mass morphology accompany conceptual differences between the speakers (e.g., Lucy, 1992). Much of this cross-linguistic work has focused on differences between English speakers and speakers of classifier languages in which a distinction between count and mass nouns may not exist (e.g., Yucatec Mayan, Japanese, Chinese). In classifier languages, a word called a classifier or counter is required to appear with the noun when the noun occurs with a numeral, and/or demonstrative or certain quantifiers (Li and Thompson, 1981). For example, when saying one leaf in Mandarin one would say, yi pian yezi [one-CLASSIFER-leaf ]. Because the analogy between English mass nouns (e.g., soap, rice, and furniture) and all nouns in classifier languages is often made, a central issue in these studies has been whether speakers of classifier languages represent the referents of all nouns in the same way English speakers represent the referents of mass nouns (e.g., Lucy, 1992; Imai and Gentner, 1997). For example, Lucy (1992) reported that adult speakers of Yucatec-Mayan rely more on material similarity when
Count-mass constructions and individuation in Spanish 423 classifying entities than English speakers who rely more heavily on shape similarity. Imai and Gentner (1997) reported similar findings among speakers of Japanese, another numerical classifier language. However, not all speakers of classifier languages rely less on shape than English speakers when classifying objects. In Mandarin Chinese, shape is an important basis for classifier use, and Kuo and Sera (2008) recently report that speakers of Mandarin Chinese rely more heavily on shape than English speakers in when categorizing solid objects. Thus, subtle differences across languages in count-mass constructions seem to contribute to the representation of entities in different ways. In this paper, we focus on differences between English and Spanish in use of the plural form [s] for mass and count nouns, and on how this morphological difference is linked to speakers interpretations of these nouns as referring to individuated objects. Differences and similarities between the two languages in their count-mass systems have been noted by several scholars (e.g., Stockwell et al., 1977; Gathercole, 1997; Colunga et al., 2002). However, to our knowledge there is no experimental evidence that documents these language differences. According to Stockwell et al. (1977) both English and Spanish have count and mass noun constructions but there are two significant differences in the distinction between the languages. The first is that there are numerous discrepancies between the languages in the assignment of what appear to be semantically equivalent lexical items to a class. The most common example of this difference is that there seem to be numerous nouns that are mass nouns in English but count nouns in Spanish (e.g., furniture-muebles; jewelry-joyas; adviceconsejos; news-noticias). According to Stockton et al. (1977), the reverse patternmass nouns in Spanish that are count nouns in Englishis very rare. The second significant difference between English and Spanish count-mass distinctions according to Stockwell et al. is the movement of Spanish mass nouns into count noun constructions. By their analysis, there are a number of Spanish nouns that occur only in mass noun constructionsa phenomenon that establishes the class. Many of these nouns refer to abstract entities (e.g., justicia-justice and paciencia-patience). However, Stockwell et al. (1977) argue that the ease with which Spanish mass nouns move into Spanish count constructions is not matched in English (p. 83). It should be noted, however, that nouns can move from one class to the other in both directions in both languages. In English for example, one can countify a typical mass noun when referring to a standard serving (e.g., one milk, two coffees, three beers), to distinguish between types of the referents of the mass noun (e.g., a new rice has been developed), or for poetic or special effects (e.g., the sands of time). Similarly, count nouns can be massified in English (theres egg on your face; thats a lot of house to clean). However, Stockwell et al. (1977) argue that Spanish mass nouns jump more easily into Spanish count constructions than vice versa, and more so than they do in English.
424
Because of the observations that Spanish mass nouns seem to appear frequently in count noun constructions, some have made a more radical claimthat there is no distinction between Spanish mass and count nouns at all. For example, Mueller-Gathercole (1997) writes, The structure of Spanish is quite distinct from English in this regard. In Spanish, there is no linguistic distinction that separates mass words from count words on the basis of distributional privileges (p. 833). Consistent with this claim, Mueller-Gathercole reports that Spanish-English bilinguals do not honor the English count-mass distinction until 9 years of age, and then only if they are more dominant in English than Spanish. However, the reason for these findings is not clearto our knowledge, no studies to date have systematically investigated the linguistic constructions across the two languages, and the relation between these constructions and the interpretation of nouns. Thus, the main question of this work is to begin to ask whether there are differences between English and Spanish in count-mass constructions, and whether these linguistic constructions have any ramifications for speakers of the languages in how they construe the referents of count and mass nouns. One sentence completion task and two experiments were conducted. Because some scholars have claimed that there is no distinction between count and mass nouns in Spanish, results from a sentence completion task in which English- and Spanish-speaking adults made plurals for nouns that were thought to be count and mass in each language are reported. The results from this task offer evidence that a distributional difference exists between count and mass nouns in Spanish based on plural morphology. In Experiment 1, we began to investigate the relation between plural morphology and reference in the two languages. In this experiment, English and Spanish-speaking children and adults participated in two tasks. One task examined the development of plural morphology. The other task examined whether the referents of count nouns are treated as individuated objects, unlike the referents of mass nouns. The results from this study suggest that speakers of both languages treat the referents of both count and mass nouns as individuated objects early in development. However, between 7 and 9 years of age English speakers begin to treat the referents of mass nouns as aggregates, unlike Spanish speakers. In Experiment 2, we provide converging evidence suggesting that Spanish speakers are more likely than English speakers to interpret nouns as referring to individuated objects. 2. Sentence completion task Disagreement exists regarding whether or not there is any distinction in Spanish between count and mass nouns. Stockwell et al. (1977) argue in favor of a distinction in Spanish; Mueller-Gathercole (1997) argues against one. Impor-
Count-mass constructions and individuation in Spanish 425 tantly, there is to our knowledge no experimental evidence on how English and Spanish speakers treat nouns with respect to count-mass constructions. Thus, the goal of this sentence completion task was to offer some experimental evidence from English and Spanish speakers on their linguistic treatment of mass and count nouns. Because plural formation has been an important criterion by which nouns are classified as count or mass, adults were asked to produce plurals for 14 nouns in each language. The nouns were chosen to refer to every kind of solid entity identified by Keil (1979) and referred to a range of entities from aggregates such as sand to individuated objects such as bicycle. Sixteen native, monolingual speakers of English and 16 native, monolingual speakers of Spanish from Panama provided the contextually appropriate plural form of the noun by completing sentences that described a container that was full of an entity. Each sentence described a container that was large enough to hold the entity in question, so some containers were small such as a pot full of beans while others were very large such as a land full of lakes. The sentences contained a blank for the subject to provide the appropriate form of the noun. The singular form of the nouns appeared below each blank in parentheses. Two sample sentences were provided for the subject that included the plural form on the blank. One was an example of a count noun: The pot was full of beans (bean) in English, and La olla estaba llena de frijoles (frijol) in Spanish. The other sentence was an example of a mass noun, The swimming pool was filled with water (water) in English, and La piscina se llen de agua (agua) in Spanish. Both language groups were instructed that some of the nouns might already be in their correct forms. Table 2 shows the number of English and Spanish speakers that added a plural [s] to each noun, along with results from Chi Square tests that indicate reliable differences between English and Spanish speakers in their treatment of the nouns. As is clear from Table 2, there are both similarities and differences between English and Spanish speakers in plural formation of the nouns. Speakers of both languages made a distinction between the nouns. Most English and Spanish speakers added a [s] to make the plurals of letter, lake, bicycle, pencil and horse and did not add an [s] to make plurals of sugar, sand, rice, garbage, and food. Neither group added an [s] when they made the plurals for nouns referring to fine-grained aggregates such as sugar and sand. Thus, as Stockwell et al. (1977) claimed, there is a count-mass distinction in English and Spanish that involves the addition of the plural [s] for count nouns but not for mass nouns. We also found some differences. Every reliable difference observed between English and Spanish speakers indicated that more Spanish speakers added a plural [s] to the nouns than English speakers. Moreover, as Stockwell et al. (1977) suggested, some English mass nouns were treated as count nouns in Spanish ( furniture, soap, fruit, and jewelry) but the reverse pattern was not found.
426
Table 2. The number of English- and Spanish-speaking adults (out of 16) that added an [s] to make plurals in the sentence completion task. NOUN sugar/azucar garbage/basura rice/arroz sand/arena food/comida letter/carta lake/lago bicycle/bicicleta pencil/lapiz horse/caballo furniture/mueble soap/jabn fruit/fruta jewelry/joya Spanish 0 3 4 2 4 16 16 16 16 16 16 14 14 16 English 1 0 0 1 1 16 16 16 16 16 1 2 3 1 Chi Square 2(1) = .50 2(1) = 6.0 2(1) = 8.0 2(1) = 2.0 2(1) = 6.0 2(1) = 0.0 2(1) = 0.0 2(1) = 0.0 2(1) = 0.0 2(1) = 0.0 2(1) = 26.5 2(1) = 24.0 2(1) = 14.2 2(1) = 26.5 p n.s. p < .02 p < .01 n.s. p < .02 n.s. n.s. n.s. n.s. n.s. p < .001 p < .001 p < .001 p < .001
The implications of these findings are straightforward. Spanish speakers, like English speakers, treat some nouns as count by adding a plural [s] when referring to more than one item and not to other nouns. Thus, Spanish speakers do make a distinction between count and mass nouns, at least with respect to plural morphology. Experiment 1 addresses whether the morphological difference between Spanish count and mass nouns marks individuation as it does in English. 3. Experiment 1 To our knowledge, there are no studies that directly examine the relation between plural formation and the interpretation of nouns as individuated objects or aggregates in languages other than English. The goal of this experiment was to begin to examine this developing relation in English and Spanish speakers. The addition of a plural [s] in English has been associated with individuated objects while the lack of an [s] in plural forms has been associated with substances or aggregates among English speakers (e.g., Middleton et al., 2004). Thus, we asked whether the use of a plural [s] is similarly linked with individuated objects in Spanish speakers. We employed two tasksa plural formation task and an individuation task. The Plural Formation Task examined the use of plural syntax for a set of nouns. The results from this task were used to identify count and mass nouns in each language. The Individuation Task was designed to determine whether the referents of the count nouns were construed as individuated objects in contrast to the referents of the mass nouns. The plural formation task generally followed the procedures of the classic wug test developed by Berko (1958) except that familiar nouns were used
Count-mass constructions and individuation in Spanish 427 instead of nonsense words. In this task, children were shown a single object and asked to name it. Then they were shown two objects and asked to name them. We examined the degree to which nouns for which a plural [s] was added in the Plural Formation Task were represented as individuated objects in the Individuation Task. For the individuation task, we employed the idea that individuated objects do not retain their identities when they are subdivided, unlike aggregates. For example, A piece of cup is supposed to refer to a part of a cup, and is not supposed to refer to a whole cup. However, A piece of furniture, is supposed to refer to a whole object and not a fragment (Langacker, 1987). So if a correlation exists between plural formation and noun semantics, we should find that the nouns for which participants produced a plural [s] in the Plural Formation Task (e.g., cup-cups) would not retain their identities when subdivided in the Individuation Task. Alternatively, we should find that nouns for which a plural [s] was not used in the Plural Formation task would retain their identities when subdivided in the Individuation Task (e.g., furniture). 3.1. Method
3.1.1. Participants. Native, monolingual, Spanish-speaking and Englishspeaking children and adults in four age groups participated (N = 96). There were 48 participants in each language group, and 12 participants in each age group. Within each age and language group approximately the same numbers of males and females participated. We divided the participants into the following age groups: 5-year-olds, 7-year-olds, 9-year-olds, and adults. The Spanish five-year-olds had a mean age of 5;6, and consisted of six girls and six boys. The English five-year-olds had a mean age of 5;5, and consisted of seven girls and five boys. The Spanish seven-year-olds had a mean age of 7;7, and consisted of six girls and six boys. The English seven-year-olds had a mean age of 7;6, and consisted of seven girls and five boys. The Spanish nine-year-olds had a mean age of 9;4, and consisted of six girls and six boys. The English nineyear-olds had a mean age of 9;6, and consisted of six girls and six boys. There were 12 Spanish-speaking adults who participated, 6 men and 6 women, and 12 English-speaking adults, six men and six women. The Spanish-speaking children were recruited from a public school, Escuela La Palmera, in Trinidad-Beni, Bolivia. All of the Spanish-speaking children were monolinguals. The responses of one participant (a male in the 5 year age group) were not used because the child was unable to follow the instructions of the experimenter. The English-speaking children were from a metropolitan area in the Midwestern United States and were recruited to match the Spanishspeaking participants as closely as possible in age and gender. The Englishspeaking adults were students at large public university in the Midwestern United States. The Spanish-speaking adults were students from Puerto Rico
428
who were participating in a summer program at the same university as the English speakers. Thus, all of the adults were university-educated and between 19 and 23 years of age. They were also fluent speakers of English. Although the adult speakers in this experiment came from a different country (Puerto Rico) than the adults in the sentence completion task (Panama) and the children who were from Bolivia, with respect to plural formation all Spanish dialects are thought to be equivalent. 3.1.2. Materials. We examined plural formation and individuation of 13 English and 13 Spanish nouns. The nouns were chosen to refer to objects of various sizes. They were: soap (jabon), rice (arroz), peanut (man), chalk (tiza), pen (bolgrafo), jewelry (joya), paper (papel), cup (taza), fruit (fruta), celery (apio), carrot (zanahoria), bread (pan), and furniture (mueble). Thirteen sets of three 3-dimensional items were used. A set consisted of three platforms, each displaying different examples of the same test object: one platform held a single whole object, another held two of the same whole objects, and a third held a fragment of the object. In order to be clearly identifiable, fragmented examples were constructed to show approximately one half of the test object. The items in each set were displayed on matte gray foam-core platforms that were uniform in size for all the objects in each set, and placed on a small wooden table in front of each participant. The order in which the items were placed on the table was random across different participants. Figure 1 shows the stimulus set for furniture (mueble). 3.1.3. Design and Procedure. Stimulus objects were presented in four randomly ordered question sets. Prior to testing, participants were randomly and evenly assigned to one of the four question sets. Each participant was tested individually on the two tasks. We presented the Individuation Task before the Plural Formation Task to ensure that we would be accessing the representation of particular nouns. For example, we wanted to access the interpretation of furniture instead of couch for the stimuli shown in Figure 1. We describe the exact procedures of the two tasks below. 3.1.4. The Plural Formation Task. This test was designed to examine the production of plural syntax. For this portion, only two platforms were usedthe platform with one whole object and the platform with two whole objects, with the single item platform presented first. Participants were asked, What is on this platform (Qu est en esta plataforma)? If a synonym (or alternate word) was used instead of the target noun, the experimenter would ask whether another word was known for the object. Specifically, the experimenter would say, Do you know another word for this (Sabes otra palabra)? If the target noun was not given, the participant would be asked to
Figure 1. A sample stimulus set from Experiment 1.
remember the word used by the experimenter during the Individuation Task. The experimenter would say, Do you remember what word I used (Recuerdas la palabra que us)? If the target noun still was not used, the alternate word was recorded on the response sheet. Most participants did use the target noun to describe the single object on the platform. After they labeled the single object, the platform with two whole objects was presented, and the participant was prompted to label the two objects with, Here are two. There are two ____ (Aqu hay dos. Hay dos ____ ). The words used to label the two objects were recorded on the response sheet. 3.1.5. The Individuation Task. Each participant sat at a table and was presented with the 13 sets of objects one at a time. Each set consisted of three platforms: (1) a platform with two objects; (2) a platform with a single object; and (3) a platform with a fragment of one object. The three platforms were set on a table in front of the participant. Then, the experimenter said, Show me a piece of ____ (Enseame un pedazo de ____ ). Participants then pointed to one of the platforms. For the sample stimulus set shown in Figure 1, if furniture (mueble) was represented as an individuated object, asking participants for A piece of furniture should lead them to choose the broken couch. If they chose the whole single object, it implies that they do not represent furniture (mueble) as an individuated object. Responses were recorded on a response sheet. No feedback was given with regard to the accuracy of a response. This portion of the test generally lasted 10 to 20 minutes. 3.2. Results
3.2.1. Plural Formation Task. We used the results from adult speakers of both languages in the Plural Formation Task to identify the count and mass nouns in each language. We used an 85% criterion such that nouns for which adults added an s or es 85% of the time or more were considered count nouns, while nouns for which adults added and s or es 15% of the time or fewer were considered mass nouns. By this criterion, carrot, cup, peanut, and pen were identified as the English count nouns (English-speaking adults added an s or es 100% or the time to these nouns); and bread, celery, chalk, fruit,
430
furniture, jewelry, paper, rice, and soap as the English mass nouns (Englishspeaking adults added an s or es less than 15% of the time to these nouns). By the same criteria, the Spanish count nouns were the Spanish equivalents of carrot, cup, pen, celery, chalk, fruit, furniture, jewelry, paper, and soap (Spanish-speaking adults added an s or es 90% of the time or more often to the Spanish equivalents of these nouns); and Spanish translations of peanut, bread, and rice were the Spanish mass nouns (Spanish-speaking adults added an s or es 12% of the time or less to the Spanish equivalents of these nouns). The percentage of times each participant added an s or es to the respective count and mass nouns in each language were then analyzed by a 3-way mixed design ANOVA with Language (English or Spanish) and Age (5-yr-old, 7-yrold, 9-yr-old, or adult) as between-subjects factors and Noun (count or mass) as a within-subjects factor. The ANOVA yielded a main effect of Noun, F(1, 88) = 211.61, p < .001; a Noun Language interaction, F(1, 88) = 104.67, p < .001; a Noun Age interaction, F(3, 91) = 27.48, p < .001; and a Noun Language Age interaction, F(3, 88) = 9.51, p < .001. The main effect of Noun indicated that overall participants (of all ages and language groups) honored a distinction between count and mass nounsan [s] was added 76.97% of the time to count nouns and 31.13% of the time to mass nounshowever, the higher-order interactions involving Noun indicated that the magnitude of the effect varied with language and age. The Noun Language interaction indicated that the effect of Noun was larger within English speakers than Spanish speakers. Overall, English speakers added an [s] to make a plural of count nouns 94.27% of the time and to make a plural of mass nouns 16.19% of the time ( p < .001, simple effects). In contrast, Spanish speakers added an [s] to make a plural of count nouns 59.67% of the time and to make a plural of mass nouns 46.4% of the time ( p < .03, simple effects). Thus, Spanish speakers made a contrast between count and mass nouns but not nearly as sharply as English speakers. Figure 2 shows the mean number of times English and Spanish speakers added an [s] to make plurals of mass and count nouns as a function of age. The Age Noun interaction suggests that the morphological contrast between mass and count nouns reliably differed between 5 and 7 years of age for both language groups, with those 7 years and older making a sharper distinction than the five-year-olds. However, the Noun Age Language interaction further qualified this 2-way interaction. For English speakers, there was only a reliable difference between five and nine years of age in the tendency to add an s to make the plural of count nouns ( p < .05, Tukeys HSD). There was no reliable difference in the way English speakers treated mass nouns as a function of age. For Spanish speakers, there was a reliable difference in the treatment of both count and mass nouns as a function of age: 5-year-olds reliably differed from participants who were 7 years old and older ( p < .01, Tukeys
Figure 2. The mean percentage of times English and Spanish speakers added an [s] to make a plural of the respective count and mass nouns in their languages as a function of age.
HSD); and the adults differed reliably from all the children ( p < .01, Tukeys HSD). It appears that between 5- and 7-years of age there is a strong tendency for Spanish speakers to add a plural [s] to both count and mass nouns. After this over-regularization period, around 9 years of age, the use of the plural [s] begins to decrease for mass nouns. In sum, the three main findings from this analyses are that (1) both English and Spanish speakers distinguish between count and mass nouns in their formation of plurals; (2) within speakers of English this distinction is sharper and develops earlier; and (3) there is a tendency for Spanish speaking-children between 5 and 7 years to over-regularize the plural [s], and to use it for more nouns than younger Spanish-speaking children and adults. 3.2.2. Individuation Task. We analyzed the data from the individuation task using the same strategy that we used to analyze the data from the plural formation task, by classifying nouns as count or mass in terms of the adult speakers judgments in the plural formation task. We then examined whether the count and mass nouns in each language were represented as individuated objects by calculating the percentage of times fragments were chosen in response to a piece of ____ (un pedazo de____ ) for each kind of noun. These percentages
432
Figure 3. The percentage of times English and Spanish speakers chose the fragment when asked for a piece of count and mass nouns as a function of age.
appear in Figure 3 and were entered into a Language (English or Spanish) Age (5, 7, 9, or adult) Noun (count or mass) ANOVA. The analysis yielded reliable main effects of Language (F(1, 88) = 107.4, p < .001) and Noun (F(1, 88) = 29.4, p < .001); reliable two-way interactions between Noun and Age (F(1, 88) = 8.5, p < .001) and Noun and Language (F(1, 88) = 38.11, p < .001)); and a reliable three-way interaction among Noun, Language, and Age (F(3, 88) = 38.11, p < .001). The main effect of Language and the Noun Language interaction reflected the fact that speakers of Spanish almost always chose the object fragment in the task, suggesting that they represented all of the nouns as individuated objects, regardless of how their plurals were made. In contrast, the results from English speakers suggest that count nouns are more likely to be represented as individuated objects than mass nouns, but that distinction emerges between these nouns as a function of age. We interpret the 2-way interactions between Age and Noun and between Noun and Language in the context of the 3-way interaction among all the factors. We found no difference in how the Spanish speakers treated the count versus the mass nouns at any age. The English-speaking 5-year-olds also did not distinguish count from mass nouns, and tended to interpret all the nouns as individuated regardless of plural formation, like the Spanish speakers. At 7-years of age there was a trend
Count-mass constructions and individuation in Spanish 433 for the English speakers to represent the count nouns as more likely to be individuated than the mass nouns ( p = .086, t-test). The English-speaking 9-yearolds and adults distinguished count from mass nouns in this task ( p < .01, t-test). They were more likely to treat the count nouns as individuated objects than the mass nouns. In sum, there was a tendency for speakers of both languages in this task to treat these nouns as referring to individuated objects. However, this tendency was greater for Spanish than for English speakers, and within English speakers the tendency decreased with development. In addition to the analyses of group data from each task above, we also examined the performance within individuals across the two tasks. For each set of nouns (count or mass as judged by adult plural formation in each language) we counted the number of times participants gave consistent responses across the two tasks. Participants could be consistent in the following four ways. They could be consistent by (1) choosing a fragment in response to the nouns in the Individuation Task and adding an [s] in the Plural Formation Task to those same nouns; (2) choosing a fragment in the Individuation Task and consistently not adding an [s] in the Plural Formation Task to the nouns; by (3) choosing a whole object in the Individuation Task and adding an [s] in the Plural Formation Task; or by (4) choosing a whole object in the Individuation Task and not adding an [s] in the Plural Formation Task. Choices were classified as consistent if they were made 60% of the time or more in both tasks. The maximum total possible number of either consistent or inconsistent choices within each language and age group was 24. Table 3 shows the number of consistent choices as a function of age for each group. So, for example, the English-speaking adults pointed consistently to the fragment in the Individuation Task and added an [s] when making a plural of these same nouns 12 times. They consistently pointed to the whole object and did not add an [s] to nouns 8 times. So their choices were consistent 20 out of 24 times. The Spanishspeaking adults were also consistent. They consistently added an [s] to make the plural for 13 of the nouns for which they selected the fragment. However, they also consistently did not add an [s] to 11 of the nouns for which they also chose the fragment. There were fewer consistent fragment choices within the Spanish speakers when they did not add an [s] to make plurals of the nouns- offering some evidence of a stronger link between the plural [s] and individuation than between the absence of the plural [s] and individuation (36 versus 53 choices) in Spanish- but these differences were not statistically reliable. From these numbers, one can also calculate the probability of adding an [s] to a noun whose referent loses its identity as a whole object when it is subdivided (i.e., adding a plural [s] to nouns that refer to individuated objects), and not adding an [s] to a noun that retains its identity as a whole object when its referent is subdivided (i.e., not adding a plural [s] to nouns that refer to aggregates). When English speakers pointed to fragments in response to particular
434
Table 3. The number of consistent choices made by English and Spanish speakers across the Individuation and Plural Formation Tasks of Experiment 1. FRAGMENT CHOICES English +[s] 5-yr-olds 7-yr-olds 9-yr-olds Adults Total 7 9 12 12 40 [s] 3 6 4 0 13 Pr +[s] .700 .600 .750 1.00 .754 +[s] 11 14 15 13 53 Spanish [s] 8 9 8 11 36 Pr +[s] .570 .600 .652 .542 .596
WHOLE OBJECT CHOICES English +[s] 5-yr-olds 7-yr-olds 9-yr-olds Adults Total 3 3 0 0 6 [s] 2 5 2 8 17 Pr +[s] .600 .375 .0 .0 .261 +[s] 0 0 0 0 0 Spanish [s] 0 0 0 0 0 Pr [s]
nouns in the Individuation Task, the probability that they would add an [s] to the same nouns in the Plural Formation task was .754. However, if they pointed to a whole object, the probability of adding an [s] was .261. Thus, they tended to add a a plural [s] to the nouns that referred to individuated objects, and to not add a plural [s] to the nouns that referred to aggregates. At 5 years of age they did not seem to be distinguishing between the referents of nouns by plural morphology as strongly. When English-speaking 5-year-olds chose fragments, the probability of their adding an [s] to the nouns was .700 but when they chose the whole object their probability of adding an [s] was almost as high, at .600. In contrast, the unique link between plural [s] and individuation among Spanish speakers was weak at all age groups, ranging from .542 to .600. However, because these numbers are small and the Spanish speakers never selected the whole objects, none of these differences could be statistically confirmed. Taken together, the results from these analyses suggest the following regarding the developing relation between plural formation and individuation in English and Spanish. At five years of age, children in both language groups make a distinction between count and mass nouns when forming plurals that refer to aggregates. However, the distinction is stronger among English-speaking children. At the same age, speakers of both languages show the tendency to interpret the referents of both types of nouns as individuated objects. However, this
Count-mass constructions and individuation in Spanish 435 tendency is stronger among Spanish speakers. As the plural distinction between count and mass nouns becomes stronger among English speakers between 5 and 7 years of age, they begin to interpret mass nouns as not (always) referring to individuated objects. For Spanish speakers, the distinction between count and mass nouns in plural formation becomes weaker, and by the time they make a strong distinction between count and mass nouns in plural formation as adults, this morphological distinction does not lead them to interpret mass nouns as referring to anything but an individuated object. It should be noted that all of the stimuli that we used were aggregates, and the same pattern of results may not hold for other stimulus items. It should also be noted that our results might have been affected by the fact that the adults in this experiment were fluent in English. Bilingual Spanish-English speakers are probably more similar to monolingual English speakers than Spanish monolinguals would be. Thus, our results might underestimate the differences between monolingual English- and Spanish-speaking adults. 3.3. Discussion
The relation that holds in English between plural formation and individuation does not appear to hold in Spanish. Our results suggest that at least some Spanish nouns that refer to aggregates are represented as individuated objects irrespective of how their plurals are made. These results may explain why Gathercole (1997) found that Spanish-English bilingual children do not make use of the English count-mass distinction when establishing reference in English. The Spanish version of the distinction does not seem to have the same referential force, and Spanish-speaking children may be over-generalizing the relation between plural formation and individuation in Spanish to their knowledge of English. Our findings regarding the relatively later age of acquisition of countmass syntax within monolingual Spanish speakers is also consistent with Gathercoles (1997) developmental findings that it is not until 9 years of age when Spanish-English bilinguals make any distinction between English count and mass nouns at all. More importantly, our findings suggest that speakers of English and Spanish may have different linguistic tools for making inferences about new concepts. If someone says that they bought soap (jabon) a speaker of English might infer that a powder or liquid was purchased whereas a speaker of Spanish would likely infer that a bar was purchased. However, another possible explanation for our results from this experiment may be differences between the two languages in the word piece (pedazo). Dictionaries define both piece and pedazo as having two meanings. The Oxford American Dictionary of Current English (2002) gives the following two meanings for piece: (1) one of the distinct portions forming part of or broken off from a larger object; and (2) each of the parts of which a set or category is composed. Its Spanish
436
counter-part, El Diccionario Manual de la Lengua Espaola (2007) also offers two definitions for pedazo: (1) Parte separada de una cosa que se ha partido o roto [a part of something that is separate and has cracked or broken]; and (2) Parte de un todo o unidad, que se considera de manera independiente [part of a whole or a unit that is considered in an independent manner]. The Spanish dictionary also defines the common expression, un pedazo de pan, which literally translates to a piece of bread as referring to a very generous person so it seems possible to use pedazo in Spanish to refer to whole objects. Of course it is not clear from these dictionary definitions the frequency with which each meaning is used in each language. The second meaning may be more frequently in English than Spanish. Another factor that could be viewed as weakening these results is that we used a relatively small set of nouns, and the differences between English and Spanish may only apply to the particular nouns that were used. Consequently, we attempted to replicate the difference between English and Spanish speakers in Experiment 2 with a different task using novel nouns. 4. Experiment 2 In Experiment 1, we found that Spanish speakers were more likely than English speakers to treat the referents of mass nouns as individuated objects. Because nouns that refer to solid entities are either count or mass, Spanish speakers should be more likely than English speakers to interpret the referent of any noun that refers to a solid entity as an individuated object instead of an aggregate. To test this hypothesis, we presented adult native speakers of each language with novel nouns in neutral syntax, and asked them to pick whether they thought the noun referred to a single individuated object or to a collection of homogenous constituents of the same kind (i.e., an aggregate). Based on our results from Experiment 1, we predicted that the Spanish speakers would be more likely than English speakers to interpret the nouns as referring to individuated objects. 4.1. Method
4.1.1. Participants. Forty-six adult native English speakers who were not fluent in any other language (31 women and 15 men) and forty-six adult native Spanish speakers who were not fluent in any other language (13 women and 33 men) participated in the study. Each participant was paid $7. The English speakers were either participants in summer research programs at a midwestern university or were recruited after responding to flyers posted around the campus. The Spanish speakers were students in an introductory ESL class at CLUES (Chicanos y Latinos Unidos en Servicio) that had recently immigrated to the United States from Mexico.
Figure 4. A photograph of a sample stimulus set from Experiment 2 Table 4. Descriptions of the stimuli used in Experiment 2 Item 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Form 1 tiso doff peeva cupa brima mlo syba musa cuta l Form 2 l musa tiso peeva cupa brima cuta mlo doff syba Object Description black, white, and red modeling foam shaped into three circles stacked on top of each other (aggregate: full circle; individuated: quarter circle) green pipe cleaner formed into pacman shape (aggregate: string of three shapes; individuated: single shape) blue pipe cleaner formed into square u-shape (aggregate: string of five shapes; individuated: single shape) silver wick holders used for candle-making (aggregate: five holders; individuated: single holder) large bubble wrap, brightly painted (aggregate: nine bubbles; individuated: single bubble) soup beans of various sizes, brightly painted (aggregate: collection of beans; individuated: one bean) green, plastic aquarium plant (aggregate: whole plant; individuated: single leaf ) brightly colored, plastic, foot-shaped door stoppers (aggregate: two intertwined stoppers; individuated: single stopper) small, brightly colored fuzzy balls (aggregate: round string of balls; individuated: single ball) red and blue modeling foam shaped into two circles stacked on top of each other (aggregate: full circle; individuated: half circle)
4.1.2. Materials. Ten novel stimuli were constructed and mounted on cardboard; each item was created to have both an individuated and an aggregate form (see Table 4 for descriptions of the stimuli; Figure 4 shows a sample stimulus set). Each stimulus was randomly assigned to one of ten novel words that were constructed to sound plausible in both Spanish and English: l, musa, tiso, peeva, cupa, brima, cuta, mlo, doff, or syba. This process was repeated so that each item was assigned two different novel labels; during testing, the label used for each stimulus was alternated between subjects while the order of presentation remained the same.
438
4.1.3. Design and procedure. Participants were tested individually. Each was seated across the table from the experimenter and given instructions in her or his native language. All participants were told that the experimenter would show them two objects and say a word, and asked to point to the object that they thought the word referred to. They were then presented with the individuated and aggregate forms of the first stimulus, and the English speakers were asked, Which is my ____ (tiso)? while the Spanish speakers were asked, Cual es m ____ (tiso)? After they responded, the participants were similarly presented with the nine remaining stimuli. 4.2. Results
We compared the mean number of times out of the ten trials that English speakers chose the individual object over the aggregate form (Mean = 4.783) to the mean number of times out of the ten trials that Spanish speakers chose the individuated form over the aggregate form (M = 5.478). This difference was statistically reliable (t = 1.997, one-tailed, p < .05). So on average, the Spanish speakers selected the individual object more frequently than did the English speakers. The data from individual participants and individual items is also consistent with this finding. Twelve of the Spanish participants picked the individuated items 70% of the time or more, compared to seven English participants. The responses across the different stimulus items show a similar pattern. For 7 of the 10 items (items 2, 3, 4, 5, 6, 7, and 10), Spanish speakers chose the individual object more often than the English speakers. For one item (item 8) both groups picked the single object an equal number of times. For 2 of the 10 items (items 1 and 9) English speakers chose the single object more often than the Spanish speakers. 4.3. Discussion
The findings from this experiment offer converging evidence for the results from Experiment 1. When presented with a novel word in neutral syntax, Spanish speakers view its referent as an individuated object more often than English speakers do. The different kinds of stimuli used as well as the pattern of responses from individual participants suggest that this difference is most likely due to linguistic difference between the two groups rather than to the effect of some particular characteristic of the stimuli. Additionally, the fact that we used ten different stimulus sets makes it unlikely that a specific quality of one of the items (such as its similarity to an object that has different count/mass status or representation in Spanish and English) caused the observed difference between the frequencies of individuated-object choices made by speakers of the two languages. The results of this study support the hypothesis that speakers
Count-mass constructions and individuation in Spanish 439 of English and Spanish interpret nouns in different ways; namely, that Spanish speakers are more likely than English speakers to interpret both count and mass nouns as referring to individuated objects. 5. General discussion We began this study by asking whether speakers of Spanish, like speakers of English, make a distinction between count and mass nouns in their formation of plurals, and whether this distinction corresponds to differences in how the nouns referents are construed. Our results are clear. Adult Spanish speakers, like English speakers, distinguish between Spanish count and mass nouns in their formation of plurals. However, this morphological distinction was not as strong as it is in English, and perhaps more importantly, it does not seem to correspond to a distinction between individuated objects and aggregates as it does in English. In this discussion, we consider three possible reasons for the lack of correspondence between count-mass morphology and count-mass semantics in Spanish. The most likely reason why Spanish speakers are more likely than English speakers to construe a noun as individuated entity (regardless of how its plural is made) is the proportion and kind of nouns that move into count noun constructions in Spanish. Our finding that Spanish speakers at every age group make a weaker distinction between count and mass nouns in their use of the plural [s] in our sentence completion task and in Experiment 1 is consistent with previous reports that Spanish mass nouns move more easily into count noun constructions than English mass nouns (Stockwell et al., 1977). Yet this fact alone does not explain why nouns in Spanish tend to be construed as individuated objects as opposed to non-individuated ones. We believe that in addition to the acquisition of language specific count and mass nouns, cognitive development also plays an important role. Young English-speaking childrens vocabularies are dominated by count nouns (Samuelson and Smith, 1999), and there is no reason to believe that the vocabularies of young Spanish speakers would be different in this regard. Because most of the Spanish nouns that do not occur in count noun constructions refer to abstract entities (e.g., justiciajustice and paciencia-patience)nouns that are typically learned by older childrenearly Spanish vocabularies may be completely dominated by nouns that appear exclusively in count noun constructions. This dominance of count nouns, especially in early Spanish vocabularies, may lead Spanish speakers to overgeneralize the semantics of count nouns to all nouns. While our findings suggest that young English speakers (like Spanish speakers of all ages) overgeneralize the interpretation of count nouns as individuated objects to mass nouns, unlearning this overgeneralization seems to be considerably more difficult for Spanish speakers.
440
In other words, our findings suggest that by 5 years of age, both Englishand Spanish-speaking children interpret the referents of both count and mass nouns as individuated objects. Between 5 and 7 years of age, English-speaking children likely acquire more nouns than Spanish speakers that appear exclusively in mass noun constructions and begin to construe the referents of these nouns as non-individuated. For Spanish speakers, the acquisition of the nouns that exclusively occur in mass nouns constructions likely begins later, perhaps around 9 years of age. However, by this age, the classes of individuated and non-individuated objects are well-established on the basis of other semantic features (see e.g., Keil, 1979), and are thus much more difficult to re-organize. A second reason for what seems to be a less semantically powerful class of mass nouns that refer to aggregates in Spanish may be the kind of morphological information needed to establish different classes of nouns in the first place. Another potentially important difference between English and Spanish mass-count constructions that has received little attention may involve the role of determiners. As noted by Stockwell et al. (1977), the English indefinite determiner, a, only appears with English count nouns. So mass nouns do not always require determiners in English. For example, in English, one can say Sugar is expensive. However, in Spanish a determiner may be required in such cases. For example, *Azucar es cara (Sugar is expensive) is not acceptable in Spanish. One must say La azucar es cara literally meaning The sugar is expensive. So a Spanish determiner may be required in certain mass noun constructions that do not require one in English, and this use of determiners in Spanish may be another feature that leads to the treatment of nouns in Spanish as count nouns are treated in English. The main point of this example is that more morphological distinctions in addition to plural morphology may be necessary in order to form distinct noun classes in the first place. We know of no research that specifies how much contrasting morphological information must exist in order to form different noun classes, nor on the relative strength of inflexions versus determiners in establishing word classes. The third possible explanation of these findings is thatas in all experimental workour findings might be limited by the particular items and tasks that we used. For example, all of the entities we studied were aggregates that did not vary systematically. Different results might be found for entities across other areas of the continuum between solid objects and substances. Another possibility is that our pieces task in Experiment 1 may not have measured the relevant knowledge in Spanish speakers. However, we offer converging evidence using a different task in Experiment 2. Because the results from two very different tasks point to the same conclusionthat Spanish speakers are more likely to interpret nouns as individuated objects than English speakerswe believe that our results are valid.
Count-mass constructions and individuation in Spanish 441 In sum, there seem to be many differences between languages in the ways they divide nouns into count and mass classes. Much recent work in cognition has focused on differences between speakers of such languages in categorization and word-learning (e.g., Lucy, 1992; Imai and Gentner, 1997; MuellerGathercole, 1997). Past work, however, fails to provide evidence on the underlying linguistic differences under study and the mechanisms by which such language effects are supposed to work. The research reported in this paper specifically focused on plural formation in English and Spanish speakers, and the relation between plural morphology and the interpretation of a noun as an individuated object. We found that Spanish speakers are more likely than English speakers to represent the meanings of nouns that refer to aggregates as individuated objects regardless of how their plurals are made. More generally, our research begins to offer evidence on the morphological processes that are associated with differences across languages in the interpretation of nouns. Received 17 September 2007 Revision received 15 October 2009 References
Berko, J. 1958. The childs learning of English morphology. Word, 14, 150177. Brown, R. 1957. Linguistic determinism and the parts of speech. Journal of Abnormal and Social Psychology, 55, 15. Colunga, E., Gasser, M. and Smith, L. B. 2002. Attention to different cues in noun learning: The effect of English vs. Spanish count/mass syntax. Proceedings of the 26th Annual Boston University Conference on Language Development. pp. 107117 El Diccionario Manual de la Lengua Espaola (2007) French and European Publications Incorporated. Imai, M. and Gentner, D. 1997. A crosslinguistic study of early word-learning: Universal ontology and linguistic influence. Cognition, 62, 2, 169200. Keil, F. 1979. Semantic and Conceptual Development: An Ontological Perspective. Cambridge, MA: Harvard University Press. Kuo, J. Y. and Sera, M. D. 2009. Classifier Effects on Human Categorization: The Role of Shape Classifiers in Mandarin Chinese. Journal of East Asian Linguistics, 8, 119. Langacker, R. W. 1987. Foundations of cognitive grammar: Theoretical perspectives. Stanford University Press, California. Li, C. N. and Thompson, S. A. 1981. Mandarin Chinese: A Functional Reference Manual. Berkeley: University of California Press. Middleton, E. L., Wisniewski, E. J., Trindel, K. A., and Imai, M. 2004. Separating the chaff from the oats: evidence for a conceptual distinction between count noun and mass noun aggregates. Journal of Memory and Language, 50, 371394. Lucy, J. 1992. Grammatical categories and cognition. Cambridge University Press, UK. MacNamara, J. 1982. Names for things: A study of human learning. MIT Press. Mueller-Gathercole, V. 1997. The Linguistic Count/Mass Distinction as an Indicator of referent Categorization in monolingual and bilingual children. Child Development, Vol 68, No. 5, pp. 932842.
University of Minnesota
442
Quine, W. V. O. 1969. Ontological relativity and other essays. Columbia University Press, New York, NY. Samuelson, L. and Smith, L. B. 1999. Early noun vocabularies: do ontology, category structure and syntax correspond? Cognition, 73, 133. Soja, N., Carey, S., Spelke, E. 1991. Ontological categories guide young childrens inductions of word meanings: object terms and substance terms. Cognition, 38, 179211. Stockwell, R. P., Bowen, J. D., and Martin, J. W. 1977. The grammatical structures of English and Spanish. University of Chicago Press. The Oxford American Dictionary of Current English (2002 edition) Oxford University Press.
Fields and settings: French il and a impersonals in copular complement constructions

MICHEL ACHARD*
Abstract This paper argues that in the context of the copular complement construction (est possible que is possible that for example), French possesses two imper sonal constructions respectively introduced by il it and a (c) this. This analysis runs counter to most syntactic accounts which structurally distinguish impersonals (il) from dislocated (a) constructions. Two arguments are pro posed in defense of the two impersonals analysis. First, following the Cog nitive Linguistics tradition, it shows that il should not be considered a meaningless dummy but a referential (albeit general) expression. Secondly, a comparison with ceci this, a pronoun with an unquestionable cataphoric sense, reveals that a cannot be considered a cataphoric pronoun, and that its meaning within the context of the copular complement construction is very close to ils. Consequently, the syntactic competition between these two pro nouns reflects their conceptual overlap, and their distribution in discourse is motivated by their semantic differences. Beyond its relevance to the under standing of the il/a distribution, the analysis presented in this paper not only argues in favor of a broader account of impersonals than syntactic accounts generally advocate, but also provides a way of constraining the kinds of con structions which receive the impersonal label. Keywords: Impersonal constructions; Cognitive Grammar; Demonstrative pronouns; Setting constructions
* Address for correspondence: Dept. of Linguistics, Rice University, MS23 6100 Main Street Houston USA 77005-1892. Email: achard@rice.edu Cognitive Linguistics 213 (2010), 443500 DOI 10.1515/COGL.2010.016 09365907/10/00210443 Walter de Gruyter
444
M. Achard
1. Introduction In the French copular complement constructions illustrated in (1) and (2), the pronouns il it and a (c/ce/cela) this/that could easily be substituted for each other without any noticeable semantic distinction:1 Pour la premire fois je sentais quil tait possible que ma mre vct sans moi (Proust, M. A lombre des jeunes filles en fleurs: 648) For the first time I felt it was possible for my mother to live without me (2) bien sr que la journe ne se passera pas sans pluie. Ce ntait pas possible que a reste comme a, il faisait trop chaud (Proust, M. Du ct de chez Swann: 101) Of course the day will not finish without rain. It [this] was not possible the weather would stay that way, it was too hot (1) Despite their quasi interchangeability, il and ce/a are most frequently given different treatments in both traditional grammars and more recent syntactic analyses. In those accounts, il is viewed as semantically empty, a dummy or pleonastic element (Chomsky 1981: 26): Intuitively, we know that il (. . .) does not refer to anythingit does not represent an argument of the verb which can be characterized in terms of a theta-role such as Agent, Experiencer, etc. Rather, its function appears to be purely syntactic, satisfying a requirement that all finite clauses must have a subject (Jones 1996: 120). By contrast, the neuter pronouns ce and a are treated as full-fledged demonstratives: referential expressions which refer forward to the finite or infinitival clause (Jones 1996: 128). The construction illustrated in (1) can therefore be called impersonal, while the one illustrated in (2) cannot.2 Jones clearly expresses the dis1. The four demonstrative forms ce, c, a, and cela will be considered together for the purposes of this paper. This position is justified by the diachronic relation which exists between the forms. Cela (along with the less frequently used ceci) is a compound form of ce, and a is a reduced form of cela (Brunot 1936, Wilmet 1997). Secondly, all four forms are attested in the constructions under investigation, even though their distribution is quite specific (Olsson 1986). Ce, along with a and cela only occurs preceding a consonant initial form of the tre be copula (ce/a/cela serait bien de les revoir it would be nice to see them again). By contrast, c alone is possible with a vowel initial form of the copula (c/*a/*ce/*cela est bien de les revoir it is nice to see them again). With all other verbs, a and cela are often claimed to indicate register differences, a being more colloquial. The four demonstratives therefore exhibit a common semantic core which justifies their being treated together. Because the investigation of the more fined grained distinctions between them would necessitate the examination of socio linguistic considerations which lay well beyond the scope of this paper, I will leave the matter for further research. For an analysis of French demonstratives which distinguishes ce and a, see Moignet (1974). French impersonals are commonly divided in two categories whose names vary extensively throughout the literature. The intrinsic impersonals (Jones 1996: 66) only occur preceded by
2.
French il and a impersonals in copular complement constructions 445 tinction between the two structures: Impersonal sentences are constructions in which the subject position is occupied by a dummy pronoun il, which does not refer to anything (Jones 1996: 120), whereas: It is postulated that the constructions with ce or a are not impersonal constructions, but dislocated constructions analogous to elle est arrive, Marie she has arrived, Mary (Jones 1996: 128).3 This paper argues that the structural distinction between impersonal and dislocated constructions when il and a are (largely) interchangeable is unwarranted, and that consequently, in the context illustrated in (1) and (2), both il and a constructions should be considered impersonals. Two arguments are presented in defense of this position. The first one follows the tradition established both in French grammar (Wartburg and Zumthor 1958; Schehaye 1950; Galichet 1947 inter alia), as well as more generally in functional/ cognitive linguistics (Bolinger 1973, 1977; Moignet 1974; Kirsner 1979; Lakoff 1987; Langacker 2004, 2009; Smith 1985, 2006) in showing that impersonal il is not a dummy pronoun, but a referential (albeit general) expression. Secondly, a comparison of a with the unquestionably cataphoric ceci this reveals that these demonstratives are not cataphoric pronouns in the strict sense, but play a role very similar to il with respect to the copular complement construction.
il in the third person masculine singular. Derived impersonals alternate with personal counterparts (il est arriv une dizaine denfants / une dizaine denfants sont arrivs approximately ten children arrived). Impersonal verbs are analyzed as lacking an external argument, their theta-roles being assigned to their complements. For the intrinsic impersonals, the verbs lexical entry has an empty slot in subject position, to be filled at a later stage by the dummy il. For the derived impersonals, the dummy pronoun is inserted as a result of a movement of the real subject ( jouer du piano lui plait playing the piano pleases him / il lui plait de jouer du piano it pleases him to play the piano). Joness analysis is not representative of the state of the art in syntactic theories, but although the most current accounts reflect the increasing sophistication of the linguistic models, they have remained consistent in their basic assumption about the impersonal pronouns dummy status (Radford 2004: 291 ff; Gledhill 2003: 131; Rowlett 2007: 133). Jones analysis is selected as representative of the syntactic position because of its clarity, as well as its lack of overly technical language. In addition to the semantic difference between il and a, another argument in favor of their structural distinction is based on the two pronouns distribution in other constructions. More specifically, since a cannot replace il in the contexts illustrated in (i) and (ii), the two forms must be structurally distinct. #Il/*a faut revenir nous voir. You must come back and see us (ii) #Il/*a est arriv deux enfants There [it] arrived two children (i) This argument will not be considered in detail, since, given the pronouns well attested polysemy, it doesnt seem particularly problematic to call a an impersonal in a specific construction, even if it is not interchangeable with il in other contexts.
3.
446
M. Achard
The divergence of opinions with respect to the analysis of the a construction in (2) illustrates the difficulty of properly characterizing impersonals not only in French but cross-linguistically. To date, syntactic accounts have tended to interpret the concept narrowly by categorizing a given construction as impersonal based solely on the nature of the subject of its predicate: Impersonal constructions are seen to either lack a grammatical subject altogether or alternatively feature only a pleonastic (semantically empty) subject, be it an overt one or potentially a covert one (Siewierska 2008: 4). Conversely, more functionally inspired analyses have taken a broader stance whereby any construction where the situation depicted is brought about by an unspecified or defocused agent can be considered a potential impersonal. Both types of analyses have encountered their share of difficulties. For instance, the reminder of this paper shows that syntactic accounts are incapable of capturing the commonalities between il and a in examples such as (1) and (2). On the other hand, functional solutions have found it difficult to precisely define the range of impersonal constructions. For example, Siewierska (2008: note 3) notes that if all impersonals lack a definite human agent as subject, and may all be seen as a means of agent backgrounding or defocusing (see also Sans 2005; Slo 2007), these characteristics cannot adequately distinguish them from other constructions such as anticausatives or unaccusatives with similar functions. Beyond the distribution of il and a, the analysis presented in this paper lays the foundations for a more general treatment of impersonals which addresses i) the overly restrictive nature of syntactic accounts, and ii) the difficulties in deciding which constructions should be called impersonals. With respect to i), it illustrates the need to broaden the scope of impersonals beyond the dummy subject constructions by challenging the justification for keeping il and a structurally distinct. With respect to ii), it adopts as a working hypothesis the position that impersonal constructions are characterized by two essential properties. First, the agent of the profiled process must be backgrounded or defocused, and secondly, that process must be general enough to be available to a generalized conceptualizer, namely anyone in a position to experience it (Langacker 2009: 115). These two conditions are shown to restrict the scope of a impersonals to the strict confines of the copular construction.4 It should be noted, however, that outlining a precise inventory of French impersonals is well beyond the scope of this paper. In fact, the analysis presented here shows that the very existence of an impersonal category should not be taken for granted, but carefully demonstrated, and that each potential candidate needs to be thoroughly examined before a final decision is made. This paper begins the
4.
A similar claim could be made for weather verbs which can also be followed by il and a. These constructions will not be considered in this paper.
French il and a impersonals in copular complement constructions 447 investigation with the distribution of il and a. It is organized in the following fashion. Section 2 briefly reviews the Cognitive Grammar treatment of impersonal constructions. Section 3 provides an analysis of the il copular complement constructions. Section 4 explores as meaning, and shows that in the context of the copular complement construction, it fits our working definition of an impersonal. Section 5 shows that the semantic difference between il and a directly accounts for the distribution of the two pronouns when they are mutually incompatible as well as when they overlap semantically. Section 6 concludes the paper and briefly introduces possible avenues to extend the methodology adopted for demonstratives to other potential impersonal constructions. The data on which the analysis is based come from two sources, namely the FRANTEXT database of 760 20th century French texts, and a corpus of journalistic prose composed of approximately 20.5 million words from the Agence France Press (AFP) news agency (19941995). The few manufactured examples are preceded by the # sign. 2. The Cognitive Grammar view of impersonals
In Cognitive Grammar (henceforth CG, Langacker 1987, 1991, 2008), impersonal constructions exhibit some departure from the coding of a prototypical transitive clause. Its specificity can therefore only be understood if this prototypical coding is presented first. 2.1. Toward prototypical clause structure: Conceptual models
In the CG view of language and cognition, all language forms are ultimately grounded in human experience. Consequently, despite its complexity, the base relative to which clause structure is best described consists of a relatively small number of very general models and archetypes (Langacker 2008: Ch. 11) which represent specific aspects of our conceptual organization, and can be exploited for linguistic purposes. The first one of these archetypes pertains to the organization of a scene into a global setting and any number of smaller, more mobile participants (Langacker 2008: 355, emphasis in the original) which interact with one another, and may occupy different locations at different times. The notion of interaction between the participants makes necessary reference to the billiard-ball model which represents our conception of objects moving through space and impacting one another through forceful physical contact (Langacker 2008: 355), as well as our knowledge that some objects possess the inner resources to provide the necessary energy, while others merely transmit or absorb it. Based on this model, the archetypal conception of an action chain represents a series of forceful interactions, each involving the transmission of energy . . . from one participant to the next. (Langacker 2008: 355356).
448
M. Achard
In addition, the roles of the participants in the described event are understood relative to different kinds of conceptual archetypal roles which can be exploited for linguistic purposes. To provide just two examples, an agent is an individual who willfully initiates and carries out an action, typically a physical action affecting other entities. It is thus an energy source and the initial participant in an action chain. (Langacker 2008: 356). By contrast, the opposite role of patient is defined as something that undergoes an internal change of state . . . Typically inanimate and nonvolitional, a patient usually changes as the result of being affected by outside forces. It is then an energy sink and the final participant in an action chain (Langacker 2008: 356, emphasis in the original). The other roles will be considered as needed in the course of the analysis. Another archetype required for the characterization of clausal structure pertains to the manner in which we perceive the different facets of the world around us. The analogy here is that the conceptualization of the scenes we apprehend for the purpose of linguistic expression proceeds in a manner analogous to the special case of spectators watching a play. This stage model captures how we may focus our attention to different aspects of a scene to enhance the perceptual experience. Langacker (2008: 356) expresses the analogy as follows: the maximal field of view, the onstage region, and the focus of attention correspond respectively to an expressions maximal scope, immediate scope, and profile. Closely related is a group of archetypes which pertain to the speech event itself, and involves models of speaking, listening, and engaging in social interaction, as well as different models of viewing arrangement which pertain to the perceptual asymmetry that exists between the subject of perception (conception) and the perceived (conceptualized) object. Specific configurations will be introduced at different points of the analysis, but the default viewing arrangement consists of two interlocutors being together in a fixed location, using a shared language to describe occurrences in the world around them (Langacker 2008: 357). The canonical event model represents a way of integrating these different interconnected archetypes into what is arguably the most typical kind of occurrence (Langacker 2008: 357). More specifically, this occurrence is identified as a bounded, forceful event in which an agent (AG) acts on a patient (PAT) to induce a change of state. This event is the focus of attention within the immediate scope or onstage region (IS), being apprehended from offstage by a viewer (V) not otherwise involved in it. All of this unfolds within some global setting (Langacker 2008: 357). The canonical event model is illustrated in Figure 1, where MS stands for maximal scope of predication. Finally, in order to describe a large number of perceptual, physical, social, or mental control events, Langacker (2002, 2004) described the control cycle. At any given moment (the baseline), an actor (A) has control over a certain number of entities which collectively constitute his dominion (D). In the next
French il and a impersonals in copular complement constructions 449
Figure 1. Canonical Event Model ( from Langacker 2008: 357)
Figure 2. The control cycle ( from Langacker 2004: 536)
phase, a target (T) enters his field of potential interaction (F), thus creating a state of tension which needs to be resolved. One way of resolving this tension consists in bringing the target to the actors dominion by exerting force onto it. The outcome of this action phase is a modified static dominion which incorporates the newly acquired element. The different phases of the model are illustrated in Figure 2. Although different manifestations of this cycle perpetually unfold in the different domains of human experience, for the purposes of this paper, we will pay particular attention to epistemic and social control. Epistemic control pertains to the acquisition of propositional knowledge (Langacker 2009: 131). The actor is a conceptualizer, the target a proposition which represents a facet
450
M. Achard
of the world around her, and the dominion is her view of reality, composed of the propositions she holds true (Achard 1998, 2002). Social control pertains to the manipulation of other individuals behavior, according to a set of expectations and obligations. Our conceptualizations need to be linguistically coded for the purpose of expression, and because grammar is composed of conventionalized composite units, certain types of clauses are particularly well suited to express certain types of events. Among all the possible types, the transitive clause, which codes the interaction between an agent selected as the trajector of the profiled relation (the most focal figure) and a patient coded as its landmark (the second most focal figure), is perhaps the most common, because it allows the participants with the highest degree of cognitive salience, (the agent and patient), to also be treated as the two most focal figures in the linguistic representation of the conceptualized event. This alignment between focal prominence and semantic roles is illustrated in (3) and represented in Figure 3: (3) En quinze jours, la 1re arme a tu 10000 allemands, fait 18000 prison niers, enlev 120 canons. (Gaulle, C. de. Mmoires de guerre, le salut: 136) Within two weeks, the first army killed 1000 Germans, captured 18000 prisoners, removed 120 guns.
A transitive clause might represent the unmarked, possibly even prototypical way of coding sentences such as (3), but it by no means constitutes the only alternative. Coding is a matter of construal, and thus of speaker choice, and speakers have the possibility of giving focal prominence to any entity they choose. Nothing inherent to the scene conceptualized imposes the selection of any entity as a focal participant, and conversely, the elements made prominent linguistically need not be the most salient on non-linguistic grounds (Langacker 2009: 112). The multiple clause types available in any language illustrate the flexibility of construal because they provide conventionalized alternatives for the description of scenes with the same conceptual content. Human beings
Figure 3.
Coding of a transitive clause ( from Langacker 2008: 357)
French il and a impersonals in copular complement constructions 451 most likely share with other species the cognitive flexibility to construe a scene in alternate ways, but they alone possess the symbolic resources to express this flexibility linguistically. The constructions to be considered in the course of this paper, namely il and a impersonals, middles, and on indefinites all deviate from this prototypical clausal construal along two mains dimensions, namely i) alternative profile assignment, i.e., the selection of the most focal figure in the profiled relation (the subject), and ii) the level of specificity at which the nominals that describe the participants in the profile process are described or its delimitation (Langacker 2009: 123). Because this paper is specifically concerned with the il and a constructions, the emphasis will be placed on alternative profile assignment. 2.2. Clause structure: Alternative codings
Speakers often have good reasons not to select the agent as the linguistically most salient figure in the profiled relation despite its inherent cognitive salience. They may want to protect it from undesirable consequences, find it irrelevant to their description of their conceptualization, or more simply do not know its real nature. Languages therefore provide a variety of ways of defocusing the agent (Shibatani 1985) by removing the spotlight that its focal role in the profiled process inherently shines on it. Cross-linguistically, passive and middle constructions represent two of the most frequently attested ways of defocusing the agent by selecting the patient as the most focal figure in the profiled relation (Langacker 1982, 2006). There are, however, no constraints on the nature of the entity which can be chosen as most focally prominent for linguistic purposes. In particular, the location within which the profiled event takes place may be selected as the focal figure, and thus marked as subject. Two alternative construals which results in the selection of different entities as the trajector of the profiled relation are illustrated in the examples in (4) and (5), respectively represented in Figures 4a and 4b. (4) . . . des ombres douteuses grouillaient sur les votes souilles. (Gracq, J. Le rivage des Syrtes: 299) . . . dubious shadows were crawling on the soiled vaulted ceilings. (5) La chambre, ou cellule, o il se trouvait, grouillait dhommes et de femmes de blanc vtus. (Beckett, Samuel. Malone meurt: 137) The bedroom, or cell, where he was located was crawling with men and women in white. The scenes the examples in (4) and (5) depict are similar in that a group of participants is perceived as crawling within a certain location. Their linguistic representations, however, reflect different structurations of that scene by selecting alternative entities as the trajector of the profiled relation. In the
452
M. Achard
Figure 4.
Shift in construal from participant to setting (4b and 4c are from Langacker 2009: 118)
intransitive clause in (4), the participants (marked P in the diagram) are selected as trajector and thus marked as the subject. The location within which the crawling takes place is coded by the oblique complement introduced by the preposition sur on, and thus not considered a focal participant. In (5) by contrast, the location, namely la chambre, ou cellule, o il se trouvait the room or the cell where he was located is selected as the trajector of the profiled relation and therefore coded as the subject, while the participants, coded as the oblique introduced by the preposition de, are not treated as focal figures. In the locational subject construction, the (solid) line between the location and the process indicates that the profiled process is somehow crucially associated with that location (to be considered in further detail shortly). In (5), the location (the bedroom) is a relatively restricted area. Compared to a location, a setting is a global expanse within which events unfold (the difference is one of degree). (Langacker 2009: 118). Settings can also function as clausal trajectors, as illustrated in (6), and represented in Figure 4c. Whereas voir see normally selects an experiencer as its subject, the trajector in (6) is the spatial setting (France) within which the event coded by the object nominal can be experienced. Setting constructions do invoke a conceptualizer, but a generalized one (Langacker 2009: 118), or more precisely, in the case of (6), anyone with sufficient knowledge of art history to identify the rebirth of the interrupted trend. (6) partir du xviiie sicle, un sourd travail devient perceptible . . . La France voit une cole bourgeoise renouer avec la tradition interrompue de ses propres peintres de la ralit . . . (Huyghe, Ren. Dialogue avec le visible: 154) Starting in the 18th century, an obscure current begins to surface . . . France sees a bourgeois school reconnect with the interrupted tradition of its own reality painters . . . In the CG tradition, impersonals are closely related to setting constructions because, just like these constructions, they do not select participants as the main figure in the relation the predicate profiles. However, the entity impersonals select is considerably more abstract than the temporal or spatial setting within which the process is carried out. In order to characterize it properly, we
French il and a impersonals in copular complement constructions 453 need to recall that in the Control cycle model presented in the previous section, agents can only exert their dominion over their target if the latter is within their reach, or more precisely, within their zone of potential interaction or field. In the physical domain, the field seems relatively easy to delineate (even though it is not often made specific linguistically) as the perceptual and kinesthetic range within which agents can exert the necessary force to capture their target. In the epistemic domain, it is perhaps more difficult to identify precisely because the analogical notion of mental reach (Langacker 2009: 139) is much more diffuse. Nevertheless, it seems reasonable to suggest that any concept we seek to understand can only be grasped within a specific mental area comprised of the knowledge structures (to be understood in the broadest sense which includes the relevant perceptual and emotional input) which allows the conceptualizer to reach a conclusion about the epistemic status of the target. For instance, in the example in (7), the alternative edition of a book unknown to the hearer constitutes the target whose epistemic status is assessed by the speaker (underlined in the example). In order to conceptualize this target and assert its existence, the speaker relies on a knowledge base that includes at the very least the immediate circumstances which surround the publication of the work. In Cognitive Grammar, that knowledge base constitutes the field within which the profiled interaction can be conceptualized. (7) le dsespr que vous avez eu tant de peine vous procurer, dites vous, est, sans doute, ldition Soirat. Il en existe une autre qui vient de paratre, mon insu et sans mon autorisation . . . (Bloy, L. Journal: 74) The desperate person you say you had such trouble finding is most likely the Soirat Edition. There exists another one which was just released unbeknownst to me and without my permission. Langacker (2009) suggests that impersonal constructions select the field as the focal figure in the profiled relation, and that field is represented by the impersonal pronoun: I thus propose, as a general characterization, that impersonal it profiles the relevant field, i.e., the conceptualizers scope of awareness for the issue at hand (Langacker 2009: 139). If we adopt Langackers proposal to French, il in (7) codes the knowledge base (the mental range) which allows the speaker to assert the existence of the alternative edition.5 The impersonal construction is represented in Figure 5:
5. Achard (1998: 282) suggests that il profiles the immediate scope of the existential predication, that is the specific part of R [reality] immediately necessary for Es [the event] construal. Langackers definition adopted here is perfectly compatible with this statement, but it represents an improvement in that it directly connects impersonals to more general issues of clause structure. For a review of several accounts of the meaning of impersonal pronouns, see Langacker (2009: Ch. 5).
454
M. Achard
Figure 5. Il impersonal construction
In Figure 5, the field is marked F and indicated with a broken rectangle. It is selected as the trajector of the profiled process and thus marked as the subject. The process [exister exist in (7)] is also profiled. By contrast, the only participant marked as P [the new edition in (7)] is left unprofiled. As previously indicated for setting constructions, the (broken) line between the process and the field indicates a tight association between the two entities. The nature of that association, however, deserves further attention. On the one hand, it seems fairly straightforward. The very definition of the field confers to the latter a high level of responsibility with respect to the process coded by the predicate, because that process could only be uncovered against its background. On the other hand, the field was described as a given conceptualizers scope of awareness with respect to the profiled process, and it is therefore not immediately obvious how one individuals knowledge base concerning a process might be responsible for its occurrence. In order to reconcile these seemingly diverging observations, we need to remember that even though the conceptualizing experience which leads to the discovery of the missing address is indeed that of the speaker (or any relevant conceptualizer), it is not presented as such. In fact, one of the roles of impersonal constructions is precisely to present the conceptualized scene as the product of a generalized conceptualizer (Langacker 2009: 115). The basic idea is that any person in a similar position would invariably reach the same conclusion. For example, in (7), anyone with sufficient knowledge of the circumstances surrounding the publication of the new edition would invariably reach the same conclusion. In this sense, the missing address can be considered a property of the field.6
6. A possible objection to adopting Langackers analysis to French, and thus treating it and il in a similar manner comes from the distribution of these pronouns in other contexts. Synchronically, it also functions as a neuter 3rd person singular pronoun, but il marks a 3rd person singular masculine agreement on the predicate. This apparent difficulty, however, is quickly solved diachronically. The pronoun il comes from the Latin neuter third person demonstrative illud which might be translated in French by cela this, that. In Old French, the most common impersonal marker is , but beginning in the 12th century, il gradually becomes more and more grammaticalized, to become generalized by the 16th century (Brunot 1936: 285). During
Figure 6.
Personal, field, and setting constructions (adapted from Langacker 2009: 143)
In order to summarize the alternative codings which can be imposed on the conceptualized scene and further emphasize the subtle distinction between the setting and the field, compare the three examples in (8) (adapted from Langacker 2009: 143): (8) a. b. c. I am cold here in Chicago Its cold in Chicago Chicago is cold
The three constructions in (8) and illustrated in figure (6) present essentially the same conceptual content, namely an experiencer (E), who (by definition) possesses a scope of awareness (F) which includes the ambient environment where the sensation (cold) is experienced, within a specific locational setting (Chicago). In each case, the trajector is specified by the nominal which therefore functions as the subject. The three constructions differ primarily in which facets of this situation they highlight through profiling and choice of trajector. (Langacker 2009: 143). In (8a), the experiencer and the experience itself is emphasized by the choice of the experiencer as the subject, but the cause is left implicit. The constructions in (8b) and (8c) shift the primary emphasis away
that time, il was a neuter pronoun which didnt exclusively mark masculine referents, but also more general propositional referents, as illustrated in (i): (i) Cestoit jadis chose bien rare, Que de veoir un abb ignare: Aujourd huy il est si com mun, Que cent mille, aussi bien comme un, Se trouveront . . . (Marot, C. [1526]. Les traductions: 171) It was in the past a very rare thing to see an uncultured abbot: Nowadays it is so common, that you can find one hundred thousand as easily as one
Since il was a neuter pronoun at an earlier stage of the language (and still exhibits frozen traces of this usage), it is possible to suggest that the pronoun became specialized in two opposite directions. On the one hand it became a masculine 3rd person subject pronoun, and on the other hand, it became exclusively specialized as an impersonal. From this standpoint, even though the French and English systems synchronically diverge, their diachronic similarity, especially their shared characteristic of having a neuter pronoun used as an impersonal at some point in their history justifies treating il in a way parallel to it.
456
M. Achard
from the experiencer and place it either on the scope of awareness (field) the pronoun it codes, or on the spatial setting expressed by Chicago. In both cases, the experiencer is generalized and unidentified. The shift in focus emphasizes the trajectors responsibility in causing the cold sensation. The contrast between (8b) and (8c) is a matter of whether the trajector is identified as the experiential field per se or as the spatial setting with which it is largely coextensive (Langacker 2009: 144). With respect to the specific issue of French impersonals, the difference between field and (abstract) setting constructions will be examined in detail in section 5. 3. Il impersonals in copular complement constructions
Achard (1998: Ch. 7; 2009) claims that the impersonal constructions semantic function is to present the existence or location of the post verbal entity in its relevant domain. In the most central cases illustrated in (7), it establishes the presence of the entity the post verbal expression codes in reality. As was considered in the previous section, il profiles the field, or in other words the scope of awareness within which the entity une autre [dition] another one can be conceptualized. In this case, this scope represents the knowledge base necessary for the conceptualization of the new edition to be entertained.7 Impersonals can be divided between simple and complex constructions, depending on the nature of the post verbal expression they contain. Simple cases such as the one in (7) have a nominal in that position, whereas complex impersonals are characterized by the presence of a finite or infinitival clause.8 Complex impersonals can be further split up between non-copular and copular predicates i.e., those which contain a form of the tre be copula followed by an adjective. These two kinds of impersonals are respectively illustrated in (9) and (10):9
7. The evidence for ils existential function comes from the distribution of predicates in the impersonal construction. Previous research (Hriau 1980, Achard 2009a) has shown that the verbs which consistently occur with impersonal il are those that most saliently include the field in their lexical semantic structure, namely tre be, exister exist, venir come, passer pass, rester stay, pousser grow, etc. These verbs constitute a good fit with impersonal il because the relation they profile involves the interaction between the participants and the field within which the process is observed. Simple impersonals cannot be reduced to these central cases. Several extensions from this central case are also attested. The first one concerns the location of an entity at a certain location. The second one concerns a change in the modality of existence. Pure existence naturally extends into a more aspectual notion of coming into existence, then to coming into range of consciousness (Achard 1998: Ch. 7). For an analysis of the distribution of intransitive predicates in active impersonal constructions, see Achard 2009a. Predicates which include passive morphology such as for example il a t dcid it was decided, or il a t prouv it was proven were excluded from the sample because impersonal passives will not be considered in this paper.
8.
9.
French il and a impersonals in copular complement constructions 457 (9) On me dira que la guerre nest pas sre. Je crois cependant quil faut ordonner sa vie comme si elle ltait. (Green, J. Journal. T. 5. 1946 1950: 64) One will object that war is not a certainty. I think, however, that it is necessary to organize our life as if it were. (10) Pour le Hezbollah, ce crime barbare met en vidence le fait quaucun endroit au Liban nest labri du danger sioniste. Il est ncessaire de considrer dsormais toutes les rgions libanaises comme des zones daffrontements , o le peuple mne une rsistance permanente contre la menace juive. (AFP) For Hezbollah, this barbaric crime shows that no place in Lebanon is safe from the Zionist danger. It is necessary to consider, from now on, that all Lebanese regions are fighting areas, where people are engaged in permanent resistance against the Jewish threat. The distinction between non-copular and copular predicates might seem artificial, especially since both kinds of verbs include largely overlapping semantic classes, as illustrated in (9) and (10) with falloir (il faut) and il est ncessaire it is necessary. However, several factors justify their separation. First, the two categories widely differ in their distribution. While non-copular predicates are far less numerous, they nonetheless include the most frequently attested ones. For example, in the 1010 predicates considered in the corpus (to be described shortly), they provide the largest number of tokens (669 or 66.23% of all examples), but they only consist of 25 verbs, among which falloir (il faut) alone accounts for 409 instances (40.5% of all examples), and sembler seem for 97 instances. Conversely, as the next section illustrates, copular predicates are so much more open that it is virtually impossible to provide an exhaustive list. Secondly, the non-copular and copular predicates exhibit different syntactic behaviors. Generally speaking (some exceptions exist, which will not be considered in this paper), non-copular predicates generally do not possess a variant where the post verbal expression can appear in subject position. Copular impersonals, by contrast, almost invariably possess a personal variant where the post verbal expression occurs in subject position. This is illustrated in (11) where the example is manipulated from the attested form in (15): (11) # Souvent il a raison, mais avoir raison de cette manire me semble un peu facile He is often right, but to be right in this manner seems a little too easy to me Finally, while non-copular predicates can generally only be preceded by il, copular predicates can almost always be preceded by a form of the neuter demonstrative ce a, c or cela. Because this paper is predominantly interested in
458
M. Achard
this contrast, only copular predicates will be considered from this point forward. The next section first presents the semantic range of copular predicates before proposing an analysis. 3.1. Copular impersonal predicates
In order to investigate the complex impersonal category, 1010 predicates were selected from two corpora (511 from FRANTEXT, 499 from AFP) and analyzed. The 341 instances of copular predicates divide themselves among a large class of 89 verbs, only 18 of which are common between the two corpora. The relative frequency is quite evenly distributed between the different predicates. If tre possible be possible is the most frequently attested with 41 overall occurrences (32 in AFP and 9 in FRANTEXT), several other predicates can also be found in a similar frequency range, namely tre probable be probable (12), tre vrai be true (15), tre ncessaire (14), tre clair (18), tre impossible be impossible (17), tre difficile be difficult (22). All the other predicates contribute fewer than 5 occurrences, with the large majority contributing only one example. The large category of copular predicates can easily be divided into three semantically consistent classes. The first one subsumes deontic predicates such as tre ncessaire be necessary, tre indispensable be indispensable, tre crucial be crucial, tre vital be vital, tre interdit be forbidden, illustrated in (10) and (12): (12) Le petit volume que jai l contient des rsums et des extraits des tra gdies les plus clbres: Hamlet, Othello, Antoine et Cloptre. Il est indispensable que tu les connaisses au moins de cette manirel. (Green, J. Mora: roman: 189) The little volume I am holding contains summaries and extracts of the most famous tragedies: Hamlet, Othello, Anthony and Cleopatra. It is imperative that you be familiar with them at least in this manner.
The second semantic class contains epistemic predicates which describe some conceptualizers degree of certainly with respect to the presence in reality of the event or proposition the post verbal expression codes. This class illustrated in (13) includes tre possible be possible, tre impossible be impossible, tre faux be false, tre sr be sure, tre clair be clear, tre vrai be true, tre vraisemblable be likely, tre certain be certain, tre incertain be incertain, tre probable be probable, etc. (13) Comment pouvaitil se faire quun homme tut la femme quil aimait? On ne tuait que ses ennemis. Il est vrai que cela se passait dans un livre: ctait une histoire invente, un mensonge. (Green, J. Mora: roman: 195)
French il and a impersonals in copular complement constructions 459 How could it happen that a man killed the woman he loved? People only kill their enemies. It is true that it happened in a book: the story was a fabrication, a lie. Finally, the third class includes evaluative predicates. Within this category, the largest group describes a conceptualizers emotional reaction toward the event or proposition coded in the post verbal expression. It is illustrated in (14) and includes tre regrettable be regrettable, tre plaisant be pleasant, tre agrable be agreable, tre amusant be amusing, tre honteux be shameful, tre dcevant be disappointing, tre scandaleux be scandalous etc. (14) 27 Octobre.Plaisir de relire. Salammb. Les phrases du dbut sont dune rsonance merveilleuse. Il est amusant de retrouver de vieilles connaissances. (Green, J. Journal. T. 5. 19461950: 216) October 27. Reread Salammb with pleasure. The sentences at the beginning sound wonderful. It is amusing to encounter familiar characters again. A smaller group of evaluative predicates illustrated in (15) is more concerned with an objective assessment of the event coded in the post verbal expression than by the reaction it triggers. These predicates provide a global positive or negative assessment (tre bon be good, tre mauvais be bad), or evaluate that events degree of difficulty (tre facile be easy, tre ais be easy, tre difficile be difficult): (15) Souvent il a raison, mais il me semble un peu facile davoir raison de cette manire, je veux dire en faisant voir linanit des plaisirs du monde. (Green, J. Journal. T. 5. 19461950: 229) He is often right, but it seems to me that it is a little too easy to be right in this manner, I mean by pointing out the silliness of the worlds pleasures. It was mentioned in the previous section that impersonal il profiles the mental reach which enables the conceptualizer to assess the presence of the entity the post verbal entity refers to in reality. As it stands, however, this definition is not quite sufficient to account for the copular construction cases illustrated in (12)(15), because in those instances, the event or proposition coded in the complement has already been entertained and cannot thus be said to be introduced into existence. In the example in (13) for instance, the context makes it clear that the character is reacting to a scene he is reading in a book. The proposition cela se passait dans un livre this was happening in a book is therefore de facto located in reality. Rather than profiling the unveiling of the facts the proposition codes, the construction objectifies the mental effort the
460
M. Achard
character makes in order to assert the latters epistemic status. Even though its status is already known, the epistemic assertion of the propositions location in reality is used to justify the protagonists uncharacteristic behavior. Recall from the previous section that the Control cycle enables us to assert dominion over different entities in different domains. Epistemic impersonals allow us to express the whole range of decisions pertaining to the place in reality of various potential candidates in our never ending quest for the acquisition of propositional knowledge (Langacker 2009: 131). Individual predicates profile different parts of the cycle. Etre possible, be possible, tre probable be probable for example profile the formulation phase, while tre sr be sure, tre certain be certain, tre vrai be true profile the result phase. In order to describe the difference between the purely existential constructions illustrated in (7) and the epistemic cases described in this section, Achard (1998) points to the need to recognize two separate (yet integrated) levels of reality. A basic level of reality R is composed of objects, the existence of which can be questioned or assessed. At this level, events also occur, and we can question or observe their occurrence. However, the report of the events of basic reality for the purposes of thought or communication necessarily entails their epistemic evaluation, or in other words, their assessment with respect to a more abstract level of reality (R) where their occurrence can be characterized. In a slight amendment to Achard (1998), I suggest that in an epistemic construction, il profiles the field, that is in this particular case the subsection of R within which the assessment can be formulated. In this fashion, the epistemic evaluation can be made explicit, and the presence of the impersonal pronoun highlights the separation between R and R (Achard 1998). In this sense, all instances of il impersonals can be called existential, and in all cases, the impersonal pronoun can be said to profile the field i.e., the conceptualizers scope of awareness for the issue at hand. Its specific nature, however, namely the scope of awareness critically depends on the circumstances with respect to which the post verbal entity is conceptualized. For the simple impersonals, it will merely include elements of R (possibly elements of observation), whereas in the case of epistemic impersonals presented here, the conceptualizers scope of awareness is expanded to include notions of R, namely more evaluative notions that pertain to assessment of facts, or comparison between different situations. Regardless of the level of abstraction at which it is manifested, the conceptualizers task nonetheless remains the introduction of an entity within its sphere of existence. Consequently, the epistemic copular il impersonals presented in this section will also be called existential predications. In order to further illustrate the distinction between different levels of reality, and hence between the different kinds of existential predications, let us compare the example in (7), repeated for convenience, to the one in (16):
French il and a impersonals in copular complement constructions 461 (7) le dsespr que vous avez eu tant de peine vous procurer, dites vous, est, sans doute, ldition Soirat. Il en existe une autre qui vient de para tre, mon insu et sans mon autorisation . . . (Bloy, L. Journal: 74) The desperate person you say you had such trouble finding is most likely the Soirat Edition. There [it] exists another one which was just released unbeknownst to me and without my permission . . . (16) voir cela, il me semble que la rvolte est plus loin de nous que je ne croyais d abord. Il est vrai que je suis avec des montagnards, carts des centres industriels et trs fatalistes. (Rivire, J. Correspondance avec J. Rivire: 120) When I see this, it seems to me that the rebellion is farther from us than I first thought. It is true that I am with mountain men, remote from the industrial areas and very fatalistic In (7), the speaker introduces the presence of a specific edition in reality. The field the pronoun profiles includes the immediate circumstances that pertain to his knowledge of that fact. There is no explicit epistemic evaluation, merely the statement of a fact. In this sense, we can talk about the speakers scope of awareness reaching within the level of simple reality (R). Note the difference between (7) and (16), where the report of a fact as a proposition involves the explicit epistemic assessment of that proposition. This means that the conceptualizers scope of awareness has to expand beyond R to encompass notions of assessment, evaluation, and comparison, all notions of R. It is a particularity of French that existence at these two levels of reality is coded by the same pronoun. In English, for example, the consideration of R and R are kept separate in impersonal constructions: R is marked by there (as in there is a fly in my soup), while presence at the R level is marked with it (it is true that there is a fly in my soup). In French, all il impersonal constructions are analyzed as existential constructions. Note finally that if the discussion was restricted to epistemic predicates for expository purposes, a similar analysis can be given of evaluatives. As factive predicates, (Kiparsky and Kiparsky 1970; Karttunen 1971) tre amusant be amusing in (14) does not seek to establish the event/proposition in the complement in reality (its reality is presupposed), but to express the speakers evaluation of its effect on herself. In a manner similar to epistemic assessment, the determination of the emotional reaction a given event triggers critically involves elements of R. The mere observation of the event is not sufficient, but the more analytical evaluation of its advantages, disadvantages, and overall fit into the value system of the relevant conceptualizer is also required. In this sense, and even though the factive character of the evaluative predicates may make them appear quite removed from denoting existence, these predicates can still be called existential.
462
M. Achard
Recall that the structural distinction between the il and a constructions illustrated in (1) and (2) rests on the dual claim that i) il is a meaningless dummy whose import is solely structural, and ii) a is a cataphoric pronoun which refers forward to the complement clause. Contra the first part of the claim, this section showed that il is a meaningful (albeit general) expression which refers to the field i.e., the mental range within which the event/proposition coded in the complement can be located. The next section addresses the second part of the claim. 4. From demonstrative to impersonal: a
French demonstrative pronouns are traditionally analyzed as having three main senses. The following quote from Grevisse is provided as a representative of a large number of analyses:
Les pronoms dmonstratifs dsignent un tre ou une chose en les situant dans lespace, ventuellement avec un geste lappui ( fonction dictique). Ils peuvent aussi renvoyer un terme qui prcde ( fonction anaphorique) ou qui suit ( fonction cataphorique) dans le contexte. (Grevisse 1986: 1054) Demonstrative pronouns designate a person or a thing by situating them in space, sometimes with an accompanying gesture (deictic function); they can also refer to a preceding term (anaphoric function) or a following term (cataphoric function) in the context.
It is commonly accepted that all pronouns possess these three senses, so as analysis as a cataphoric pronoun in the copular complement construction straightforwardly conforms to the most traditional account of demonstratives. Contra this view, the analysis developed in this section shows that in the context illustrated in (2), a does not behave as a cataphoric pronoun in the strictest sense, but in a manner very close to il, and that consequently, their syntactic overlap reflects their semantic similarities. The argument in favor of this position comes from as semantic examination, and in particular, from the pronouns comparison with ceci this, whose strict cataphoric sense is uncontroversial. The structural and semantic differences observed between a and ceci in its strict cataphoric sense present a serious challenge to the pronouns cataphoric role in the copular complement construction. The data for the comparison of the demonstratives come from the 91 post 1950 documents from the FRANTEXT database. For ceci, the search yielded 586 instances, and all were analyzed. For cela, 3679 instances were obtained, among which 594 random tokens were analyzed. For a, the total search produced 4493 instances, 2057 of which in Simone de Beauvoirs novel Les man darins. 523 random tokens of these 2057 were analyzed. The distribution of

Table 1. Distribution of anaphoric and cataphoric senses for ceci, cela, and a Anaphoric Ceci Cela a 298 559 420 Cataphoric 221 26 101 Deictic/nominal 67 9 2 Total 586 594 523
ceci, cela, and as different senses is presented in Table 1. Because cela is very similar to a in its usage, the rest of this paper will strictly focus on the latter to avoid redundancy.10 The most casual glance at Table 1 reveals that ceci is by far the most frequently attested cataphoric pronoun, but more than sheer frequency, the structural and semantic specificities the pronoun exhibits in its cataphoric sense truly set it apart from its counterparts. 4.1. Cecis cataphoric sense
Perhaps the most striking characteristic of ceci in its cataphoric sense concerns the highly restricted set of formal and syntactic contexts in which it occurs. The 221 examples analyzed in the corpus only represent three types of constructions. In the first one (110 instances), the entity to which the pronoun refers is set apart from its immediate context by punctuation. In 44 instances, it is presented as directly reported speech, namely introduced by a colon and surrounded by quotation marks, as illustrated in (17) and (18): La voix du lecteur est si volontairement terne quil faut un effort pour le suivre. Jentends ceci: une incroyante me dit un jour: si javais la foi, votre brviaire me brlerait les mains. (Green, J. Journal: 11) The readers voice is so deliberately dull that it takes an effort to follow. I hear this: a non-believer tells me one day: If I had faith, your breviary would burn my hands. (18) Revenant lautre soir du thtre avec Robert, nous passons prs dun groupe dagents qui causent entre eux mivoix, et ceci parvient jusqu nous dans le grand silence de la rue dserte: il lui a fil un coup de lame. (Green, J. Journal. T. 5. 19461950: 210) (17)
10. Although it will not be considered in detail in this paper, cecis anaphoric sense is illustrated in (i), where the pronoun in bold print refers to the underlined entity: (i) Stepan: lorganisation tavait command de tuer le grandduc. Cest vrai. Mais elle ne mavait demand dassassiner des enfants. Annenkov Yanek a raison. Ceci ntait pas prvu. (Camus, A. Les justes: 334) Stepan: the organization had ordered you to kill the Great Duke. It is true, but it hadnt asked me to kill children. Annenkov Yanek is right. It [this] was not planned.
464
M. Achard On the way back from the theater the other night with Robert, we walked past a group of policemen talking to each other in low voices, and this comes to us in the silence of the deserted street: he cut him with a blade.
In 66 instances, the entity referred to is not presented as reported speech, but it is nonetheless set apart from the immediate context by punctuation, namely a colon in 55 instances, or a period in 11 examples. These two situations are respectively illustrated in (19) and (20): (19) (Plus bas, mais fermement:) frres, je veux vous parler franchement et vous dire au moins ceci que pourrait dire le plus simple de nos paysans: tuer des enfants est contraire lhonneur. (Camus, A. Les justes: 340) (Lower, but in a firm voice:) brothers, I want to speak frankly and tell you at least this which the least educated of our peasants could say: killing children is dishonorable. (20) Kirilov, se lve et semble rflchir. De quoi faudratil me dclarer coupable? Pierre vous le saurez. Kirilov bon. Mais noubliez pas ceci. Je ne vous aiderai en rien contre Stavroguine. (Camus, A. Les possds; pice en trois actes: 1043) Kirilov, stands up and seems to be thinking. What will I have to confess I am guilty of ? Pierre you will know. Kirilov good. But do not forget this. I will not help you in any way against Stavroguine. In the second type of construction (7 instances), the phrase which contains the pronoun is surrounded by two commas and provides a kind of parenthetical commentary, as illustrated in (21): (21) Il me dit encore, et ceci me parat beaucoup plus juste, quil craint que tenir un journal ne nuise au romancier, nte au roman son impul sion. (Green, J. Journal. T. 5. 19461950: 70) He also tells me, and this seems to me much more to the point, that he fears keeping a journal may be detrimental to a novelist, that it may take away the novels momentum. Finally, in the third kind of construction in which cataphoric ceci is attested (101 instances), the pronoun is directly followed by a relative pronoun (overwhelmingly que). The latter is preceded by a comma in 52 cases, as illustrated in (22). In 49 instances illustrated in (23), however, it is not preceded by a comma. (22) Toutes ses difficults avec les tres lui venaient de ceci, quil ne pouvait leur faire comprendre lextrme pril de leur situation. (Green, J. Mora: roman: 122)
French il and a impersonals in copular complement constructions 465 All his difficulties with people came from this, that he couldnt make them understand how extremely dangerous their situation was. (23) Cette longue interruption a eu ceci de bon quelle ma permis de pren dre du recul. (Martin Du Gard, R. Souvenirs autobiographiques: cxxx) This long break was beneficial in this that it allowed me to take a step back. The syntactic distribution of cataphoric ceci yields two interesting observations. First, the highly specific character of the three constructions in which the pronoun participates reveals its narrow range of usage. Secondly, the first construction clearly stands out from the others with respect to the level of independence the entity to which the pronoun refers (henceforth the referred entity) enjoys from its immediate context. More specifically, in the 110 instances where the referred entity is preceded by a colon or period, there is nothing in the previous context that announces its semantic content. For example, in (18) repeated for convenience, nothing in the preceding context semantically anticipates the utterance il lui a fil un coup de lame, beyond the fact that an unknown policeman is responsible for it. (18) Revenant lautre soir du thtre avec Robert, nous passons prs dun groupe dagents qui causent entre eux mivoix, et ceci parvient jusqu nous dans le grand silence de la rue dserte: il lui a fil un coup de lame. (Green, J. Journal. T. 5. 19461950: 210) On the way back from the theater the other night with Robert, we walked past a group of policemen talking to each other in low voices, and this comes to us in the silence of the deserted street: he cut him with a blade.
In cecis first construction, the formal separation the presence of punctuation creates between the pronoun and the referred entity iconically represents the high level of independence that entity enjoys with respect to the surrounding discourse. Another facet of that independence is that ceci does not relate to any aspect of the immediate context, except to announce the entity it refers to. Because of this unique function, this sense of the pronoun will be called strictly cataphoric. In the other two constructions ceci participates in, the referred entity exhibits a higher level of semantic integration in the overall discourse context. In (21) for example, also repeated for convenience, ceci refers forward to the phrase quil craint que tenir un journal ne nuise au romancier, nte au roman son impulsion. Unlike what was observed in the strict cataphoric construction, this phrase is semantically integrated within the overall context. First, the presence of the adverb encore also indicates the following entity is part of the ongoing argumentation, and can thus be expected to follow the same thematic
466
M. Achard
lines. Secondly, the presence of beaucoup plus juste much more to the point shows that the content of that entity is evaluated. The communicative goal of (21) is to assess the value of someones arguments against a novelists keeping a journal, and the proposition ceci refers to contributes to that goal by specifically isolating the part of the argument the author agrees with. In this sense, and despite the possibly isolating role of the preceding comma, the referred entity is considerably better integrated in the thematic context of the overall discourse than its counterpart in the strictly cataphoric construction. (21) Il me dit encore, et ceci me parat beaucoup plus juste, quil craint que tenir un journal ne nuise au romancier, nte au roman son impul sion. (Green, J. Journal. T. 5. 19461950: 70) He also tells me, and this seems to me much more to the point, that he fears keeping a journal may be detrimental to a novelist, that it may take away the novels momentum. A similar analysis can be given for cecis third construction, namely when the referred entity is preceded by a relative pronoun. As was the case in the second construction, this entity is also embedded within the overall discourse, as illustrated in (24), where it represents the authors rectification and reduction of the scope of the previous assertion, and is thus clearly semantically incorporated within its immediate context. (24) Un jeune crivain menvoie le texte dun ouvrage quil a crit sur moi et dans lequel il pense dmontrer que tous mes romans ont t crits sous linfluence du dmon! Cette vue me parat bien systmatique . . . mais il y a ceci de vrai, que mes romans laissent entrevoir dans de grands re mous ce que je crois tre le fond de lme et qui chappe toujours lobservation psychologique, la rgion secrte o Dieu travaille. (Green, J. Journal. T. 5. 19461950: 126) A young author sends me the text of a book he wrote about me in which he thinks he shows that all my novels have been written under the devils influence! This view seems very systematic . . . but it contains this truth [this that is true] that my novels reveal in great turmoil what I believe to be the bottom of the soul which always eludes psychological investigation, the secret region where God is at work Even though ceci does not participate in the copular complement construction, its investigation provides interesting insight into the semantic nature of French cataphoric pronouns. In particular, it reveals an interesting continuum where the level of semantic independence of the referred entity with respect to the surrounding discourse correlates with the pronouns level of cataphoricity. The pronouns first construction represents one end of the continuum, where the referred entity is semantically isolated from the neighboring discourse and
French il and a impersonals in copular complement constructions 467 the pronoun is strictly cataphoric. The other two constructions are less extreme, because the referred entity is more closely embedded within the surrounding context, and consequently, the pronoun is less cataphoric in the strictest sense. The next section evaluates as supposedly cataphoric sense with respect to this cataphoric cline, and shows that the pronoun cannot be called cataphoric in any meaningful way. 4.2. as cataphoric sense: Shared lexical contact
Most importantly for the purposes of this paper, a radically differs from ceci because i) it occurs in a far larger range of constructions and is thus less syntactically restricted, and ii) it does not have a strict cataphoric sense. As a general rule, and contrary to what was observed with ceci, the referred entity is closely integrated in the surrounding context. This shared thematic content manifests itself in a variety of ways. In perhaps the most obvious cases, the referred expression repeats a lexical item from the previous discourse, as illustrated in (25) and (26) where the shared lexical elements are underlined. In both instances, the referred entity is so well integrated in the surrounding context that it could be viewed as lexically redundant and possibly left out. Promenonsnous. Amusonsnous tant quil nous reste de la chair sur les os. Il haussa les paules: tu sais bien que a nest pas si facile de samuser. (Beauvoir, S. de. Les mandarins: 93) Lets go for a walk. Lets have fun while we still have flesh on our bones. He shrugged: you know that it [this] is not so easy to have fun. (26) Josette soupira: il va falloir que je me montre un peu, pour ma pub licit; alors je dois mhabiller.a ne tennuie pas de thabiller? (Beauvoir, S. de. Les mandarins: 281) Josette sighed: I will have to show myself a little, for advertising purposes; so I need to dress up.It [this] doesnt bother you to dress up? (25) In other instances, the referred entity expresses a reformulation of the previous context, which often involves reanalysis, as in (27), where the referred entity le pch par omission a sin by omission constitutes a recapitulation and generalization of the situation described in the previous context: (27) quand on pense tout ce quon pourrait faire et quon ne fait pas! Toutes les occasions quon laisse chapper! On na pas lide, pas llan; au lieu dtre ouvert on est ferm; cest a le plus grand pch: le pch par omission. (Beauvoir, S. de. Les mandarins: 70) when you think about all we could do and dont do! All these opportunities we waste! We lack the idea, the momentum; we are closed instead of being open; This is the greatest sin: a sin by omission
468
M. Achard
Finally, the shared information between the referred entity and the previous context can be provided by the set of inferences made available by wellestablished schemas, or the larger context of the sentence, as the examples in (28) and (29) respectively illustrate: (28) elles avaient toutes des robes noires, des cheveux couleur doral, des talons trs hauts, de longs cils et une personnalit, diffrente pour cha cune, mais fabrique dans les mmes ateliers. Si javais t homme a maurait t impossible den prfrer aucune, jaurais t faire mon march ailleurs. (Beauvoir, S. de. Les mandarins: 343) they were all wearing black dresses, bottle colored hair, very high heels, long eyelashes and different personalities nonetheless crafted in the same salons. If I had been a man, it [this] would have been impossible to choose one of them. I would have done my shopping elsewhere. (29) Robert avait senti que je navais gure envie de parler et lui il avait des tas de choses me raconter: il racontait. Il tait beaucoup plus gai quavant mon dpart: ce nest pas que la situation internationale lui part brillante, mais il avait repris got sa vie. a comptait beaucoup pour lui de stre rconcili avec Henri . . . (Beauvoir, S. de. Les man darins: 492) Robert had sensed that I didnt feel like talking and he had a lot to tell me about: he was talking. He was much happier than before I left: it isnt because he was happy with the international situation, but he had regained his taste for life. It [this] mattered a lot to him to have reconciled with Henri. In (28), the connection between the referred entity den prfrer aucune to prefer one of them and the preceding context is made available by the presence of a schema which structures the social interaction and possible seduction between men and women, and the knowledge of the criteria men usually use to select their potential partner. In (29), we know from previous context that Robert and Henri have reconciled, and the referred entity echoes their reconciliation. The examples presented in this section clearly point out the difference between ceci and a in both pronouns cataphoric sense. In the overwhelming majority of the a cases, and in a manner which radically contradicts the conclusions reached for ceci in the preceding section, the referred entity shares a great deal of semantic information with the immediately preceding context. Consequently, unlike ceci whose anaphoric and cataphoric senses are sharply delineated, as precise semantic import can only be appreciated if the pronouns anaphoric sense is also considered. as anaphoric uses are interesting and complex, but the next section only evokes the pronouns characteristics most directly relevant to the purposes of this paper.
French il and a impersonals in copular complement constructions 469 4.3. as anaphoric sense
A strictly referential analysis, namely one that would consider as discourse function a direct extension of its deictic value, has long been considered untenable because the pronoun often does not anaphorically refer to an existing entity. This is notoriously observed in the referential shifts (Cadiot 1988) illustrated in (30) which distinguish as behavior from that of personal pronouns (straight anaphors). The examples in (30) are adapted from Carlier (1996): (30) a. b. Les gosses, ils se lvent tt The kids, they get up early Les gosses, a se lve tt Kids, they get up early [kids, that gets up early]
In (30a), the strict anaphoric value of the personal pronoun ils they is reflected by the agreement on the verb which matches the gender and number of the referent, but also by the specific reading of the utterance. In the preferred reading, the speaker has specific kids in mind, of which the predication of se lever tt get up early is made. Conversely, in (30b), the singular subject/verb agreement does not match the referents number, and the reading of the utterance is necessarily generic. Furthermore, in numerous instances, a does not refer to a nominal directly available in the context. This is illustrated in (31) taken from Achard (2000), where the only potentially available nominal dos sier portfolio cannot be considered the pronouns referent because a portfolio cannot be said to be working or not. (31) Javais gard de bons copains du temps de lOpra, dont un qui tait pass chez Cuevas. Il ma attrap dans un bar de la rive gauche et ma conseill la mode, ma envoy prsenter mes dessins. A lpoque, je mhabillais beaucoup, jtais presque un personnage avec des avions, des nuages dor et dargent dcoups sur le dos de mes blousons. Je ne savais rien, jai prpar un dossier, a a march. Jai appris comme a, et je nai pas arrt. I had kept good friends from the Opra days, including one who had gone to Cuevas. He caught me in a bar on the left bank and advised me to get into the fashion industry, and sent me to show my drawings. At the time, I would dress up a lot, I was almost a character with planes, gold and silver clouds cut outs in the back of my jackets. I didnt know anything, I prepared a portfolio, it [this] worked out. I learned like that, and I never stopped. In order to account for as behavior, Achard (2000: 2) suggests that the pronoun: provides the hearer with the instructions to create a specific region within the space provided by the context of the utterance, and profiles that region. In
470
M. Achard
CG, a region is defined as a set of interconnected entities. Consequently: within the space of the context of the utterance, as presence instructs the hearer to construe a group of entities as interconnected (Achard 2000: 2). Importantly for the purposes of this paper, anaphoric a possesses three main properties which motivate its use in cataphoric contexts. The first one is that because the entities to be considered together are not always specified, the hearer often needs to conceptually manipulate the immediate context in order to create the region the pronouns presence requires. This is illustrated in (32) where the pronoun does not refer to the situation currently occurring in front of Henris eyes, but to an alternative one which he is imagining, and is therefore confined to this specific mental space (Fauconnier 1985): (32) Lucie se mordit la lvre; soudain, elle ne crnait plus, et il eut peur quelle ne se mette pleurer, a devait tre un spectacle curant. (Beauvoir, S. de. Les mandarins: 472) Lucie bit her lip; suddenly, she wasnt showing off anymore, and he was afraid she would start crying, it [this] would certainly be a sickening show. as second property is that the region the pronoun profiles can be maximally large and abstract, to the point of being indistinguishable from the portion of reality within which the events or circumstances of interest can be observed, as illustrated in (33): (33) Il y a quelque chose de pourri quelque part, se dit Henri. Tant de choses faire! Et tant de types qui ne savaient que faire! a aurait d coller: et puis a ne collait pas. (Beauvoir, S. de. Les mandarins: 151) There is something rotten somewhere, Henri said to himself. So many things to do! And so many guys who didnt know what to do! It [this] should have worked out: and then it [this] wasnt working out. In (33), as referent is difficult to isolate precisely, but it may best be described as the region of reality which incorporates the traumatic chain of events which occurred in the aftermath of the Second World War in France, including the hopes and dreams of the survivors as well as their aspirations for a better future, and their disillusion with the reality of the political and social situation. Finally, the pronouns third property which motivates its use in cataphoric contexts is the subjective construal its presence imposes on the region it profiles. The notion of subjectivity as it is presented in CG (Langacker 1985, 1990) refers to the vantage point from which a linguistic expression is conceptualized, as well as the viewing organization that exists between the conceptualizer and her conceptualization. The analysis of an expressions viewing organization involves the investigation of the conceptual asymmetry in the construal relation between the conceptualizing subject and the object of conceptualization.
French il and a impersonals in copular complement constructions 471 The subjective construal a imposes on the entity it evokes is illustrated in (34) (from Achard 2000) and (35): (34) Les Archaos ont investi le Cirque dHiver, et a fait du vacarme. The Archaos have invaded the Cirque dHiver, and it [that] makes a raucous. (35) a nest pas russi! Dit Henri. Il suivit des yeux Julien qui marchait avec dignit vers la porte; lui non plus, il ntait pas drle, il tournait plutt laigre. Mais somme toute, pourquoi a seraitil spcialement drle, laprsguerre? Oui, sous loccupation, elle tait bien belle: vie ille histoire. Assez fredonn la chanson des lendemains; demain, ctait devenu aujourdhui, a ne chantait plus. (Beauvoir, S. de. Les manda rins: 159) It [this] didnt succeed! Henri said. His eyes followed Julien who was walking toward the door with dignity; he also wasnt much fun, he was getting quite bitter. But in fact, why should it [this] be particularly fun, the after war? Yes, during the occupation, it was beautiful: old story. Enough sung the song of tomorrow; tomorrow had become today, it [this] wasnt singing anymore. The subjective construal a imposes on the scene it profiles clearly emerges from the comparison of (34) with its possible alternative #Les Archaos ont in vesti le Cirque dHiver, et ils font du vacarme the Archaos have invaded the Cirque dHiver, and they are making a raucous, where the verb clearly profiles the relation two well delineated entities, namely the source of the noise (ils) and their creation (le vacarme). Both entities are treated as strict objects of conceptualization, while the conceptualizing subject (the speaker in this case) remains off stage. In this viewing configuration, the subject and object of conceptualization are maximally differentiated. In (34) by contrast, the subject a does not represent the source of the noise, but the all encompassing abstract region (sub-section of reality) where it is taking place, which crucially includes at the same time the more diffuse source of the noise and the conceptualizing subject. In this configuration, as presence blurs the asymmetry between the object of conceptualization (the source and production of the noise) and the conceptualizing subject (the speaker) by making them both part of the profiled scene. The verb faire doesnt profile the creation of the source of the noise, but the sensation that fills the scene. The situation is similar in (35), where the scene is presented from within, without a clear delineation of the source of the singing or the participants conceptualizing it. This viewing configuration is frequently attested in less literary styles, as illustrated in the newspaper example in (36) (from Achard 2000) where the source of social unrest is clearly people in general, but is never presented as such. Rather, it is subjectively construed, implicitly situated within the abstract region a profiles. Consequently,
472
M. Achard
the main verb does not portray a clearly defined action with clearly identifiable sources, but the more global atmosphere that results from the subjective construal of the conceptualized scene: (36) Vous avez vu un peu ce qui se passe dans ce pays? Cest la folie! a proteste, a rouscaille, a rouspte dans tous les coins. Have you seen whats going on in this country? Its crazy! It [that] protests, it [that] complains, it [that] bitches in every corner. 4.4. Interplay between as anaphoric and cataphoric senses These properties of anaphoric a provide critical insight into the investigation of the pronouns cataphoric sense because they shed light on the exact nature of the connection observed in the preceding section between the referred entity and the preceding context. More specifically, if we bear in mind what anaphoric a does, it is often difficult to determine whether the pronoun has an anaphoric or cataphoric value. This is the case in 38 examples of the Les mandarins corpus illustrated in (37) and (38): (37) Une morale de luniversel, on peut tcher de limposer. Mais le sens quon donne sa vie, cest une autre affaire. Impossible de sen expli quer en quatre phrases: il faudrait amener Lambert voir le monde avec mes yeux. Henri soupira. Cest a que a sert la littrature: mon trer aux autres le monde comme on le voit . . . (Beauvoir, S. de. Les mandarins: 255) One can try to impose a universal morality. But the meaning one gives to ones life, it is something else. Impossible to explain this in four sentences. I would have to bring Lambert to see the world through my eyes. Henri sighed. This is what literature is for: showing others the world as one sees it. (38) Tu ne te promnes jamais?Je nai pas le temps.Questce que tu fais donc?Il y a toujours tant faire; les cours de diction, les courses, le coiffeur: tu nimagines pas quel temps a prend, le coiffeur; et puis les ths, les cocktails.a tamuse tout a?(Beauvoir, S. de. Les mandarins: 280) You never go for a walk?I dont have the time.What do you do with your time?There is always so much to do; enunciation classes, shopping, the hairdresser: you cannot imagine how long it [this] takes, the hairdresser; then tea parties, cocktail parties. Does it [this] amuse you, all this? In (37) and (38), a strict cataphoric analysis would obscure the pronouns anaphoric role with respect to the preceding context. In fact, if the underlined entities were eliminated from all the examples, each of them would resemble the cases in the preceding section where anaphoric a summarizes and generalizes
French il and a impersonals in copular complement constructions 473 the previous context. This is particularly obvious in (38). Since, on the one hand, the semantic import of tout a all that is to summarize and synthesize the elements of the previous context, and on the other hand, as cataphoric role refers to that entity, the pronouns anaphoric and cataphoric roles necessarily greatly overlap. The symbiotic character of the as anaphoric and cataphoric values is further illustrated by a large number of cases where the entities which respectively precede and follow the pronoun can be reversed without noticeable change in meaning. This is illustrated in the pair in (39) and (40): (39) Comment a test venu, de vouloir crire?Oh! a remonte loin, dit Henri. a remontait loin, mais il ne savait trop quelle importance accorder ses souvenirs.Quand jtais jeune, a me semblait magique un livre. (Beauvoir, S. de. Les mandarins: 92) How did it come to you to want to write?Oh! it was a long time ago, Henri said. It was a long time ago, be he didnt know how relevant his memories were.When I was young, it [this] seemed magical to me, a book. (40) Promenonsnous. Amusonsnous tant quil nous reste de la chair sur les os. Il haussa les paules: tu sais bien que a nest pas si facile de samuser.Essayons. Une grande balade dans les montagnes, a serait bien, non? (Beauvoir, S. de. Les mandarins: 93) Lets go for a walk. Lets have fun while we still have flesh on our bones. He shrugged: you know it isnt easy to have fun.Lets try. A long hike in the mountains, it [this] would be fun wouldnt it? The direction of the relation between the pronoun and its referent is reversed between the examples in (39) and (40). In (39), a cataphorically refers to un livre a book, whereas in (40), the pronoun anaphorically refers to une grande balade dans les montagnes a long hike in the mountains. In both cases, however, the referential relation could easily be reversed. A possible alternative to (39) could be #un livre, a me semblait magique a book, it seemed magical to me; and the example in (40) could be manipulated to #a serait bien, non, une grande balade dans les montagnes? it would be fun wouldnt it, a long hike in the mountains? without a noticeable change in meaning. The possibility of these reversed construals clearly illustrates that the information conveyed by the referred entity is at least partially already available from the context, and that the speaker chooses to present it before or after the pronoun for expressive purposes. The semantic import of cataphoric a therefore lies in the narrative benefit of presenting it after the pronoun. Some possible reasons for this choice will be considered in the following sections. The results obtained in this section reveal a strong distinction between cecis strict cataphoric sense where the referred entity is maximally separated from
474
M. Achard
the immediate context, and all the other cases (i.e., the second and third configuration of ceci, but more importantly all the cases involving a), where the indefinite pronoun shares varying degrees of lexical, conceptual, or contextual elements with the preceding context. The next section provides a preliminary attempt to integrate these different degrees of shared information into a more general account of cataphoric pronouns. 4.5. An integrated view of cataphoric pronouns The starting point of a unified account of the types of constructions examined in this paper is provided by Smiths (2000) analysis of the cataphoric pronouns illustrated in (41) and (42) in English and German.11 (41) I despise it that John voted for the governor (42) Wir bedauern (es) da Hans so dumm ist [respectively (1a) and (1b) in Smith (2000: 483)] We regret [it] that Hans is so stupid Smith argues that cataphoric pronouns serve a space designating function (Smith 2000: 486) by which they anticipate the mental spaces set up by space builders by designating the spaces themselves in the grammar (Smith 2000: 487). Smiths analysis is directly applicable to cecis strict cataphoric sense, as the example in (18) repeated once again illustrates: (18) Revenant lautre soir du thtre avec Robert, nous passons prs dun groupe dagents qui causent entre eux mivoix, et ceci parvient jusqu nous dans le grand silence de la rue dserte: il lui a fil un coup de lame. (Green, J. Journal. T. 5. 19461950: 210) On the way back from the theater the other night with Robert, we walked past a group of policemen talking to each other in low voices, and this comes to us in the silence of the deserted street: he cut him with a blade. Consistent with Smiths analysis for other languages, the predicates which precede the pronoun in cases such as (18) are unquestionable space builders. Ceci profiles the abstract nominal entity which announces the direct quote to follow, and the following quote elaborates this abstract setting. Strict cataphorics such as cecis first sense or Smiths examples in (41) and (42) constitute a limiting case, because the mental space which announces the upcoming entity is maximally abstract and devoid of specific contextual content. There are, however, no limits as to the possible internal structure of this space, and it seems reasonable to posit that it may be contextually elaborated at different degrees. I therefore suggest that cecis second and third construc11. In his paper, Smith also presents Russian cases which will not be considered here.
French il and a impersonals in copular complement constructions 475 tions, as well as the cases involving a only differ from the strict cataphorics in the level of contextual elaboration of the abstract setting the pronoun designates. Rather than empty and maximally abstract, it may contain a wide range of elements already present in the context or easily inferable from it. At the opposite end of the continuum from the strict cataphorics, this abstract setting may be fully inclusive of the discourse context within which the referred entity is extracted for specific purposes. For instance, in the example in (43), this purpose may include the need to synthesize and analyze the diverse elements of a globally construed context in order to present them in a way more suitable for communication: (43) il gagnait du terrain en province; et ce quil y avait de rconfortant, cest que les communistes ne lattaquaient plus: lespoir dune union durable se rveillait. Cest lunanimit que le comit dcida en no vembre de soutenir Thorez contre De Gaulle. a facilite bien la vie de se sentir en accord avec ses amis, ses allis, avec soimme, pensait Henri . . . (Beauvoir, S. de. Les mandarins: 231) he was gaining ground outside Paris; and what comforted him the most was that the communists werent attacking him any more: the hope of a lasting alliance was awakening. The committee unanimously decided to support Thorez against De Gaulle in November. It [this] makes life easier to feel in agreement with ones friends, ones allies, and oneself , Henri thought . . . In (43), the mental space a profiles is composed of the subjectively construed region which contains the previously mentioned events pertaining to Henris relations with his allies and himself. Because of its subjective, undifferentiated construal, the different elements which compose this space are not easily isolated and explicitly mentioned individually. The following (underlined) infinitival clause represents the objectification of some of these elements, in an analytical synthesis which makes them more suitable for communicative expression. The pronoun therefore does not profile the infinitival clause per se, but the undistinguished mass of the abstract contextual setting which contains it.12 It was claimed at the end of the preceding section that cataphoric pronouns should be considered along a continuum of thematic integration between the
12. Because the mental space the pronoun designates contains the segment of context from which a more specific entity is extracted, it is analogous (to a certain extent) to the more concrete setting previously illustrated in (8b) by Chicago is cold and illustrated in figure (6c). Just as Chicago provides the boundaries within which the sensation of cold is experienced, a in (45) designates the outlines of a section of discourse context (all the events which pertain to Henris relationship with the members of his political world) from which the content of the post verbal expression is extracted. The pronoun thus designates the contextual setting from which the entity which follows it is extracted.
476
M. Achard
entity the pronoun refers to and its immediate context. It is now clear that this continuum reflects the level of contextual elaboration of the abstract setting which announces the pronouns referent. The two poles of this continuum have been isolated. In cecis strict cataphoric role on the one hand, the abstract space which announces the entity that follows (often a direct quote) is void of specific content. Consequently, that entity is thematically maximally separated from its immediate context. In cases with a such as (43) on the other hand, that mental space critically includes a variety of events and circumstances, including the ones evoked by the entity which follows the pronoun. The latter is thus fully contextually integrated in its surrounding context in the sense that it represents the objectification of some circumstance previously only considered in an undistinguished, subjective fashion. Since a is situated at the most elaborated end of the cataphoricity continuum, we may legitimately wonder if the pronoun should even be considered to have a cataphoric sense at all. With respect to the issues discussed here, and because an exhaustive account of cataphoric pronouns doesnt constitute a goal of this paper, the answer is mainly definitional. On the one hand, as this section illustrated, as anaphoric and cataphoric senses both conceptually manipulate the preceding context in order to conjure up the region the pronoun profiles. On the other hand, cataphoric a involves the extra step of singling out and objectifying a specific aspect of that region, so that the resulting entity elaborates the abstract setting (region) the pronoun designates. It is certainly legitimate to decide that this extra step constitutes sufficient motivation to call this specific use of a cataphoric, but it should be clear that this label does not, in any form, justify the structural distinction between demonstratives and impersonals most syntactic accounts posit to explain their analyses of the distribution presented in (1) and (2). Section 3 challenged the position that il is a meaningless dummy whose import is solely structural by showing that the pronoun is a referential expression which refers to the field i.e., the mental range within which the event/ proposition coded in the complement can be located. This section addressed the view that a cataphorically refers to the complement clause by showing instead that the pronoun refers to a subjectively construed abstract region (setting), which contains the immediate circumstances articulated in the discourse from which the entity that follows the pronoun can be extracted. It should therefore be clear that the structural distinction between il and a in examples such as (1) and (2) is unwarranted, and the next section argues that in the context of the copular complement construction, both il and a should be considered impersonals. 4.6. Two impersonals in the copular complement construction The emphasis has so far been placed on as cataphoric value in general, but we are now in a position to focus on the pronouns possible impersonal sense,
French il and a impersonals in copular complement constructions 477 and ask under which conditions it may be considered an impersonal. Obviously, the answer to this question depends on the definition of impersonals, and solutions have been proposed which run the entire range of possibilities from the narrowest syntactic accounts which systematically exclude demonstratives, to much broader ones which include virtually all of them. For example, Olsson (1986: 29) recognizes as an impersonal les verbes et les locutions prcds dun pronom sujet neutre (il, ce, ou cela/a) qui na aucun rapport avec ce qui prcde dans le contexte, ni avec un mot particulier, ni avec le contenu total the verbs and locutions preceded by a neuter subject (il ce, or cela/a) which has no connection with the preceding context, either with a specific word or the global content. According to Olssons broad definition, a would undeniably be an impersonal in the example in (44): (44) et toi, mauvais gredin, que je ty reprenne courir les routes en faisant le conspirateur! . . . a ttonne que je taie tir de l, hein? . . . (Adam P. LEnfant dAusterlitz: 280) As for you, you good-for-nothing scoundrel, dont let me catch you running around doing mischief! It [this] surprises you that I got you out of this doesnt it? In (44), a displays all the characteristics we have observed in the preceding section. The pronoun profiles the section of current reality which contains the event described in the complement. Because this global subjective construal does not isolate the precise element which causes the hearers surprise, the latter is later objectified and presented in the complement clause. As indicated earlier, the content of that clause is totally contained in the section of reality a subjectively profiles, even before being objectified and singled out as the specific reason for the hearers surprise. However, despite this viewing organization, it seems counter-intuitive to call the construction in (44) an impersonal because the construal of the complement scene is specifically tied to the hearer, and therefore not potentially available to any conceptualizer. In other words, the construction lacks the degree of generality required to be considered an impersonal. In copular complement constructions, however, the event/proposition coded in the complement is not exclusively considered with respect to its effect on a specific conceptualizer, but evaluated relative to the general categories of reality (epistemic modals), necessity (deontic modals), or emotion (emotion reaction), usually available to anyone. Because any conceptualizer in the right position will invariably experience the surprise caused by the observation of the scene the complement clause profiles, the construction illustrated in (45) meets the definition of an impersonal. It is semantically very close to the one in (44), but the presence of the copula provides the additional level of generality required of impersonal constructions.
478
M. Achard
(45) D ailleurs il nest pas tout fait vrai que le chemin de fer ait un trac aussi raide, aussi indiffrent et brutal quon veut bien le dire. Ainsi que tu me le faisais remarquer lan dernier, en haut de La Sche, cest tonnant de voir comme il sest incorpor au paysage (Rivire, J. Corre spondance avec J. Rivire: 28) In any case, it is not quite true that the railroad tracks cuts such a steep, indifferent and brutish path as people have said. As you were indicating to me last year, at the top of La Sche, it [this] is surprising to see how well it blends into the landscape The decision to restrict the impersonal label to the copular complement construction may seem somewhat arbitrary because in addition to the constructions illustrated in (44) which specifically mention the experiencer (by an accusative or dative pronoun) and thus restrict the experience to that conceptualizer, there exist other cases in which the construal of the complement scene is not so directly tied to a specific character, as illustrated in (46). (46) Dame, tu comprends, quand on se sent si loin de son pays, au milieu des sauvages, a fait rudement plaisir de se retrouver. (Moselly, E. Terres lorraines: roman: 100) Well, you see, when one feels so far away from home, among savages, it [this] makes you very happy to get together Although the inclusion of constructions such as the one in (46) to the impersonal category would be unproblematic given the analysis presented here, it would nonetheless require a careful analysis of a large number of predicates, as well as a clear understanding of the conditions under which their level of generality meets the impersonal requirement. For the time being, and excluding the weather expressions not considered in this paper, it seems reasonable (and quite conservative) to reserve as impersonal label to its usage in the copular complement construction.13 In the copular complement construction, French therefore has two kinds of impersonals, respectively introduced by il and a. As the preceding sections
13. The fact that il and a are in only competition in the copular complement construction further emphasizes the conservative aspect of the analysis proposed here. This is illustrated in (i) and (ii). (i) shows that the predicate tre tonnant be surprising, already attested with c in (45) also occurs with il, (ii) shows that an attempt to replace a by il with the semantically related predicate tonner surprise yields ungrammatical results: (i) nest-il pas tonnant que la ruche que nous voyons ainsi confusment, du haut dun autre monde, nous fasse, au premier regard que nous y jetons, une rponse sre et profonde? (Maeterlinck M. La vie des abeilles: 46) Isnt it surprising that the hive that we see so indistinctly from the top of another world would give us, as soon as we look at it, such a positive and profound response?
French il and a impersonals in copular complement constructions 479 have indicated, the two pronouns are very close in meaning, and their syntactic overlap merely reflects the extreme closeness of two meaningful expressions. Il profiles the field, i.e., the conceptualizers awareness or mental reach which permits the situation profiled in the complement to be assessed (Achard 1998, Langacker 2009, Smith 2006). Demonstrative impersonals profile the subjectively construed abstract setting i.e., the immediate circumstances from which the event or proposition coded in the complement clause can be extracted. The complement clause represents the objectification of a specific part of that scene singled out for expressive reasons. In this specific discourse context, demonstrative impersonals are therefore true abstract setting constructions, where the setting i.e., the abstract region the pronoun refers to is composed of interconnected entities from the current discourse context. The semantic difference between the two pronouns is thus subtle, and essentially pertains to ils emphasis placed on the conceptualizers mental effort to assess the scene in the complement, while a is more specifically concerned with the immediate context itself. Il is therefore larger in scope, because it includes the conceptualizer centered considerations of assessment and analysis among others. However, this distinction is particularly tenuous within the confines of the copular complement construction, where the mental effort required to locate or evaluate the event/proposition in the complement is largely based on the examination of the circumstances which surround it. In this context, il and a exhibit such a large amount of conceptual overlap that the competition in their distribution is inevitable. The next section considers this distribution more specifically. 5. Distribution of il and a impersonals
The data for the investigation of il and as distribution come from the FRANTEXT corpus exclusively. First, the consideration of one single work, namely Simone de Beauvoirs Les mandarins yields some general observations about the relative distribution of the two constructions. Secondly, all the 20th century texts available (760 texts) were searched for individual predicates, and an indepth analysis is presented for vrai true. These two sources of data serve different and complementary purposes. On the one hand, a complete novel provides a unique context from which the authors selections stand out more acutely, and each forms narrative function can be exploited more visibly. Furthermore, it constitutes an integrated whole with respect to which the relative
(ii) #*et toi, mauvais gredin, que je ty reprenne courir les routes en faisant le conspira teur! . . . il ttonne que je taie tir de l, hein? As for you, you good-for-nothing scoundrel, dont let me catch you running around doing mischief! It [this] surprises you that I got you out of this doesnt it?
480
M. Achard
distribution of whole classes of predicates can be evaluated. However, because the meaning distinction between il and a impersonals is quite subtle, a great deal of author and genre specific variation can be expected, and the examination of a single work may yield results too idiosyncratic to be safely generalized to other works. Consequently, the results obtained from a single text need to be evaluated against the predicate specific investigations conducted across many texts.14 The observation of Les mandarins data recapitulated in Table 2 reveals that a impersonals by far outnumber their il counterparts:
Table 2. Overall distribution of copular impersonals in Les mandarins il Deontic Instances 1 Epistemic 8 Evaluative 26 Total 35 Deontic a Epistemic 40 Evaluative 167 Total 207
This distribution calls for some observations. First, even though deontic, epistemic, and evaluative forms are attested (at least for one form), the overwhelming majority of copular impersonal predicates are evaluative. The scarcity of deontics is explained by the quasi monopoly in French of another impersonal form, namely il faut it is necessary (401 instances in Les mandarins). Secondly, epistemic predicates are represented by a very small number of verbs, as illustrated in Table 3:
Table 3. Distribution of epistemic copular impersonals in Les mandarins il Number Vrai Possible Impossible Probable Sr 2 5 1 % 6.06 62.5 0 100 0 Number 31 3 5 1 a % 93.94 37.5 100 0 100 Total Number 33 8 5 1 1 % 100 100 100 100 100
Finally, the largest category of copular evaluative predicates is extremely large. While the 26 il instances represent 17 separate predicates, the 167 a examples
14. It should also be noted that the conclusions reached in this paper concerning written French cannot be extended to the spoken language where the proportion of il impersonals is considerably reduced.
French il and a impersonals in copular complement constructions 481 distribute themselves among 67 different predicates. The only overlapping predicates are illustrated in Table 4:
Table 4. Overlapping copular impersonals il Number Facile Urgent Important Absurde Inutile Bon Naturel 3 2 2 1 1 2 1 % 20 40 33.3 16.7 25 33.3 11.1 Number 12 3 4 5 3 4 8 a % 80 60 66.7 83.33 75 66.7 88.9 Total Number 15 5 6 6 4 6 9 % 100 100 100 100 100 100 100
The relatively small number of overlapping predicates merely illustrates the great variety and eclecticism of the two constructions. It should not be interpreted as meaning that the non-attested forms are ungrammatical, but simply that they were not selected.15 The examination of Les mandarins provides a snapshot of the il and as usage in copular complement constructions, but we
15.
The few predicates where the attested a cases i) have no corresponding example with il in the larger corpus, and ii) sound highly questionable when an il equivalent was manufactured are given in (i)(iv): Quand on russit, on a un tas de problmes, mais on en a aussi quand on ne russit pas. a doit tre morne de parler et de parler sans jamais veiller un cho. (Beauvoir, S. de. Les mandarins: 507) When you are successful, you always have problems, but you also have problems when you are not successful. It [this] must be gloomy to be talking and talking without ever getting a response. (ii) Il se leva: venez vous promener. La nuit sent si bon.Il faut revenir chez ces gens, Lewis. Ils vont remarquer notre absence.Et aprs? Je nai rien leur dire ni eux moi.Mais ce sont des amis des Murray: a ne serait pas gentil de disparatre comme a. (Beauvoir, S. de. Les mandarins: 452) He stood up: Lets go for a walk. The night smells so good.We need to go back inside Lewis. These people will notice we are gone.So what? There is nothing for us to talk about.But they are friends of the Murrays: it [this] wouldnt be nice to disappear without saying anything. (iii) Je voudrais un caf. Jai peur davoir trop bu. Il sourit: une amricaine demand erait un autre whisky, ditil. Mais vous avez raison: a serait moche si un de nous deux navait plus toute sa tte. (Beauvoir, S. de. Les mandarins: 73) I would like some coffee. I am afraid I drank too much. He smiled: an American would ask for another whisky, he said. But you are right: it [this] would be ugly if one of us didnt have her wits about her. (i)
482
M. Achard
Distribution of some epistemic predicates in the larger corpus il Nb. % 77.25 93.67 95.32 Nb. 370 47 0 a % 22.75 4.65 0 Nb. 17 11 % 0 1.68 4.68 Nb. 1626 1010 235 Total % 100 100 100
Table 5.
Vrai Possible Probable
1256 946 224
Table 6. Relative distribution of copular impersonals with four evaluative predicates in the larger corpus il Nb. Facile Agrable Ennuyeux Dommage 481 88 2 15 % 80.1 67.7 6.45 10.87 Nb. 115 42 29 37 a % 19.2 32.3 93.55 26.81 Nb. 4 0 0 86 % 0.7 0 0 62.32 Nb. 600 130 31 138 Total % 100 100 100 100
cannot forget this snapshot is critically shaped by the authors general narrative and esthetic purposes, and as such, cannot be considered fully representative of the pronouns overall distribution in written French. This does not, however, question the validity of examining a single text. Any discrepancy between the usage observed in that text and more general tendencies can provide valuable insights into the meaning of the two pronouns if the discrepancies can be shown to correlate with the authors narrative strategies. A cursory glance at the comparison of the data from individual predicates in Les mandarins to larger sets taken from the 760 20th century documents of the FRANTEXT database reveals important differences. For example, Tables 5 and 6 respectively recapitulate the overall distribution of epistemic and some evaluative predicates in the larger corpus. If we compare them to Tables 3 and
(iv) Je contemplais avec dtresse la table charge de pts, de salades, de gteaux: a serait long den venir bout! (Beauvoir, S. de. Les mandarins: 522) I looked at the table loaded with pts, salads, cakes with despair: it [this] would take a long time to finish all this.
These adjectives seem very closely tied to the event itself (long long, morne gloomy, moche ugly), or to the sensation their occurrence provokes (gentil nice). If we compare them to others which are also felicitous with il such as inconcevable inconceivable or agr able pleasant, they seem to be less capable of providing the analytical judgment these adjectives convey. This observation is compatible with the analysis presented here, but it needs to be confirmed by further research.
French il and a impersonals in copular complement constructions 483 4, we realize that unlike what was observed in Les mandarins, il is much more frequent than a. The difference is even more striking for individual predicates. For example, il occurs in only 6.06% of the cases involving vrai in Les mandarins, but in 77.25% of the overall corpus. Similarly, a is attested in 80% of the instances with facile easy in Les mandarins, whereas in the more general corpus, it only represents 19.2% of the cases. The end of this section will show, however, that the discrepancy between the two corpora can be explained by the narrative strategies used in Les mandarins, and that the meaning of the two pronouns as it is described in this paper provides a possible motivation for the authors unconventional selection. The data in Table 6 also reveal that the distribution of impersonal forms varies greatly depending on the meaning of individual predicates, even within the same general semantic classes. For example, even though the four predicates tre facile be easy, tre agrable be pleasant, tre ennuyeux be annoying, and tre dommage be a pity are all evaluative, their relative distribution with the three impersonal forms varies considerably. Whereas facile easy and agrable pleasant favor il, ennuyeux boring overwhelmingly prefers a, and dommage pity is most frequently attested with .16 The individual idiosyncrasy of each predicate illustrated in Table 6 has made it difficult to find global strategies which determine the use of a particular pronoun, beyond general observations concerning the more familiar registers a is assumed to cover. Even the more sophisticated accounts have had problems with the large amount of individual variation these predicates exhibit. For example, Le Bidois and Le Bidois (1938: 116) essentially claim that impersonal demonstratives provide greater force to a statement: Toutes les fois quon veut mettre de la force dans lnonciation dun jugement, cest ce qui est aujourdhui prfr. Every time additional force is needed in a judgment, ce is preferred nowadays. Furthermore, lnonciation purement objective, rationnelle, se contente trs bien de il; mais quil intervienne un lment subjectif, sentimental, on voit tout de suite paratre ce rational, objective statements are perfectly content with il, but as soon as a subjective, emotional element occurs, ce immediately appears. These observations are certainly sound, and the analysis presented in the reminder of this section is largely compatible with them, but they are meant to apply to the distribution as a whole, and cannot therefore adequately distinguish agrable from ennuyeux for example, and explain their different distribution. The only way of dealing with this amount of idiosyncrasy within similar semantic classes consists in addressing each predicate individually. The next section will thus consider il and as (c) distribution
16. It is interesting to note that impersonals constitute the most frequent form with dommage pity, although they are seldom mentioned in the literature. The matter will not be pursued here.
484
M. Achard
with the epistemic copular predicates est vrai is true. For an exhaustive analysis of the distribution of the two forms, other individual predicates need to be analyzed in a similar fashion before any general conclusion can be drawn. 5.1. Vrai
As an existential predication, il introduces the post verbal entity with respect to its sphere of existence. As a demonstrative, a pertains to the evaluation of the post verbal entity (Achard 2009b). The use of the pronouns in the copular complement construction represents an extension of their more prototypical senses. Section 3 showed that the epistemic predications profile the conceptualizers efforts to assess the location of the proposition coded in the complement with respect to R. In the case of vrai true, that proposition is recognized as part of some conceptualizers dominion. By contrast, with impersonal demonstratives, the proposition previously considered is confirmed, i.e., evaluated as true. The notions of existence and evaluation which suitably describe il and a respectively are therefore inherited from both pronouns more prototypical senses, and in the context of the copular complement construction, the difference between the two is often too subtle to carry a substantial semantic distinction, as in the examples in (47) where c could be used as an alternative to the attested il with little difference in meaning: (47) Il me suffira de rappeler comment M. Klein, dans une question relative aux surfaces de Riemann, a eu recours aux proprits des courants lec triques. Il est vrai que les raisonnements de ce genre ne sont pas rigou reux . . . (Poincar H. La valeur de la Science: 154) It will be sufficient to remind you how M. Klein, in a question relative to Rieman surfaces, used the properties of electrical currents. It is true that such arguments are not rigorous However, these notions are reinterpreted within the domain of discourse coherence, where they essentially pertain to the management of argumentation i.e., the articulation of the different propositions which constitute the overall discursive strategy of a given passage. For instance, il is frequently attested if the proposition it introduces serves to temper a previously made statement by presenting a piece of information that challenges its force. This is illustrated in the examples in (48) and (49): (48) voir cela, il me semble que la rvolte est plus loin de nous que je ne croyais d abord. Il est vrai que je suis avec des montagnards, carts des centres industriels et trs fatalistes. (Rivire, J. Correspondance avec J. Rivire: 120)
French il and a impersonals in copular complement constructions 485 When I see this, it seems to me that the rebellion is further from us than I first thought. It is true that I am with mountain men, remote from industrial areas, and very fatalistic. (49) Au reste, vins, bires, ou cidres, il savait rendre justice tout ce que le seigneur a cr dexcellent. Il ntait pas assez malavis pour laisser sa raison dans son verre, et il gardait la mesure. Il est vrai que cette mesure tait copieuse, et que dans son verre une raison plus dbile se ft noye. (Rolland, R. JeanChristophe le matin: 123) Besides, wine, beer, or cider, he knew how to do justice to all the excellent things the Lord created. He didnt have the bad judgement to leave his reason in his glass, and he kept his measure. It is true that this measure was large, and that in his glass, a weaker reason would have drowned. In the examples in (48) and (49), the proposition il introduces presents a piece of information that challenges the generalizing force of the previous statement. In (48) for instance, the authors earlier position about the state of the rebellion is nuanced by his further consideration of the fatalistic nature of his companions. Ils selection is consistent with its meaning described in the previous section because the mere statement of the existence of a fact which runs counter to the overall argument suffices to weaken the latters scope and power. Furthermore, the proposition in the complement has not been presented before, and even though the speaker is obviously aware of the events it is reporting, the impersonal construction merely asserts its existence with respect to R. The presence of c in this context is not impossible, but it would indicate that the proposition in the complement had somehow been previously established, and is now being evaluated. In the context of (48) and (49), each passage would come across as a piece of internal dialogue, where the speaker plays the role of distinct protagonists and answers his own objections as if they came from other discourse participants. This construal is obviously marked, since such dialogic practices are more frequently reserved for interactive communication. Conversely, and also consistent with the pronouns meaning, cs presence often indicates that the proposition it introduces directly corresponds to some element of the immediate context, as illustrated in (50) and (51). (50) je sais: il a tu un pauvre vieil homme sans dfense: Farnese tait seul,pas un laquais,et le coup de revolver a t tir par derrire. Je sais tout a. . . . mais coutez un peu: ce nest pas vrai que Farnese tait seul. (Farrre C. Lhomme qui assassina: 280) I know: he killed a poor defenseless man: Farnese was alone,not a servant,and the shot was fired from behind. I know all that . . . but listen for a minute: It [this] is not true that Farnese was alone.
486
M. Achard
(51) Sil te faut une confiance perptuelle sache que tu las et que cest elle qui sinquitait quand jcrivais ma dernire lettre. Mais sache aussi que cette confiance est exigeante et demande quon la satisfasse. Cest vrai que je suis prs de mes intrts. Plus je vais, plus je veux ac qurir. (Rivire, J. Correspondance avec J. Rivire: 207) If you require everlasting trust, know that you have it and it was that trust that was getting worried when I wrote my last letter. Be also aware, however, that this trust is demanding and expects to be satisfied. It [this] is true that I am close to my interests. I want to acquire more and more as time passes. In the examples in (50) and (51), the proposition c introduces express agreement with a statement made in the previous discourse. In (50), Farnese tait seul Farnese was alone repeats a section of the preceding discourse verbatim. In (51), the quotes surrounding prs de mes intrts close to my interests indicate that this very expression was used in a previous letter. The examples in (50) and (51) are very different from the ones in (48) and (49). They describe multi-character interactions, where the impersonal demonstrative is used dialogically to manage the exchange between the participants. The literary register of those examples also suggests that the impersonal demonstratives dialogic character rather that their confinement to lower linguistic genres is responsible for the often made statements about their conversational nature. If the content expressed in the complement proposition has already been established in the context, and the communicative purpose of the predicate vrai is merely to confirm it, c alone is possible. For example, in (52), cs presence is expected because the speaker confirms a rumor about himself. Il would be awkward because it would imply the speaker is stating the existence of what is already common knowledge. (52) Mais je ne prendrai pas un coup, Maria, pas un seul! Il hsita un peu et demanda abruptement, les yeux terre:peut tre . . . vous a t on dit quelque chose contre moi?non.cest vrai que javais coutume de prendre un coup pas mal, quand je revenais des chantiers et de la drave; mais cest fini. (Hmon L. Maria Chapdelaine: 93) But I wont have a drink, Maria, not a single one! He hesitated a little and asked suddenly, his eyes downcast: maybe someone told you something against me?no-.it [this] is true that I used to drink quite a bit when I came back from working or cutting wood; but its over. Because of the interactive value of the demonstrative impersonals, and the possibility for the pronoun to present speaker-internal arguments, cs acceptability greatly relies on the hearers ability to get independent access to the proposition it introduces, so that she can anticipate the speakers adjustments
French il and a impersonals in copular complement constructions 487 as if they were her own, and hence follow the general trend of discourse. This independent evidence may come from the context itself, or from global world knowledge. Consider the examples in (53) and (54): (53) La reproduction des organismes unicellulaires consiste en cela mme, ltre vivant se divise en deux moitis dont chacune est un individu com plet. Il /#cest vrai que, chez les animaux plus complexes, la nature lo calise dans des cellules dites sexuelles, peu prs indpendantes, le pouvoir de produire nouveau le tout. (Bergson, H. Lvolution cra trice: 12) The reproduction of single cell organisms consists in this very process, the being divides itself into two halves, each of which is a complete being. It/this is true that, with more complex animals, nature places in the quasi independent so-called sexual cells the power to reproduce the whole. (54) Les systmes dlimits par la science ne durent que parce quils sont indissolublement lis au reste de lunivers. Il /#??cest vrai que, dans lunivers lui mme, il faut distinguer, comme nous le dirons plus loin, deux mouvements opposs, lun de descente, lautre de monte. (Bergson, H. Lvolution cratrice: 10) The systems outlined by science only last because they are forever connected to the rest of the universe. It/??this is true that in the universe itself, we must distinguish as we will say later, two opposite movements, one descending, the other one ascending. Both examples in (53) and (54) illustrate the pattern already presented in (48) and (49), where the proposition which follows the predicate tempers the overall force of the previous utterance. As mentioned earlier, cs presence in these cases is interpreted as some kind of internal dialogue where the speaker is construed as answering self evoked, unexpressed dissenting arguments. As indicated above, the difference in acceptability between (53) and (54) is imputable to the speakers being able to access the proposition coded in the complement independently, so she can follow the speakers progress through his own arguments. This is certainly the case in (53), because people are generally aware of some rudiments of complex cell reproduction, and the proposition which follows the predicate is thus readily available as an objection, which the speaker considers (and agrees with) as such. In (54) on the other hand, the presence of comme nous le dirons plus loin as we will say later indicates that the proposition il faut distinguer deux mouvements opposs, lun de descente, lautre de monte we must distinguish two opposite movements, one descending, the other one ascending has never been introduced in the discourse before. It is also highly unlikely that the hearer will be acquainted with the concept independently, which makes the interpretation of the proposition as an
488
M. Achard
unmentioned objection the speaker agrees with difficult to accept given regular discourse conventions, and hence renders cs presence awkward. The analysis presented in this section is strengthened by the presence of several lexical/syntactic environments which allow a more precise observation of the meaning of the pronouns. Two of them will be considered here. First, the split between the evaluative and argumentative functions respectively encountered for il and a in the previous examples can also be observed when the proposition the pronoun introduces starts with the hypothetical si if. In this context, every single one of the 10 attested instances of si cest vrai if it is true indicates the speakers attempt to confirm the status of the proposition with respect to reality. The important point for our purposes is that in each case, the propositions epistemic status is not established with certainty, but merely considered potentially true. The motivation for the speakers desire for epistemic confirmation may be prompted by a rumor as in (55), or a suspicion arising from the interpretation of suspicious behaviors as in (56). In any case, however, the pronouns import is clearly evaluative, because its primary function is to elicit the truth of the proposition coded in the complement. (55) monsieur, dites moi un peu si cest vrai que vous faites des vers char mants? Je lai entendu dire en ville. (Colette, G. Claudine lcole: 118) sir, please tell me is it [this] is true that you write charming poetry? I heard it in town. (56) si ctait vrai que tes grosse, dit la mre, la seule chose, cest de te laisser courir aprs par les deux Merlavigne. (Martin du Gard, R. Vieille France: 1096) if it [this] is true that you are with child, the mother said, the only thing to do is to let the two Merlavigne cozy up to you. This evaluative function of confirmation can also be observed with il, as illustrated in (57), but it is only attested in 13 of the 124 instances of sil est vrai. (57) je me souviens que je lui ai demand sil tait vrai que son frre tait parti. (Simenon, G. Les vacances de Maigret: 137) I remember that I asked him if it was true that his brother had left. In a way congruent with the tendencies observed earlier, the overwhelming majority of the sil est vrai cases pertains to the management of the diverse facets of the speakers argument, or more specifically to the establishment of specific relations between different propositions to advance that argument in the desired direction. With il, the si clause sets up a proposition as true and considers the ramifications for the rest of the argument. The relations between the proposition introduced by the si clause and other propositions which follow
French il and a impersonals in copular complement constructions 489 are varied. The cases where the si clause presents the protasis in a hypothetical construction are the most frequently attested with 57 instances illustrated in (58): (58) Ignoretoi toimme, cest le premier prcepte de la sagesse. Sil est vrai que Montaigne composa ses essais pour tudier son propre indi vidu, cette recherche lui dut tre plus cruelle que les pierres qui lui dchiraient les reins. (France, A. Le petit Pierre: 265) Ignore thyself is the first rule of wisdom. If it is true that Montaigne wrote his essays to study his own self, this search must have been more painful than the stones which broke his back. Other instances present different logical configurations between the different propositions which constitute the overall argument. For example, despite its reality, the scope of the si clause is reduced by the following proposition in (59): (59) Mais nous avons pris lindividu ltat isol, sans tenir compte de la vie sociale. En ralit, lhomme est un tre qui vit en socit. Sil est vrai que lintelligence humaine vise fabriquer, il faut ajouter quelle sassocie, pour cela et pour le reste, dautres intelligences. (Bergson, H. Lvolution cratrice: 158) But we have considered an individual alone, without taking his social life into account. In reality, man is a social being. If it is true that human intelligence aims to build things, it must be added that to do this, as to do anything else, it collaborates with other intelligent beings. In (60), the si clause serves as a given to introduce another proposition of equal status although it may not be so readily available: (60) Sil est vrai que la mer ait t autrefois notre milieu vital o il faille replonger notre sang pour retrouver nos forces, il en est de mme de loubli, du nant mental; on semble alors absent du temps pendant quelques heures (Proust, M. A lombre des jeunes filles en fleurs: 820) If it is true that the sea once was our life environment where we need to immerse our blood to recover our strength, the same can be said about oblivion, about mental vacuum; we then seem to be gone from time for a few hours. The exact nature of the relation between the different propositions (as well as the number of different cases) is difficult to precisely isolate. The important point for our purposes is that in each case, the si clause constitutes the reference point with respect to which the following clause is considered. This is fully consistent with the pronouns meaning as it was presented earlier. The
490
M. Achard
statement alone of the proposition (i.e., its existence), and not its evaluation is sufficient to calculate the status of the other proposition with respect to which it is evaluated. It should be noted that the evaluative and argumentative senses favored respectively by a and il are not as separate as this presentation implied for expository purposes. In fact, the situation described in (56) does conform to the protasis/apodosis in some way, since the course of action the mother advocates should only be carried out if her daughters pregnancy is confirmed. However, despite the logical relation that exists between the two elements of the sentence, the pronouns sense remains predominantly evaluative because the truth of the proposition coded in the si clause is still being debated. With il, the truth of the proposition does not constitute the speakers primary focus. For example, in (58), we will never know the true reasons which prompted Montaigne to write, but it doesnt deter from the overall validity of the argument. The emphasis is placed on the resulting consequences should the hypothesis be true, or in other words on the logical relation which exists between the two propositions. In this environment again, the two pronouns behave in a way consistent with their meaning as it was presented in Sections 3 and 4. Finally, in the context of nen est pas moins vrai nonetheless remains true only il appears. The sequence il nen est pas moins vrai it is nonetheless true occurs 99 times in the corpus (7.90% of the il est vrai cases), while no example is attested with a. This construction is illustrated in (61) and (62): (61) Un gnral victorieux et qui apportait de largent se rendait indispens able. Et la popularit de Bonaparte grandissait. Il nen est pas moins vrai que bien des franais se demandaient si lon allait se battre tou jours, enrler toujours, conqurir toujours. (Bainville, J. Histoire de France: 196) A victorious general who brought in money was making himself indispensable. So Bonapartes popularity grew. It is nonetheless true that many French people were wondering if they would always be enrolling new soldiers, always be fighting, always be conquering new lands. (62) Voici quune fois encore un des grands sujets de mon cours le George a t sabot pas tout fait par ma faute cette fois ci cause et de ma sant et de louragan Keyserling; mais il nen est pas moins vrai que depuis louverture de ce cours il y a quatre ans, jenregistre de plus en plus de dsastres . . . (DuBos, C. Journal T. 3: 63) Once again, one of the most important topics of my course, the George has been sabotagednot quite by my fault this time because of my health and the Keyserling hurricane; but it is nonetheless true that since this course opened four years ago, I have been the victim of an increasing number of disasters . . .
French il and a impersonals in copular complement constructions 491 In the examples in (61) and (62), the mitigating circumstances that surround the proposition introduced by the impersonal pronoun are not sufficient to obscure its reality. This structure matches up perfectly with the il construction which codes the existence of the proposition. This function of the pronoun makes it ideally suited to express this sort of narrative bottom line. This section has shown that the distribution of il and c with the epistemic predicate est vrai conforms to specific tendencies, and that these tendencies are congruent with the two pronouns respective meanings presented in Sections 3 and 4. C is specifically concerned with the evaluation of the proposition coded in the si clause as real, while il focuses more directly on that propositions sheer existence, or, alternatively, its being treated as real in order to evaluate the ensuing consequences for the reminder of the argument. However, the example in (63) serves to remind us that these tendencies are not absolute. Despite the fact that the complement content represents an almost verbatim repeat of a former statement, and that the communicative purpose of the sentence is to confirm the protagonists statement, il is used. Other factors, in particular socio-linguistic ones and ultimately author choice also bear on the selection of the impersonal pronoun. (63) Tu as une grande fivre. Tu es ptrie de tristesse. Ton me est ptrie de tristesse. Ton oncle est all la chercher, hein. Jeannetteil est vrai que mon me est dans la tristesse. (Pguy, Ch. Le mystre de la charit de Jeanne dArc: 16) You have a great fever. You are immersed in sadness. Your soul is immersed in sadness. Your uncle went to fetch her, didnt he. Jeanetteit is true that my soul is in sadness.
We can now come back to the discrepancy observed between the distribution of the two pronouns in the larger corpus and the one in Les mandarins, where a (c) outnumbers il by 31 to 2. The dialogic nature of the novel, and the way in which the ideas emerge out of conversations between the protagonists goes a long way toward explaining this frequency difference. Nineteen examples occur in conversation, and precisely follow the tendency observed throughout this section, namely the confirmation of the potential for the proposition coded in the si clause to be true, as illustrated in (64): (64) Cest lensemble qui est moche: comme ils mnagent les fritz, y compris les nazis, et comment ils traitent les types des camps.Je voudrais bien savoir si cest vrai quils interdisent les camps la croixrouge fran aise, dit Henri.Cest la premire chose que je vais vrifier, dit Lam bert. (Beauvoir, S. de. Les mandarins: 131) The whole picture is ugly: how they tiptoe around the Krauts, including the Nazis, and how they treat people in the camps.I would like to
492
M. Achard know if it [this] is true that they do not allow the French Red Cross into the camps, Henri said.It is the first thing I will be checking, Lambert said.
Furthermore, 12 of the 31 examples of c involve speaker internal dialogue, some of which, illustrated in (65), preserve the formal structure of dialogue with the quotation marks: (65) Henri sarrangerait pour lassocier de plus en plus troitement la vie du journal; Lambert se formerait politiquement, il se sentirait beaucoup moins perdu dans le monde, et une fois tout fait dans le coup, il ne se demanderait plus que faire de sa peau. Cest vrai que ce nest pas commode dtre jeune en ce moment, se dit Henri. (Beauvoir, S. de. Les mandarins: 255) Henri was finding ways to integrate him more and more closely to the daily life of the paper; Lambert would learn politics, he would feel far less lost in the world, and once he was totally comfortable, he would no longer wonder what to do with himself. It [this] is true that it is not easy to be young nowadays Henri said to himself. Even when there are no quotation marks, the internal dialogue originates from a conversation and the proposition whose truth the character confirms originates as a result of the previous conversation, as if the protagonist were continuing the dialogue for herself, as illustrated in (66): (66) Lewis se mit rire:Pauvre petite gauloise! Comme elle a lair pitoy able ds quon ne fait plus ses quatre volonts! Je rougis. Ctait bien vrai que Lewis ne pensait jamais qu me faire plaisir. (Beauvoir, S. de. Les mandarins: 438) Lewis started laughing:My poor little Gaulish girl! How pitiful she looks when things dont go her way! I blushed. It [this] was really true that Lewis was always trying to make me happy. It is therefore clear that the difference in frequency between Les mandarins and the larger corpus does not present any challenge to the analysis developed in this section. To the contrary, the systematic use of the demonstrative impersonal along with similar discursive strategies designed to highlight the interactive and conversational tone of the novel provides further validation for that pronouns meaning as it is presented in this paper. Unfortunately, the analysis of vrai is not directly expandable to other predicates because the adaptation of the pronouns meaning to the discourse context is obviously mediated by the semantic import of the predicate. If the interactive, evaluative, and argumentative functions the pronouns have displayed in
French il and a impersonals in copular complement constructions 493 this section represent well-motivated adaptations of the pronouns meaning in other contexts, we shouldnt necessarily expect to see them reproduced with other predicates, especially with predicates from other semantic classes. For this reason, general statements about the overall distribution of il and impersonal demonstratives only stand a chance of being correct if they are based on fine-grained analyses of individual predicates. 6. Conclusion and possible avenues of research for a multidimensional view of impersonals
This paper argued that in the context of the French copular complement construction, the structural distinction between impersonal pronouns (il) and demonstratives (a) most syntactic accounts of impersonals posit is not warranted because i) the impersonal il is not a syntactic placeholder but a meaningful expression, and ii) a cannot be considered to cataphorically refer to the complement clause. The competition between il and a therefore doesnt involve two syntactically distinct structures, but two expressions very close in meaning. Il profiles the field i.e., the conceptualizers awareness or mental reach which permits the situation profiled in the complement to be assessed, and a profiles the subjectively construed abstract setting or, in other words, the immediate circumstances from which the event or proposition coded in the complement clause will be extracted. Because of their similarity in meaning and syntactic function, both pronouns deserve to be called impersonals. The syntactic distribution of the two pronouns reflects the large amount of overlap in their semantic structure. The properties il and a display in the copular complement construction have been shown to be inherited from the pronouns other senses. Il was argued to be predominantly existential, while a remained evaluative. However, these properties were also shown to adapt to the particular semantics of a given predicate to yield specific discourse functions. With the predicate est vrai is true, a is mostly used to confirm the truth of a proposition already considered as a potential candidate for inclusion in reality, while il pertains to the logical organization of the different propositions which constitute a complex argument. Because of the highly idiosyncratic distribution of the pronouns with individual predicates, further generalizations require the in-depth examination of a large number of them. Beyond their relevance to the analysis of il and a in copular complements, the results obtained in this paper not only argue in favor of a broader account of impersonals than syntactic accounts generally advocate, but they also provide a way of constraining the kinds of constructions which receive the impersonal label. The reminder of this conclusion proposes a brief and preliminary overview of some ways in which the analysis presented for il and a could be extended to different kinds of constructions.
494
M. Achard
According to the working hypothesis presented in the introduction, impersonal constructions are characterized by i) the defocusing or backgrounding of the agent of the profiled process, and ii) the high degree of generality this process must possess so that it is available to a generalized conceptualizer, namely anyone in a position to experience it. These two conditions can be met in a variety of ways. The analysis presented in this paper illustrated one of them. With respect to condition i), the il and a constructions were shown to diverge from a prototypical transitive clause by the selection of alternative entities as the trajector in the profile relation. Rather than the agent, il and a were shown to respectively select the field and the abstract setting within which the process is located. With respect to condition ii), impersonal demonstratives were restricted to the copular complement construction because the complement process only possesses a high enough degree of generality in this particular context. Middle constructions constitute another potential example because they profile the patient as the trajector of the profiled relation, as illustrated in (67), and they can thus be analyzed in the same way. (67) Hero, brusquement: Tu mas compris! (Il serrait son verre dans sa main, le verre se casse. Ils regardent le verre tous deux dans la main de Hro qui dit doucement. Excusemoi, mon vieux. Jaime casser. (Anouilh, J. La rptition: ou, lamour puni: 75) Hero, suddenly: You understood me! (he was holding his glass in his hand, the glass breaks. Both of them look at the glass in Heros hand who says softly: Excuse me Old Man. I like to break (things). The construction in (67) satisfies the first condition because the predicates casser break is generally transitive, but the presence of the middle marker se serves to render the predicates intransitive.17 Consequently, the agent is not expressed, and the patient is selected as the focal figure (the trajector) in the profiled relation. Obviously, however, it fails the second condition because the process it profiles is highly specific. It occurs only once, between clearly delineated and uniquely identifiable participants, and is not likely to be reproduced in a similar fashion for other conceptualizers to experience. Other instances of middle constructions exhibit a much higher degree of generality, as the examples in (68)(70) illustrate: (68) Jetable, certes, mais indmodable: la fameuse pointe Bic qui a fait la fortune du baron Bich, dcd lundi, se vend toujours 15 millions dexemplaires par jour dans le monde. (AFP)
There exists another intransitive construction without the middle marker se (la branche casse the branch breaks. For an analysis of the distribution of these constructions, see Achard (2008).
17.
French il and a impersonals in copular complement constructions 495 Disposable, sure, but always fashionable: The famous Bic ball point pen that made Baron Bich who died on Monday rich still sells to the tune of 15 million a day in the world. (69) Il note que le pain sans levain est cuit sur des plaques de tle et res semble de la galette ou aux crpes de carnaval, que le saucisson dArles se fait avec de la viande de mulet. (Durry, MJ. Grard de Nerval et le mythe: 82) He notes that yeast free bread is cooked on flat metal sheets and resembles biscuits or carnival pancakes, that the sausage from Arles is made with mule meat. (70) Scapin.Oh monsieur, les coups de bton ne se donnent pas des gens comme lui et ce nest pas un homme tre trait de la sorte. (Claudel, P. Le ravissement de Scapin: 1344) Scapin:Oh sir, cane strokes are not given to people like him and he is not a man to be treated in this manner. The examples in (68)(70) indicate a cline in the generality of the profiled process. The event profiled in (68) is unquestionably more general than the one profiled in (67) because it is iterative, and thus not confined to a single occurrence. It can be observed every day, and occurs between a very large number of distinct participants. Furthermore, the sales figures are potentially available to any conceptualizer, which makes the construction a valid candidate for inclusion in the impersonal category. However, the processes profiled in (69) and (70) may be argued to be even more general because these essentially present definitional features of their environment. In (69), it represents the very characteristic which distinguishes the members of the category saucisson dArles sausage from Arles from all other instances of the saucisson category. In other words, the examination of each instance of the category (each saucisson dArles) will unfailingly reveal the presence of mule meat. This cannot be said for the process in (68) where the examination of each ball point pen will not reveal how many have been sold that day. In a way similar to (69), Scapins warning in (70) concerns not only the person under discussion, but more generally any individual of similar social status. It is offered as a codified rule of social behavior which governs all potential interactions with a specific section of the population and can thus be viewed as definitional for that social group. Should the construction illustrated in (68) still be included in the impersonal category even though others are more general? A conclusive answer is certainly premature given the preliminary nature of the investigation of the middle category, but it is worth noting that we have already been confronted with this type of choice when we decided to reserve the impersonal label to a in the copular complement construction. That decision was made easier by the presence of a specific construction which codes the difference in generality
496
M. Achard
syntactically, but we had already noticed that the comfortable safety this context provides may be overstated. The truth of the matter is that the different candidates for impersonal status can be evaluated along a continuum of generality, and that any cut off point may involve some degree of arbitrariness. This situation is not particularly problematic in the view defended here. There is not very much at stake in calling a construction impersonal since impersonals are not structurally distinct from other constructions. Nonetheless, a possible methodology for empirically defining impersonals will be proposed at the very end of the paper. The selection of an alternative trajector doesnt constitute the only possible avenue to express the defocusing of the subject and very general construal of the conceptualized scene. For instance, the indefinite pronoun construction in (71) meets the two conditions for an inclusion into the impersonal category because i) the emphasis is placed on the process itself, and ii) any person traveling through Spain is bound to encounter the kinds of difficulties the author describes. (71) Comme dans nimporte quel pays, il ne faut pas coucher au bord de la route mais quelque distance, pour viter la curiosit des passants et le bruit des voitures. On ne trouve pas toujours facilement, en Espagne, un endroit pour camper, cause de labsence de bois, et de la culture inten sive. (TSerstevens, A. Litinraire espagnol: 16) As in any country, you shouldnt sleep directly at the side of the road, but some distance away, to prevent curious passer-bys and the noise from cars. One doesnt always easily find a camping place in Spain, because woods are scarce and the fields heavily farmed. However, indefinite pronoun (on we) constructions do not differ from prototypical transitive clauses in terms of alternative trajector selection, since the agent is, indeed, selected as the trajector of the profiled relation, but in terms of delimitation i.e., how the profiled instance projects to the world (or the relevant universe of discourse) (Langacker 2009: 123). In other words, delimitation pertains to how much of the world the instance subsumes (or delimits), so that by referring to it we are limiting attention to a facet of the world as opposed to all others. (Langacker 2009: 123, emphasis in the original). In a prototypical transitive clause, the nominals respectively selected as trajector and landmark, as well as the process profiled by the verb, all profile grounded instances of their respective specific types (Langacker 2008 Chs. 10, 11) and therefore they limit attention to these specific entities. Some personal pronouns have very strict delimitation. Je I for example, is restricted to the speaking subject. Others have a minimum, but virtually no upper limit in the range of individuals it may subsume. This is the case for on we. Minimally, it includes the speaker and another entity, as illustrated in
French il and a impersonals in copular complement constructions 497 (72), but it may also include groups of virtually any size in addition to the speaker, as illustrated in (73): (72) Jai senti quavec cet hommel, on allait sentendre. a na pas tran. Buzard et moi, on est tout de suite tomb copains. (Aym, M. Clram bard: 131) I felt that with this man, we would get along well. It was quick. Buzard and me, we became buddies right away. (73) Quand on a fait le film, les images des massacres de notre sicle nous sont revenues en mmoire, on ne les a pas refuses, a ajout le ralisa teur. (AFP) When we shot the film, the images of the massacres of our century came back to our minds, we didnt refuse them, the director added The example in (73) describes the choices made during the shooting of a movie. In addition to the speaker (the director), every member of the crew who had a voice in determining the esthetic choices the movie made is also part of the group on refers to. In other cases, the precise nature of the group on delimits is much more difficult to delineate because it is too large or too general for its members to be individuated, as illustrated in (74): (74) Quand on joue, il faut se concentrer et se relaxer, il faut oublier les problmes techniques, a dclar le cinaste, qui tourne actuellement La jeune fille et la mort. (AFP) When you play, you have to be concentrated and relaxed, you must forget the technical problems, the director who is currently shooting The young lady and death declared In (74), the directors statement is not only applicable to his crew, but pertains to the ways in which actors should behave in general. On therefore refers to a group so large and general that it applies to anyone in a position to perform the process profiled in the verb. Because it potentially pertains to all actors, and thus to an unlimited of individuals within this group, the pronoun takes on a generic value. The only restriction on the potential participants is provided by the nature of the verbal process. The group members on delimits must act (be actors). Any member of that acting group is part of the pronouns collective reference. Some processes are even more inclusive, as in (75) where the universality of hope and its maximally inclusive scope confers to the pronoun an almost proverbial value with the unlimited reference of humanity in general: (75) Dans un entretien avec la presse jeudi aprsmidi, le Pr. Grimaud rap pel que tout pronostic concernant son tat est extrmement rserv, dans les heures qui viennent. On ne peut jamais dire quil ny a plus
498
M. Achard despoir, cependant il y a toujours des risques de complications secon daires craindre, atil indiqu. (AFP) In an interview with the press on Thursday afternoon, Pr. Grimaud reminded us that any prognosis concerning his condition in the coming hours is extremely reserved. One can never say that there is no hope, but there are always risks of secondary complications, he indicated Because of its limitless delimitation, on is a perfect candidate to describe humans universal characteristics, as illustrated in (76) which describes philosophical stance valid for everyone.
(76)
Il faudrait oser; on nose point. Mais saiton bien? La doctrine du libre jugement est profondment enterre. (Alain. Propos sur des philosophes: 20) We should be daring, we dont dare. But do we really know? The free judgment doctrine is deeply buried
The impersonal status of some of the examples in (74)(76) is intuitively clear. Despite its selection as trajector in the profiled relation, the agent in the indefinite pronoun construction is extremely diffuse, and its delimitation is only determined by the predicate which determines the restrictions on the nature (and number) of the referred group. If the predicate itself expresses a process applicable to people in general, the indefinite on construction allows the speaker to express her linguistic conceptualization at a level of generality comparable to the previously described impersonals and middles. The same problem remains, however, which we have already encountered at several junctures of this paper, namely the difficulty of determining which predicates profile processes general enough to be called impersonal. A possible way out of this difficulty comes from the interplay of construction-internal and cross-constructional criteria. First, as was illustrated for il and a, and to a far lesser extent for the middles and on, each potential construction (as determined by the two conditions presented in this paper) should be investigated, so that the different levels of generalization can be brought to light. Secondly, each construction should be compared to the others, so the appropriate generalizations can be pointed out. The important point is that the categorization of a construction as an impersonal should be based on both constructionspecific and cross-constructional criteria. The successful candidates should not only be demonstrably different (form a syntactic or semantic natural class) within their own construction, but also across constructions. As a consequence of this dual requirement, a comprehensive analysis of the impersonal category can only be undertaken when each participating construction has been successfully mapped out. In the final analysis, however, because impersonals are not structurally different from other related constructions, the precise outlines of the category are probably less important than the general pro-
French il and a impersonals in copular complement constructions 499 cesses by which the construal of a situation becomes maximally general and inclusive. Received 14 August 2009 Revision received 22 February 2010 References
Achard, Michel. 1998. Representation of Cognitive Structures: Syntax and Semantics of French Sentential Complements. Berlin and New-York: Mouton de Gruyter. Achard, Michel. 2000. French a and the dynamics of reference. LACUS Forum: 112. Achard, Michel. 2002. The meaning and distribution of French mood inflections. In Frank Brisard (ed.). Grounding: The Epistemic Footing of Deixis and Reference, 197249. Berlin: Mouton de Gruyter. Achard, Michel. 2008. Verbes de rupture simples et rflchis: Deux constructions intransitives. In J Durand, B Habert, and B Laks (eds.) Congrs Mondial de Linguistique FranaiseCMLF08: 23772388. Achard, Michel. 2009a. The distribution of French intransitive predicates. Linguistics 47.3: 513 558. Achard, Michel. 2009b. Existence and evaluation: French il and a impersonals. LACUS Forum 35: 111. Bolinger, Dwight. 1973. Ambient it is meaningful too. Journal of Linguistics 9: 261270. Bolinger, Dwight. 1977. Meaning and Form. London and New York: Longman Brunot, Ferdinand. 1936. La Pense et la Langue. Paris: Masson. Cadiot, Pierre. 1988. De quoi a parle? A propos de la rfrence de a pronom-sujet. Le franais Moderne 65: 17492. Carlier, Anne. 1996. Les Gosses a se lve tt le matin: Linterprtation gnrique du syntagme nominal disloqu au moyen de ce ou a. Journal of French language studies 6: 13362. Chomsky, Noam. 1981. Lectures on Government and Binding. Dordrecht: Foris. Fauconnier, Gilles. 1985. Mental Spaces: Aspects of Meaning Construction in Natural Language. Cambridge, Mass.: MIT Press, and London: Bradford. Galichet, Georges. 1947. Essai de Grammaire Psychologique de la Langue Franaise. Paris: Presses Universitaires de France. Gledhill, Christopher. 2003. Fundamentals of French Syntax. LINCOM EUROPA. Grevisse, Maurice. 1986. Le bon usage (12th dition). Paris: Duculot. Hriau, Michel. 1980. Le verbe impersonnel en franais moderne. Lille: Atelier de reproductions de thses, Universit de Lille III. Jones, Michael A. 1996. Foundations of French Syntax. Cambridge University Press. Karttunnen, Laurie. 1971. Some observations on factivity. Papers in Linguistics 4: 5570. Kiparsky, Paul, and Carol Kiparsky. 1970. Fact. In Manfred Bierwisch, and Karl E. Heidolph (eds.), Progress in Linguistics, 143173. The Hague: Mouton. Kirsner, Robert. 1979. The Problem of Presentative Sentences in Modern Dutch. Amsterdam: North-Holland. North-Holland Linguistic Series 43. Lakoff, George. 1987. Women, Fire, and Dangerous Things: what Categories Reveal about the Mind. Chicago and London: University of Chicago Press. Langacker, Ronald W. 1982. Space grammar, analyzability, and the English passive. Language 58: 2280. Langacker, Ronald W. 1985. Observations and speculations on subjectivity. In John Haiman (ed.) Iconicity in syntax, 10950. Amsterdam and Philadelphia: John Benjamins.
Rice University
500
M. Achard
Langacker, Ronald W. 1987. Foundations of Cognitive Grammar. Vol. l: Theoretical prerequisites. Stanford: Stanford University Press. Langacker, Ronald W. 1990. Subjectification. Cognitive linguistics 1: 538. Langacker, Ronald W. 1991. Foundations of Cognitive Grammar. Vol. 2: Descriptive application. Stanford: Stanford University Press. Langacker, Ronald. W. 2002. The control cycle: Why grammar is a matter of life and death. Pro ceedings of the Annual Meeting of the Japanese Cognitive Linguistics Association 2: 193220. Langacker, Ronald. 2004. Aspect of the grammar of finite clauses. In Michel Achard and Suzanne Kemmer (eds.), Language, Culture, and Mind, 535577. Stanford: CSLI Publications. Langacker, Ronald W. 2006. Dimensions of defocusing. In Masayoshi Shibatani and Taro Kageyama (eds.). Voice and Grammatical Relations; in honor of Masayoshi Shibatani, 115 137. Amsterdam and Philadelphia: John Benjamins. Langacker, Ronald W. 2008. Cognitive Grammar: A Basic Introduction. Oxford University Press. Langacker, Ronald. 2009. Investigations in Cognitive Grammar. Berlin and New York: Mouton de Gruyter. Le Bidois, Georges and Robert Le Bidois. 1938. Syntaxe du Franais Moderne. New-York: Strechert. Moignet, Grard. 1974. Etudes de Psychosystmatique Franaise. Paris: Klincksieck. Olsson, Hugo. 1986. La Concurrence entre il, ce et cela (a) comme Sujet dExpressions Imperson nelles en Franais Contemporain. Stockholm: Almqvist and Wiksell. Radford, Andrew. 2004. Minimalist Syntax. Exploring the Structure of English. Cambridge University Press. Rowlett, Paul. 2007. The Syntax of French. Cambridge University Press. Sans, Andrea. 2005. Semantic maps in action, a discourse-based approach to passive and impersonal constructions. In Annalisa Baicchi, Cristiano Broccias, Andrea Sans (eds.). Modelling Thought and Constructing Meaning, 89106. Milan: Angeli. Schehaye, Albert. 1950. Essai sur la Structure Logique de la Langue. Paris: Champion. Shibatani, Masayochi. 1985. Passive and related constructions: A prototype analysis. Language 61: 821848. Siewierska, Anna. 2008. Ways of impersonalizing. Pronominal vs. verbal strategies. In Maria de los Angeles Gmez Gonzlez, J. Lachlan MacKenzie, Elsa M. Gonzlez Alvarez (eds.). Current Trends in Contrastive Linguistics, Functional and Cognitive perspectives, 326. Amsterdam and Philadelphia: John Benjamins. Slo, Anna. 2007. The impersonal impersonal construction in Polish. A Cognitive Grammar analysis. In Dagmar Divjak and Agata Kochaska (eds.). Cognitive Paths through the Slavic Domain, 257287. Berlin: Mouton de Gruyter. Smith, Michael. 1985. An analysis of German dummy subject constructions in Cognitive Grammar. In Scott DeLancey, and Russell Tomlin (eds.). Proceedings of the First Annual Meeting of the Pacific Linguistics Conference, 412425. Department of Linguistics: University of Oregon. Smith, Michael. 2000. Cataphors, spaces, propositions: Cataphoric pronouns and their function. Proceedings from the Meeting of the Chicago Linguistic Society 36.1: 483500. Smith, Michael. 2006. The conceptual structure of German impersonal constructions. Journal of Germanic Linguistics 17.2: 79138. Wartburg, von Walter, and Paul Zumthor. 1958. Prcis de Syntaxe du Franais Contemporain. Bern: Francke. Wilmet, Marc. 1997. Grammaire critique du franais. Louvain-la-Neuve: Duculot.
Differences in continuity of force dynamics and emotional valence in sentences with causal and adversative connectives
YURENA MORERA, MANUEL DE VEGA and JUAN CAMACHO*
Abstract This research analyses the semantics of Spanish sentences with causal or adversative connectives using force dynamics, emotional valence and subjectivity parameters. Participants were given stimulus sentences, each followed by a connective, and were asked to generate meaningful continuation sentences. For each stimulus sentence, four versions were offered, differing only in the connective (two causal and two adversative). Participants responses were encoded using a set of variables related to force dynamics, emotional valence, subjectivity, complexity and continuity. A discriminant analysis of the data found two main statistical functions. The Continuity-Discontinuity function (polarity) discriminated between causal and adversative sentences: force dynamics and emotional valences tended to be continuous in sentences linked by a causal connective and discontinuous in sentences with an adversative connective. The Internal-External function, orthogonal to continuity, discriminated between specific connectives: sentences with porque (because) and aunque (although) usually expressed internal or volitional events, and sentences with puesto que (given that) and a pesar de que (in spite of ) were associated with external or factual social events. Keywords: force dynamics, connectives, continuity, embodiment, subjectivity, discriminant analysis.
* Address for correspondence: Yurena Morera, Facultad de Psicologa, Campus de Guajara, Universidad de La Laguna, 38205 La Laguna, Spain. ymorera@ull.es Acknowlegements: We would like to thank Vicente Moreno and Mabel Urrutia for their help in the inter-judge reliability analysis. We are also grateful to Lisa Collins and several anonymous reviewers for their useful comments on this article. This research was supported by the Spanish Ministry of Education and Science (Ministerio de Educacin y Ciencia) through grant SEJ2004-02360 to the second author. Cognitive Linguistics 213 (2010), 501536 DOI 10.1515/COGL.2010.017 09365907/10/00210501 Walter de Gruyter
502
Y. Morera, M. De Vega and J. Camacho
1. Introduction Connectives are important guidelines for the construction of coherence during discourse comprehension. A primary function of connectives and other cohesion markers is to indicate to the readers or listeners what kind of relation between adjacent clauses or sentences should be established to accomplish discourse coherence (Givn, 1995; Caron, 1997, among many others). For instance, in the sentence The snowman melted because the sun was shining, the causal conjunction because instructs the reader to infer a causal relation between the event expressed by the first clause (the snowman melted) and the event in the second clause (the sun was shining). The cognitive relevance of connectives has been demonstrated in a number of studies, showing their role in the construction of discourse meaning at different levels. Sentences with connectives are read faster (Cozijn, 2000), better recalled (Caron et al., 1988) and understood (Haberlandt, 1982; Millis and Just, 1994), and they improve the overall understanding of expository texts (Degand et al., 1999) as compared to unconnected sentences. Studies using word recognition tasks have shown that connectives regulate inter-clausal activation (de Vega, 2005; Millis and Just, 1994), which suggests that they regulate the readers attentional focus (Givn, 1992). However, the nature of the events depicted in the sentences determines how useful connectives are in facilitating comprehension (Mouchon et al., 1995). Thus, when the described events follow a continuous causal sequence, connectives may not be necessary to integrate the information (Johns older brother hit himHis body was covered with bruises). By contrast, when the depicted events are only moderately related (John went to play near the neighbours houseHis body was covered with bruises) or they entail a disruption in the causal order (Ronny had little time to arrange a surprise for his girlfriends visitHe bought her a beautiful bouquet of flowers), connectives are needed to guide inter-clause integration (Murray, 1997). As markers of inter-clausal relationships, connectives must be studied in the context of sentences and discourse. Although connectives signal a relation (e.g., causal) between clauses, they do not work as independent semantic pieces, but rather must be combined with other lexical elements, especially verbs, in the clauses they connect. The need for a semantic matching between the connective and the connected sentences can easily be observed in experiments in which an appropriate connective substitutes an inappropriate one in a sentence context (Murray, 1997; de Vega, 2005). For instance, in a study by de Vega (2005), participants were given sentences involving verbs with an adversative semantic bias, such as The pedestrian wanted to jump over a puddle in the street [but / because] he fell down in the water. When the connective because was inserted between the sentences, the read-
Connectives and force dynamics 503 ing time of the second sentence was longer and the comprehension was worse than when the appropriate connective but was inserted between the sentences. Semantic matching between connectives and their sentence context not only concerns the general categories of connectives (e.g., adversative vs. causal), but also involves fine-grained distinctions among connectives within the same category (e.g., causal). Most languages have several connectives within a given category and, given the economy of language, these connectives will presumably differ in their semantic profile to give the speakers the opportunity to make specific semantic distinctions. For instance, de Vega et al. (2007) compared sentences with two connectives used in some cases to express simultaneity. Specifically, they explored the connectives while and when in Spanish (mientras / cuando) and German sentences (whrend / als) and found parallel differences in both languages. Thus, while involves a specific temporal metric of events: the event of longer duration has to be in the adverbial clause (while a > b). By contrast, when does not have this temporal restriction (when a < = > b). Consequently, while-sentences have a figure-ground structure (Talmy, 2001) and are strongly asymmetric, whereas when-sentences do not. For instance, the reversed version of while I was writing the letter I heard a shot is extremely odd (while I was hearing a shot I wrote a letter), whereas substituting when for while makes the original and the reversed sentences equally appropriate. As we will see in a further section, there are also causality connectives that involve different semantic constraints in sentences. For instance, the choice of one of the Dutch causality connectives (dus / daardoor / daarom) depends on the degree of the speakers involvement in the causal events that they want to express (e.g., Pander Maat and Degand, 2001). The research presented here explores the semantic fine-tuning among connectives and their sentence contexts. Thus, it analyzes the semantic profiles of sentences differing in the category of the connective used (causal vs. adversative), and also explores more fine-grained differences among sentences involving connectives of the same category. The next sections examine the following questions: First, the most widely accepted classifications of connectives are briefly reviewed. Second, the choice of the Spanish causal ( porque and puesto que) and adversative connectives (aunque and a pesar de que) for the current study is justified. Third, several semantic factors that can contribute to the specificity of the target connectives are considered. Specifically, we consider the force dynamic framework as an analytical tool for connective relations, and we also review the notion of subjective involvement as a factor that may distinguish among specific connectives. Finally, we suggest the possible role of emotional valence in causal and adversative sentences.
504
1.1. Classification of connectives There are several taxonomies of connectives which prove useful when organizing the field of study (e.g., Louwerse, 2001; Halliday and Hasan, 1976; Sanders et al. 1992, 1993). Louwerse (2001), for example, proposes a parametric analysis of connectives, according to which the combination of type, polarity and direction allows one to classify most of the connectives in English and other languages. First, type contains three categories of connectives differing in their semantic complexity: causal, temporal and additive. Causal connectives (e.g., because, although) involve time and causality relations, temporal connectives (e.g., before, while) involve only time, and additive connectives (e.g., and, moreover) involve neither time nor causal relations. Second, the polarity of connectives might be considered positive when the relation established in the first situation continues in the conjoined situation (e.g., because, after); otherwise, the polarity is negative (e.g., although, until). Finally, direction involves the order of the events in the sentences (forward, backward, bidirectional). Forward connectives keep the iconic sequence of events in the world (e.g., John was bored, so he went to watch a movie); backward connectives change the order of the related events (e.g., John went to watch a movie, because he was bored); bidirectional connectives express symmetric relations that keep the meaning whatever the order of the events in the sentence (e.g., John went to watch a movie and I went to visit my cousin). Sanders et al. (1992, 1993) consider the source of coherence (semantic, pragmatic) as the fourth dimension used to classify coherence relations expressed by connective sentences. Semantic sentences describe relations between events in the world (e.g., There had been an avalanche at Rogers pass. As a result, the road was blocked), and pragmatic sentences describe relations between speech-acts (There is a good movie on. Did you already have plans for tonight?) or epistemic relations (Pander Maat and Sanders, 2001). Beyond the above formal analysis, how much evidence exists as to the actual role of these categories in cognition? Sanders et al. (1992, 1993) designed a series of classification experiments to test their taxonomy of coherence relations. They asked discourse analysts and undergraduate students to label pairs of sentences according to their four dimensions or primitives; participants agreed most on the polarity and type of relation dimensions, less on the source of coherence dimension, and no agreement was found for order. Type of relation has been also confirmed as a relevant cognitive dimension by several studies employing on-line methodology. In general, the experiments have shown that causal relations are processed faster than temporal relations, which in turn are processed faster than additive relations (Louwerse, 2001; Caron et al. 1988; Sanders and Noordman, 2000). Concerning polarity, positive relations are generally processed faster than negative relations (Louwerse,
Connectives and force dynamics 505 2001; Murray, 1997; Townsend, 1983). For instance, Murray (1997) explored the role of connectives differing in polarity or continuity, as he calls it in his terminology. He ran an experiment using an on-line reading paradigm, in which participants were given sentence pairs that conveyed additive, causal, or adversative relations. In some cases, the appropriate connective substituted a connective inconsistent with the relation conveyed by the sentences. He confirmed that adversative connectives were associated with the greatest degree of facilitation when used in the appropriate context, and with the greatest disruption when used in the inappropriate context. This result is consistent with the continuity principle that governs discourse relations, as we will explain later. Researchers have paid less attention to direction or order, and empirical results are controversial. For instance, de Vega (2005), in an on-line study with causal and adversative sentences in Spanish, manipulated the direction of the coherence relation by using forward and backward connectives (Experiment 3). Based on previous studies, his hypothesis was that readers would process sentences with forward connectives faster, because they would know in advance what type of inference (e.g., causal) they would have to make to integrate the connected segments. However, results did not show any significant difference, and readers read sentences with forward connectives as fast as sentences with backward connectives (de Vega, 2005). By contrast, Noordman (2001) found that people are faster at processing causally related sentences, which follow the basic order (cause-consequence), than sentences, which follow the non-basic order (consequence-cause). 1.2. Causal and adversative connectives In this study, two categories of backward connectives were chosen: causal and adversative. The choice was based on several factors. First, causal and adversative connectives are semantically rich. Taxonomic analyses, as well as behavioural and developmental data, suggest that causal and adversative relations are more complex than additive and temporal relations denoted by other connectives. Thus, according to Louwerse (2001), the semantic parameters underlying additive relations are the most basic; temporal relations involve both additive and temporal parameters, and causal (and adversative) relations involve additive, temporal and causal relations together. Furthermore, the sequence of acquisition of connectives in children generally follows a pattern of growing complexity: additive - temporal - causal - adversative (Bloom et al., 1980; Cain et al. 2005; Spooren and Sanders, 2008). A second reason to choose causal and adversative connectives is that they convey related but opposite meanings. In fact, they can be considered as belonging to the same general category of causal relations, differing only in their polarity or continuity, as explained above (Louwerse, 2001; Murray, 1997;
506
Sanders et al., 1992; 1993). Continuity is a strong psychological principle in discourse comprehension, as readers assume by default that consecutive sentences refer to consecutively related events (Murray, 1997; Segal et al. 1991). Cognitive theories of discourse comprehension often emphasize that readers of narratives assume that sentences describe events that maintain causal, motivational, temporal, spatial, or co-referential continuity, unless a discontinuity marker is explicit in the texts (Gernsbacher, 1990; Zwaan et al. 1995). Finally, for any given language, there are several causal and adversative connectives, a fact which allows us to study fine-grained semantic distinctions within each category. Several studies have shown that causal and adversative connectives differ from each other in the degree of subjective involvement of the expressed relation. Most of these studies focus on sentences with causal connectives (Bestgen et al., 2006; Pander Maat and Degand, 2001; Pander Maat and Sanders, 2001; Pit, 2003, 2006; Spooren et al., in press), although a few studies also reported the role of the subjectivity of sentences with adversative connectives (Oversteegen, 1997; Pander Maat, 1998; Verhagen, 2005). In this study, we will explore subjective involvement in both causal and adversative conjunctions. 1.3. Connectives and force dynamics Causality is a fundamental concept in human cognition. Many theories of causal representation have been developed in order to account for how we learn and induce causal relations and for how we use language to express causality. One of these theories is the so-called force dynamics approach, which conceptualizes causation in terms of patterns of forces (Talmy, 1988; Verhagen, 2002; Wolff, 2007). According to Talmy, force dynamics is a basic schema referring to the implicit forces operating among the events in a scene, which plays a semantic role in certain grammatical structures (Talmy, 1987, 1988). The simplest case of force dynamics corresponds to the force relation that occurs between two interacting entities: the agonist and the antagonist. Both have an intrinsic force, which is a tendency for either rest or motion. When the agonist and the antagonist interact with opposite forces, the force relation between the two entities is called resistance; whereas if the agonists and antagonists forces apply in the same direction, the force relation is called increment. The result of the force interaction depends on the balance of strengths. A resistance relation will give rise to one of two possible results: a) if the agonist maintains its original tendency in spite of the antagonist force, then the force dynamic relation is considered to be overcome; b) if the agonists tendency changes because of the effect of the antagonists force, the outcome of the force dynamic relation is called a non-overcome result. In an increment relation, in which the agonists
Connectives and force dynamics 507 tendency increases because of the addition of the antagonists force, the outcome of the force dynamics is called reinforcement. Within this theoretical framework, Talmy proposes that causal and adversative connectives are grammatical markers of force dynamic relations, and that readers should infer these patterns of force between consecutive sentences following the connective guidelines. Specifically, in a causal or adversative sentence, the main clause expresses the agonists event and the outcome of the agonist-antagonist force dynamics (The boxer went down), and the subordinate clause (the one with the connective) describes the antagonists event (because of the punch). According to Talmy, causal connectives express the role of a stronger antagonist, while adversative connectives announce the role of a weaker antagonist. If this holds true, then what is the difference between causal and adversative sentences? It depends on the force dynamics outcome. Causal sentences which express a resistance relation will be characterized by a nonovercome result (The boxer went down because of the punch); while adversative sentences which depict a resistance relation will be characterized by an overcome result (The boxer stood up in spite of the punch). As we will see later, in Talmys theory, the antagonist is also the stronger element in reinforcement interactions, in which case these interactions should be expressed with causal connectives (The boxer won because his experience helped him). Force dynamics information is conveyed not only by causal or adversative connectives, but also by other lexical elements in sentences, especially verbs (Pinker, 1989; Talmy, 1988; Wolff and Song, 2003). According to Wolff (2003), languages provide two basic ways to express causal relations: through lexical causatives and periphrastic causatives. The former are verbs that encode the notions of cause and result (such as kill, break, open, etc.), allowing speakers to describe a causal relation in a single-clause construction (e.g., Sara opened the door). On the other hand, periphrastic causatives entail the expression of causality in two-predicate constructions, corresponding to the cause and the result, respectively (e.g., Sara caused the door to open). Sentences with periphrastic causal verbs are semantically similar to sentences with causal/ adversative connectives, because in both cases the force dynamic relations involve two clauses, and both express a force dynamic interaction. In fact, sentences with periphrastic causal verbs can be easily paraphrased as connective sentences (Stukker et al., 2008). For instance, the sentence A few drops of rain made the festival organizers fear for the worse can be rewritten as Some drops of rain fell. Because of that the festival organizers feared for the worse. Several studies have reported empirical evidence that periphrastic causal verbs could be categorized in many languages in terms of coercion, permission, prevention, and other categories that fit well with the force dynamics framework (Shibatani, 1976; Talmy, 1988; Wolff, 2003; Wolff and Song, 2003). Particularly, Wolff (2003) proposed that periphrastic causal verbs can
508
be analyzed within the force dynamic approach, by extending the notion of cause to the complementary concepts of prevent and enable. In a series of experiments, he showed that the force dynamic model is the best predictor of peoples conceptualization of direct and indirect causation when judging linguistic descriptions of events (Wolff, 2003). Furthermore, Wolff and Song (2003) conducted an extensive corpus search and constructed a list of 49 periphrastic causal verbs. They then asked a group of students to sort the causal verbs into groups according to their meanings. A multidimensional scaling (MDS) applied to the data demonstrated that Talmys distinction of causal, enabling and preventing types of causation provides a better account of the meaning of causal verbs than other models of causation. As Stukker et al. (2008) claim, causal connectives and causal verbs involve a parallel categorization of causal relations, albeit with different linguistic scope: verbs mark within-clause relations and connectives inform of discourselevel relations. As far as we know, there are no empirical studies examining the force dynamic relations conveyed by causal and adversative connectives. Some empirical results on connectives, however, show that there is a finegrained semantic matching between connectives and their sentences context (e.g., Murray, 1997; de Vega et al., 2007). In the same vein, we expect a consistent pattern between the force dynamics guided by the connectives and the force dynamics encoded in the connected sentences (verbs and other lexical elements). So, according to Talmy, sentences with causal connectives are characterized by a dominant antagonist. Given the close relation between connectives and verbs in the clauses, we can expect two possibilities in sentences marked by backward causal connectives: When the agonists final state is toward movement, as in the current study, the antagonists force could be expressed either by a causal or enabling type verb in the second clause, as the following examples show: (1) I went to the concert because my parents convinced me (causal). (2) I went to the concert, because my parents allowed me (enabling). Conversely, sentences with adversative connectives are characterized by a dominant agonist, namely, the agonist maintains its intrinsic tendency despite the antagonists action. Therefore, if the agonists final state is toward movement, as in the sentences of this study, the antagonists force will be expressed by a preventing type verb in the second clause. For instance: (3) I went to concert, in spite of the fact that my parents had forbidden me to go (preventing). As mentioned above, this study is limited to cases in which the agonists final tendency is toward movement, expressed with causal verbs in the first
Connectives and force dynamics 509

Table 1. Force dynamics framework for the description of sentences with causal connectives (in which the antagonist is the dominant force) and adversative connectives (in which the agonist is the dominant force) in the current experiment. Type of relation Resistance Increment Resistance Agonists initial state Toward Movement Toward Movement Toward Movement Verb in the 1st clause (stimulus) Causal verbs Causal verbs Causal verbs Connective type Causal Causal Adversative Verb in the 2nd clause (response) Causal verbs Enabling verbs Preventing verbs Outcome of the force dynamics Non-overcome Reinforcement Overcome
clause provided as the stimulus sentence. The critical manipulation was the causal or adversative connective that prompted participants to write a continuation sentence. The participants responses were analyzed paying special attention to the force dynamic features of verbs and other lexical elements like adjectives or nouns (e.g., the case of copulative verbs: I went to the concert in spite of not having any tickets). In our opinion, this method may contribute to offering an integrated view of causal and adversative relations that presumably are encoded in both verbs and connectives (Talmy, 2001; Stukker et al., 2008). Table 1 shows a summary of the hypothetical force dynamics in sentences with backward causal and adversative connectives that will be analyzed in this paper. 1.4. Connectives and subjectivity A relevant feature to distinguish among causal connectives is the notion of subjectivity, namely the protagonists point of view denoted by the connective (e.g., Langacker, 1990; Verhagen, 2007). The subjectivity of connectives can be intuitively appreciated when an objective connective is inserted in a sentence describing a subjective relation. For instance, I went home given that it was six oclock sounds odd. Beyond these intuitions, a number of studies support that subjectivity accounts for the semantic differences among connectives in several languages (Bestgen et al., 2006; Degand and Pander Maat, 2003; Pander Maat and Degand, 2001; Pander Maat and Sanders, 2001; Pit, 2003). For instance, Degand and Pander Maat (2003) analyzed the semantic context of the Dutch backward casual connectives doordat (because of the fact that), want (because, for), omdat (because), and aangezien (since) taken from a corpus of written language. Their results clearly supported that Dutch backward causal connectives scaled on a continuum of subjectivity. Doordat is often used to express objective, factual information; want is preferably used in very subjective contexts, expressing epistemic and speech-act relations; and omdat
510
and aangezien would take a position in between, expressing both types of causal relations. Four types of causal relations differing in degree of subjectivity, sometimes marked with specific connectives, have been reported in the literature: (1) Nonvolitional (content or semantic) causal relations, presenting causality as an objective state of affairs in the world, i.e., either physical or social events (e.g., There had been an avalanche at Rogers pass. As a result, the road was blocked; Since John wasnt here, we decided to leave a note for him); (2) Volitional causal relations, presenting causality as an intentional act; that is, the protagonists beliefs, intentions, evaluations, etc. are explicitly involved in the construal of the causal relation (e.g., It was six oclock. So I went home); (3) Epistemic relations, which involve a greater degree of subjectivity, because they do not express the cause of the fact depicted in the first clause, but rather the protagonists mental state or belief (e.g., They were large, grey birds that made a lot of noise. So they must have been cranes); and finally, (4) Speechact relations, which are the most subjective causal relations, because the coherence requires an inference based on illocutionary meaning, e.g., There is a good movie on. Did you already have plans for tonight? (Caron, 1997; Pander Maat and Degand, 2001; Pander Maat and Sanders, 2001; Sanders and Spooren, 2007; Spooren et al., in press). Researchers have used a set of variables to measure the degree of subjectivity of sentences linked by causal connectives. First, in subjective sentences, causal connectives tend to co-occur with opinion words and personal pronouns, whereas in objective sentences, causal connectives co-occur significantly more with factual and action words (Bestgen et al., 2006). Second, the subjectivity of causal sentences has been associated with several categories of propositional attitudes ranked from less to more subjective: facts, general knowledge, intentional acts, individual knowledge, experiences, perceptions, or judgements (Spooren et al., in press). These authors reported that subjective connectives co-occur more frequently with judgements, and objective connectives are more likely to co-occur with the other, more objective propositional attitudes. Finally, the degree of subjectivity of sentences with causal connectives has been associated with the polarity of the utterance (affirmation vs. negation) and the tense of the predicate. Thus, Pit (2006) claims that the presence of a negation enhances the subjectivity of an utterance (e.g., I do not believe she is pregnant, because she drank a glass of wine vs. I believe she is pregnant, because she only drank orange juice). Present tense also enhances the subjectivity, whereas past tense, especially the pluperfect, denotes a higher objectivity (Pit, 2006). It is remarkable that the notion of subjectivity can easily be incorporated into the force dynamics framework. In fact, Talmy (1988) claimed that force dynamics notions can be extended from the physical domain to the social domain (e.g., She urged him to leave) and the intra-psychic domain (e.g., He
Connectives and force dynamics 511 refrained from closing the door). A recent study by Wolff (2007) supports this hypothesis: he reported that subjects judged animations of realistic causal events based on the force relationships instantiated in each particular animation. This was supported in a series of experiments in which he presented animations which depicted not only physical causation (Experiments 14), but also intentional causation (Experiment 5), and social causation events (Experiment 6). This study investigates the possible differences in subjectivity between two backward causal connectives ( porque; puesto que) and two backward adversative connectives (aunque; a pesar de que). We propose that the most frequently used Spanish causal connective porque (3019 per million words)1 could express both objective and subjective causal relations. By contrast, the less frequently used causal connective puesto que (450 per million words) is more restrictive and is preferably used to express objective or factual causal relations (Flamenco, 1999). Likewise, the most frequently used adversative connective aunque (1687 per million words) expresses both objective and subjective causal relations, whereas we suspect that a pesar de que (404 per million words) is more restrictive when associated with objective or factual causal relations (Galn, 1999). These differences in frequency of use and restrictiveness of meaning of the Spanish connectives may be related to their different degree of grammatization (Heine and Kuteva, 2007; Hopper and Traugott, 1993). Thus, porque and aunque are phonetically eroded forms, and highly polysemic or desemanticized as corresponds to a relatively later stage of grammatization. By contrast, a pesar de que and puesto que are more transitional forms that still preserve the original content words ( pesar, puesto) and are semantically more restrictive. 1.5. Connectives and emotional valence Cognitive psychologists have demonstrated that emotion is an important functional dimension of words, sentences and discourse. For instance, words with emotional valence are better recalled and recognized faster than neutral words. In addition, a negativity bias has been reported in the literature: unpleasant words tend to be better recalled than pleasant and neutral words (Ohira et al., 1998; Ortony et al., 1983), although they also produce more false alarms in a recognition task (Dewhurst and Parry, 2000). One possible explanation for these memory effects is that negative stimuli enhance attention due to their biological significance (Cacioppo and Garner, 1999). Beyond the influence of words emotional valence on memory, psycholinguists have paid attention to the emotional features of sentences and texts.
1. These data were taken from Alameda and Cuetos (1995) Spanish dictionary of written word frequency.
512
Several studies have shown that readers are sensitive to the emotional tone of the events in a story, deriving inferences about the protagonists emotions (de Vega et al., 1996; 1997; Gernsbacher et al., 1992; Len et al., submitted; Gygax et al., 2004). For instance, Gernsbacher et al. (1992) asked participants to read a story in which the main character stole money from a store where his best friend worked, and later he learned that his friend had been fired. At the end of the story, participants read a critical sentence that described the protagonist either as feeling guilt (matching the implicit emotion) or pride (mismatching the implicit emotion). Participants read the sentences with the word that matched the emotion induced by the story faster than sentences with the mismatching emotion word. In the same vein, readers are able to update the protagonists emotions as new events are described in the story (de Vega et al., 1996). For instance, an initial paragraph could bias the protagonists emotion of envy towards a secondary characters success; whereas a new paragraph describing a further characters disgraceful events makes the reader update the protagonists emotion as pity. Readers seem to perform emotional inferences mandatorily and automatically as part of ordinary narrative comprehension (Graesser et al., 1994). In spite of the prominence of emotional valence in linguistic meaning, this dimension of human experience has been relatively neglected in the semantic analysis of language. A simple inspection of representative linguistics books shows no entry for emotion in the subject index; the few that do have a reference to emotion deal with the subject in just a few lines. One possible reason for this is that most languages in the world do not grammatize emotions (Bickerton, 1981), but rather tend to express emotions lexically, metaphorically, by means of prosodic and other paralinguistic cues, or leave them implicit. As far as we know, there has been no systematic analysis of the emotional dimension in the context of connective sentences. Here we propose a specific hypothesis: the polarity that distinguishes causal and adversative sentences could be associated with differences in valence continuity between the connected sentences. In other words, the general trend would be to keep the same emotional valence in sentences with causal connectives (e.g., Her business succeeded because of her talent), and shift the emotional valence between sentences with adversative connectives (e.g., Her business failed in spite of her talent). 2. Hypotheses In sum, the specific hypotheses of this study are: Hypothesis 1. Force dynamics and emotional valence parameters play a role in establishing the continuity or polarity of sentences with causal and adversative connectives. Our specific predictions are:
Connectives and force dynamics 513 a. Sentences with backward causal connectives are more likely to include force dynamic continuity in both clauses. Namely, the second clause (antagonist) keeps the same force dynamic parameters in the verb or other lexical units as the first clause (agonist). b. Sentences with backward adversative connectives are more likely to include force dynamic discontinuity between both clauses. Namely, the second clause (antagonist) shifts the force dynamic parameters in the verb or other lexical units with respect to the first clause (agonist). c. Sentences with backward causal connectives are more likely to keep the same emotional valence in both clauses. d. Sentences with backward adversative connectives are more likely to involve a shift of emotional valence between both clauses. Hypothesis 2. Sentences with causal and adversative connectives may differ in subjectivity depending on the particular connective employed. Specifically: e. The causal porque and the adversative aunque might be followed by clauses expressing some degree of subjective involvement. f. The causal puesto que and the adversative a pesar de que might be followed by more objective or descriptive clauses. Hypothesis 3. Sentences with causal and adversative connectives may differ in formal complexity. Specifically, sentences with adversative connectives produce more elaborate responses, because they violate the default causal expectations: g. Sentences with adversative connectives are more likely to involve larger numbers of words, including verbs, and the presence of other connectives. h. Participants responses are more variable in sentences with adversative connectives than in sentences with causal connectives. The reason for this is that given an agonist and an adversative connective, the violation of causal expectations admits many explanations. By contrast, given an agonist and a causal connective, there is a reduced number of causal explanations. 3. Experiment To test these hypotheses, participants performed a cloze task in Spanish. They were asked to complete a set of stimulus sentences, each ending with either a causal connective ( porque [because] or puesto que [given that]) or an adversative connective (aunque [although] or a pesar de que [in spite of ]). Participants had to write meaningful continuation sentences that were subsequently encoded according to a semantic protocol, which included force dynamic, emotional and subjectivity categories, among others. The stimulus sentences
514
were also encoded using these same categories to check whether the same values were kept (continuity) or shifted (discontinuity) between the first and the second clause. We predicted that participants would adjust their responses to the force dynamic relation marked by the connective. In other words, they would mention a stronger antagonist following a causal connective and they would mention a weaker antagonist after an adversative connective. Specifically, causal connectives would be associated with antagonists with causal or enabling forces, which could be expressed by causal or enabling verbs (e.g., cause, make, permit, allow, let) or other lexical elements (e.g., to give permission, to grant authorization, etc.). Adversative sentences, on the other hand, would be associated with antagonists with preventing forces, which could be expressed by preventing verbs (e.g., prevent, forbid, prohibit), negated enabling verbs (e.g., to not permit, to not allow, etc.) or other lexical elements with preventing meaning (e.g., to not give permission, to not grant authorization, etc.). 3.1. Method
3.1.1. Participants. One hundred and sixty students of Psychology from the University of La Laguna participated in the study. All participants were Spanish native speakers, and they received extra credit in an introductory course for their participation. 3.1.2. Design. Four versions of 40 stimulus sentences were constructed, differing only in the backward connective used: porque, aunque, puesto que, a pesar de que. The four Spanish versions of a stimulus are shown below with their English translation: Aurora empez a estudiar en la biblioteca porque . . . [Aurora began to study in the library because . . . ] Aurora empez a estudiar en la biblioteca puesto que . . . [Aurora began to study in the library given that . . . ] Aurora empez a estudiar en la biblioteca aunque . . . [Aurora began to study in the library although . . . ] Aurora empez a estudiar en la biblioteca a pesar de que . . . [Aurora began to study in the library in spite of . . . ] 3.1.3. Materials and Procedure. Four booklets with 40 sentences each were put together. Each booklet contained 10 stimulus sentences for each connective. The assignment of the connectives to particular sentences was counterbalanced across the booklets in such a way that participants received each sentence only once with just one connective. Stimulus sentences were randomly ordered within each booklet. The experiment was conducted in a
Connectives and force dynamics 515 classroom session, and 40 participants were randomly assigned to each of the four different booklets. Their task was to write a meaningful continuation for each stimulus sentence. An unlimited amount of time was allowed for this task, but most participants completed the booklet in about 35 minutes. 3.1.4. Analysis protocol. Participants responses were transcribed directly from the questionnaire booklets. Each response was then classified according to the 28 variables listed in Table 2, which are organized into five conceptual clusters. To simplify the encoding process and facilitate the statistical analyses, most of these variables were dichotomic. The only exceptions were the complexity variables, which were interval level variables (number of words, number of verbs, and number of different responses). An example of encoded responses is shown in the Appendix. Participants responses were encoded using subjectivity variables related to the protagonists involvement in the situation. Following the literature, we considered that continuation sentences with a negative particle, without new characters, with imperfective tenses and intentional or mentalist verbs involved an internal locus or a high level of subjectivity, whereas affirmative sentences, the introduction of a new character, the use of perfective tenses and temporal, deontic or illocutive verbs involved an external locus related to objective or factual situations. We expected that stimulus sentences marked with subjective connectives ( porque, aunque) would yield sentences with more subjective parameters than stimulus sentences marked with objective connectives ( puesto que, a pesar de que). The type of force dynamics variables encoded participants responses as having causal, enabling, or preventing forces (Wolff and Song, 2003). The domain of force dynamics was categorized as physical if the antagonist was a physical or external force; it was categorized as intra-psychic if the antagonist was a protagonists internal force; and it was considered interpersonal if the antagonist force came from another character. Finally, the linguistic locus of force dynamics encoded the antagonist force as being marked by a verb or by other grammatical elements, such as the direct object (e.g., there was a long queue). The emotional valence of responses was encoded as negative, neutral or positive. We did not have any specific hypothesis about valence per se. However, it was encoded to obtain the valence continuity, described below, that is related to certain theoretical predictions. The continuity variables are second-order variables, resulting from contrasting the force dynamic and emotional valence values in the stimulus sentence and the participants responses. The continuity of emotional valence was thus encoded as 0 if the valence of the response was the same as the valence of the stimulus sentence, and as 1 if there was a valence shift. Likewise,
516
Table 2. Predictor variables included in the analysis protocol. Examples of verbal categories as well as the encoding values in dichotomic and interval variables are provided in parenthesis. 1. Subjectivity variables Presence of the negative particle not (0, 1) Presence of a new character or agent (0, 1) Verb tense Subjectivity/objectivity of the verb
Imperfective (0), Perfective (1) Temporal (start, begin) (0, 1) Intentional (want, wish, desire) (0, 1) Mentalist (think, believe, know, realize) (0, 1) Deontic (should, must, have to) (0, 1) Illocutive (talk, phone, argue, call for) (0, 1) Others (0, 1) Causal (cause, make, get) (0, 1) Enabling (allow, permit, facilitate) (0, 1) Preventing (hinder, block, prohibit) (0, 1) Physical (0, 1) Intra-psychic (0, 1) Interpersonal (0, 1) Verb (0, 1) Other lexical elements (0, 1) Neutral (0, 1) Positive (0, 1) Negative (0, 1)
2. Force dynamics variables Type of force dynamics Domain of force dynamics Linguistic locus of force dynamics 3. Emotional valence variables Emotional valence
4. Continuity variables Force dynamics shift (0, 1) Domain of the force dynamics shift (0, 1) Linguistic locus of the force dynamics shift (0, 1) Emotional valence shift (0, 1) 5. Formal complexity variables Number of words (1 or more) Number of verbs (1 or more) Presence of other connectives (0, 1) Number of different responses (140)
the continuity of the type of force dynamic variables and of the force dynamic domain was encoded as 0 when the response kept the same values as the stimulus sentences and as 1 when there was a shift in the corresponding values. The last set of variables was related to the formal complexity of responses. Most of these variables (number of words, number of verbs, presence of other connectives) are just gross cues of the cognitive complexity or elaborateness of the responses. Concerning the number of different responses given by participants, this is a second-order parameter which is related to schematic knowl-
Connectives and force dynamics 517 edge. When most participants produce the same or very similar responses to a given stimulus, this suggests they share a conventional causal schema. As we have seen, a possible prediction is that causal connectives involve more similar or schematic responses among participants than adversative connectives. Once the participants responses had been transcribed, the first author encoded them following the aforementioned protocol. Each response was encoded using a blind procedure, as the connectives had been suppressed in the transcriptions. Any doubts about the encoding were resolved between the first and the second author. Incomplete and nonsensical responses were discarded from the analysis (approximately 12 percent). The final analysis was run on a corpus of 2511 responses. To obtain the continuity variables, the force dynamics variables of the stimulus sentences were also encoded by the first author. In addition, the emotional valences of the stimulus sentences had been previously determined in a normative study, where twenty participants rated the emotional valence of each stimulus sentence on a scale from 2 (completely negative) to 2 (completely positive). Most of the stimulus sentences (36 out of 40) were evaluated as neutral (0) or positive (higher than 0). These stimulus sentence valence values were used to compute the valence continuity of the participants responses, as explained above. To test the reliability of the force dynamic encoding, two independent judges encoded 30 percent of the participants responses. The judges had previously been given a tutorial explaining the force dynamics categories (causal, enabling and preventing), the force dynamics domains (physical, intra-psychic and interpersonal), and the linguistic locus of force dynamics (verb and other lexical elements). This tutorial included a list of force dynamic verbs, based on Wolff and Songs (2003) proposal and the authors own intuition. Prototypical examples for each response category were also given to the judges. The mean agreement Kappa index between judges was .82. 3.1.5. Discriminant analysis technique. Discriminant analysis (DA) is a statistical method used for determining which set of predictor variables best differentiates among categories defined as criteria by the researcher (Huberty, 1994; Tatsuoka, 1988; Tabachnick and Fidell, 2001). DA generates the linear combination of the predictor variables that best differentiates the criterion categoriesin the current study, the four connectives. These linear combinations of discriminant variables are called discriminant functions, which are analogous to a multiple regression equation, except that the coefficients in the DA equation maximize the distance between each observation and the mean of the criterion categories. Each criterion category mean is called the centroid, i.e., the mean discriminant scores for each of the criterion variable categories for each of the discriminant functions. For instance, in the current study, the
518
DA has four centroids, one for each connective. The closer an observations discriminant function score falls to a category centroid, the more likely it is that the observation will be a member of that category. Also, when centroids are well apart from one another, the corresponding discriminant functions will clearly discriminate among cases. By contrast, when centroids involve close means, the discriminant functions will discriminate poorly among cases. The maximum number of functions that DA can provide is equal to the number of criterion categories minus one. In this study, with four connectives as criterion categories, we can obtain a maximum of three discriminant functions to account for the data. The first function explains the largest overall variation allowing the best discrimination among categories, the second function explains most of the remaining variation, and so on. The discriminant functions are independent or orthogonal, i.e., their contributions to the discrimination between categories do not overlap. Discriminant functions are interpreted by means of structure coefficients. The structure coefficients show the correlations of each predictor variable and discriminant function. Those predictor variables with the highest correlations to each discriminant function will contribute most to defining dimensions according to which the studied categories differ from one another. By identifying the set of variables that correlate the most to a given dimension, we may infer a suitable label for that dimension, which gives an evaluation of the content of the statistically significant discriminant functions, which in turn aids in the interpretation of the resulting category differences. In keeping with general practice, we included in the interpretation of the discriminant functions those variables with structure coefficients greater than the absolute value of .3 (Camacho, 1995; Tabachnick and Fidell, 2001). Once the discriminant functions have been obtained, another important application of DA is the prediction of case classification. In other words, it is possible to use the discriminant function scores to determine the probability that a given case will be assigned to its predicted category. The procedure consists of computing the distance between each case and each category centroid and then classifying the case as belonging to the category to which it is the closest. This procedure allows us to assess the predictive accuracy of the resulting discriminant model. 4. Results A DA was performed using the statistical package SPSS 14 to examine how the 28 independent variables in the protocol predict differences among the four connectives used as categorical criteria (dependent variables). Three discriminant functions were found. The first function accounted for 91.93% of the explainable variance; the second function accounted for 4.42% of the variance;

Table 3. Discriminant functions and their eigenvalue, percentage of variance, canonical correlation, Wilkss lambda, chi-square, degree of freedom and p value. Function 1 2 Eigenvalue .727 .035 % of Variance 91.9 4.4 Canonical correlation .649 .184 Wilkss lambda .544 .939 Chi-square 1517.310 156.360 Df 96 62 P < .0001 .0001
and the third function explained the remaining 3.64% of the variance. We will report on the first and the second discriminant functions, because the third lacks statistical and theoretical relevance. Table 3 presents statistical information on the two relevant functions. Specifically, it gives their eigenvalue (a measure of their relative discriminant power), the percentage of explained variance, the canonical correlation (a measure of the association between the criterion connectives and the discriminant function), the Wilkss lambda (a value that varies from 0 to 1; the smaller the value, the more that function contributes to explaining group differences), and the chi-square test based on lambda, its degrees of freedom, and its p value, which express the statistical significance of the function. Because there were no compelling reasons to order the predictor variables a priori, we entered all the predictor variables directly into the DA. Table 4 shows the correlations among the predictor variables and the two significant functions. An examination of Function 1 (X 2(96) = 1517.31; p < 0.0001) reveals that it separates sentences with causal connectives ( porque and puesto que) from sentences with adversative connectives (aunque and a pesar de que). Sentences with causal connectives are characterized by higher scores (positive correlations) on positive emotional valence and causal and enabling types of force dynamics, whereas sentences with adversative connectives (negative correlations) show the opposite pattern, that is, higher scores on negative emotional valence, preventing type of force dynamics, change of emotional valence and change of force dynamics. Function 2 (X 2(62) = 156.36; p < 0.0001) separates subjective connectives ( porque and aunque) from objective connectives ( puesto que and a pesar de que). Sentences with subjective connectives are characterized by higher scores on the intra-psychic forces domain, mentalist verbs and the presence of negative particles (positive correlations), whereas sentences with objective connectives are associated with higher scores on the interpersonal forces domain, illocutive verbs and the presence of a new character or agent (negative correlations). The correlation matrix in Table 4 must be complemented by the discriminant space in Figure 1 that depicts the centroids of the four connectives and their coordinate values in the two orthogonal discriminant functions. Sentences with
520
Table 4. Correlations between predictor variables and the significant discriminant functions. Predictors Preventing forces Negative valence Positive valence Emotional valence shift Causal forces Enabling forces Force dynamics shift Intra-psychic domain Interpersonal domain Presence of negative particle Mentalist verbs Presence of a new character Illocutive verbs Other verbs F. 1 .755 .749 .679 .661 .416 .398 .395 .145 .146 .302 .051 .085 .135 .026 F. 2 .064 .108 .038 .009 .073 .113 .019 .534 .450 .448 .402 .386 .352 .317 Predictors Number of verbs Verbal tense Linguistic locus on other lexical elements Number of words Number of different responses Intentional verbs Physical domain Deontic verbs Temporal verbs Linguistic locus on the verb Presence of connectives Neutral Valencea Locus shifta Domain shifta F. 1 .031 .027 .040 .098 .066 .248 .006 .048 .002 .079 .070 .153 .028 .002 F. 2 .279 .261 .223 .113 .104 .105 .109 .048 .122 .023 .107 .079 .088 .101
Note. Weight > |.30| are in bold and gray background. a Variables not passing the tolerance criteria were not used in the analysis.
Figure 1. Group centroids for each criterion group on the discriminant space defined by the two significant discriminant functions. In parenthesis, the coordinate values of the centroids. At the left are the connectives at the discontinuous pole of the continuitydiscontinuity function; at the top are the connectives at the internal pole of the internal-external function.

Table 5. Percent of predicted classifications that match the actual group categories according to the discriminant functions. Actual category Porque Porque Puesto que Aunque A pesar de que 39.61 33.28 6.90 5.46 Predicted category Puesto que 32.58 38.24 6.27 5.14 Aunque 14.21 11.52 49.13 37.94 A pesar de que 13.57 16.96 37.67 51.44 100 100 100 100 Total
Note. Values in bold in the diagonal are hits. Values in the nearby cells in gray background are also sentences correctly classified within the general category of causal or adversative. For the purpose of classification, prior probabilities of group membership were based on sample probabilities for each group (these were based on sample size).
causal connectives ( porque and puesto que) are situated at the continuity pole of the first discriminant function, whereas sentences with adversative connectives (aunque and a pesar de que) are at the discontinuity pole. Concerning the second discriminant function, sentences with subjective connectives ( porque and aunque) are at the internal pole of the dimension, whereas the objective connectives ( puesto que and a pesar de que) are at the external pole. Finally, a detailed account of the classifications predicted by the model is provided in Table 5. The discriminant analysis correctly classified 44.6% of the observations into the appropriate reference categories, far superior to the chance rate of 25%. The hit rate was 39.61% for porque (because) sentences, 38.24% for puesto que (given that) sentences, 49.13% for aunque (although) sentences and 51.44% for a pesar de que (in spite of ) sentences. These percentages may seem lower than could be expected. However, a careful inspection of Table 5 shows that most of the misclassifications took place within the same category of connectives. In other words, about 72% of causal sentences were correctly categorized as causal, either in the porque or puesto que category, and only the remaining 28% were erroneously classified as adversative. The accuracy of the model was even better for adversative sentences: 88% were correctly categorized as adversative (aunque or a pesar de que) and only 12% were incorrectly classified as causal. 5. Discussion Researchers generally agree that connectives work as processing instructions guiding the reader or hearer to construe a specific coherence relation (e.g., Givn, 1995; Louwerse, 2001; Pander Maat and Sanders, 2006; Pit, 2006; Sanders and Spooren, 2007). Thus, sentences with a causal connective invite one to establish a causal inference connecting two events. By contrast, sentences with an adversative connective announce that causal expectations
522
between the events in the sentences are violated. In addition, causal sentences follow the principle of continuity that governs discourse comprehension (Murray, 1997; Gernsbacher, 1990; Zwaan et al. 1995), whereas in adversative sentences, the continuity expectations are disrupted. The above characterizations of causal and adversative relations are very useful, and serve to understand that both kinds of relations are two poles of the same mental schema: causality. However, notions such as causality, continuity and polarity are rather opaque and deserve theoretical elaboration. Causality itself is an important epistemic principle in human cognition, and is subject to intensive investigation and debate in philosophy, linguistics and cognitive sciences. Under the label causation, researchers understand different concepts: perceptual schemas (e.g., Michotte, 1963), covariation patterns (e.g., Cheng, 1997), counterfactual simulation (e.g., Kahneman and Tversky, 1982), differences in a causal field (e.g., Mackie, 1974), force dynamic relations (Talmy, 1988; Wolff, 2007), etc. The notions of continuity and polarity, for their part, are sometimes more descriptive than explanatory. The first goal of this study was to provide a possible characterization of causality underlying sentences marked with causal or adversative sentences, exploring how their differences in polarity or continuity are associated with force dynamics and emotional valence parameters. The second goal was to explore the subjectivity of both causal and adversative connectives in Spanish. 5.1. Continuity in force dynamics The results of this study showed a powerful statistical function that clearly discriminated between sentences with causal and adversative connectives, as well as a less conspicuous, albeit significant, function that discriminated between sentences with subjective and objective connectives. The first discriminant function, which we called the continuity-discontinuity function, confirms that sentences with causal and adversative connectives mainly differ in their polarity or continuity (Sanders et al., 1992; 1993; Louwerse, 2001). However, these results go beyond a simple confirmation of previous taxonomies, providing a more fine-grained analysis of polarity in terms of force dynamics and emotional valence continuity. Sentences with the causal connectives porque and puesto que showed continuity in force dynamics, whereas sentences with the adversatives aunque and a pesar de que involved a shift in force dynamics. Given that the stimulus sentences always involved a causal force, the continuity of causal relations consisted of participants responses that kept the same causal or enabling force, whereas the discontinuity of adversative relations corresponded to responses with a preventing force. Note that the presence of a preventing force is the strongest marker of adversative relations with the highest weight in the continuity-discontinuity discriminant function (.755), as shown in Table 4.
Connectives and force dynamics 523 However, it is unlikely that causal and enabling forces necessarily characterize the antagonist of causal sentences, and preventing forces necessarily correspond to the antagonist of adversative sentences. If the stimulus sentences had described preventing forces, rather than causal forces (e.g., Carmen refrained from walking in the street because / although . . .), the continuitydiscontinuity dimension would have predicted participants responses with preventing forces following causal connectives (e.g., because she twisted her ankle) and responses with causal or enabling forces following adversative connectives (e.g., although she was in a hurry to arrive home). It is remarkable how close the semantic relation is between connectives and verbs that are combined in the same sentences. As shown in the introduction, previous studies have reported that connectives convey a variety of causal relations (Sanders et al., 1992; 1993; Louwerse, 2001; de Vega, 2005; Murray, 1997; Townsend, 1983; Pander Maat and Sanders, 2001; Bestgen et al., 2006; Caron, 1997; Segal and Duchan, 1997), and also that some verbs express force dynamic causal relations (Wolff, 2003; 2007; Wolff and Song, 2003). Moreover, Stukker et al. (2008) reported that the semantics of causal connectives and causal verbs partially overlap, although their syntactic range differs: verbs operate at the clause level, and connectives at the discourse level. The results of this study allow us to postulate a different view: that there is a fine semantic adjustment between the force dynamics of connectives and the verbs which compose a sentence. A possible interpretation of this is that sentences and connectives constrain each other: a given connective calls for a certain semantic profile in the connected sentences, mainly, although not exclusively, expressed by the verbs. The opposite constraint has also been reported in studies that employ a substitution paradigm. When two clauses describe events with a causal bias, the insertion of an adversative connective (e.g., although) rather than an appropriate causal connective (e.g., because) between them disrupts on-line comprehension. Likewise, when adversative biased sentences include a causal rather than an adversative connective, comprehension also decreases (Murray, 1997; de Vega, 2005). Talmys force dynamics theory is an example of embodiment theories of meaning that postulate that linguistic meaning is grounded in sensory-motor processes (see de Vega et al., 2008, for a debate on this issue). But, how much embodiment takes place in force dynamics? The current experiment demonstrates that force dynamic continuity parameters are computed to understand and generate causal and adversative relations, but the exact representational nature of these force dynamics remains unclear. It might be the case that force dynamics involve abstract or symbolic representations of the agonist, antagonist and outcome force. If so, the semantic adjustment processes described above could be perfectly accomplished by means of purely propositional or symbolic computations. For instance, the sentence The horse stopped its
524
gallop because it met a fence could be reduced to a set of force dynamics propositions and rules: P1: (GALLOP, HORSE) P2: (INITIAL-STATE, P1) P3: (STOP, HORSE) P4: (FINAL-STATE, P3) P5. (HORSE, AGONIST) P6. (FENCE, ANTAGONIST) P7 (ANTAGONIST > AGONIST) FD: IF (AGONIST, INITIAL-STATE 0 FINAL-STATE) and (ANTAGONIST > AGONIST) THEN (CAUSE-OF, FINAL-STATE, ANTAGONIST) When the force dynamics rule FD is applied to the propositions P1 to P7, it can be concluded that the initial sentence is a well-formed force dynamic causal expression. However, ungrounded symbolic processes like the above proposal have some drawbacks. Their lack of grounding in perceptual experience makes purely symbolic systems inappropriate to explain referential meaning (e.g., de Vega et al., 2008; Harnad, 1999; Barsalou, 1999). Another possibility, more akin to the embodiment approach, would be that speakers and listeners of causal and adversative sentences do really activate sensory-motor processes (e.g., motor events or percepts) to simulate the force dynamic events. If this were the case, it could be expected that understanding or producing a causal sentence interacts with the simultaneous performance of an action that either matches or mismatches the meaning of the sentence. This sort of procedure, called the action-sentence compatibility effect (ACE) paradigm, has been applied to the comprehension of action sentences (Glenberg and Kaschak, 2002; de Vega, 2008). For instance, in some experiments, participants read sentences describing transfer actions away from the speaker (e.g., I gave you the book) or towards the speaker (e.g., You gave me the book), and simultaneously or immediately afterwards performed a hand motion away from or towards themselves. The typical result is that in the matching conditions (e.g., transfer away and motion away) responses are faster than in the mismatching conditions (e.g., transfer towards and motion away). The current study did not perform an ACE test, but in another, recent experiment, participants received causal and adversative sentences while they watched and responded to an animation that represented causal or adversative force dynamics (Morera, 2009; Experiment 5). The responses to the animation were faster for the matching than for the mismatching condition, suggesting than some sort of sensory-motor force dynamics underlie the comprehension of sentences with connective relations.
Connectives and force dynamics 525 5.2. Continuity in emotional valence
Not only force dynamic parameters contribute to distinguishing among causal and adversative sentences. The inspection of Table 5 shows that emotional valence variables accumulate even more statistical weight than force dynamic parameters in the continuity-discontinuity function. Specifically, positive emotional valence and valence continuity are associated with causal sentences; whereas negative emotional valence and valence discontinuity are associated with adversative sentences. Again, we may note that positive emotional valence is not a necessary descriptor of causal relations, nor is negative valence a necessary descriptor of adversative relations. The observed pattern could be a consequence of the biases in our stimulus sentences, most of which were of positive valence. If the stimulus sentences had been of negative valence (Marc broke his leg because / although . . .), the continuity-discontinuity dimension would have predicted a negative valence response for causal connectives (because he got into a bike accident) and a positive valence response for adversative connectives (although he won a gold medal). Why does emotional valence or, more precisely, valence continuity play such a prominent role in causal and adversative relations? At least two possible explanations can be considered. First, connectives modulate the continuity of both force dynamics and emotional value as independent parameters. According to this view, continuity is a general dimension that distinguishes between sentences with causal and adversative connectives. When continuity is fleshed out, several semantic features could emerge, such as force dynamics and emotional valence, and probably others not examined in this study. Another plausible and more parsimonious explanation is that emotional valence and force dynamics are two related features. For instance, a positive emotional valence could be considered a causal or enabling force (e.g., associated with performing actions), whereas a negative emotional valence would be a preventing force (e.g., associated with avoidance or inaction). Or seen the other way around, causal and enabling forces would more likely involve positive emotional valence, whereas preventing forces would involve negative emotional valence. It is intuitively appealing to consider emotions as forces, and it is supported by the fact that we incorporate into our everyday language emotion-as-force metaphors and idioms. Thus, love is sometimes described as a magnetic or electric attraction; and anger is frequently described as exploding, burning or boiling (Lakoff, 1987). 5.3. Internal and external causality
The second discriminant function, which we called the internal-external dimension, is orthogonal to the continuity-discontinuity function. It involves a
526
gross distinction between sentences that describe internal causality (defined as involving the intra-pyschic domain, the presence of a negation and a mentalist verb) and sentences that describe external causality (involving the interpersonal domain, the presence of another character and an illocutive verb). The internal pole of the discriminant function includes sentences with two connectives (the causal porque and the adversative aunque) and the external pole includes sentences with the other two connectives (the causal puesto que and the adversative a pesar de que). Our initial hypothesis 2 was that porque and aunque more likely express subjective relations, and puesto que and a pesar de que tend to express objective relations. The second discriminant function statistically confirmed this classificatory criterion, although we preferred to call it the internal-external dimension rather than use the term subjectivity. Subjectivity is a very complex semantic factor that is still under theoretical discussion. Some authors claim that causal utterances are subjective to the extent that they are connected to the speakers deictic perspective (Pander Maat and Sanders, 2001; Pander Maat and Degand, 2001; Pit, 2006; Sanders and Spooren, 2007; Spooren et al., in press). Deictic perspective is not an all-or-none variable. Rather, it must be considered on a continuous scale of the speakers involvement. The subjective involvement can be explicitly declared in the sentence by means of mentalist verbs or other lexical devices (e.g., I think that John is sick because today he did not come to work), but notably more subjectivity could take place when the mental state is presupposed rather than directly mentioned. For instance, the sentence It is raining because the streets are wet is highly subjective, because the apparent violation of a causality schema (wet streets do not cause rain) is resolved by inferring implicit epistemic information, shown in brackets in the rewritten example: [I think that] it is raining because [I saw that] the streets are wet. An empirical demonstration of the cost of subjectivity is that understanding such epistemic or diagnostic sentences takes longer than understanding an objective causal utterance (The streets are wet because it is raining), as demonstrated by Traxler et al. (1997). We avoided calling our second discriminant function subjectivity for several reasons. First, the current study used a restricted sample of stimulus sentences in which, for instance, epistemic or diagnostic sentences and sentences describing physical events were absent. Second, the sample of causal and adversative relations was also limited to four backward connectives, although there are a few more available in the Spanish repertoire. Therefore, we were not able to explore the whole range of subjectivity that potentially underlies causal and adversative relations in Spanish. Finally, an a posteriori reason is that the descriptors found for the two poles of the second discriminant function do not exactly match the descriptors usually attributed to the subjective and objective relations. Although sentences with porque and aunque share typical
Connectives and force dynamics 527 features of subjectivity (intra-psychic domain, negation, mentalist verbs), the most subjective epistemic relations or speech-acts are absent in our corpus. Furthermore, sentences with puesto que and a pesar de que are quite hybrid, including relatively objective features (presence of another character and use of illocutive verbs), but also the interpersonal domain, which hardly could be considered entirely objective. Consequently, we chose the labels internal and external, which are rather accurate in describing the two poles of the dimension (see Halliday and Hasan, 1976 for a similar classification of cohesion relations). The internal pole refers to mental states or events, being basically subjective although excluding epistemic relations; whereas the external pole refers to factual events occurring in the social realm, involving objective interactions and explicit illocutive acts with other characters. One advantage of characterizing the dimension as internal-external is that it could be included in the force dynamics theoretical framework, a possibility that Talmy (1988) also considered. Thus, he claimed that causal or force dynamic relations could take place in the intrapsychic, the interpersonal or the physical domain. We confirmed that the intrapsychic and the interpersonal domain of force dynamics are selectively associated with two groups of connectives. The absence of physical domain events in the current study is explainable because the stimulus sentences were biased towards human activity rather than physical events. As commented before, the internal-external dimension, although statistically significant, is not as well-defined as the continuity-discontinuity dimension. In other words, sentences with internal and external connectives do not differ as much as sentences with causal and adversative connectives. Looking at Figure 1, it seems that the semantic confusion is larger between internal and external causal connectives ( porque and puesto que), which are very close in the discriminant space, whereas sentences with internal and external adversative connectives (aunque and a pesar de que) are more distinguishable, because they are further apart in the discriminant space. The simplest interpretation is that the connective porque is an all-purpose causal connective that can be used with both internal and external relations, whereas adversative connectives are more specific along the internal-external dimension. The method used in this study deserves some comment. The sentence completion task offers some advantages over other methods. It produces a rich corpus of responses that can be encoded according to a set of variables, and the data can easily be subjected to classificatory statistics methods like discriminant analysis. Unlike written or spoken corpus studies, the procedure used in this study involves an experimental manipulation of the target connectives with a careful control of the stimulus sentences, and as such is ideal for seeing how connectives themselves constrain sentence meaning. Finally, the completion task (actually a combination of comprehension and
528
production tasks) is rather intuitive for participants and allows us to avoid having to ask them to make semantic judgments that could on occasion demand sophisticated metacognitive processes. A drawback of the method is that the completion task is rather artificial and probably does not produce causal and adversative sentences as natural as those produced in spontaneous language (Gilquin and Gries, 2009). In addition, it requires researchers to perform the painstaking encoding of responses and appropriate reliability measures. In sum, this study confirmed and extended some previous findings on causal and adversative relations in Spanish sentences. The selected connectives served as discriminant criteria for two semantic dimensions. The first dimension discriminated sentences with causal ( porque and puesto que) and adversative connectives (aunque and a pesar de que), revealing that they differ in the continuity-discontinuity function of force dynamic and emotional valence parameters. The continuity-discontinuity of force dynamics in connective sentences goes beyond the analysis of force dynamic verbs reported in other studies and thus provides a more integrative approach. Given the minimal manipulation of connectives attached to the same stimulus sentence, the results demonstrated that connectives themselves impose semantic constraints on sentences. In addition, this study incorporates for the first time the semantics of emotions into the analysis of connectives, as the continuity-discontinuity function has shown. The second internal-external function, related to subjectivity, cuts across causal and adversative connectives, discriminating between porque/ aunque (internal) and puesto que/a pesar de que (external). The internal pole concerns the intra-psychic domain and the external pole the interpersonal domain. This study was based on a single language and cannot be generalized to other languages like English. However, given the apparent one-to-one translatability of the Spanish conjunctions into English ones, similar patterns could be predicted in English and perhaps in other languages. Nevertheless, crosslinguistic analyses will be necessary to consolidate the current results and interpretations. Other research avenues could involve exploring how force dynamic and emotional valence continuity and the internal-external domain modulate on-line processing of causal and adversative sentences. In addition, it would be useful to test to what extent the semantics of force dynamics are really embodied, for instance by analyzing the activation of motor areas in the brain during comprehension of causal and adversative sentences. This could be done by collecting behavioural data, Event-Related Potentials or neuroimaging data. Received 3 July 2009 Revision received 7 December 2009 University of La Laguna
Connectives and force dynamics 529 Appendix Below, we show three different participants responses for each of the four versions of a stimulus sentence and its codification according to the set of predictor variables considered (the stimulus sentences and participants answers were translated into English; we present the original Spanish followed by the translation in brackets). We present first the encoding of variables 114 and then the encoding of variables 1528 (for the same sentences). Legend for analysis categories:
1 = 2 = 3 = 49 = Presence of the negative particle not Presence of a new character or agent Verbal tense Subjectivity/objectivity of the verb (0 = No, 1 = Yes) (0, 1) (0 = Imperfective, 1 = Perfective) 4 = Temporal (0, 1); 5 = Intentional (0, 1); 6 = Mentalist (0, 1); 7 = Deontic (0, 1); 8 = Illocutive (0, 1); 9 = Others (0, 1). 10 = Causal (0, 1); 11 = Enabling (0, 1); 12 = Preventing (0, 1). 13 = Physical (0, 1); 14 = Intra-psychic (0, 1); 15 = Interpersonal (0, 1). 16 = Verb (0, 1); 17 = Other lexical elements (0, 1). 18 = Neutral (0, 1); 19 = Positive (0, 1); 20 = Negative (0, 1) 21 = Force dynamics shift (0, 1); 22 = Domain of force dynamics shift (0, 1); 23 = Linguistic locus shift (0, 1); 24 = Emotional valence shift (0, 1) (1 or more) (1 or more) (0, 1) (040)
1012 = Type of force dynamics 1315 = Domain of force dynamics 1617 = Linguistic locus of force dynamics 1820 = Emotional valence 2124 = Continuity variables
25 = 26 = 27 = 28 =
Number of words Number of verbs Presence of other connectives Number of different responses
530
El entrenador empez a felicitar a su equipo porque . . . (The team manager began to congratulate the team because . . .) R1. Haban causado gran admiracin al pblico. (They had caused great admiration in the people). R2. Haban jugado deportivamente. (They had played sportingly). R3. l pensaba que se lo haban merecido. (He thought they had deserved it). 1 0 1 0 1 0 2 1 2 0 2 0 3 0 3 1 3 1 4 0 4 0 4 0 5 0 5 0 5 0 6 0 6 0 6 1 7 0 7 0 7 0 8 0 8 0 8 0 9 1 9 1 9 0 10 1 10 1 10 1 11 0 11 0 11 0 12 0 12 0 12 0 13 0 13 0 13 0 14 0 14 0 14 1
El entrenador empez a felicitar a su equipo puesto que . . . (The team manager began to congratulate the team given that . . .) R4. Haban ganado la copa. (They had won the cup). R5. Haban trabajado duro. (They had worked hard). R6. Haban hecho muy buen trabajo. (They had done a very good job). 1 0 1 0 1 0 2 0 2 0 2 0 3 0 3 0 3 0 4 0 4 0 4 0 5 0 5 0 5 0 6 0 6 0 6 0 7 0 7 0 7 0 8 0 8 0 8 0 9 1 9 1 9 1 10 1 10 1 10 1 11 0 11 0 11 0 12 0 12 0 12 0 13 0 13 0 13 0 14 0 14 0 14 0
El entrenador empez a felicitar a su equipo aunque . . . (The team manager began to congratulate the team although . . .) R7. No obtuvieron el resultado que esperaban. (They did not get the expected result). R8. No estaba satisfecho con el resultado. (He was not satisfied with the result). R9. Estaban cansados despus del esfuerzo que haban hecho. (They were tired after the effort they had made). 1 1 1 1 1 0 2 0 2 0 2 0 3 1 3 0 3 0 4 0 4 0 4 0 5 0 5 0 5 0 6 1 6 0 6 0 7 0 7 0 7 0 8 0 8 0 8 0 9 0 9 1 9 1 10 0 10 0 10 0 11 0 11 0 11 0 12 1 12 1 12 1 13 0 13 0 13 0 14 1 14 1 14 0

El entrenador empez a felicitar a su equipo a pesar de que . . . The team manager began to congratulate the team in spite of . . . R10. Haban perdido el partido. (They lost the match). R11. No haban ganado el partido. (They did not win the match). R12. Haban jugado fatal. (They had played horribly). 1 0 1 1 1 0 2 0 2 0 2 0 3 1 3 1 3 0 4 0 4 0 4 0 5 0 5 0 5 0 6 0 6 0 6 0 7 0 7 0 7 0 8 0 8 0 8 0 9 1 9 1 9 1 10 0 10 0 10 0 11 0 11 0 11 0 12 1 12 1 12 1 13 0 13 0 13 0 14 0 14 0 14 0
El entrenador empez a felicitar a su equipo porque . . . (The team manager began to congratulate the team because . . .) R1. Haban causado gran admiracin al pblico. (They had caused great admiration in the people). R2. Haban jugado deportivamente. (They had played sportingly). R3. l pensaba que se lo haban merecido. (He thought they had deserved it). 15 16 17 18 19 20 21 22 23 24 25 26 27 28 1 1 0 0 1 0 0 0 0 0 4 1 0 20
15 16 17 18 19 20 21 22 23 24 25 26 27 28 1 0 1 0 1 0 1 0 0 0 3 1 0 20
15 16 17 18 19 20 21 22 23 24 25 26 27 28 0 1 0 0 1 0 0 0 1 0 4 2 0 20
El entrenador empez a felicitar a su equipo puesto que . . . (The team manager began to congratulate the team given that . . .) R4. Haban ganado la copa. (They had won the cup). R5. Haban trabajado duro. (They had worked hard). R6. Haban hecho muy buen trabajo. (They had done a very good job). 15 16 17 18 19 20 21 22 23 24 25 26 27 28 1 1 0 0 1 0 0 0 0 0 0 1 0 16
15 16 17 18 19 20 21 22 23 24 25 26 27 28 1 1 0 0 1 0 0 0 0 0 4 1 0 16
15 16 17 18 19 20 21 22 23 24 25 26 27 28 1 0 1 0 1 0 0 0 1 0 7 1 0 16
532
El entrenador empez a felicitar a su equipo aunque . . . (The team manager began to congratulate the team although . . .) R7. No obtuvieron el resultado que esperaban. (They did not get the expected result). 15 16 17 18 19 20 21 22 23 24 25 26 27 28 0 1 0 0 0 1 1 1 0 1 6 1 0 14
R8. No estaba satisfecho con 15 16 17 18 19 20 21 22 23 24 25 26 27 28 el resultado. 0 0 1 0 0 1 1 1 1 1 7 1 0 14 (He was not satisfied with the result). R9. Estaban cansados despus del esfuerzo que haban hecho. (They were tired after the effort they had made). 15 16 17 18 19 20 21 22 23 24 25 26 27 28 1 0 1 0 0 1 1 1 1 1 9 2 1 14
El entrenador empez a felicitar a su equipo a pesar de que . . . The team manager began to congratulate the team in spite of . . . R10. Haban perdido el partido. (They lost the match). R11. No haban ganado el partido. (They did not win the match). R12. Haban jugado fatal. (They had played horribly). 15 16 17 18 19 20 21 22 23 24 25 26 27 28 1 1 0 0 0 1 1 0 0 1 4 1 0 14
15 16 17 18 19 20 21 22 23 24 25 26 27 28 1 1 0 0 0 1 1 0 0 1 6 1 0 14
15 16 17 18 19 20 21 22 23 24 25 26 27 28 1 0 1 0 0 1 1 0 1 1 4 1 0 14
References
Alameda, Jose R. and Fernando Cuetos. 1995. Diccionario de frecuencias de las unidades lingsticas del castellano. (Dictionary of frequencies of Castilian linguistic units). Oviedo: Servicio de Publicaciones de la Universidad de Oviedo. Barsalou, Lawrence W. 1999. Perceptual Symbol Systems. Behavioral and Brain Sciences 22, 577609. Bestgen, Yves, Liesbeth Degand, and Wilbert Spooren. 2006. Towards automatic determination of semantics of connectives in large newspaper corpora. Discourse Processes 41(2), 175193. Bickerton, Derek. 1981. Roots of language. Ann Arbor: Karoma. Bloom, Lois, Margaret Lahey, Lois Hood, Karin Lifter, and Kathleen Fiess. 1980. Complex sentences: acquisition of syntactic connectives and the semantic relations they encode. Journal of Child Language 7, 235261. Cacioppo, John T., and Wendi L. Gardner. 1999. Emotion. Annual Review of Psychology 50, 191 214.

Cain, Kate, Nikole Patson, and Leanne Andrews. 2005. Age- and Ability- related differences in young readers use of conjunctions. Journal of Child Language 32(4), 877892. Camacho, Juan. 1995. Anlisis multivariado con SPSS/PC+. (Multivariate analysis with SPSS/ PC+). Barcelona: EUB. Caron, Jean, Hans C. Micko, and Manfred Thring. (1988). Conjunctions and the recall of composite sentences. Journal of Memory and Language 27, 309323. Caron, Jean. 1997. Toward a procedural approach of the meaning of connectives. In J. Costermans and M. Fayol (Eds.), Processing interclausal relationships. Studies in the production and comprehension of text. New Jersey: Erlbaum, 95119. Cheng, Patricia W. 1997. From covariation to causation: A causal power theory. Psychological Review 104, 367405. Cozijn, Reinier. 2000. Integration and inference in understanding causal sentences. PhD dissertation, Tilburg University. De Vega, Manuel, Arthur M. Glenberg, and Arthur G. Graesser. 2008. Symbols, Embodiment, and Meaning. Oxford: Oxford University Press. De Vega, Manuel, Inmaculada Len, and Jose M. Daz. 1996. The representation of changing emotions in reading comprehension. Cognition and Emotion 10, 303323 De Vega, Manuel, Jose M. Daz, and Inmaculada Len. 1997. To know or not to know. Comprehending protagonists beliefs and their emotional consequences. Discourse Processes 23, 169 192. De Vega, Manuel, Mike Rinck, Jose M. Daz, and Inmaculada Len. 2007. Figure and ground in temporal sentences: the role of the adverbs when and while. Discourse Processes 43, 1 23. De Vega, Manuel. 2005. El procesamiento de oraciones con conectores adversativos y causales. (The processing of sentences with adversative and causal connectives). Cognitiva 17(1), 85 108. De Vega, Manuel. 2008. Levels of embodied meaning. From pointing to counterfactuals. In M. de Vega, A. Glenberg, and A. Graesser (Eds.), Symbols, Embodiment, and Meaning. Oxford: Oxford University Press, 287308. Degand, Liesbeth and Henk Pander Maat. 2003. A contrastive study of Dutch and French causal connectives on the Speaker Involvement Scale. In A. Verhagen and J. van de Weijer (Eds.), Usage Based Approaches to Dutch. Utrecht: LOT, 175199. Degand, Liesbeth, Lefvre, Nathalie, and Bestgen, Yves. 1999. The impact of connectives and anaphoric expressions on expository discourse comprehension, Document Design 1, 3951. Dewhurst, Sthephen A., and Lisa A. Parry. 2000. Emotionality, distinctiveness, and recollective experience. European Journal of Cognitive Psychology 12, 541551. Flamenco, Luis. 1999. Las construcciones concesivas y adversativas. (Concessive and adversative constructions). In I. Bosque and V. Demonte (dirs.), Gramtica descriptiva de la lengua espaola. (Descriptive grammar of the Spanish language). Madrid: Espasa Calpe, 38053878. Galn, Carmen. 1999. La subordinacin causal y final. (The causal and final subordination). In I. Bosque and V. Demonte (dirs.), Gramtica descriptiva de la lengua espaola. (Descriptive grammar of the Spanish language). Madrid: Espasa Calpe, 36003642. Gernsbacher, Morton A. 1990. Language comprehension as structure-building. Hillsdale, NJ: Lawrence Erlbaum. Gernsbacher, Morton A., H. Hill Goldsmith, and Rachel R. Robertson. 1992. Do readers mentally represent fictional characters emotional states? Cognition and Emotion 6, 89111. Gilquin, Gatanelle and Gries, Stephan. 2009. Corpora and experimental methods: A state-of-theart review. Corpus Linguistics and Linguistic Theory 5(1), 126. Givn, Tom. 1992. The grammar of referential coherence as mental processing instructions. Linguistics 30, 555.
534
Givn, Tom. 1995. Coherence in text vs coherence in mind. In M. A. Gernsbacher and T. Givn (Eds.), Coherence in spontaneous text. The Netherlands: John Benjamin, 59115. Glenberg, Arthur M. and Michael P. Kaschak. 2002. Grounding language in action. Psychonomic Bulletin and Review 9(3), 558565. Graesser, Arthur G., Murray Singer, and Tom Trabasso. 1994. Constructing inferences during narrative text comprehension. Psychological Review 101, 371395. Gygax, Pascal, Alan Garnhan, and Jane Oakhill. 2004. Understanding emotions in text: readers do not represent specific emotions. Language and Cognitive Processes 19, 613638. Haberlandt, Karl. 1982. Reader expectations in text comprehension. In J. F. Le Ny and W. Kintsch (Eds.), Language and Comprehension. Amsterdam: Elsevier Science, 239249. Halliday, Michael A. K. and Ruqaiya Hasan. 1976. Cohesion in English. London: Longman. Harnad, Stevan. 1999. The symbol grounding problem. Physica 42, 335346. Heine, Bernd and Tania Kuteva. 2007. The genesis of grammar. Oxford: Oxford University Press. Hopper, Paul J. and Elisabeth C. Traugott. 2003. Grammaticalization. Cambridge, England: Cambridge University press. Huberty, Carl J. 1994. Applied Discriminant Analysis. New York: Wiley. Kahneman, Daniel and Amos Tversky. 1982. The simulation heuristic. In D. Kahneman, P. Slovic, and A. Tversky (Eds.), Judgment under uncertainty: Heuristics and biases. Cambridge, England: Cambridge University Press, 201210. Lakoff, George. 1987. Woman, fire, and dangerous things: What categories reveal about the mind. Chicago: University of Chicago Press. Langacker, Ronald W. 1990. Subjectification. Cognitive Linguistics 1, 538. Len, Inmaculada, Jose M. Daz, and Manuel de Vega (submitted). The impact of emotional coherence and valence in narratives. An ERP study. Louwerse, Max. 2001. An analytic and cognitive parameterization of coherence relations. Cognitive Linguistics 21, 291315. Mackie, John L. 1974. The cement of the universe. Oxford, England: Oxford University Press. Michotte, Albert E. 1963. The perception of causality. New York: Basic Books. (Original work published 1946). Millis, Keith K. and Marcel A. Just. 1994. The influence of connectives on sentence comprehension. Journal of Memory and Language 33, 128147. Morera, Yurena. 2009. Force dynamics in causal and adversative sentences. PhD dissertation, University of La Laguna. Mouchon, Serge, Michel Fayol, and Daniel Gaonach. 1995. On-line processing of links between events in narratives: Study of children and adults. Current Psychology of Cognition 14(1), 171 193. Murray, John D. 1997. Connectives and narrative text: The role of continuity. Memory and Cognition 25(2), 227236. Noordman, Leo. 2001. On the production of causal-contrastive although \ sentences in context. In Ted Sanders, Joost Schilperoord and Wilbert Spooren, eds., Text representation: Linguistic and psycholinguistics aspects. Amsterdam: John Benjamins, 153180. Ohira, Hideki, Ward M. Winton, and Makiko Oyama. 1998. Effects of stimulus valence on recognition memory and endogenous eyeblinks: further evidence for positivenegative asymmetry. Personality and Social Psychology Bulletin 24, 986993. Ortony, Andrew, Terence J. Turner, and Stephen J. Antos. 1983. A puzzle about affect and recognition memory. Journal of Experimental Psychology. Learning, Memory, and Cognition 9, 725729. Oversteegen, Leonoor E. 1997. On the pragmatic nature of causal and contrastive connectives. Discourse Processes 24, 5185. Pander Maat, Henk and Liesbeth Degand. 2001. Scaling causal relations and connectives in terms of speaker involvement. Cognitive Linguistics 12(3), 211245.

Pander Maat, Henk and Ted Sanders. 2001. Subjectivity in causal connectives: An empirical study of language in use. Cognitive Linguistics 12, 247273. Pander Maat, Henk. 1998. The classification of negative coherence relations and connectives. Journal of Pragmatics 30, 177204. Pinker, Steven. 1989. Learnability and cognition: the acquisition of argument structure. Cambridge, MA: The MIT Press. Pit, Mirna. 2003. How to express yourself with a causal connective: Subjectivity and causal connectives in Dutch, German and French. PhD dissertation, Utrecht University. Pit, Mirna. 2006. Determining subjectivity in text. The case of backward casual connectives in Dutch. Discourse Processes 41(2), 151174. Sanders, Ted and Leo Noordman. 2000. The role of coherence relations and their linguistic markers in text processing. Discourse Processes 29(1), 3760. Sanders, Ted and Wilbert Spooren. 2007. Discourse and text structure. In H. Cuyckens and D. Geeraerts (Eds.), Handbook of cognitive linguistics. Oxford: Oxford University Press, 1414 1446. Sanders, Ted, Wilbert Spooren, and Leo Noordman. 1992. Toward a taxonomy of coherence relations. Discourse Processes 15, 135. Sanders, Ted, Wilbert Spooren, and Leo Noordman. 1993. Coherence relations in a cognitive theory of discourse representation. Cognitive Linguistics 4, 93133. Segal, Erwin M. and Judith F. Duchan. 1997. Interclausal connectives as indicators of structuring in narrative. In J. Costermans and M. Fayol (Eds.), Processing interclausal relationships. Studies in the production and comprehension of text. New Jersey: Erlbaum, 95119. Segal, Erwin M., Judith F. Duchan, and Paula J. Scott. 1991. The role of interclausal connectives in narrative structuring: Evidence from adults interpretations of simple stories. Discourse Processes 14, 2754. Shibatani, Mayayoshi. 1976. The grammar of causative constructions: a conspectus. In Shibatani, M. (Ed.), Syntax and Semantics. The Grammar of Causative Constructions. New York: Academic press, 140. Spooren, Wilbert and Ted Sanders. 2008. The acquisition order of coherence relations: On cognitive complexity in discourse. Journal of Pragmatics 40(12), 20032026. Spooren, Wilbert, Ted Sanders, Mike Huiskes, and Liesbeth Degand (in press). Subjectivity and causality: A corpus study of spoken language. In S. Rice and J. Newman (Eds.), Empirical and Experimental Methods in Cognitive/Functional Research. CSLI University of Chicago Press. / Stukker, Ninke, Ted Sanders, and Arie Verhagen. 2008. Causality in verbs and in discourse connectives. Converging evidence of cross-level parallels in Dutch linguistic categorization. Journal of Pragmatics 40, 12961322 [doi: 10.1016/j.pragma.2007.10.005]. Tabachnick, Barbara G. and Linda S. Fidell. 2001. Using multivariate statistics (4 ed.). Boston: Allyn and Bacon. Talmy, Leonard. 1987. The relation of grammar to cognition. In B. Rudzka-Ostyn (Ed.), Topics in Cognitive Linguistics. Amsterdam: Benjamins, 165205. Talmy, Leonard. 1988. Force dynamics in language and cognition. Cognitive Science 12, 49100. Talmy, Leonard. 2001. Toward a cognitive semantics. Vol. 1: Concept structuring systems. Cambridge, MA: The MIT Press. Tatsuoka, Maurice. 1988. Multivariate Analysis: Techniques for Educational and Psychological Research. New York: Macmillan. Townsend, David J. 1983. Thematic processing in sentences and texts. Cognition 13, 223261. Traxler, Matthew J., Anthony J. Sandford, Joy P. Aked, and Linda M. Moxey. 1997. Processing causal and diagnostic statements in discourse. Journal of Experimental Psychology: Learning, Memory, and Cognition 23(1), 88101. Verhagen, Arie. 2002. From parts to wholes and back again. Cognitive Linguistics 13(4), 403439.
536
Verhagen, Arie. 2005. Constructions of Intersubjectivity. Discourse, Syntax, and Cognition. Oxford: Oxford University Press. Verhagen, Arie. 2007. Construal and perspectivisation. In D. Geeraerts and H. Cuyckens (eds.), Handbook of Cognitive Linguistics. Oxford: Oxford University Press, 4881. Wolff, Phillip and Grace Song. 2003. Models of causation and the semantics of causal verbs. Cognitive Psychology 47, 276332. Wolff, Phillip. 2003. Direct causation in the linguistic coding and individuation of causal events. Cognition 88, 148. Wolff, Phillip. 2007. Representing causation. Journal of Experimental Psychology: General, 36, 82111. Zwaan, Rolf A., Mark C. Langston, and Arthur C. Graesser. 1995. The construction of situation models in narrative comprehension: An event-indexing model. Psychological Science 6(5), 292297.
From premodal to modal meaning: Adjectival pathways in English

AN VAN LINDEN*
Abstract This article approaches common topics in the diachronic literature on modal categories from the perspective of adjectives. It thus expands on what has been found for the better studied category of modal auxiliaries as regards sources of modal meaning and pathways of change. Most importantly, it proposes two new pathways from premodal to (dynamic) modal meaning, one followed by essential and vital, and one followed by crucial and critical. It also shows that in the four cases the development of dynamic meaning depends on the emergence of two semantic properties, viz. relationality and potentiality. Finally, this study makes it clear that the mechanisms driving the various semantic changes are not new, but rather have proved useful in explaining a varied set of developments. For the final semantic extension of the adjectives from dynamic to deontic meaning, for instance, the process of subjectification (Traugott 1989) will be invoked. Keywords: modality; semantic change; adjectives; sources; premodal > modal; dynamic > deontic
* Address for correspondence: Department of Linguistics; University of Leuven; BlijdeInkomststraat 21; Postbus 3308; 3000 Leuven; Belgium. Authors e-mail address: <an. vanlinden@arts.kuleuven.be>. Acknowledgements: The research reported on in this article has been made possible by research grants OT/03/20/TBA, OT/04/12 and OT/08/011 of the Research Council of the University of Leuven, as well as the Interuniversity Attraction Poles (IAP) Programme - Belgian State - Belgian Science Policy, project P6/44 Grammaticalization and (inter)subjectification. In addition, it has been supported by the Spanish Ministry of Education and Science (grant no. HUM2007-60706/FILO) and the European Regional Development Fund. I would also like to thank Jean-Christophe Verstraete for the fruitful discussions of topics covered in this study, and for his helpful comments on earlier versions of this article. Finally, I am indebted to the two anonymous referees and one of the Associate Editors of Cognitive Linguistics for their very generous and insightful remarks. Needless to say, I am the only one responsible for remaining errors of thought in the final version. Cognitive Linguistics 213 (2010), 537571 DOI 10.1515/COGL.2010.018 09365907/10/00210537 Walter de Gruyter
538
A. Van linden
1. Introduction In the diachronic literature on modal categories, much attention has been devoted to verbal forms, especially to modal auxiliaries (e.g., Goossens 1983, 1999; Plank 1984; Sweetser 1990: 4975; Bybee et al. 1994; Hansen 1998, 2004; Van der Auwera and Plungian 1998; Diewald 1999; Traugott and Dasher 2002: Ch. 3). In general, accounts have focused on three topics, viz. (i) lexical sources of modal forms, (ii) pathways of change, and (iii) mechanisms of change, very often in the framework of grammaticalization. In this article, I will concentrate on another grammatical category expressing modal meaning, viz. adjectives.1 It will become clear that the adjectival data add new findings to what has been observed for the lexical and semantic sources of modal elements and their pathways of change from descriptive to deontic meaning. However, it will also appear that the mechanisms driving the semantic changes of the adjectives are not that new, but have been invoked already for a diverse set of changes in distinct conceptual domains (cf. Geeraerts 1997: 93102). Therefore, I will return to this topic only in the various case-studies of the adjectives (Section 4). It is generally agreed that modal expressions ultimately derive from nonmodal elements (e.g., Traugott 2006: 107). The English modal auxiliary can, for instance, has developed from the main verb cunnan know (how to), just like the modal shall has developed from the main verb sculan owe (Bybee et al. 1994: 183, 190; Traugott and Dasher 2002: 119). Typological studies have shown that these lexical sources are cross-linguistically recurrent for expressions of ability and obligation respectively (e.g., Bybee et al. 1994; Van der Auwera and Plungian 1998; Heine and Kuteva 2002: 327, 333). More specifically, for the latter notion, traditionally seen as part and parcel of deontic modality, the following sources have been proposed: (i) future-oriented need and desire (with lexical sources need, want), (ii) being or coming into being (with lexical sources be, sit, stand, [be] fall), (iii) possession (with lexical sources possess, have, get, obtain, catch, owe), (iv) positive evaluation (with lexical sources be fitting, good, mete [measure]) (Bybee et al. 1994: 182183; Traugott and Dasher 2002: 118119; Heine and Kuteva 2002: 333). Although the adjectives focused on in this article do not encode obligation, they can be used to express the conceptually related category of strong desirability, as in (1) (also included within the deontic domain by some authors, e.g., Nuyts 2006).
1. Of course, as suggested by a referee, other scholars have already pointed out that adjectives can have modal meanings as well, such as Nuyts (2001) for Dutch waarschijnlijk and German wahrscheinlich (both probable), and Shindo (2008) for clear. These adjectives, however, express epistemic meanings, whereas the adjectives focused on in this article express deontic meanings.
From premodal to modal meaning (1)
539
But quite apart from mediation, it is essential that more explicit recognition is given in the Bill to the important role marriage counselling can play in exploring the possibility of reconciliation. (CB 1996, times)
It will become clear that not only essential, but also vital, crucial and critical derive from very different sources than the ones found for obligation among verbal forms. In addition to the semantic and lexical sources of modal categories, diachronic studies devoted to the modal domain have concentrated on pathways of change. As noted by Traugott (2006: 110), these pathways, paths, clines, or chains should be interpreted as macro-schemas accommodating overarching types of change (cf. Andersen 2001). These schemas typically include focal points, which indicate micro-steps by which changes occur, like in ability > root possibility > epistemic possibility (for the English modal can, for instance). It is assumed that such micro-steps are instances of gradual change, with diachronic gradualness corresponding to synchronic gradience (see, e.g., Denison 2001). In principle, three types of pathways can be distinguished: (i) from premodal to modal meaning, (ii) from one modal to another modal meaning, and (iii), from modal to postmodal meaning. When reviewing the literature, it becomes apparent that especially paths of type (ii) have been focused on. Language-specific as well as cross-linguistic accounts have generally adduced evidence for the agent-oriented/root2 > epistemic pathway (e.g., Heine et al. 1991; Bybee et al. 1994; Van der Auwera and Plungian 1998; Traugott and Dasher 2002: Ch. 3).3 Paths of type (iii) still have received some attention (e.g., Bybee et al. [1994: 212225] on the development of subordinating moods, and Van der Auwera and Plungian [1998: 104110] on demodalization), but those of type (i) have hardly been investigated. In this article, I will detail the semantic development of four adjectives from their original nonmodal meaning to dynamic meaning and further to deontic meaning. I will present two new pathways from premodal to (dynamic) modal meaning, each exemplified by two adjectives. The structure of this article is as follows. Section 2 briefly describes the modal notions referred to in this study and discusses the data and the corpora used. Section 3 concentrates on the lexical sources of deontic adjectives. Section 4 discusses the semantic developments of essential, vital, crucial and
2. 3. Agent-oriented or root meaning includes ability, desire, root possibility, obligation and necessity (Bybee et al. 1994: 177179). However, Narrog (2005) presents Japanese data which do not comply with the agentoriented > epistemic pathway. Instead, he proposes an overarching tendency for modal expressions to change from event-oriented (largely identical with agent-oriented) to speakeroriented meaning.
540
A. Van linden
critical. It distinguishes between the changes from premodal to modal meaning, and those within the modal domain. Whereas in the latter case the dynamicdeontic development mirrors that of modal verbs, the developments leading to dynamic meaning offer new insights. I argue that the development of this modal meaning involves (the emergence of ) two properties in the semantic make-up of the adjectives, which are called relationality and potentiality. Relationality is needed to turn the adjective into a predicate of necessity that can link two concepts, for instance a part and a whole, or a condition and a goal. Potentiality is needed to ensure that the relationship established by the adjective is one of indispensability, which gives rise to dynamic meaning. Although all four adjectives differ in the way they develop these properties, it proves possible to generalize over the four cases and I propose two pathways of change from premodal to modal meaning. In Section 5, finally, I recapitulate the main findings and propose questions for further reflection. 2. Modal notions and data The modal notions central to this article are dynamic and deontic modality. Dynamic modality traditionally involves ascribing an ability or capacity to the subject participant of a clause (e.g., von Wright 1951b: 28). However, this definition has been felt to be too narrow, and it has been extended to all indications of abilities/possibilities, or needs/necessities inherent in participants of actions (which are not necessarily syntactic subjects) or in situations (e.g., Palmer 1979: 34, Ch. 56, 1990: Ch. 56; Perkins 1983: 1112; Nuyts 2005, 2006).4 It is especially the situational subtype that can be expressed by the adjectives studied here, as in (2). (2) This should make you want to go to the toilet frequently. Although it may sting the first few times you go, this usually gets better the more water you pass. It is essential to keep emptying the bladder if you are to flush out the germs. (CB 1992, ukepehm) Situational dynamic modality involves the indication of a potential or a necessity/inevitability inherent in the situation described in the clause as a whole (Nuyts 2006: 4). In (2), the speaker describes the need to keep emptying the bladder in order to flush out the germs (note the condition-goal paraphrase). Importantly, the speaker does not express his/her personal opinion, but rather a natural law-like truth: the need or necessity originates in the physical make-up of the human body. Example (2) thus expresses a necessity that is internal to the State of Affairs (SoA) described in the clause.
4. As suggested by a referee, Depraetere and Reed (2006: 281282) provide a useful overview of how dynamic modality is treated in the literature.
From premodal to modal meaning

Table 1. The corpora used for each subperiod and their number of words Subperiod of English Time span Corpus
541
Number of words (million) 1.16 1.79 15.01 42.10
Middle English (ME) Early Modern English (EModE) Late Modern English (LModE) Present-day English (PDE)
11501500 15001710 17101920 roughly 19901998
Penn-Helsinki Parsed Corpus of Middle English, Second Edition (PPCME) Penn-Helsinki Parsed Corpus of Early Modern English (PPCEME) Corpus of Late Modern English texts (Extended version) (CLMETEV) (De Smet 2005, 2008) Collins COBUILD corpus (CB) (only British subcorpora)
Deontic modality, in turn, has traditionally been defined in terms of obligation and permission (von Wright 1951a, 1951b: 36, 1971; Lyons 1977: 823 841; Kratzer 1978: 111; Palmer 1979: Ch. 4, 1986: 96115; Goossens 1985: 204; Van der Auwera and Plungian 1998: 81). However, Nuyts (2006: 4) has proposed a more general definition as an indication of the degree of moral desirability of the state of affairs expressed in the utterance, which does not necessarily involve obligation or permission. In fact, the adjectives studied never encode obligation or permission, but they can be used to express someones (viz. of the attitudinal source) commitment to an SoA in terms of his/her moral principles. These principles are invariably external to the SoA assessed in the expression, as in (3). (3) Herbert Daniels, the groups founder, believes that it is essential to overcome the social stigma of Aids, which often means that people with the virus lose their homes, jobs and families, and are effectively condemned to death by society. (CB 1990, bbc)
In (3), the (reported) speaker does not impose an obligation, but rather expresses his commitment to overcoming the social stigma of Aids on the basis of moral grounds. Unlike in (2), therefore, essential expresses SoA-external necessity in (3). In addition to essential, illustrated in (1) to (3), this article focuses on three other adjectives that were borrowed into English with a non-modal descriptive sense, viz. vital, crucial and critical. These adjectives were searched for in the electronic version of the Oxford English Dictionary (OED) to find information on their etymology. In addition, I used its general quotation database, in which nearly all quotations are precisely dated and thus very helpful in tracking the semantic development of the adjectives in question. In addition to the OED, I also used the set of diachronic and synchronic corpora presented in Table 1,
542
A. Van linden
to corroborate the findings. It should be noted, though, that in some cases I found a considerable time lag between the first attestation of a use in the OED and its occurrence in the corpus data. 3. Sources of deontic modality Several cross-linguistically recurrent sources have already been proposed for the notion of obligation (for an inventory, see Section 1). Among these, expressions of positive evaluation are relevant to English, as complex constructions with adjectives such as appropriate can be used to express deontic meaning, as in (4). However, they express a weaker degree of desirability than the adjectives studied here. (4) You can indulge the shortcomings of a friend a certain number of times and then, unwittingly, they go over the limit. You tot everything up and, ( . . . ) there comes a point when you decide that in total they are unforgivable and can no longer be overlooked. ( . . . ) Sometimes it may be wholly appropriate not to forgive or forget. (CB 1993, ukmags)
A set of such weak(er) English adjectives can even be traced further back to other sources listed in Section 1. The adjectives proper and appropriate, for instance, etymologically involve the notion of possession (cf. OED, s.v. proper and appropriate).5 The adjectives fitting (and possibly also fit) and meet, in turn, relate to the notion of measure (cf. OED, s.v. fitting, fit and meet).6 However, it has also been acknowledged that the inventory of sources for the notion of obligation is not exhaustive (e.g., Van der Auwera and Plungian 1998: 91). The adjectives I am concerned with here, which express a strong degree of desirability in the deontic domain, derive from sources very different from the ones mentioned above. For example, essential, borrowed into English from Latin in the 15th century, is an adaptation of the Late Latin word essentialis, which in turn derives from the noun essentia essence (OED, s.v. essential). Vital, which entered the English language in the 14th century, also derives from a noun, viz. Latin vita life (note that according to the OED its precise etymology is not very clear: either it is adopted from Old French vital,
5. 6. Likewise, the expressions for obligation in Chepang (Sino-Tibetan) and Temne (Niger-Congo) as well as the English semi-modals have to and have got to, and the Spanish forms haber de and tener que also derive from the notion of possession (Bybee et al. 1994: 182184). In addition, the expression for obligation in Danish (m) (Bybee et al. 1994: 182) and the English modal auxiliary *motan (Traugott and Dasher 2002: 122) also derive from the notion of measure, just like the Dutch past participle gepast and its German counterpart angemessen (both fitting). In Dutch, the present participle of the same verb passen (fit), viz. passend, can also be used to translate fitting.
543
or it is an adaptation of the Latin form vitalis).7 Crucial and critical ultimately derive from nouns as well. Crucial, borrowed from French in the 18th century, is based on the Latin form crux cross (OED, s.v. crucial). Critical, finally, goes back to the Greek noun judgement, crisis, via the derived adjective or (Liddell et al. 1951 [1924]: i 997a), which was first borrowed into Latin as criticus, and in the 16th century into English as critic (OED, s.v. critic and critical). Morphologically, the adjectives studied thus all are derivations from a nominal base. Semantically, the nominal sources refer to either abstract notions (essence, life and judgement/crisis) or concrete entities (cross), which are not included in the set of sources given above. The adjectives essential, vital, crucial and critical therefore present us with additional sources for deontic meaning.8 4. Pathways of semantic change9 This section concentrates on the semantic development of the four adjectives studied here. In Section 4.1, I will discuss changes from premodal to modal meaning, which have been largely under-researched so far (cf. Section 1). I will present two new pathways of change towards dynamic meaning, which share the emergence of two semantic properties, viz. relationality and potentiality. In Section 4.2, I will discuss a change within the modal domain common to all four adjectives, viz. that from dynamic to deontic meaning, which has also been observed for modal auxiliaries such as can, must and may. In general, I will show that the development of the lexical items cannot be dissociated from the constructions they appear in, which are broadly understood here as including patterns of co-occurrence and patterns of (prepositional or clausal) complementation. As with the modal auxiliaries, the lexicon-syntax interface thus plays an important role in the semantic development of the adjectives as well. It will also turn out that the mechanisms driving the semantic changes are not specific to the adjectival developments either. 4.1. Changes from premodal to modal meaning If we take a closer look at the semantic development of essential, vital, crucial and critical, we can distinguish between two pathways from premodal to
7. The Old French period is generally taken to last until 1350, so it possible that vital was borrowed from continental Old French. However, in view of the sociolinguistic situation in Britain during the Middle English period, it is more likely that vital was borrowed from Anglo-Norman than from Old French, if the source is not Latin (see Rothwell 1998). As noted by a referee, these sources are non-native from the perspective of English. Therefore, the adjectives belong to a different register than the English modal auxiliaries, which can be traced back to cross-linguistically recurrent sources. I will briefly return to this issue in Section 5. This section is largely based on Van linden (2009: 77107) and Van linden (forthcoming).
8.
9.
544
A. Van linden
modal meaning. I will first discuss the developments of essential (Section 4.1.1) and vital (Section 4.1.2), and I will argue that these mark a first pathway to dynamic meaning (Section 4.1.3). Then I will present the developments of crucial (Section 4.1.4) and critical (Section 4.1.5), which came later into the English language (see Section 3), and propose a second pathway on the basis of these findings (Section 4.1.6). Both pathways will be described in terms of the properties of relationality and potentiality, which are the semantic prerequisites for dynamic meaning to emerge. In addition, as suggested by a referee, it will become clear that the adjectives comply with the tendency noted by Paradis (2001: 58) for non-gradable items to develop into gradable ones. 4.1.1. Essential from premodal to dynamic meaning.10 This section concentrates on the semantic developments of essential towards dynamic meaning. As proposed in Table 2 below, we can hypothesize three stages, which are the result of two semantic changes. The first change is that from its original meaning to a relational type of meaning, which is termed defining necessity. The second change is that to (situational) dynamic modal meaning, for which the development of the feature of potentiality is crucial. It will also become clear that the main driving factors of the changes are patterns of co-occurrence. As can be seen in Table 2, the original meaning of essential in English is not relational, nor potential. It can be paraphrased as being such by its true nature, or being such in the true sense of the word. The OED gives that is such by essence, or in the absolute or highest sense (OED, s.v. essential). An example is given in (5). (5) For e souerayne and e Escencyalle Ioy es for the sovereign and the essential joy is in in e lufe of the love of
Table 2. The development of essential from premodal to modal meaning (cf. Van linden et al. 2008: 240, Table 2) Stages First attestation Meaning and examples relationality potentiality stage 1: original meaning c1440 being such by its true nature (5) stage 2: defining necessity 1596 constituting the true nature of (6)(8) + stage 3: dynamic meaning 1618 indispensable for (9)(11) + +
10.
This section is based on Van linden et al. (2008: 231240).
545
Godd by hym-selfe and for hym-selfe, and e secundarye es in God by himself and for himself, and the secondary is in comonynge and byhaldynge of Aungells and gastely creaturs. communing and beholding of angels and ghostly creatures For the sovereign and the essential joy is in the love of God by himself and for himself, and the secondary ( joy) is in the communing and the beholding of Angels and ghostly creatures (PPCME c1440 ?Rolle i ioy [Thrn] 17) In (5), the adjective essential, like secondary, indicates a type of joy. In this sense, it functions as a classifier and not as an attribute of the noun joy.11 Semantically, classifiers denote a subtype of the more general type referred to by the head noun, and tend to be organized in mutually exclusive and exhaustive sets of that general type (Halliday 1994: 185). In fact, the two types of joy in (5) are opposed to each other, and thus presented as mutually exclusive and exhaustive sets of joy: essential joy (meaning true, basic, substantial or primary joy) versus secondary joy (meaning derived, accidental joy). The first semantic extension of essential on its pathway to dynamic meaning involves the development of relational meaning (cf. Table 2). This type of meaning is illustrated in (6) below. (6) Sensibility and a locomotive faculty are essentiall to every living creature. (OED 1656 Bramhall, A replication to the bishop of Chalcedon i. 5) In (6), sensibility and a locomotive faculty are said to constitute the essence of every living creature. This use of essential is relational because it does not indicate a type of something (e.g., a type of joy as in [5]), but serves to relate two concepts, viz. sensibility/locomotion and life. Whereas the original sense of essential is still taxonomic, in that it applies to types, the relational meaning is clearly partonomic, in that it applies to parts in relation to a whole. This change is also reflected in the syntactic potential of the adjective: it is not a classifier, but now functions as an attribute in predicative position, and it can take a prepositional complement. In addition, it has changed from a non-gradable to
11.
Classifiers can be opposed to attributes, which assign a (typically gradable) quality to the instance referred to by the NP, as new in a new car, or beautiful in a beautiful car) (Bolinger 1967: 1420; Teyssier 1968: 225249; Halliday 1994: 184186). Unlike attributes, classifiers can only occur in prenominal position and never appear predicatively. Furthermore, since classifiers do not attribute a quality to the referent of the NP, but rather modify the reference of the head noun (Bolinger 1967: 1415), they are not gradable, i.e., they do not accept degrees of comparison or intensity (Halliday 1994: 185).
546
A. Van linden
a gradable item: in examples such as (6) essential can take totality modifiers such as absolutely (cf. Paradis 2001: 5053). It has been shown that the diachronic bridge between the original classifier use in (5) and the later relational use in (6) can be found in structures in which the classifier co-occurs with relational nouns like property, attribute or part (Van linden et al. 2008: 232234), as in (7) and (8) below.12 Heate is the essentiall propertie of fire (OED 1620 Granger, Syntagma logicum, or the divine logike 66) (8) Mercy as it is Radically in God and an essentiall attribute of his. (OED a1631 Donne, Sermons [1953] VI. 170) (7) In these examples, essential functions as a classifier with the relational nouns property and attribute, which denote a part within a larger whole. This pattern of co-occurrence with relational nouns is decisive for the development of relational meaning of the adjective itself: with relational nouns, the paraphrase proposed for the original use of essential in (5) cannot be applied anymore. In (7), for instance, essentiall propertie does not mean that is a property in the true sense of the word, or that is a property by its true nature. Rather, the part-whole relationship in the background of property provides a better paraphrase: a property of the essence of fire, or a property constituting the essence of fire. In this sense, it can be argued that relational nouns like property or attribute, which are based on a part-whole or inclusion relationship, are semantically permeable and therefore able to transfer their relational property to the adjectives that classify them. The semantic permeability of the relational nouns actually implies that the meaning of essential in expressions such as (7) and (8) is relational as well. In (7), for instance, essential links heat with fire, as heat is said to constitute the essence of fire, and in (7), it links mercy with God, as mercifulness is argued to constitute the essence of God. Thus, heat is essential to fire, and mercy is essential to God. In (7) and (8), then, essential establishes a relation of inclusion between two concepts, just like in (6). As can be seen in (6), later relational uses of essential do not necessarily involve relational nouns: it has merely been argued that co-occurrence with relational nouns (i.e., classifier relational uses) is a facilitating factor that forms a diachronic bridge between classifier nonrelational uses and non-classifier relational uses.
12. Relational nouns like these are different from other nouns in that they make schematic reference to another thing (the whole), and have the conception of a relationship with this other thing as a background (the part-whole relationship), just like the noun father (the male parent) makes schematic reference to offspring on the basis of the parent-offspring relationship (cf. Langacker 1991: 3839).
547
As indicated in Table 2, the type of meaning expressed by relational uses of essential is called defining necessity (cf. Van linden et al. 2008: 234). In fact, if certain properties or attributes are said to constitute the essence of something, they are necessary to it, for otherwise we might be dealing with just something else. Importantly, this type of necessity differs from the classic dynamic-modal type of necessity, i.e., the necessity we experience when something is needed for a certain purpose. The second semantic change in the development of essential then is the extension from the sense of defining necessity to that of dynamic necessity, in which the property of potentiality plays a key role (cf. Table 2). What distinguishes the two types of necessity is the notion of definition. The first type of necessity obviously is defining in nature, whereas the second type is not. Example (7), for instance, can be paraphrased as fire is (necessarily) hot, and (8) as God is (necessarily) merciful. In these paraphrases, the predicates do not add any new information to the subject, but rather define it. Being hot, for example, is a defining feature of fire. In this sense, the paraphrases are analytical propositions, in which subject and predicate are linked by virtue of their intension. Furthermore, what is regarded as necessary in a defining way (e.g., mercy as necessary to God in [8]) is intrinsically present in it.13 Finally, defining necessity applies to all instances of the type designated by the head noun to which something is said to be necessary: all fires, for instance, are hot. Dynamic (modal) necessity, on the other hand, has very different semantic characteristics. Consider the following example. (9) And practice, though essential to perfection, can never attain that to which it aims, unless it works under the direction of principle. (CLMETEV 1776 Reynolds, Seven discourses on art)
In (9), practice is not defining of perfection. The example is a synthetic proposition, in which the predicate is not linked to the subject by virtue of its intension, but adds new information about the subject. Furthermore, dynamic necessity does not really signal an inherent presence, such as the presence of mercy in God in (8), but rather the absence of something that is desirable for a particular purpose, such as practice in (9). The subtype of dynamic meaning involved here thus is situational in the sense defined in Section 2: the necessity of practice is inherent in the situation of reaching perfection, with the necessity being indicated on the basis of SoA-internal grounds. Finally, as this type of necessity is not defining in nature, it does not necessarily apply to all instances of the type designated by the head noun to which something is said to be essential.
13. Note that this is highly determined by the speakers Weltanschauung. An ancient Greek speaker, for example, would not see mercy as an essential attribute of god (e.g., Zeus).
548
A. Van linden
The semantic extension of essential from the sense of defining necessity to that of dynamic necessity can be attributed to the emergence of an element of potentiality. Corpus examples such as (9) show that the potential element can originate in the fact that the element to which something is said to be essential is a potential action, viz. reaching the state of perfection. Clearly, the action representing the goal is potential: it has not yet been realized, but it can be realized at some point in the future. However, the earliest constructions in which potentiality emerges are expressions in which the element to which something is said to be essential is modified by an evaluative adjective. They appear in the early 17th century, not much later than the first relational (but non-potential) uses (1596). Examples are given below; example (10) is the first attestation in the OED. (10) (11) It is an essentiall property of a man truly wise, not to open all the boxes of his bosome. (OED a1618 Ralegh, Remains, viz. Maxims of state, Advice to his son [1664] 89) Government is essential to formed and regular Societies. (OED 1681 1686 Scott, The christian life [1747] III. 386)
In these examples, the nouns to which a particular feature is said to be essential (man in [10], societies in [11]) are modified by evaluative adjectives. These adjectives indicate that the predication of being essential does not apply to all instances of the type designated by those nouns, but only to a subjectively defined subset of them. The type of subjectivity intended here is the one involving the speakers evaluation of an entity, i.e., the description of a content based in the speakers subjective attitude towards the situation (De Smet and Verstraete 2006: 385).14 This type of subjective meaning gives rise to potential meaning. In (10), for instance, the property of not opening all the boxes of your bosom is said to be an essential property of truly wise men (only), so not of just every man. The property actually serves as a criterion for a man to be taken up in the privileged subset of truly wise men, or, in other words, if you want to be considered a truly wise man, you should not open all the boxes of your bosom. Example (11) can in turn be paraphrased as in order for a society to be considered formed and regular, it should have government, or it should be governed. These condition-goal paraphrases make it clear that evaluative adjectives bring with them the notion of dynamic (situational) necessity. The examples (9) to
14. In their typology of subjectivity, De Smet and Verstraete (2006: 387) term this type of subjective meaning ideational semantic subjectivity. This type is different from the subjective meaning conveyed in deontic expressions, which involves the enactment of the speakers position towards the situation, and is labelled interpersonal semantic subjectivity in De Smet and Verstraete (2006: 386) (see Section 4.2).
549
(11) thus show that the extension of essential to evaluative contexts and contexts of potential action implies a semantic extension of the adjective: the relationship established by it has been extended from one of intrinsic inclusion (in contexts of defining necessity) to one of indispensability (in contexts of dynamic necessity). In conclusion, in the development from premodal to (dynamic) modal meaning, essential first acquired relational meaning through co-occurrence with relational nouns, and came to express defining necessity. Later on, co-occurrence with evaluative adjectives and potential actions drove the development of potential meaning, and the extension from intrinsic inclusion to indispensability or dynamic necessity. It is not surprising that the earliest examples of potential meaning were found in evaluative contexts, as these are still close to defining contexts because of the inclusion relationship between the two entities linked by essential (e.g., properties of men in [10]). Contexts of potential action, by contrast, are both diachronically (cf. [9]) and semantically further removed from defining contexts, because they have given up the inclusion relationship at all. 4.1.2. Vital from premodal to dynamic meaning. In the semantic development of vital to dynamic meaning, we can also distinguish between three stages as the result of two changes. As shown in Table 3, the first stage involves its original meaning, which is already relational, but yet non-modal. I will discuss three different subsenses; the first semantic change involves the generalization of one specific subsense, viz. that in the collocation vital parts, which gives rise to the meaning of defining necessity (like essential in its second stage). The second semantic change is that to dynamic meaning, in which again the property of potentiality emerges. Like in the case of essential, this change occurs through the extension to contexts of evaluation and potential action. However, the chronology of the first attestations of the senses in
Table 3. The development of vital from premodal to modal meaning Stages First attestation Meaning and examples stage 1: original meaning 1386 associated with life or the heart; essential to life (12)(15) + stage 2: defining necessity 1647 essential to; constituting the essence of (16)(17) + stage 3: dynamic meaning 1619 indispensable for (18)(19) + +
relationality potentiality
550
A. Van linden
stages 2 and 3 requires us to regard the development sketched here as merely a hypothesis. In its earliest attestations in the OED and the historical corpora, vital is used in three distinct senses. The data do not provide a decisive answer as to which sense is the original one in English, or whether these senses developed out of one another. As these questions are not immediately relevant to the development of modal meaning, they are not discussed in further detail. The first attestation of vital dates from 1386, and involves the general sense of associated with life. The OED gives a more specific definition: consisting in, constituted by, that immaterial force or principle which is present in living beings or organisms and by which they are animated and their functions maintained (OED, s.v. vital). The example is given in (12) below. (12) In hise armes two The vital strengthe is lost, and al ago. in his arms two the vital strength is lost, and all agone In his two arms the vital strength is lost and all gone. (OED c1386 Chaucer, Knights Tale 1994) A second sense of vital appears not much later in the OED data (1450), and is also covered by the associated with life paraphrase. In this case also, a more specific definition can be put forward, in which vital is associated with the physiology of the ancient Greek physician Galen (129199 AC) (TLF XVI: 1210a). Building on Platos tripartite nature of the soul, consisting of a vegetative, sensitive and rational soul (Knoeff 2004: 419), Galen distinguished between three systems, each of which is located in different organs and has a distinct set of virtues and faculties (Siraisi 1990: 107). In later Galenic thought, these systems were called the natural, vital and animal system, the principal parts of which are the liver, heart and brain respectively (Siraisi 1990: 107 108). Galenic physiology and pneumatology persisted into the 17th century (Forrester 2002), which is reflected in the OED data. In the Middle and Early Modern English data, vital is found in collocation with nouns such as spirit(s), blood, heat, virtue and faculty, with the specific meaning of associated with the heart. In these collocations, as in (13), vital does not assign a gradable quality, but rather functions as a classifier, as it indicates a specific subtype of a more general type (e.g., spirit), in opposition with natural and animal. (13) The Spirit Vitall in the Hert doth dwell, The the spirit vital in the heart does dwell, the Naturall . . . in the Liver . . . , but Spirit Animall natural in the liver, but spirit animal the Braine. the brain Spirit spirit dwelleth dwells
in in
551
The vital spirit dwells in the heart, the natural spirit in the liver, but the animal spirit dwells in the brain. (OED 1477 Norton, The ordinall of alchimy [1652] 82) It can be argued that in the sense of associated with life (or the heart), vital already has a relational meaning (cf. Table 3): it evokes a relationship with life. In this sense, it can be paraphrased as essential to life, with essential used in a defining way. Vital strength, for instance, is a strength that is intrinsically present in life, or more specifically in living creatures and organisms. Likewise, the vital spirit is intrinsically present in life. At least in that particular Weltanschauung, it constitutes the essence of life, and every living human being has it by definition. The senses of vital in vital strength and vital spirit thus both imply a relationship of intrinsic inclusion. The third non-modal sense of vital is found in collocations with the relational noun part(s), and its first example in the OED dates from 1565. Arguably, this collocation was used in a Galenic and a modern sense. In the Galenic sense, the term vital parts referred to the organs of the Galenic vital system, viz. the organs in the thoracic cavity and the arteries (Siraisi 1990: 107). This sense is illustrated in example (14) below, in which the vital parts are opposed to the parts of the natural system, which were also called the nourishing parts. Again, vital is used as a classifier, indicating a type of parts. In the modern sense, the referents of the collocation do not belong to the vital system only, but also to the animal and natural system. In this sense, vital also functions as a classifier. However, it is not opposed to natural/nourishing or animal, as in the Galenic sense, but rather to non-vital. Vital parts are organs without which we cannot live, such as the heart, lungs, brains and liver, whereas non-vital parts are those which can be missed, such as the milt, uterus and eyes. This modern sense is illustrated in example (15), and is clearly of a later date than the Galenic example. However, both senses can be paraphrased by essential to life, with essential used in a defining way. According to the Galenic Weltanschauung on the one hand and that of modern medicine on the other, these parts are intrinsically present in life, or, to put it differently, without these parts, there is no life. (14) There is a partition called diaphragma by the Grcians, which separateth the instruments of the vital partes, from the nourishing parts. (OED 1594 Bowes, De La Primaudayes French academie II. 220) (15) The Vital Parts are the Heart, Brain, Lungs and Liver. (OED 1696 Phillips, The new world of English words: or, a general dictionary [ed. 5] s.v. vital)
The first semantic change of vital involves semantic generalization, in which vital loses its connection with life and comes to express defining necessity
552
A. Van linden
(cf. Table 3).15 This generalization starts from its collocation with parts, and extends the relationship of intrinsic inclusion within life to that of intrinsic inclusion within basically anything that is more or less composite in nature. The hypothesis that the generalization occurred prior to rather than simultaneous with the development of potential meaning is suggested by examples in which vital is found with nouns referring to abstract concepts that are fairly homogeneous in substance, much like the relational non-potential examples found with essential, viz. (7) and (8) above. Examples with vital are given in (16) and (17). We can note that they also show structural reflections of its relational meaning: the elements to which something is said to be vital are coded by of-PPs (the same goes for [18] below). In addition, vital has become gradable, as it can combine with totality modifiers such as absolutely (cf. Paradis 2001: 5053). (16) Their submiss Reverence to their Princes being a vital part of their Religion; (OED 1647 Clarendon, The history of the rebellion and civil wars in England I. 76) (17) If these he has mentioned be the substantial and vital parts [of his theory, OED]. (OED 1698 Keill, An examination of Dr. Burnets Theory of the earth [1734] 181) In these examples, vital co-occurs with the relational noun part, but it bears no relation to life anymore. Instead, vital is used in its generalized sense, as it refers to essential parts of a religion or theory. It can be argued that vital is used here in a defining way, as the religion in (16) and the theory in (17) would not be the same anymore if the vital parts were changed or removed. In other words, these parts are intrinsically present in the religion or theory, and constitute their essence. The second semantic change of vital involves the development of the property of potentiality (cf. Table 3). Like in the case of essential, this propertyand hence, dynamic meaningfirst emerges in examples in which the noun to which something is said to be vital is modified by an evaluative adjective.16 As
15. 16. In the data, the earliest instances of vital in this more general meaning of essential to are few. Therefore, the semantic developments proposed here are not necessarily consistent with the chronology of the attestations (cf. dates of first attestations in Table 3). However, as suggested by Hubert Cuyckens (pc), it might be argued that the collocation from which the process of generalization starts (viz. vital parts) provides a shortcut to potential meaning, as it already indexes the property of potentiality. More precisely, the collocation can also be paraphrased as parts that are necessary to life; performing the functions indispensable to the maintenance of life (OED, s.v. vital). This potential element can be thought of as an invited inference, which is later semanticized (Traugott and Dasher 2002: 3440). Paraphrases involving potentiality can also be used for examples which are comparable to those in (16) and (17) above, but which involve more concrete noun referents that are heterogeneous in substance. Examples are given below.
553
discussed in Section 4.1.1, such adjectives indicate that the predication does not apply to all instances of the type designated by that noun, but only to a subjectively defined subset of these. An example is given below. (18) The three vital circumstances of a well-ordered Action, Person, Time and Place. (OED 1619 Lushington, The resurrection rescued from the soldiers calumnies [1659] 70) This example is similar to that with essential and an evaluative adjective, like in (10) and (11) above. In (18), the three circumstances listed are essential or necessary only to a potential or subjectively defined subset of actions, viz. well-ordered actions. In other words, in order for an action to be considered well-ordered, it should be characterized by the circumstances of person, time and place. This condition-goal paraphrase suggests that the evaluative adjective well-ordered imposes a potential interpretation on vital. It should also be noted that here the relationship established by vital is not one of intrinsic inclusion, but rather one of indispensability. Later, the property of potentiality is also found in examples in which some element is said to be vital to a particular potential action. Example (19) bears a close resemblance to (9) above, in which essential is used with a potential action. (19) Hence it was that the raising of the siege of Gibeon . . . was so vital to the conquest of Canaan. (OED 1856 Stanley, Sinai and Palestine in connection with their history iv. 215) In (19), raising the siege of Gibeon is said to have been vital or necessary in order to conquer Canaan. Again, the condition-goal paraphrase and the SoAinternal character of the necessity make it clear that the type of meaning involved is situational dynamic modality. Clearly, the relation that vital establishes is one of indispensability. The meaning of vital has thus been extended from defining to dynamic necessity in the course of the 17th century.
(i) (ii)
To preserve intact such vital parts as the machinery, magazines, and steering gear. (OED 1889 Welch, Naval Architecture 141) Spring washers are less effective, but answer well enough for the less vital parts of the mechanism. (OED 1912 Motor Man. (ed. 14) 206)
In these examples, vital can be paraphrased as necessary to its proper working. However, the fact that such examples are attested rather late (i.e., after the instances with evaluative adjectives and potential actions), and the prior occurrence of defining examples such as (16) and (17) above together suggest that vital developed along the same lines as essential. Of course, the invited inference of potential meaning may have paved the way for the constructions discussed here to emerge.
554
A. Van linden
To conclude, it can be hypothesized that the development of vital from premodal to modal meaning first involved semantic generalization. The three subsenses found in the earliest attestations of vital all already implied a relationship of intrinsic inclusion within life, which can be explained by the etymology of vital (ultimately based on Latin vita, life). The semantic generalization preserved this type of relationship and yielded the meaning of essential used in a defining way. The connection with life, however, got lost. In a second change, driven by patterns of co-occurrence with evaluative adjectives and potential actions, vital developed dynamic meaning, involving the property of potentiality and a relationship of indispensability instead of intrinsic inclusion. Like in the case of essential, evaluative contexts appeared earlier than those with potential actions. Again, therefore, the data have shown that the properties of relationality and potentiality are the semantic conditions of the development of dynamic meaning. 4.1.3. A first pathway to dynamic meaning: essential and vital. From the previous discussions, we can infer that the semantic developments of essential and vital show more similarities than differences. They thus allow us to propose a first pathway to dynamic meaning, which is visualized in Figure 1. As can be seen in the figure, the main difference between the two concerns the beginning of the pathway. Remember that in its original meaning essential is non-relational, whereas vital does not have a non-relational stage. However, both adjectives share a stage of defining necessity and later develop the meaning of dynamic necessity. In both cases, the development of this modal meaning involves first patterns of co-occurrence with evaluative adjectives, and later contexts of potential actions. The adjectives essential and vital therefore present us with a first pathway to dynamic meaning via the notion of defining necessity. In the following sections, I will put forward a second pathway, for which the adjectives crucial and critical are exemplary. Both pathways not only mark a development from descriptive to modal items, but also from nongradable to gradable items.
Figure 1. The first pathway to dynamic meaning: essential and vital
555
4.1.4. Crucial from premodal to dynamic meaning.17 This section focuses on the development from premodal to modal meaning of crucial, in which four stages can be recognized, which are, however, not as clear-cut as is the case for essential and vital. Table 4 hypothesizes what these stages may look like. It will become clear that the distinction between the second and third stage is not hard and fast. In fact, these two stages coincide temporally due to the first semantic change, which is a metaphorical projection. I have disentangled these stages to be able to assign the same configuration of semantic properties to the source and target meaning of the metaphor. The second change, which is driven by semantic generalization, leads to the development of dynamic meaning. As the original meaning of crucial in English, the OED (s.v. crucial) mentions cross-shaped or in the form of a cross. Its first attestations in the OED are given below. (20) rucial Incision, the cutting or lancing of an Impostume or Swelling C cross~wise. (OED 1706 Phillips, The new world of English words: or, a general dictionary [ed. Kersey] s.v. Incision) (21) The bursal and crucial ligaments were in their natural order. (OED 1751 Phil. Trans. XLVII. xxxvii. 261) Both in (20) and (21), crucial functions as a classifier. In (20), it indicates a specific type of incision in the form of a cross, as opposed to a linear incision. In (21), crucial denotes a sub-class of ligaments, viz. those in the knee-joint that cross each other in the form of a Saint Andrews cross and connect the femur and tibia, as opposed to the bursal ligaments, which cross the bursa (OED, s.v. crucial and bursal). In both cases, crucial indicates a subtype of the general type of the head noun, and does not attribute a gradable quality to the
Table 4. The development of crucial from premodal to modal meaning (cf. Van linden et al. 2008: 244, Table 3) Stages stage 1: original meaning 1706 cross-shaped (20)(21) relationality potentiality 17. stage 2: metaphorized meaning 1830 like (at) a finger-post (22) stage 3: collocational meaning 1830 necessary to decide between two hypotheses (22) + + stage 4: dynamic meaning 1869 decisive for; important for (23)(24) + +
First attestation Meaning and examples
This section is based on Van linden et al. (2008: 240244).
556
A. Van linden
NP referent. More generally, the OED database does not contain any predicative or graded uses of crucial in its original meaning. It is clear that crucial in the sense of cross-shaped or cross-like is non-relational, since it does not link two concepts, and non-potential, since it does not involve a potential event or the potential presence of an entity (cf. Table 4). The first semantic change of crucial on its way to dynamic meaning involves metaphorical projection. It is commonly accepted that the basis of this metaphorical extension was laid in the work of Francis Bacon (15611626) (OED, s.v. crucial; FEW II2: 1382b; TLF VI: 559; Klein 1971: 178; Barnhart 1988: 238a). In his very influential Novum Organum (1620), written in Latin, Bacon coined the phrase instantia crucis crucial instance, which he explained as a metaphor derived from crosses that are placed at bifurcations of the road and indicate where each road will lead to. Crucial instances are places where the scientist or thinker in general has to make a decision, as much as finger-posts are places where the traveller has to decide which way to go18 (the Latin word crux at that time had developed the meaning of a guidepost that gives directions at a place where one road becomes two [OED, s.v. crucial; FEW II2: 1380a]).19 Bacon thus mapped the more concrete domain of travelling onto the more abstract domain of thinking. Robert Boyle (16271691) and Isaac Newton (16421727) built on this metaphor and used the term experimentum crucis to refer to the experiment performed to decide between two rival hypotheses (OED, s.v. crucial). Although the studies of the scientists mentioned were written in the 17th or early 18th century (some in Latin), the specific phrases with the adjective crucial appeared in English only in the 19th century.20 The earliest example is given in (22) below. (22) What Bacon terms crucial instances, which are phenomena brought forward to decide between two causes, each having the same analogies in its favour. (OED 1830 Herschel, A preliminary discourse on the study of natural philosophy II. vi. 150)
18. It can be argued that this metaphor has a metonymical basis, as the instances in question are not cross-like, but rather situated at crosses posted at bifurcations of the road. This relation of spatial contiguity thus serves as the base for the metaphor, which is in keeping with Barcelonas claim that the target and/or the source of a potential metaphor must be understood or perspectivized metonymically for the metaphor to be possible (Barcelona 2000: 31; italics his). I thank Hubert Cuyckens for pointing this out to me. The question whether the emergence of the metaphorized meaning in English is a languageinternal development or the result of another borrowing does not concern us here. The Latin phrases appeared in earlier scientific or philosophical English writings (e.g., The gradual removal of these suspicions at length led me to the Experimentum crucis (OED 1672 Newton, Light and Colours i); The Experimentum crucis or that Experiment, which points out the Way we should follow, in any Doubt or Ambiguity (OED 1751 Hume, An enquiry concerning the principles of morals V. ii. 84)).
19. 20.
557
The definitions of crucial instance (in [22]) and crucial experiment (described above) make it clear that these fixed phrases have relational and potential meaning as a whole, since the consideration of a finger-post-like type of instance or the performance of such a type of experiment is necessary in order to decide between rival hypotheses, and ultimately to resolve the intellectual crisis. (Note that crucial functions as a classifier of its collocates.) Arguably, it is only in the specific collocations with instance and experiment that crucial has relational and potential meaning, which is another reason why Table 4 distinguishes between stages 2 and 3. In any case, the condition-goal paraphrases imply that the collocations involve dynamic situational necessity, just like essential and vital in their third stage. The second semantic change takes place when the use of crucial is extended to other contexts than the collocations with instance and experiment, and concomitantly, the specific meaning of necessary to decide between two hypotheses is generalized to decisive for or important for (cf. Table 4). Whereas crucial only has this specific meaning in the collocations with instance and experiment, in which it functions as a classifier, it retains a more general meaning of important or decisive when used in modifying other nouns. Semantically, in such other contexts, it is crucial itself that has relational and potential meaning, and not the combination of the adjective and the noun. This is structurally reflected by the occurrence of complements (see [23] and [24]). Syntactically, it no longer functions as a classifier, but as an attribute: it is gradable, and it can be used in predicative position (see [24]). Example (23) illustrates the semantic generalization of crucial. Even if it modifies the noun experiments, we can still argue for a general attribute reading, since the potential action to which the experiment is considered crucial needs to be expressed; if crucial experiments had been used in its specific collocational sense, the forcomplement would have been redundant. The type of relationship established by crucial is one of decisive importance or determining influence. (23) Crucial experiments for the verification of his theory. (OED 1869 Martineau, Essays philosophical and theological II. 134) Like in the case of essential and vital, potential contexts such as in (23) are a prerequisite for dynamic modal meaning. A similar example is given in (24). (24) It is crucial that the blocking device, ( . . . ), is deposited at this point to ensure that the tubes are rendered impassable. (CB 1996, times) In (24), which is construed with an extraposed that-clause, the blocking device has to be deposited at a certain point in order to ensure that the fallopian tubes are rendered impassable. The action of depositing is necessary on SoA-internal grounds, that is, for the proper blocking of the tubes (in a sterilization
558
A. Van linden
operation). Examples (23) and (24) make it clear that after metaphorical projection and semantic generalization crucial can be used in dynamic utterances expressing a situation-internal necessity. In summary, in its development from premodal to modal meaning, crucial starts with the non-relational and non-potential meaning of cross-shaped. It then develops both types of meaning at once through metaphorical projection, brought about by Bacons collocation crucial instance. In a process of semantic generalization, crucial loses the specific collocational meaning of necessary to decide between two hypotheses, and comes to mean decisive for. In this meaning, it expresses dynamic necessity, like essential and vital in their third stage of development. 4.1.5. Critical from premodal to dynamic meaning. The last adjective studied in detail here is critical. I will propose that in its development from premodal to dynamic meaning two stages can be distinguished. These are the result of one semantic change, which involves semantic generalization and leads from its original meaning immediately to dynamic meaning. As put forward in Table 5, this change does not involve a shift in the configuration of the semantic properties of relationality and potentiality. It will also become clear that the development of critical has much in common with that of crucial discussed above. The first attestation of critical in English dates from 1590, and is a derivation of the now obsolete adjective critic (OED, s.v. critical). Around the end of the 16th century, English critical has two distinct meanings. One is related to the act of judging, and can be paraphrased by given to judging, especially given to adverse or unfavourable criticism (OED, s.v. critical). Its first attestation in the OED comes from Shakespeare and is given in (25). In its second sense, critical is a medical term and relates to the crisis or turning point of a
Table 5. The development of critical from premodal to modal meaning Stages stage 1: original meaning = collocational meaning critic: 1544 critical: 1601 necessary to determine the direction of the disease (25)(28) + + stage 2: dynamic meaning (1664) roughly 1990 (CB) decisive for; important for (29)(31) + +
First attestation Meaning and examples
relationality potentiality
559
disease (OED, s.v. critical; Barnhart 1988: 236a). An example of this medical sense is given in (26). This sense is also the meaning of critic in its first attestation, which is given in (27). (25) That is some Satire keene and criticall. (OED 1590 Shakespeare, A midsommer nights dreame V. i. 54) (26) Who will say that the Physition in his iudgement by Who will say that the physician in his judgement by vrine, by indicatorie and criticall daies, by Symptomes urine, by indicatory and critical days, by symptoms and other arguments . . . doeth intrude into the secret and other arguments . . . does intrude into the secret prouidence of God? providence of God? Who will say that the physician in his judgement by urine, by indicatory and critical days, by symptoms and other arguments, intrudes into the secret providence of God? (OED 1603 Heyden, An astrological discourse in justification of the validity of astrology. i. 19) (27) If it appeare in the vj day, being a day iudiciall or if it appear in the 6th day, being a day judicial or creticke of the ague. critic of the ague. If it [ jaundis, OED] appears on the sixth day, being a judicial or critic day of the ague [i.e., an acute or violent fever, AVL]. (OED 1544 Phaer, Goeurots [J.] Regiment of life [1553] Gjb) As the sense of critical in (25) does not play a role in its semantic development of modal meaning, I will not discuss it in more detail. The medical sense of critical (and critic), illustrated in (26) (and [27]), however, did play an important role in the development of deontic meaning, and it is taken here as the first stage (cf. Table 5).21 This sense originates in the writings of Hippocrates (c460377 BC), and refers to a changing point of a disease, a sudden change for better or worse (Liddell et al. 1951 [1924]: i 997a).22 Hippocrates also introduced the concept of critical days ( ) as a prognostic tool (cf. [26][27]), with which he referred to days on which the illness reaches a crisis, and which afforded and required a judgement (also ) about its direction (Demaitre 2003: 768).
21. 22.
It is also in this sense that the adjective was borrowed first into Latin and later into French, English and German (FEW II2: 1354b1355b; Koselleck 2006: 358363). Such a crisis usually involves the sudden excretion of bad humours, for instance through heavy sweat during fever, vomiting, diarrhea or menstruation (Siraisi 1990: 135).
560
A. Van linden
In his works De crisi and De diebus creticis, Galen provides the Hippocratic doctrine of critical days with a theoreticalastrologicalfoundation. He argues that critical days need to be calculated on the basis of a medicinal month, which derives from the orbit of the moon (Siraisi 1990: 135). Since Galen, therefore, the meaning of critical in the collocation critical days also involves an astrological component. Moreover, several studies have shown that the Galenic idea of iatromathematics or astrological medicine has been kept in use throughout the Middle Ages (e.g., Demaitre 2003), the Early Modern period (e.g., Roos 2000), and even the Late Modern period (e.g., Harrison 2000). Hence, it is not surprising that the first attestations of critic(al) in its medical (and astrological) sense typically collocate with days, as in (26) and (27) above. A later example is given below. (28) Another time is called Intercidental, which is a time falls out between the Judicial dauyes and Critical. (OED 1651 Culpepper, Semeiotica Uranica; or, an astrologicall judgment of diseases 22) In collocation with day(s), critical functions as a classifier, indicating a specific subtype of day, rather than attributing a gradable quality to its referent. The types of days critical ones are opposed to are intercidental and judicial or indicatory days, as in (26) and (28). The explanation of critical days above has shown that this fixed phrase has relational and potential meaning as a whole (cf. Table 5), just like the phrases crucial instance and crucial experiment in Section 4.1.4 above. In fact, critical days can be paraphrased as days that are necessary to determine the direction of the disease, just like a crucial experiment is necessary to determine the direction of a scientific theory. This condition-goal paraphrase thus implies that the collocation studied here involves situational dynamic meaning. The semantic change of critical that leads to its dynamic meaning involves semantic generalization through the expansion of the host-class. The data show that the use of critical is extended from the technical medico-astrological sense relating to the crisis in a disease to the more general meaning of decisive for or important for when used in modifying other nouns, just like crucial after its semantic generalization. In this extended sense, it is critical itself that has relational and potential meaning, and not critical in combination with the noun it classifies. What is regarded as critical has a decisive impact on the following course of events or, in other words, will determine the outcome of the matter talked about. The relationship established by critical is thus one of decisive influence or determining importance. The semantic generalization of critical has structural correlations in that it is able to take complements (see [29][31] below)unlike in its collocational sense, as has been observed for crucial (cf. Section 4.1.4). Syntactically it does not function as a classifier anymore, but
561
rather as an attribute, since it can be graded, as is illustrated in (29), and used in predicative position, as is illustrated in (30) and (31). (29) Acquaint them [tender-plants, OED] gradually with the Air for this change is the most critical of the whole year. (OED 1664 Evelyn, Kalendarium hortense [1729] 198) (30) The short scenes are critical to providing continuity and maintaining suspense and eye-catching details include flickering/strobe lighting and even silhouetted shadows for the bedroom scene ( . . . ). (CB 1993, ukmags) (31) The demands imposed by Formula One are greater than ever, he says ( . . . ). The cars too, have become more difficult to handle: It is critical to get the set-up right because it is so easy to lose it in a big way. (CB 1996, times) In (29), critical is modifying this change (presumably the change between two seasons) and it is graded. The adjective has the meaning of decisive for, but arguably the sense of necessity is not that clearly present. In fact, all Early and Late Modern English examples are similar to (29), with critical modifying a special occasion or period of time. It is only in Present-day English that critical appears in expressions in which the sense of necessity is foregrounded as well, as in examples (30) and (31). In (30), the use of short scenes is critical or necessary to provide continuity and maintain suspense in the play. This conditiongoal paraphrase, typical of dynamic meaning, also applies to (31). Here, getting the set-up of a Formula One car right is critical or necessary to take a good start in a race (and ultimately, to win the race). Note that in this example, the condition is encoded by a clausal complement. In these two cases, some action is regarded as critical or necessary to the achievement of a particular goal, on the basis of SoA-internal grounds. These examples hence show that the first modal meaning critical develops is that of situational dynamic meaning, much like essential, vital and crucial. In conclusion, critical develops dynamic modal meaning from its original medico-astrological meaning through semantic generalization. From its specific meaning of necessary to decide on the direction of the disease in collocation with days, it develops the more general meaning of decisive for. Both stages involve relational and potential meaning. In the dynamic modal stage, the meaning of critical is very similar to that of crucial in its fourth stage. It should be noted, though, that it is only in Present-day English that critical is used in clearly dynamic expressions, in which the necessity of SoAs is indicated on the basis of SoA-internal arguments. 4.1.6. A second pathway to dynamic meaning: crucial and critical. From the sections above, we can understand that the semantic developments of
562
A. Van linden
crucial: cross-shaped
metaphorized meaning: like (at) a finger-post collocational meaning: necessary to resolve the crisis critical: decisive of the issue of a disease dynamic: decisive for
Figure 2. The second pathway to dynamic meaning: crucial and critical
crucial and critical to (dynamic) modal meaning run parallel. They can be represented on a single pathway to dynamic meaning, as shown in Figure 2. Like in the case of the first pathway, the main differences between the two adjectives pertain to the initial stages of the respective developments. Crucial first underwent metaphorical projection before it could be used in its collocational meaning (as in crucial experiment), whereas with critical the stage of collocational meaning (critical days) coincides with its original stage. Importantly, in both cases the collocational stage involves the notion of a crisis or turning point. In the case of crucial, the crisis relates to the development of a scientific theory, while in the case of critical, the crisis relates to the development of a disease. After the stage of collocational meaning, both adjectives develop dynamic meaning through generalization. Semantically, the type of relationship they establish then relates to the notion of a crisis, and can be paraphrased as decisive for. This change is also reflected structurally in that the adjectives can take complements (unlike in their collocational stages) (cf. vital in its generalized meaning, see Section 4.1.2). Syntactically, the adjectives no longer function as classifiers, but as gradable attributes, typically allowing for totality modifiers and predicative alternation. Thus, like in the case of essential and vital, the properties of relationality and potentiality can be seen as the semantic conditions for dynamic meaning, indicating necessity inherent in a situation. More specifically, we can conclude that crucial and critical illustrate a second pathway from premodal to (dynamic) modal meaning via the notion of a crisis, which differs from the first pathway exemplified by essential and vital, involving the notion of defining necessity. 4.2. Change within the modal domain Whereas essential, vital, crucial and critical show differences in their changes from premodal to modal meaninghowever, these could be grouped two by
563
two in terms of two pathways, the adjectives all show the same change within the modal domain. From the dynamic meaning described as endpoint of the pathways presented above, all cases develop deontic meaning. The dynamicdeontic development has also been noted for modal auxiliaries with some cross-linguistic frequency (cf. Bybee et al. 1994; Van der Auwera and Plungian 1998). Importantly, however, the type of dynamic meaning involved differs. Whereas the modal auxiliaries first undergo micro-changes within the dynamic domain from participant-inherent (ability) to participant-imposed meaning before developing deontic meaning (e.g., Van Ostaeyen and Nuyts 2004: 113),23 the adjectives studied here develop only one type of dynamic meaning which leads to deontic meaning, viz. situational meaning. While these dynamic adjectival expressions indicate necessities that are internal to the SoAs referred to, deontic expressions involve an attitudinal source (typically the speaker) in whose view a certain action is assessed as necessary or desirable on the basis of (moral) arguments that are external to the SoA (see Section 2). In this sense, the change from dynamic to deontic meaning involves the process of subjectification as defined by Traugott (1989: 35), in which meanings tend to become increasingly based in the speakers subjective belief state/attitude toward the proposition. Specifically, deontic expressions are subjective in that they enact the speakers position with regard to the situation (cf. De Smet and Verstraete 2006: 387).24 The first deontic utterance appears with essential in the first half of the 19th century, and is shown in (32). Examples (33) to (35) illustrate deontic uses of vital, crucial and critical, only found in the Present-day English data. As can be seen in (35), in their deontic meaning the adjectives are still gradable, since they can combine with the totality modifier absolutelyjust like in their dynamic meaning. However, in their change from dynamic to deontic meaning, the adjectives also change in the type of opposition they imply (cf. Paradis 2001: 5154). In dynamic uses, the adjectives are complementaries, conceptualized in terms or either . . . or (Paradis 2001: 52) (either necessary or not necessary/avoidable). In deontic uses, by contrast, the adjectives are antonymic and they imply a scale (in this case one of [moral] desirability), on which they appear at one extreme (with at the other end adjectives such as, e.g., unacceptable). According to the types of gradable adjectives proposed in Paradis (2001: 5154), they thus change from limit adjectives to extreme
23. In their diachronic study of the Dutch modal kunnen (can), Van Ostaeyen and Nuyts (2004) argue on the basis of the distribution of ambiguous cases that deontic meaning seems to have developed from participant-imposed dynamic meaning, and epistemic meaning from situational dynamic meaning (2004: 52). In De Smet and Verstraetes (2006: 386) typology of subjectivity, this type of subjective meaning is called interpersonal semantic subjectivity (cf. Section 4.1.1, note 13).
24.
564
A. Van linden
adjectives, which is a shift from non-scalar to scalar. Therefore, they confirm Paradiss (2001: 58) finding that adjectives tend to get scalar interpretations. (32) The Anglo-Catholics consider it essential to be ordained by bishops receiving their appointment in regular succession from the apostles. (OED 1842 Gell, Serm. Visitation Archdeacon of Derby 33) (33) It is vital that the European Community helps the process of transition to market economies, preparing these countries for eventual EC membership. (CB 1992, ukephem) (34) With the scourge of illegal narcotics infecting every part of the world, it is crucial to educate young people about the dangers of drugs. (CB 1998, sunnow) (35) The most important thing is to sharpen the focus of the young generation so that they are better able to identify racism and totalitarianism in its early stages, he said. In the battle against this fundamental evil of the twentieth century, it is absolutely critical to mount a timely resistance. (CB 1996, times) As the diachronic corpora provide too few examples of deontic expressions, we can only start from the synchronic data to hypothesize how the change took place, i.e., how the process of subjectification worked. Since situational dynamic and deontic expressions merely differ in the presence or absence of an attitudinal source (see also Van linden 2009: 283209, forthcoming), it is the interpretation of the presence of an attitudinal source that must have arisen as an invited inference. Crucially, this presence need not be overtly or structurally marked in the complement constructions studied here. Therefore, it may be argued that the invited inference arose in the following contexts. (36) This should make you want to go to the toilet frequently. Although it may sting the first few times you go, this usually gets better the more water you pass. It is essential to keep emptying the bladder if you are to flush out the germs. (CB 1992, ukepehm) (37) We must persuade our mps to support the Billits a Private Members Bill, and so it is essential that at least 100 MPs support it, or it will get thrown out without a second reading. (CB 1995, ukephem) Example (36) repeats (2) above. In Section 2, I argued that the necessity expressed by its condition-goal structure resides in the nature of things, viz. the physical make-up of the urinary system. The structure in (37) indicates the need to have the support of 100 MPs in order to give the Wild Mammals (Protection) Bill a second reading (which is ultimately needed to have the bill passed). Here, the necessity expressed by the condition-goal structure
565
resides in a self-imposed system, viz. the parliamentary system of Great Britain. In both examples, however, it is possible to see involvement of an attitudinal source. Example (37) can be interpreted as within the parliamentary system it is necessary that at least 100 MPs support the bill to give it a second reading, and I think it is essential that this happens, because I feel it is highly desirable that we protect wild animals. (36) can be understood as it is essential to keep emptying the bladder if you are to flush out the germs, and I think you should flush them out because you should keep yourself in good health. In the clearly deontic examples given in (32) to (35), however, the necessities cannot be felicitously interpreted to reside in the nature of things or in a self-imposed system anymore. This is especially clear in (32), in which the necessity is related to the syntactic subject of the complex transitive matrix construction (the Anglo-Catholics). Note also that the expressions in (32) to (35) do not (implicitly or explicitly) refer to a concrete goal to which the SoAs expressed in the complements are said to be essential, vital, crucial or critical. Therefore, the examples show that the invited inference of the presence of an attitudinal source has semanticized or conventionalized (cf. Traugott 1989; Evans and Wilkins 2000: 549550; Traugott and Dasher 2002: 3440; Enfield 2003: 2830): the more subjective scalar deontic meaning has become part of the meaning of the adjectives in addition their non-scalar dynamic meaning. 5. Conclusion25 In this article on the diachrony of modal expressions, I have elaborated on two topics which have received much attention in that domain, viz. sources of modal forms and pathways of change. Whereas these topics have typically been investigated on the basis of modal auxiliaries, both in language-specific and in cross-linguistic accounts, I have taken the perspective of an underresearched category in this domain, viz. adjectives. Even though the adjectives studied, viz. four borrowed items of Romance origin, belong to a different register than the modal auxiliaries, as rightly noted by a referee, they proved interesting and offered new insights, which are visualized in Figure 3. Firstly, the adjectives focused on in this article, viz. essential, vital, crucial and critical, add new items to the lists of sources of deontic meaning presented in typological studies. As indicated in Figure 3, the adjectives derive from nouns that express abstract notions (crisis, essence, and life) or concrete objects (cross). It was also noted that many adjectives expressing a weak degree of desirability in the deontic domain, such as proper, appropriate, fitting and
25. This section is based on Van linden et al. (2008: 244245) and Van linden (2009: 108112, forthcoming).
566
A. Van linden
Figure 3. The sources and pathways of change of deontic adjectives
meet, by contrast, can etymologically be related to cross-linguistically recurrent sources of obligation or strong deontic meaning, such as possession and measure (Bybee et al. 1994: 182183; Traugott and Dasher 2002: 118119; Heine and Kuteva 2002: 333). Secondly, this study of adjectives has expanded on well-known pathways of change. For one thing, it has adduced further evidence for the diachronic validity of the dynamic-deontic pathway, which has been proposed for modal auxiliaries already (e.g., Bybee et al. 1994: 191194; Goossens 1999; Traugott and Dasher 2002: Ch. 3). Although the dynamic stages may differ for the verbal and adjectival expressions, as discussed in Section 4.2, in both cases the process of subjectification re-orients the property of necessity from the situation (necessity imposed by or internal to a particular situation) to the attitudinal source (necessity as judged by someone, typically the speaker, on the basis of SoA-external, moral principles). It should be noted, though, that sub-
567
jectification is a metonymically based semantic process that does not systematically correlate with certain formal or structural properties. The diachronic analysis presented here therefore suggests that the distinction between dynamic and deontic modal meaning (in the upper part of Figure 3) may not always be clear-cut, unlike the stages in the developments from premodal to modal meaning. What is even more interesting than the dynamic-deontic change within the modal domain, is the very changes from premodal to modal meaning. It is especially in this subdomain that the investigation of adjectives fills a gap in the literature. More specifically, it has been shown that with adjectives the development of (situational dynamic) modal meaning is a matter of the properties of relationality and potentiality. Relationality is needed to turn the adjective into a predicate of necessity that can link two concepts, for instance a part and a whole, or a condition and a goal. Accordingly, it was shown that relational meaning is the semantic condition for the development of potential meaning. Potentiality, in turn, is needed to ensure that the relationship established by the adjective is one of indispensability or decisive influence rather than intrinsic inclusion, and hence, that the necessity involved is dynamicmodal rather than defining. We can therefore assume that the semantic properties of relationality and potentiality are the conditions of entry into the modal domain. In addition, it became clear that the adjectives show some substantial differences in the development of these properties. Although in their original stages, they all function as classifiers, they differ in terms of the configuration of semantic properties. In particular, essential and crucial start off with nonrelational and non-potential meaning, whereas vital starts off with relational meaning and critical even with both relational and potential meaning. It has also been shown that the factors driving the emergence of relationality can be quite different: patterns of co-occurrence with relational nouns in the case of essential, as opposed to metaphorical projection, metonymy and semantic generalization in the case of crucial. For the emergence of potential meaning, the same mechanisms were invoked in the case of crucial, whereas in the case of critical, only the mechanism of semantic generalization (through expansion of the host-class) applied. In the cases of essential and vital, by contrast, potential meaning emerged through patterns of co-occurrence with evaluative adjectives and potential actions. These mechanisms all indicate the importance of constructions in the development of a particular lexical item. We can thus conclude that the developments of the properties of relationality and potentiality, which themselves are new in the diachronic research of modal categories, involve more general mechanisms of change which are not that new, but have already been invoked for a varied set of semasiological extensions in distinct conceptual categories (cf. Geeraerts 1997: 93102).
568
A. Van linden
Still with regard to the properties of relationality and potentiality, it can be argued that they function on different levels: the development of relationality seems to be mainly a lexical matter, while the development of potentiality seems to be on a constructional rather than a lexical level. In the cases of essential and crucial, for instance, the change from non-relational to relational meaning involves the largest semantic leap (from meanings that do not involve necessity to meanings that do). Moreover, the emergence of relationality precedes the development of potentiality, most clearly so in the semantic extension of essential and vital. The changes involving potential meaning, and further on to deontic meaning, by contrast, involve smaller semantic developments (from one type of necessity to another). Most importantly, this study has generalized over the differences in the development of the semantic properties of relationality and potentiality, and it has presented two distinct pathways of change from premodal to modal meaning, as shown in Figure 3. One pathway involves the notion of defining necessity and is followed by essential and vital. The second pathway involves the notion of a crisis, and is followed by crucial and critical. From a broader perspective, the two pathways can also serve as the basis of a more elaborate typology of pathways to deontic meaning. In this article, a few concepts were introduced that may prove useful in further explorations of the diachrony of modal categories, such as the features of relationality and potentiality. Apart from the borrowed adjectives discussed in this article, it may be interesting to look at native adjectives also, such as needful, which may present us with yet other pathways to deontic meaning (from poor, needy over necessary, indispensable [OED, s.v. needful] to morally desirable). It is hoped that further research can expand this preliminary typology of adjectival pathways to deontic meaning. However, before we can build on this typology, we need to strengthen its foundations by adducing quantitative evidence for the developments proposed in this article, drawn from larger datasets. Other questions for further reflection, as suggested by a referee, may include the application of the concepts of relationality and potentiality to other categories than adjectives, such as, for instance, the modal auxiliaries. If we take a closer look at the premodal stages of the modals, can we draw parallels with the adjectives? Or can we perhaps draw cross-linguistic parallels? Finally, another concept that deserves further investigation is that of gradability. In the developments of the adjectives, we noted a change from nongradable classifier uses to gradable attribute uses, which is consistent with the unidirectional tendency posited by Paradis (2001: 58). Within the modal domain, we noted a further change from limit adjectives to extreme adjectives, or from non-scalar to scalar gradable items. This finding also complies with the observation that adjectives tend to develop scalar interpretations (Paradis 2001: 58). Perhaps the study of adjectives in other modal domains, such as
569
certain or true in the epistemic domain, may reveal similar changes and thus present us with other contexts in which shifts from non-gradable to gradable take place. Received 4 June 2009 Revision received 26 November 2009 References
Andersen, Henning. 2001. Actualization and the unidirectionality of change. In Henning Andersen (ed.), Actualization: Linguistic Change in Progress (Amsterdam studies in the theory and history of linguistic science. Series 4: Current issues in linguistic theory 219), 225248. Amsterdam: John Benjamins. Barcelona, Antonio. 2000. Introduction: The cognitive theory of metaphor and metonymy. In Antonio Barcelona (ed.), Metaphor and Metonymy at the Crossroads: a Cognitive Perspective (Topics in English linguistics 30), 128. Berlin: Mouton de Gruyter. Barnhart, Robert K. 1988. The Barnhart dictionary of etymology. Bronx, NY: Wilson. Bolinger, Dwight. 1967. Adjectives in English: Attribution and predication. Lingua 18. 134. Bybee, Joan L., Revere Perkins and William Pagliuca. 1994. The Evolution of Grammar: Tense, Aspect and Modality in the Languages of the World. Chicago: University of Chicago Press. De Smet, Hendrik. 2005. A corpus of Late Modern English Texts. ICAME Journal 29. 6982. De Smet, Hendrik. 2008. Diffusional change in the English system of complementation. Gerunds, participles and for . . . to-infinitives. Leuven: University of Leuven dissertation. De Smet, Hendrik and Jean-Christophe Verstraete. 2006. Coming to terms with subjectivity. Cognitive Linguistics 17. 365392. Demaitre, Luke. 2003. The Art and Science of Prognostication in Early University Medicine. Bulletin of the History of Medicine 77. 765788. Denison, David. 2001. Gradience and linguistic change. In Laurel J. Brinton (ed.), Historical Linguistics 1999. Selected papers from the 14th International Conference on Historical Linguistics, Vancouver, 913 August 1999, 119144. Amsterdam: John Benjamins. Depraetere, Ilse and Susan Reed. 2006. Mood and modality in English. In Bas Aarts and April McMahon (eds.), The handbook of English linguistics (Blackwell handbooks in linguistics 21), 267290. Oxford: Blackwell. Diewald, Gabriele. 1999. Die Modalverben im Deutschen: Grammatikalisierung und Polyfunktionalitt (Reihe Germanistische Linguistik 208). Tbingen: Max Niemeyer Verlag. Enfield, Nicholas J. 2003. Linguistic epidemiology: Semantics and grammar of language contact in Mainland Southeast Asia (RoutledgeCurzon Asian linguistics series). London: RoutledgeCurzon. Evans, Nicholas and David Wilkins. 2000. In the minds ear: The semantic extensions of perception verbs in Australian languages. Language 76. 546592. FEW: von Wartburg, Walther. 1922. Franzsisches Etymologisches Wrterbuch. Eine Darstellung des galloromanischen Sprachschatzes. Bonn: K. Schroeder. Forrester, John M. 2002. The marvellous network and the history of enquiry into its function. Journal of the History of Medicine 57. 198217. Geeraerts, Dirk. 1997. Diachronic prototype semantics: A contribution to historical lexicology (Oxford studies in lexicography and lexicology). Oxford: Clarendon. Goossens, Louis. 1983. Can and kunnen: Dutch and English potential compared. In Frans Daems and Louis Goossens (eds.), Een spyeghel voor G. Jo Steenbergen: huldealbum aangeboden bij
University of Leuven
570
A. Van linden
zijn emeritaat [a mirror for G. Jo Steenbergen: Festschrift offered on his retirement], 147158. Leuven: Acco. Goossens, Louis. 1985. Modality and the modals: A problem for functional grammar. In A. Machtelt Bolkestein, Casper de Groot and J. Lachlan Mackenzie (eds.), Predicates and Terms in Functional Grammar (Functional grammar series 1), 203217. Dordrecht: Foris. Goossens, Louis. 1999. Metonymic bridges in modal shifts. In Klaus-Uwe Panther and Gunter Radden (eds.), Metonymy in language and thought (Human cognitive processing 4), 193210. Amsterdam: John Benjamins. Halliday, Micheal A. K. 1994. An introduction to functional grammar. 2nd ed. London: Edward Arnold. Hansen, Bjrn. 1998. Modalauxiliaire in den Slavischen Sprachen. Zeitschrift fr Slawistik 3. 249272. Hansen, Bjrn. 2004. Modals and the boundaries of grammaticalization: The case of Russian, Polish and Serbian-Croatian. In Walter Bisang, Nikolaus P. Himmelmann and Bjorn Wiemer (eds.), What makes grammaticalization? A look from its fringes (Trends in linguistics. Studies and monographs 158), 245270. Berlin and New York: Mouton de Gruyter. Harrison, Mark. 2000. From Medical Astrology to Medical Astronomy: Sol-Lunar and Planetary Theories of Disease in British Medicine, c. 17001850. The British Journal for the History of Science 33 (1). 2548. Heine, Bernd, Ulrike Claudi and Friederike Hnnemeyer. 1991. Grammaticalization: A conceptual framework. Chicago: University of Chicago Press. Heine, Bernd and Tania Kuteva. 2002. World lexicon of grammaticalization. Cambridge: Cambridge University Press. Klein, Ernest. 1971. A comprehensive etymological dictionary of the English language dealing with the origin of words and their sense development, thus illustrating the history of civilization and culture. Amsterdam, London and New York: Elsevier Publishing Company. Knoeff, Rina. 2004. The reins of the soul: The centrality of the intercostal nerves to the neurology of Thomas Willis and to Samuel Parkers theology. Journal of the History of Medicine and Allied Sciences 59 (3). 413440. Koselleck, Reinhart. 2006. Crisis. Journal of the History of Ideas 67 (2). 357400. Kratzer, Angelika. 1978. Semantik der Rede: Kontexttheorie, Modalwrter, Konditionalstze (Monographien Linguistik und Kommunikationswissenschaft 38). Knigstein/Ts.: Scriptor. Langacker, Ronald W. 1991. Foundations of Cognitive Grammar, Vol. 2. Descriptive application. Stanford: Stanford University Press. Liddell, Henry G., Robert Scott, Henry S. Jones and Roderick MacKenzie (eds.). 1951 [1924]. A Greek-English lexicon. 9th ed. compl., repr. Oxford: Clarendon. Lyons, John. 1977. Semantics. Vol. 2. Cambridge: Cambridge University Press. Narrog, Heiko. 2005. Modality, mood, and change of modal meanings: A new perspective. Cognitive Linguistics 16 (4). 677731. Nuyts, Jan. 2001. Epistemic modality, language, and conceptualization: A cognitive-pragmatic perspective (Human cognitive processing 5). Amsterdam: John Benjamins. Nuyts, Jan. 2005. The modal confusion: On terminology and the concepts behind it. In Alex Klinge and Henrik Heg Mller (eds.), Modality: Studies in form and function, 538. London: Equinox. Nuyts, Jan. 2006. Modality: Overview and linguistic issues. In William Frawley (ed.), The expression of modality (The expression of cognitive categories 1), 126. Berlin: Mouton de Gruyter. Oxford English Dictionary: http://dictionary.oed.com/cgi/. Palmer, Frank Robert. 1979. Modality and the English modals (Longman linguistics library). London: Longman. Palmer, Frank Robert. 1986. Mood and Modality (Cambridge textbooks in linguistics). Cambridge: Cambridge University Press.
571
Palmer, Frank Robert. 1990. Modality and the English modals (Longman linguistics library). 2nd ed. London: Longman. Paradis, Carita. 2001. Adjectives and boundedness. Cognitive Linguistics 12 (1). 4765. Perkins, Michael R. 1983. Modal expressions in English (Open linguistics series). London: Pinter. Plank, Frans. 1984. The modals story retold. Studies in Language 8. 305364. Roos, Anna M. 2000. Luminaries in medicine: Richard Mead, James Gibbs, and solar and lunar effects on the human body in Early Modern England. Bulletin of the History of Medicine 74. 433457. Rothwell, William. 1998. Arrivals and departures: The adoption of French terminology into Middle English. English Studies 79 (2). 144165. Shindo, Mika. 2008. From visual adjective to modalized intensifier: A cross-linguistic study of grammaticalization. Paper presented at New Reflections on Grammaticalization (NRG) 4, University of Leuven, 1619 July. (Abstract at http://wwwling.arts.kuleuven.be/nrg4/_pdf/shindo. pdf ) Siraisi, Nancy G. 1990. Medieval and early Renaissance medicine: An introduction to knowledge and practice. Chicago: University of Chicago Press. Sweetser, Eve. 1990. From Etymology to Pragmatics (Cambridge studies in linguistics 54). Cambridge: Cambridge University Press. Teyssier, Jacques. 1968. Notes on the syntax of the adjective in Modern English. Lingua 20. 225 249. TLF: Trsor de la langue franaise. 19711994. Dictionnaire de la langue du XIXe et du XXe sicle, 16 vols. Paris: Centre National de la Recherche Scientifique. Traugott, Elizabeth Closs. 1989. On the rise of epistemic meanings in English: An example of subjectification in semantic change. Language 65 (1). 3155. Traugott, Elizabeth Closs. 2006. Historical aspects of modality. In William Frawley (ed.), The expression of modality (The expression of cognitive categories 1), 107139. Berlin: Mouton de Gruyter. Traugott, Elizabeth Closs and Richard B. Dasher. 2002. Regularity in Semantic Change (Cambridge studies in linguistics 97). Cambridge: Cambridge University Press. Van der Auwera, Johan and Vladimir Plungian. 1998. Modalitys semantic map. Linguistic Typology 2. 79124. Van linden, An. 2009. Dynamic, deontic and evaluative adjectives and their clausal complement patterns: A synchronic-diachronic account. Leuven: University of Leuven dissertation. Van linden, An. Forthcoming. The development of deontic and evaluative meanings in English adjectival constructions (Topics in English Linguistics). Berlin: Mouton de Gruyter. Van linden, An, Jean-Christophe Verstraete and Hubert Cuyckens. 2008. The semantic development of essential and crucial: Paths to deontic meaning. English Studies 89 (2). 226247. Van Ostaeyen, Gert and Jan Nuyts. 2004. De diachronie van kunnen [the diachrony of kunnen (can)]. Antwerp papers in linguistics 109. University of Antwerp. von Wright, Georg H. 1951a. Deontic Logic. Mind 60 (237). 115. von Wright, Georg H. 1951b. An Essay in Modal Logic (Studies in logic and the foundations of mathematics). Amsterdam: North-Holland Publishing Company. von Wright, Georg H. 1971. Norm and Action: A logical enquiry (International library of psychology, philosophy and scientific method). London: Routledge and Kegan Paul.
The relation between iconicity and subjectification in Portuguese complementation: Complements of perception and causation verbs
RAINER VESTERINEN*
Abstract The present paper examines the variation between finite and infinitive complements of the Portuguese perception/causation verbs ver (see), ouvir (hear), sentir (feel), deixar (let) and fazer (make) from a cognitive grammar perspective. It is argued that the distribution of the structures main verb+ finite /infinitive complement can be explained by iconicity and subjectification. The hypothesis is put forward that the structure perception verb+ infinitive complement designates direct physical perception, while the structure perception verb+finite complement designates an inferential relation between the main verb and the complement event. In addition, the structure causation verb+infinitive complement designates direct causation, whereas causation verb+finite complement designates an indirect causation with inferential features. Further, it is claimed that the inferential and conceptually more complex character found in the structure main verb+finite complement represents a prime example of subjectification. Being so, it is argued that Portuguese complementation provides a remarkable connection between iconicity and subjectification. Keywords: causation; cognitive grammar; complementation; iconicity; inference; perception; Portuguese; subjectification.
1. Introduction Since Haimans (1980, 1985) seminal work on iconicity in language, functional and cognitive linguistics has shown a major concern in establishing a relation between iconic principles, on the one hand, and different complement
* Address for correspondence: Department of Spanish, Portuguese and Latin American studies, Stockholm University, Universitetsvgen 10b, 106 91, Stockholm, Sweden. E-mail: rainer@isp. su.se Cognitive Linguistics 213 (2010), 573600 DOI 10.1515/COGL.2010.019 09365907/10/00210573 Walter de Gruyter
574
R. Vesterinen
structures, on the other. A central claim holds that there is a correspondence between formal and conceptual complexity in the complement structure. Another claim is that a greater formal distance between the main verb and the complement verb matches a greater conceptual distance between the events described by these verbs (cf. Achard 2000, 2002; Givn 1993, 2001; Maldonado and Nava 2002; Verspoor 2000).1 The aim of the current paper is to corroborate these claims, supplying evidence from the Portuguese language. In view of natural data, I will argue that the distribution of finite and infinitive complement structures of perception and causation verbs in Portuguese is highly motivated by conceptual differences, i.e., the conceptual distinction made between direct and indirect perception/ causation. Furthermore, I will suggest that these conceptual differences entail a higher degree of subjectification in the structure main verb+finite complement than in the structure main verb+infinitive complement. In this sense, the present paper aims at drawing attention to a plausible connection between iconicity and subjectification in finite and infinitive complement structures. The notion of subjectification will be considered both from Traugotts (1989, 1995, 1996) and Langackers (1990, 1999, 2003, 2006) frameworks. The Portuguese perception verbs ver (see), ouvir (hear) and sentir (feel), as well as the causation verbs deixar (let) and fazer (make), share a remarkable feature, namely that of allowing both infinitive and finite complement structures. Additionally, the infinitive complement structures of these verbs are divided into three different structures, depending on the position of the logical subject of the infinitive and on the use of the plain or the inflected infinitive.2 This produces four different complement structures, illustrated in (14) with the causation verbs deixar and fazer: (1) Os pais deixam/fazem brincar The parents let/make-pres: play-inf. 3p.p The parents let/make the children play os meninos (VV) the children
(2) Os pais deixam/fazem os meninos brincar (VOV) The parents let/make-pres: the children play-inf. 3p.p The parents let/make the children play
1. 2. The term event and its different uses in the current paper should be understood in a broad sense of the word. The term will be used to designate an action, a process or a state (cf. Silva 2004). The Portuguese inflected infinitive is a typical feature of the Portuguese language. The inflected infinitive agrees with the subject in person and number and is formed by adding a suffixal subject morpheme to the plain infinitive in the following way: - (1p.s.), -es (2p.s), - (3p.s), -mos (1p.p), -des (2p.p), -em (3 p.p).
Iconicity and subjectification in Portuguese complementation 575 (3) Os pais deixam/fazem os meninos brincarem (VSV) The parents let/make-pres: the children play-inf: 3p.p 3p.p The parents let/make the children play (4) Os pais deixam que / fazem com The parents let-pres: that / make-pres: with 3p.p 3p.p meninos brinquem (finite complement) children play-pres.subj: 3p.p The parents let/make the children play que that os the
The infinitive structures in (13) are frequently referred to as VV, VOV and VSV (cf. Silva 2004, 2005). In the first example (1), the main verb (deixar/fazer) is followed immediately by the plain infinitive, forming the VV structure in which the logical subject of the infinitive has a final position. In (2), the logical subject is inserted between the main verb and the plain infinitive. Nonetheless, the use of a plain infinitive, without any subject agreement, produces the VOV structure. In other words, the logical subject of the infinitive is considered as the grammatical object. Finally, in (3), the word order is the same but there is subject agreement on the infinitive, which results in the VSV structure. Turning to the structure main verb+finite complement, illustrated in example (4), it is easy to verify some formal differences. Whereas the infinitive complements follow immediately after the main verb, the finite complements are normally introduced by the complementizer que (that). This is the case with complements of the perception verbs ver, ouvir and sentir, and with finite complements of the causation verb deixar. The finite complements of fazer diverge from this pattern, being introduced by the preposition com (with) immediately followed by the complementizer. Another formal difference between the finite complements of perception and causation verbs is that the perception verbs take an indicative verb complement, while the causation verbs take a verbal complement in the subjunctive mood.3 Studies on complementation in Portuguese have traditionally been carried out from a generative approach to language. The main purpose of these studies has been to describe the grammatical contexts that allow infinitive complement structures; and the structure with the inflected infinitive in particular. In doing
3. Although the contrast between indicative and subjunctive complements is highly relevant in relation to the perception/causation verbs, it goes beyond the scope of the present paper. See Maldonado (1995) and Vesterinen (2006, 2007a) for detailed analyses on the semantics of the subjunctive from a cognitive grammar perspective of language.
576
R. Vesterinen
so, the distribution of infinitive complements has frequently been explained by the Chomskyan theory of Government-Binding (cf. Brito 1995; Caetano et al. 1994; Raposo 1987). Needless to say, the semantic concern has been highly ignored within these frameworks. However, one exception to this tendency is found in Silva (2004, 2005). Departing from a cognitive grammar perspective (Langacker 1987, 1991), Silva distinguishes a continuum in which the initial position of the logical infinitival subject and the inflection of the infinitive render the event described in the complement more independent from the main clause event (cf. Silva 2004, 2005). The aim of the present study is to go one step further. I will suggest that the structures main verb+finite/infinitive complement represent a prime example of iconicity. Thus, it will be shown that the structure perception verb+infinitive complement denotes a prototypical direct physical perception, while the structure with a finite complement implies there to be an inferential relation between the perception verb and the complement. In the same manner, the infinitive complements of causation verbs tend to designate a prototypical direct causation, while the finite complements designate indirect causation. It will also be argued that the indirect causation attested in the finite complements often is of an inferential type. In sum, the structure perception/ causation verb+finite complement provides evidence of the speakers (the conceptualizers) inferences about the events described by the main and the complement clause. As indicated above, the inferential character found in perception/ causation verb+finite complement will be explained by the notion of subjectification. From Traugotts perspective (1989, 1995, 1996), I will show that the meaning of the finite structures is utterly based on the internal (evaluative, perceptual, cognitive) described situation; and further, I will propose that the finite structure designates the speakers subjective belief state toward the proposition. From Langackers perspective (1990, 1999, 2003, 2006), I will argue that the inferential character found in perception/causation verb+finite complement subsumes semantic bleaching and a change in perspective and locus of activity, i.e., from an active object of conception to the mental scanning of a subjectively construed conceptualizer. Thus, the analysis in the present paper provides an alternative interpretation to the claim that finite complements of causation verbs constitute a peripheral subcategory mainly expressing purely causal relations (cf. Verhagen 2005). In fact, the finite complements of perception and causation verbs seem to exhibit a striking parallelism regarding the notion of subjectification. In the light of this analysis, I will ultimately argue that the finite complement structures provide evidence of a connection between iconicity and subjectification. The first step in this process is that a higher degree of indirectness between the main event and the complement event in the structure main verb+
Iconicity and subjectification in Portuguese complementation 577 finite complement designates a more inferential relationship between these two events than does the structure main verb+infinitive complement. The second step in this process is that the inferential relation in the structure main verb+finite complement evokes the conceptualizer to a higher degree than is the case in the structure main verb+infinitive complement. The outline of the paper is as follows: The results of a quantitative analysis of complement patterns regarding direct and indirect perception/causation are presented in section 2. Sections 3 and 4 are dedicated to a qualitative analysis of the structures perception verb+complement and causation verb+ complement respectively. Section 5 describes the relationship between iconicity and subjectification. The conclusions are presented in section 6. 2. Complement patterns: direct and indirect perception/causation The linguistic material analysed consists of 600 natural examples extracted randomly from the Portuguese corpus Linguateca and from Portuguese Internet sites. The structure main verb+complement consists of 300 occurrences of the structure perception verb+complement and 300 occurrences of the structure causation verb+complement. Each verb type is further represented with 150 finite and 150 infinitive cases. That is, the category perception verbs consists of 150 finite complements and 150 infinitive complements, and the same holds for the category causationverbs. In order to make a distinction between direct and indirect causation/ perception, the following parameters were primarily considered: (1) direct perceptual experience of an event vs. inferential processes based on perceptual evidences, and (2) direct physical causation detectable in the outside world vs. causation as a mental experience. Regarding the perception verbs, the distinction is noteworthy in cases like: I saw the children playing football vs. I see that you are tired. In the latter case, the conceptualizer does not actually see the event of someone being tired, but infers it on the basis of perceptual evidences. Hence, this case designates indirect perception. Likewise, mental causation in which the causer induces the causee to perform a certain action is considered to be a case of indirect causation. On the other hand, the direct manipulation of an object often refers to direct causation. Obviously, the cases discussed above are prototypical instances of direct and indirect perception/causation. The distinction drawn between direct and indirect perception/causation will be discussed in more detail in the qualitative analysis. The quantitative analysis confirms a strong pattern in which the structure mainverb+infinitivecomplement designates a direct perception/causation, while the structure mainverb+finitecomplement designates a more indirect and inferential relation between the main verb event and the complement event. Accordingly, the structure perceptionverb+infinitivecomplement
578
R. Vesterinen
Table 1. Perception verbs + finite and infinitive complements. Main Vperception + Finite compl. Direct perception Indirect perception Totals 27 123 150 Main Vperception + Infinitive compl. 130 20 150 Totals 157 143 300
Table 2. Causation verbs + finite and infinitive complements. Main Vcausation + Finite compl. Direct causation Indirect causation Totals 28 122 150 Main Vcausation + Infinitive compl. 116 34 150 Totals 144 156 300
presents 130 cases of direct perception and only 20 cases of indirect (inferential) perception. In contrast, the structure perceptionverb+finitecomplement displays a completely different pattern: 123 occurrences designate indirect (inferential) perception and 27 cases are of a more direct character. The association between complement type and direct/indirect perception is highly significant, 2 = 139.02, df = 1, p < .0001 (two-tailed). The results of the structure perceptionverb+complement are shown in Table 1. Turning to the structure causationverb+complement, it is noteworthy that the same pattern prevails. The structure causation verb+infinitive complement designates a more direct kind of causation (116 cases), while it only exhibits 34 cases of indirect causation. In contrast, the structure causationverb+finitecomplement provides a totally different tendency. In the latter, 122 cases designate indirect causation and no more than 28 cases designate direct causation. The association between complement type and direct/ indirect causation is statistically significant, 2 = 101.08, df = 1, p < .0001 (two-tailed). The results of the structure causationverb+complement are shown in Table 2. It is noteworthy that the finite cases diverging from the general pattern often designate a kind of categorisation, e.g., I saw that the car was white. Another type that diverges from the general pattern corresponds to evidentials with the verb ouvir (to hear) of the type: I heard that Peter is in town. In both cases, however, one could easily argue that the finite structures display indirect features. In the first case, the categorization of an entity entails a more elaborated mental process than the mere registration of an event; and the second case corresponds to an information chain that certainly is indirect in nature. However,
Iconicity and subjectification in Portuguese complementation 579 they do not designate strictly inferential structures and, therefore, they are considered direct cases in the present paper. The infinitive cases with an indirect meaning, on the other hand, often correspond to the VSV structure with a human subject in both main and complement clause, e.g., They heard the opponent provoke the Prime Minister. That is, an occurrence in which the subject of the perception verbs not only hears something, but also draws a conclusion on the basis of the perceptual experience. This latter tendency regarding the VSV structure is also visible in a small number of cases of inducive causation designated by the verb fazer (to make). Tables 1 and 2 statistically corroborate the hypothesis that the two structures differ substantially regarding their conceptual content. The structure main verb+infinitivecomplement tends strongly to designate a direct perception/ causation, while the structure mainverb+finitecomplement designates a more indirect kind of perception/causation. In subsequent sections, I will analyse these conceptual differences more thoroughly. 3. Complements of the perception verbs ver, ouvir and sentir The distinction made in the current paper between direct perception, on the one hand, and inferential structures, on the other, is based on the folk model of the mind and the notion that physical perception is an immediate and spontaneous phenomenon. That is, in spite of modern theories of perception, there is a widespread idea that what we see, hear or feel is caused directly by an external stimuli in the outside world. We hear or see something because it occurs in our environment; and if it occurs within our auditory or visual field we have only a limited possibility to avoid it. There is, so to speak, a general idea that perception is direct and uncontrollable. Inference (or inferential structures), on the contrary, is related to our capacity for reasoning and to mental processes that lead us to certain belief states about how the outside world is shaped. As a result, inference is conceived of as more indirect and more controllable than perception (cf. DAndrade 1987; Verhagen and Kemmer 1997). Relating the distinction made above to infinitive and finite Portuguese complements of perception verbs, it is interesting to verify a rather clear-cut correlation. The structure main verb+infinitive complement designates a direct perceptual relation between the main verb perceiver and the process described in the complement. That is to say, an external stimulus in the outside world causes the perceiver to see, hear or feel something directly. In contrast, the structure main verb+finite complement does not merely describe a physical perceptual relation between the main verb subject and the complement event, but an inferential one. Thus, this structure describes inference,
580
R. Vesterinen
reasoning and beliefs about the outside world.4 The infinitive cases in (57) illustrate the direct perceptual relation between the main verb subject and the event described in the complement: (5) Em apenas vinte minutos que In only twenty minutes that estive no caf be-past: in the coffee bar 1p.s vi duas senhoras serem atacadas. see-past: two ladies be-inf: attacked. 1p.s 3p.p I was only in the coffee bar for 20 minutes, but I saw two ladies being attacked. [Dirio de Aveiro-N2240-1]
(6) O reprter do Dirio de Coimbra que na altura The reporter of Dirio de Coimbra that in the moment se deslocou ao local ouviu populares refl. move-past: to the place hear-past: people 3p.s 3p.s imputarem a prtica do crime a algum. attribute-inf: the practice of the crime to someone. 3.p.p The reporter of Dirio de Coimbra, who at that moment was moving towards the crime scene, heard people attribute the crime to someone. [Dirio de Coimbra-N0859-1] (7) A meio To middle senti a minha cama feel-past: the my bed 3p.s como um terramoto, afirmou um dos tremer, shake-inf: like an earthquake, affirm-past: one of the 3p.s, 3p.s sobreviventes. survivors. In the middle of the night I felt my bed shaking like an earthquake, said one of the survivors. [Dirio de Leiria-N0991-1] da of the noite night
In (5), the main verb subject, expressed by the first person singular preterit tense vi (I saw), personally witnessed the act of two ladies being attacked. In the following example (6), the reporter in a direct manner heard (ouviu) someone attribute a crime to someone. Finally, in (7), the main verb subject felt (senti) the bed shaking. Thus, the common denominator of these cases is a
4. See also Vesterinen (2007b) for a detailed examination on the matter.
Iconicity and subjectification in Portuguese complementation 581 direct perceptual relation between the main verb subject and the process described in the infinitive complement. Turning to the structure mainverb+finitecomplement, it is difficult to find this kind of perceptual relation between the main verb subject and the complement event. Instead, this structure designates an inferential relation in which the main verb subject, basing him-/herself on some perceptual experi ence, is drawn towards a conclusion about how the outside world is shaped. Examples (810) illustrate this phenomenon: (8) Portanto, . . . fico sempre desolada quando vejo Therefore . . . get-pres: always distressed when see-pres: 1p.s 1p.s que a comunicao social d a sensao de that the mass media give-pres: the impression of 3p.s estar controlada. be-inf: controlled. 3p.s Therefore . . . I always get distressed when I see that the mass media give the impression of being controlled. [Dirio de Aveiro-N1448-1] ouvem que a limpeza tnica e o apartheid hear-pres: that the ethnic cleansing and the apartheid 3p.p so a principal caracterstica da poltica be-pres: the typical trait of the politics 3p.p de Israel. of Israel. They hear that ethnic cleansing and apartheid are typical traits of Israelian politics. [http://www.unidadepopular.org/luta.htm] Mas gostava But like-past: 1p.s novas na younger in the de ver outras mulheres, bastante mais of see-inf: other women, much more 1p.s poltica, mas sinto que h um politics, but feel-pres: that be-pres: a 1p.s 3p.s grande desinteresse. great disinterest. But I would like to see other, much younger women, in politics, but I feel that there is a great lack of interest. [Dirio de Aveiro-N1448-1]
(9)
(10)
582
R. Vesterinen
The difference between these cases and the infinitive ones is that (810) describe events that are not detected directly by the main subjects perceptual apparatus. In other words, in (8) it is difficult to actually witness the event of mass media being controlled. Rather, the main verb subject draws a conclusion about the state of the mass media. In (9) the main verb subjects, expressed by the verb ouvem (they hear), do not actually hear the event of ethnic cleansing and apartheid in Israelian politics, but something they heard made them realize that this is a typical trait of Israelian politics. And, in (10), the main verb subject does not feel the event of great disinterest in the same direct manner as in the infinitive example (7). In sum, these cases do not express a direct physical perceptual relation between the main verb subject and the complement event. Rather, they designate an inferential process. Being so, a question to be asked is whether it is possible to insert an infinitive complement in the cases studied above and still have an inferential relation between the main verb subject and the complement event. This problem may be addressed by means of a substitution test that could verify if the finite cases in (810) can be modified into infinitive ones without any semantic consequences: (11) ? Vejo a comunicao social dar a sensao de estar controlada * I see the mass media give-inf: 3p.s the impression of be-inf: 3p.s controlled5 (12) ? Ouvem a limpeza tnica e o apartheid serem a principal caracterstica da poltica de Israel * They hear ethnic cleansing and apartheid be-inf: 3p.p typical traits of Israelian politics (13) ? Sinto haver um grande desinteresse * I feel be-inf: 3p.s a great disinterest In fact, these examples seem to highlight the semantic differences between the finite and the infinitive complements. Whereas finite complements may express thoughts and propositions about the world, including cases like (810) that are of a more stative character, the infinitive complement seems more appropriate to describe processes. This produces a semantic conflict in (1113) where the insertion of an infinitive complement creates expectations of a direct physical perceptual relation between a perceiver and a perceived process (cf.
5. The English translations of the modified and fabricated examples are made in a word-to-word order fashion in order to capture the original Portuguese structures. While they are not always grammatical, they are intended to sustain the semantic considerations under discussion.
Iconicity and subjectification in Portuguese complementation 583 Perini 1977: 4851; Vesterinen 2007b: 271273).6 This conflict is perhaps even better illustrated in the following examples: (14) O Jorge v que a Maria tem muito trabalho George sees that Mary have-pres: 3p.s much work to do (15) ? O Jorge v a Maria ter muito trabalho * George sees Mary have-inf: 3p.s much work to do (16) O Jorge v que a Maria est cansada Jorge sees that Mary be-pres: 3p.s tired (17) ? O Jorge v a Maria estar cansada * George sees Mary be-inf: 3p.s tired (18) O Jorge v que a Maria advogada George sees that Mary be-pres: 3p.s a lawyer (19) ? O Jorge v a Maria ser advogada * George sees Mary be-inf: 3p.s a lawyer Once again, the peculiarity of the construction with infinitives (15, 17, 19) is related to the fact that propositions like she has much work to do, she is tired and she is a lawyer are not so easily detected in the outside world. On the contrary, the main verb subject infers these states from some prior perceptual evidence. Therefore, the finite complements can easily be followed by a causal clause, introduced by porque (because), that explains why the main verb subject infers the complement event, e.g., vejo que ests cansado porque bocejas constantemente (I see that you are tired because you are yawning constantly). This is not the case with infinitive complements: (20) ? Vi duas senhoras serem atacadas, porque um homem estava zangado com elas. * I saw two women be-inf: 3p.p attacked, because a man was angry at them (21) ? Ouviu populares imputarem a prtica do crime a algum, porque ele era o culpvel. * He heard people attribute-inf: 3p.p the crime to someone, because he was guilty (22) ? Senti a minha cama tremer, porque houve um terramoto * I felt my bed shake-inf: 3p.s, because there was an earthquake)
6. The grammaticality/acceptability judgements in the present paper are based on my intuitions. As such, they are consistent with the approach used in Perini (1977) and Vesterinen (2007b, 2008).
584
R. Vesterinen
The reason why examples (2022) seem a bit odd is probably because they express an inferential relation that explains the cause of the complement event and not the reason for the main subjects perceptual experience. Thus, it may be true that the man in (20) attacked the women because he was angry at them, but this does not explain why the main verb subject saw the event. A more likely explanation would be that he saw the event because it occurred within his visual field. Likewise, (21) would be more felicitous if it explained the reason why the main verb subject heard someone say something and not why someone said it. Finally, it may be true that the bed in (22) shook because of the earthquake. However, the reason why the main verb subject felt it was that he was in bed when it shook. In sum, infinitive cases like (2022) designate perceptual relations. Therefore, a causal clause explaining the perceptual event is more felicitous with the structure mainverb+infinitivecomplement. The cases studied above (522) seem to highlight the conceptual difference between finite and infinitive complement structures. The infinitive complement structures designate a prototypically direct sensorial perception in its relation to the main verb, whereas the finite complement structures designate an inferential relation. Nevertheless, a remaining question is how to explain this difference. In other words, why do the infinitive structures designate direct perception and the finite ones inference? A reasonable answer to this question can be found in the notions of event integration (Givn 1993, 2001) and grounding (Langacker 1990). A fundamental idea underlying the notion of event integration is that a complement that is formally simple, and is lacking morphological information, tends to be integrated into the event described by the main verb. Likewise, a more elaborated complement gains a certain independence in its relation with the main verb and may be conceptualized as an event on its own. This difference is shown in the following examples: (23) As pessoas vem/ouvem/sentem que estas decises trazem benefcios no futuro People see/hear/feel that these decisions bring-pres: 3p.p benefits in the future (24) ? As pessoas vem/ouvem/sentem estas decises trazerem benefcios no futuro * People see/hear/feel these decisions bring-inf: 3p.p benefits in the future The difference between these examples is that the finite complement in (23) may designate a future event while the main verb designates an event in the present. That is to say, the main verb event and the complement event can be conceptualized as two different events. This is not the case with the infinitive
Iconicity and subjectification in Portuguese complementation 585 complement in (24). The temporal expression no futuro (in the future) has the effect of dislocating both the main and the complement event to the future. In other words, the absence of tense in the infinitive complements makes it more dependent on the temporal profile of the main verb.7 Further, the difference between a finite and infinitive complement pertains to the effect of grounding the complement event so that it bears some relationship to the speech event, i.e., the ground. Thus, the absence of tense in the infinitive complement implies there to be no such relation. Instead, the event described in the complement is reached through the main verb. This difference is illustrated in Figure 1:
Figure 1. The grounding of the complement event.
As indicated in Figure 1a, there is no direct relation between the ground and the infinitive complement event. This situation is illustrated with the absence of an arrow between the ground and the complement event. Even so, there is a relation in the sense that the ground has access to the event through the main verb. Simultaneously, a process that can be detected in the outside world does not need to be grounded in the same way as a proposition about the outside world. Therefore, it is not surprising that the infinitive complement designates a perceptual event, i.e., the physical perception of the main verb subject. On the other hand, the grounding of the complement event has the effect of locating the event in relation to the speech act participants conception of reality (cf. Langacker 1991, 2004, 2008). Thus, the finite structure does not actually
7. The higher degree of dependence of the infinitive complement may be even more salient, taking the following case into consideration: ?Eles viram/ouviram/sentiram estas decises trazerem benefcios no futuro *They saw/heard/felt/these decisions bring-inf: 3p.p benefits in the future It is actually difficult to separate the complement event temporally from the main verb event without creating a sentence that seems extremely odd.
586
R. Vesterinen
designate the perception of the main verb subject. Rather, it is the ground (the speaker/conceptualizer) who expresses his propositional attitude about how the world is shaped. Although the ground is always the primary conceptualizer of a linguistic expression, the situation described above reflects a shift in perspective and vantage point. The infinitive complement implies that the complement event is seen, heard or felt directly by the main verb subject. A finite complement clause, on the other hand, changes this viewing arrangement. Rather than describing the perception of the main verb subject, the finite complement designates the mental process of the ground (the primary conceptualizer). In this paper, this situation is considered to be a case of subjectification and will be further studied in Section 5. The following section, however, will focus on complements of causation verbs. 4. Complements of the causation verbs deixar and fazer The notion that causation can be of a more direct or indirect nature is commonly based on a distinction between a physical and more direct kind of causation, on the one hand, and a mental and more indirect causation, on the other. Likewise, direct physical causation is often claimed to correlate with a situation in which the causer has a high degree of control over the causee, while the opposite is true for mental and indirect causation of a more inducing, permissive or enabling kind (cf. Shibatani 1976; Kemmer and Verhagen 1994; Shibatani 2002; Shibatani and Pardeshi 2002). Accordingly, Shibatanti and Pardeshi (2002: 89) point out that direct causation prototypically reflects a causal relation with a patientive causee, whereas the causee is more agentive in indirect causation. This claim is also made by Verhagen and Kemmer (1997) and Song and Wolff (2004). The former define indirect causation as a situation that is conceptualised in such a way that it is recognized that some other force besides the initiator is the most immediate source of energy in the effected event (Verhagen and Kemmer, 1997: 67). Additionally, Song and Wolff (2004) introduce the no-intervening criteria in which the causation is claimed to be direct if there is no intervening causer between the initial causer and the final patient. Another trait, discussed in Song and Wolff (2004), is the degree of intention of the caused event. If the causer intentionally brings about an event, the causation is said to be direct. If the resulting event is non-intentional, on the other hand, the causation is understood of as indirect. This distinction is exemplified with the difference between the girl broke the vase and the girl caused the vase to break. In the former case, the caused event is claimed to be intended; and therefore it designates direct causation. The latter case, in contrast, designates a non-intended causation, and correlates with indirect causation (cf. Song
Iconicity and subjectification in Portuguese complementation 587 and Wolff, 2004: 240241). Figure 2 summarizes the distinctions discussed above:
Figure 2. Direct and indirect causation.
The idea underlying the distinction made in Figure 2 is that indirect causation is of a more mediated kind. For example, direct physical manipulation of an object, i.e., Joe threw away the stone designates an event in which there is no intervening force between the causer and the caused event. On the other hand, mental manipulation typically involves a mind-to-mind relation in which the causer tries to achieve the caused event by changing the world-view of the causee. And as Verhagen and Kemmer (1997: 71) state: one cannot reach into another persons mind and directly cause him or her to do, feel, or think something. In this case, the causee is of a more agentive kind, and responsible for his or her own actions. However, it may be difficult to sustain that intention and control are relevant features in the distinction between direct and indirect causation. It is, for example, easy to imagine a physical causation in which one person unintentionally pushes another person, and thereby causes him to fall. This would certainly qualify as direct causation without being intentional or displaying a high degree of control over the caused event. Another problem arises with the enabling or permissive mental causation. Verhagen and Kemmer (1997) seem to classify this causation as indirect since the initiator may be considered the responsible for the consequences (Verhagen and Kemmer 1997: 68). The view taken in the present paper is that the enabling or permissive causation in many cases reflects a situation where the causee is more likely to be self propelled into an action. Thus, the removal of a hindrance (social or physical) directly allows him to accomplish the action. This is, for example, the case in a sentence like: O pai deixa o seu filho sair (the father lets his son go out-inf. 3p.s). Being so, the distinctions made in Figure 2 are not to be seen as absolute, but as possible manifestations of a more basic parameter, elaborated by Shibatani
588
R. Vesterinen
and Pardeshi (2002). This parameter considers the spatiotemporal profile of the causal event structure as a whole, and the conceptualization of the situation as one single event or two separate ones. If there is a spatiotemporal overlap between the causing event and the caused eventthat is, where the causers activity and the caused event are not so easily distinguished from one and otherthe causation is considered to be of a direct kind. However, if the caused event has some degree of autonomy, being conceptualized as an event on its own, the causation is of a more indirect kind. This difference is shown in Figure 3:
Figure 3. Direct and indirect causation (cf. Shibatani and Pardeshi, 2002: 90).
To begin with, Figure 3 has the advantage of explaining the prototypical differences between direct and indirect causation. The direct physical causation typically displays a temporal overlap between the causing and the caused event so that the two events are conceptualized as one. In a situation where someone pushes, kicks or hits something, for example, the causing event has immediate consequences, which lead to the conceptualization of a single event. The causing and the caused events are not clearly distinguishable. In contrast, the mental inducive causation suggests there to be two separate events: the inducive and the action event. In other words, an agentive causee may decide to act after considering the causation event.8 Further, Figure 3 may account for the permissive and enabling side of causation. If the causal relation is conceptualized as a single event, i.e., if the removal of a hindrance has some immediate consequences, it is more likely to be categorized as direct causation. The causing and the caused event share the
8. Shibatani and Pardeshis original figure (2002: 90) does not include the terms causer and causee, but the semantic roles agent and patient. The reason for not using the original terms is that causation, in the view of the present paper, does not need to imply an agentive causer. In a case like a stone in the garden made him stumble and fall, for example, it is not really convincing to say that the stone is agentive, neither that the causee is fully patientive (cf. Vesterinen 2008).
Iconicity and subjectification in Portuguese complementation 589 same spatiotemporal profile. If the causing and caused events have different spatiotemporal profiles, on the other hand, this implies a more indirect kind of causation. This distinction seems to be rather straightforward with the enabling/permissive verb deixar: (25) As janelas The windows abertas open deixam let-pres: 3p.p entrar enter-inf: 3p.s o cheiro the scent
do campo. of the countryside. The open windows let the scent of the countryside enter. [http://www. quinta-de-s-lourenco.pt/] (26) Em plena rua Ivens em Lisboa, uma jovem In middle street Ivens, in Lisbon, a young girl deixa cair uma pasta cheia de papis em frente let-pres: fall-inf: a folder full of papers in front 3p.s 3p.s de outra jovem. of another young girl. In the middle of Ivens street, a young girl drops a folder full of paper in front of another young girl. [http://www.seleccoes.pt/ evista/ R detalhe.asp?tipo=detalhe&rea=16&ID=5632&Grup o=77]
Both (25) and (26) designate physical causal relations in which the causee has some intrinsic tendency towards motion. In (25), the scent of the countryside has a propensity for extension and, therefore, the open windows directly allow it to enter the room. The meaning of deixar in this context would be more or less not impede.9 Likewise, the causer in (26) looses the grip of the causee (the folder) andobeying the laws of gravitythe folder immediately starts falling. In other words, there is a spatiotemporal overlap between the causing and the caused events. This situation certainly reflects a direct causation, designated by the structure mainverb+infinitivecomplement. In fact, a finite complement in these examples would produce some semantic concerns. In (25), for example, the semantic considerations are highly related to a conflict between a non human causer and a more elaborated causation including mental processes and reasoning. Being so, a finite complement in (26) seems to designate a mental process. The modification of (2526) into structures with finite complements illustrates this difference:
9. See Silva (1999) for a detailed examination on the semantics of deixar. Silva distinguishes three basic meanings of deixar with verbal complements: the permissive act of allowing something, the ending of an impediment and to not impede an event.
590
R. Vesterinen
(27) ?As janelas abertas deixam que entre o cheiro do campo *The open windows let that enter-pres.subj: 3p.s the scent of the countryside (28) Uma jovem deixa que uma pasta cheia de papis caia em frente de outra jovem *A young girl lets that a folder full of paper drop-pres.subj: 3p.s: in front of another young girl The use of a finite complement in (27) creates a sentence that, in fact, seems semantically strange. This strangeness is probably due to the relation between finite complements and mental processes (cf. Vesterinen 2008). If (27) would designate a process in which a human causer opens a window in order to get some fresh air, the use of a finite complement would be expected. In the present case, however, neither the causer (the open windows), nor the causee (the scent) fulfils the qualification of being an entity capable of mental processes and reasoning. Therefore, the human causer in (28) is unproblematic, but the finite complement seems to change the semantics of the sentence drastically. The most plausible interpretation is that the causer (uma jovem) has the possibility to impede the folder from falling but decides not to do so. The examples studied above (2528) illustrate that direct causation with the permissive or/and enabling verb deixar matches the structure main verb+ infinitivecomplement. The following examples (2930) designate a more mental and indirect causation. Thus, the use of finite a complement is not surprising: (29) as pessoas deixam que os mesmos the persons let-pres: that the same 3p.p problemas as tornem infelizes por anos. problems them make-pres.subj: unhappy for years. 3p.p Sometimes people let the same problems make them unhappy for years. Como Because era uma pessoa be-past: a person 3p.s deixava que na altura as let-past: that at the time the 3p.s simples modest pessoas persons e prestativa and helpful levassem take-past.subj: 3p.p s vezes Sometimes
(30)
Iconicity and subjectification in Portuguese complementation 591 da sua mercearia produtos por fiado. from the his/her grocers shop products for trust-part. Being a modest and helpful person he/she let people buy on credit in his/her grocers shop. [http://www.folgosinho.com/albertino/ historial. html] The first case (29) designates a causal relation in which problems successively put the causee in a state of misery. This mental process is indeed of a more diffuse and indirect character than the direct physical causation attested in the cases with infinitive complements. Further, it involves the conceptualization of two different events: the emergence of a problem and the state of unhappiness. The following example (30) shares the feature of a more complex causal event structure. Indeed, knowing that there is a possibility to buy something on credit leads to the reconsideration of the offer and, further on, to the event of acceptingor not acceptingthe offer. To summarize, a more diffuse and indirect causation is coded by the structure mainverb+finitecomplement. The same pattern prevails regarding the causation verb fazer. Direct causation, often of a physical kind, is designated by infinitive complement structures, whereas mental and indirect causation seems to require the structure fazer + finitecomplement. The following examples will suffice to illustrate this difference: (31) Aos 33 minutos, Bruno num pontap de ressaca To the 33 minutes, Bruno in a kick of hangover faz a bola sair rente ao poste. make-pres: the ball go out-inf: close to the goal. 3p.s 3p.s In the 33rd minute, Bruno just misses the goal with a hangover-like toe-poke [Viana Dirio-N0901-1] O aumento do contacto com cenas violentas The rise of the contact with scenes violent faz com que a criana reaja make-pres: with that the child react-pres.subj: 3p.s 3p.s mais tardiamente a pedir ajuda ou a intervir para more later to ask-inf: help or to intervene-inf: to 3p.s 3p.s apaziguar uma luta entre outras crianas. calm-inf: a fight between other children. 3p.s The increased contact with violent scenes makes the child react later when asking for help or intervening in order to calm down a fight
(32)
592
R. Vesterinen
between other children. [http://www.medicosdeportugal.iol.pt/ action/2/cnt_id/72/] Thus, the physical direct causation of hitting a ball and thereby making it move in a certain direction in example (31) is designated by the structure fazer + infinitivecomplement. Conversely, in the following example (32), the structure fazer + finitecomplement designates a mental and more indirect type of causation. Obviously, the examples also differ in spatiotemporal profile. In (31), it is difficult to distinguish the causing and caused event from one another. There is, so to speak, a temporal overlap between them. In (32), on the other hand, it is obvious that a prior event (the contact with violent scenes) is said to result in a later event (a certain loss of empathy in witnessing authentic violence).10 It is also worthy of notice that the structures with an infinitive complement tend to designate emotional events that are understood as being more direct and uncontrollable. In these cases, the causing event creates a direct feeling, which seems to enter the causee before the acts of mental processes and reasoning (cf. DAndrade 1987). The finite complements, on the contrary, designate actions that are consequences of a prior feeling: (33) O primeiro dente, The first tooth, bacio, as primeiras diaper the first tudo o que everything the that o primeiro coc e xixi no the first poo and pee in the palavras, os primeiros passos e words, the first steps and faz os pais orgulharem-se make-pres: the parents proud-inf: 3p.s 3p.p refl.
dos filhos. of the children. The first tooth, the first poo and pee in the diaper, the first words, the first steps and everything that make the parents proud of their children. [http://www.jornaldeleiria.pt/index.php?article=7186 &visual=2]
10.
Another piece of support for a correspondence between finite and infinitive complements, on the one hand, and direct and indirect causation, on the other, is found in Silva (2005). As Silva comments, examples like: ?O Jorge deixa que/faz com que a Maria parta neste momento *George lets that/makes with that Maria leave-pres.subj: 3p.s in this moment have a minor acceptance in Portuguese. This minor acceptance could easily be explained by the notion that an expression that signals temporal contiguity (neste momento) is not compatible with a structure that designates two different events, i.e., the structure deixar/fazer + finite complement.
Iconicity and subjectification in Portuguese complementation 593 (34) fora escolhido, pela primeira vez, be-perf.cont: elect-part., for the first time 3p.s. para Ministro das Finanas, mas um to Minister of the Finances, but a desentendimento com o primeiro-ministro da altura misunderstanding with the Prime Minister of the time fez com que se demitisse. make-past: with that refl. resign-past.subj: 3p.s 3p.s. In 1926 he had been elected Minister of Finance, but a misunderstanding with the Prime Minister of the time made him resign. [http://www.arqnet.pt/portal/discursos/Abril 01.html] Em In 1926 1926
In (33), the parents experience an emotion of pride by witnessing events that are connected to early childhood. In this particular case, the emotional bond between the parents and the child, accompanied by the external stimulus that the child produces, renders any act of reasoning unnecessary. Instead, the stimulus creates a direct feeling of pride. In the following case (34), however, the causal relation is of a complex and indirect character. The misunderstanding between the Minister of Finance and the Prime Minister is most likely to produce a feeling of discontent in the latter. This feeling leads to a mental process in the Minister of Finance and, subsequently, to his decision to resign from his position. The indirect causation that the Portuguese verbs deixar/fazer + finitecomplement designate frequently seems to display another feature. The relation between the causing and the caused event is not as easily detected in the outside world as in direct causation. Rather, it is the conceptualizer who connects two different events and infers a causal relation between them. This certainly seems to be the case in (34) where a misunderstanding is said to cause the resignation of the Minister of Finance, or in (32) where increased contact with violent scenes is said to cause a loss of empathy in children. This also seems to happen in example (29) where problems are claimed to cause unhappiness. Finally, in (30), the conceptualizer may reach the conclusion that the owner of the grocers shop allows her/his customers to buy on credit after witnessing this act on several occasions. The following cases (3538) exemplify the inferential side of causation: (35) So pessoas que deixam que as Be-pres: persons that let-pres: that the 3p.p 3p.p aconteam em vez de assumirem happen-pres.subj: in turn of take-inf: 3p.p 3p.p coisas things lhes them
o controlo the control
594
R. Vesterinen da situao. of the situation. They are persons that let things happen to them, instead of taking control over the situation [http://www.maxima.pt/feminino/actriz. shtml]
(36)
Seus parentes e antigos proprietrios Their relatives and former owners
deixaram let-past: 3p.p que a caldeira se entulhasse, que a casa da azenha that the boiler refl. crammed, that the watermill se destelhasse e ficassem apenas refl. loose the tiles-past.subj: and stay-past.subj: only 3p.s 3p.p as paredes em p. the walls in foot. Their relatives and former owners let the boiler get overfilled, and the roof of the watermill lost the tiles, which left only the walls standing. [Dirio de Aveiro-N0470-1]
(37)
Temos trs factores: pinhal, praia e mar, Have-pres: three factors: pine forest, beach and sea 1p.p que fazem com que milhares de pessoas nos that make-pres: with that thousands of persons us 3p.p visitem durante todo o ano. visit-pres.subj: during all the year. 3p.p We have three factors: the pine forest, beach and sea, which make thousands of people visit us all year round. [Dirio de Aveiro-N3754-1] de jogarmos com a Naval, atendendo of play-inf: with the Naval, pay atencion-prog. 1p.p proximidade geogrfica e alguma rivalidade to the closeness geographical and some competition existente faz com que os jogadores existing. make-pres: with that the players 3p.s tenham alguma ansiedade. have-pres.subj: some anxiety. 3p.p O facto The fact
(38)
Iconicity and subjectification in Portuguese complementation 595 The fact that we are playing against Naval, bearing in mind the geographical closeness and the rivalry between the teams, makes the players anxious. [Dirio de Coimbra-N0685-1] In (35), the conceptualizer construes a causal relation where some people mentally, probably in a subconscious way, allow bad things to happen to them. What seems to be the issue, then, is that he connects a mental state to the occurrence of bad things happening. In the same spirit, the conceptualizer of (36) connects the former owners of a mill to the actual state of the same. This connection entails a mental process where the conceptualizer compares the actual state of the building with the previous one. Thereafter, he/she infers that the former owners are responsible for the current condition of the mill. The following examples (3738), with the structure fazer + finitecomplement, share this inferential feature. The first example (37) designates a causal relation in which people are attracted to a certain place. The beauty and pleasure of the pines, the beach and the sea are claimed to create a mental process in the causee and as a result he/she decides to visit the place. It is interesting to note, however, that there may be many different reasons to visit a place but, in this case, the conceptualizer connects the three factors mentioned above to the causees action. Finally, in the subsequent case (38), the conceptualizer relates the cause of the players anxiety to the fact that they will meet a team from a nearby region. This causal relation may very well be true, but the issue at stake is that the conceptualizer is the one responsible for creating this causal relation. In other words, the common denominator of examples (3538) is that the conceptualizer connects two separate events in the outside world as having a causal relation. As has been seen, the same principles that account for the relation between finite and infinitive complements of perception verbs may also explain the complements of causation verbs. The infinitive complement does not create any relation to the ground. Instead, the complement event is reached through the main verb and, as a consequence, the structure causation verb+ infinitivecomplement designates a single event in which the causers activity and the caused event are not clearly distinguishable from one and other. This situation is indeed in contrast with the causation designated by the structure causation verb+finite complement. In these cases, the ground has access to the complement event and may concepualize it as an event on its own. Further, the grounding of the complement makes it available for epistemic judgements and propositional attitudes (cf. Langacker 2004, 2008). Therefore, the structure causationverb+finitecomplement does not designate direct causation, but the conceptualizers construal of causal relations in the world.
596
R. Vesterinen
5. Iconicity and Subjectification The conceptual differences between finite and infinitive complements of causation/perception verbs analysed in the previous sections are not coincidental. Rather, they are highly motivated and reflect the iconic character of language. Moreover, they seem to display a remarkable connection between iconicity and subjectification. To begin with, the infinitive complement designates a prototypical perception or causation in its relation to the main verb. The main verb subject of the perception verb sees, hears or feels something physically and directly, and the main verb subject of the causation verb directly causes the complement event. Thus, a lesser degree of formal complexity in the structure perception/causationverb+infinitivecomplement matches a lesser degree of conceptual complexity. Conversely, a formally more complex structure, i.e., mainverb+finitecomplement, designates an indirect and more complex kind of perception/causation. The perception verb does not merely designate direct perception, but mental processes and inferential structures. Further, the causation designated by mainverb+finitecomplement is of a more indirect and inferential kind. These conceptual differences are also reflected by the formal distance between the main verb and the complement verb. A lesser formal distance between the main verb and the infinitive complement verb reflects a conceptual closeness between the events described by these verbs. Therefore, it is not difficult to understand the reason why this structure designates direct perception and direct causation. Likewise, a major formal distance between the main verb and the finite complement, effectuated by the complementizer que, and com que with the causation verb fazer, correlates with a major conceptual distance between the events described by these linguistic units. As a consequence, the structure mainverb+finitecomplement does not designate direct perception or direct causation, but a more complex relation between the main event and the complement event, i.e., mental processes and inferential structures. Underlying these differences is the fact that finite complements create a relation to the ground and therefore may designate an event distinct from the main verb event. The complement event has its own spatio-temporal profile. Additionally, the grounding of the complement event makes it accessible for epistemic judgements and reasoning. Therefore, by attributing inference to the main verb subject of a perception verb, the conceptualizer gives evidence of his own inferences. In addition, the structure causation verb+finite complement designates the conceptualizers reasoning and inferences about causal relations in the world. Moreover, the inferential character of the structure main verb+finite complement leads to a higher degree of subjectification. To begin with, the inference implied in the finite structures actually fits extremely well with Ten-
Iconicity and subjectification in Portuguese complementation 597 dency 1 and Tendency 3 in Traugotts analysis of subjectification (cf. Traugott 1989). The perception verb does not describe the main verb subjects direct and physical perception, but rather the speakers internal evaluation of the situation. Similarly, the causation expressed by the structure with a finite complement is not based on an external situation, but on an internal (evaluative) one. This is in accordance with Tendency 1. Being so, the structure mainverb+ finitecomplement describes an event based on the speakers subjective belief state towards the proposition, that is, Tendency 3 in Traugotts framework. Turning to subjectification in the Langackerian framework, the internal evaluation of a two-event structure involves the conceptualizers mental scanning between these two events. Therefore, in designating an inferential relation between the main verb and the complement event, the structure mainverb+ finitecomplement implies a change in perspective and in locus of activity. The perception verb does not designate the main verb subjects perception. Instead, it expresses this participants inferences about the outside world. Needless to say, this entails an inferential process in the conceptualizer. When the conceptualizer claims that the main verb subject infers something, it is actually the conceptualizer who is the ultimate source of the inferential process. The same pattern holds for the causation verbs. In order to detect the cause of an event, the conceptualizer scans mentally between two events that might bebut do not have to beconnected in a causal relation. Indeed, the mental character of indirect causation implies that the events might not even be visible in the outside world. Therefore, in terms of a causal force, there is no actual movement in the situation. The only movement is the mental scanning of the conceptualizer, i.e., the participant who is creating a causal relation between two events. Consequently, the structure perception/causationverb+finitecomplement may be equated with semantic bleaching or fading away. The conceptualizer emergesor becomes more evidentwhen the objectively construed counterpart fades away (Langacker 2006: 21). This phenomenon is consistent with a situation where the main verb subjects perception becomes the conceptualizers inference; and where direct causation detected in the outside world is replaced by indirect and inferred causation. This also entails attenuation, that is, a loss of subject control and a shift in domain from an active subject to the conceptualizer of the event (cf. Langacker 1999). Finally, the grounding of the complement event subsumes that the conceptualizer is subjectively construed. In sum, Portuguese complement structures display a rather straightforward connection between iconicity and subjectification. The structure mainverb+ finite complement, being more elaborated and complex, and exhibiting a greater distance between the main event and complement event, also designates a more indirect kind of perception/causation. Further, this indirect perception/causation, often with inferential features, seems to entail a higher degree of subjectification in the finite structures than in the infinitive ones.
598
R. Vesterinen
6. Conclusion The present paper has shown that the existence of finite and infinitive complement structures of perception and causation verbs in Portuguese is highly motivated by their conceptual differences. It has also been shown that these conceptual differences reflect the iconic nature of language: formal complexity and distance tends to correlate with conceptual complexity and distance. The structure perception/causationverb+finitecomplement does not designate direct perception or direct causation in prototypical cases, but mental and inferential processes. This phenomenon has been conceived of as a prime example of subjectification. From Traugotts perspective, it subsumes internalization, propositional attitude and subjective judgements about how the outside world is shaped. From Langackers point of view, it presupposes the mental scanning of a subjectively construed conceptualizer. The present study has also provided evidence for a relation between linguistic iconicity and subjectification in Portuguese complement structures. This relation indeed raises further and more general questions regarding language structure, linguistic iconicity and subjectification. Received 28 May 2008 Revision received 22 February 2010 References
Achard, Michel. 2000. Construal and complementation in French. In: Kaoru Horie (ed.), Complementation. Cognitive and Functional Perspective, 91120. Philadelphia: John Benjamins Publishing Company. Achard, Michel. 2002. Causation, constructions, and language ecology: An example from French. In: Masayoshi Shibatani (ed.), The grammar of causation and interpersonal manipulation, 127155. Amsterdam/ hiladelphia: John Benjamins Publishing Company. P Brito, Ana Maria. 1995. Algumas consideraes sintcticas do portugus no quadro das lnguas romnicas: Sujeito nulo, Infinitivo Flexionado e Clticos Nominativos [Some considerations regarding Portuguese syntax in the context of the romance languages: null subject, inflected infinitive and nominative clitics]. Lusorama 27. 1727. Caetano Silveira, Jane R., Luciene Simes, Sabrina Abreu, Gisela Collishonn and Delzimar Lima. 1994. O infinitivo flexionado em portugus: um reestudo de Raposo (87) [The Portuguese inflected infinitive: a restudy of Raposo (87)]. Letras de Hoje 96. 135146. DAndrade, Roy. 1987. A folk model of the mind. In: Dorothy Holland and Naomi Quinn (eds.), Cultural Models in Language and Thought, 113147. Cambridge: Cambridge University Press. Givn, Talmy. 1993. English Grammar: A function-Based Introduction, Vol. II. Amsterdam/ Philadelphia: John Benjamins Publishing Company. Givn, Talmy. 2001. Syntax, Vol. 2. Amsterdam/ hiladelphia: John Benjamins Publishing P Company. Haiman, John. 1980. The iconicity of grammar: isomorphism and motivation. Language 56 (3). 515540.
Stockholm University
Iconicity and subjectification in Portuguese complementation 599

Haiman, John. 1985. Natural Syntax: Iconicity and erosion. Cambridge: Cambridge University Press. Kemmer, Suzanne and Arie Verhagen. 1994. The grammar of causatives and the conceptual structure of events. Cognitive Linguistics 5 (2). 115156. Langacker, Ronald W. 1987. Foundations of Cognitive Grammar, vol. 1Theoretical Prerequisites. Stanford: Stanford University Press. Langacker, Ronald W. 1990. Subjectification. Cognitive Linguistics 1 (1). 538. Langacker, Ronald W. 1991. Foundations of Cognitive Grammar, vol. 2Descriptive Application. Stanford: Stanford University Press. Langacker, Ronald W. 1999. Grammar and Conceptualization. Berlin/New York: Mouton de Gruyter. Langacker, Ronald W. 2003. Extreme subjectification: English tense and modals. In: Hubert Cuyckens, Thomas Berg, Ren Dirven and Klaus-Uve Panther (eds.), Motivation in language. Studies in honor of Gnter Radden, 326. Amsterdam/ hiladelphia: John Benjamins Publishing P Company. Langacker, Ronald W. 2004. Aspects of the Grammar of Finite clauses. In: Michel Achard and Suzanne Kemmer (eds.), Language, Culture and Mind, 535577. Stanford: CSLI Publications. Langacker, Ronald W. 2006. Subjectification, grammaticization and conceptual archetypes. In: Angeliki Athanasiadou, Costas Canakis and Bert Cornillie (eds.), Subjectification. Various Paths to Subjectivity, 1740. Berlin/New York: Mouton de Gruyter. Langacker, Ronald W. 2008. Cognitive grammar: a basic introduction. New York: Oxford University Press. Maldonado, Ricardo. 1995. Middle-Subjunctive Links. In: Peggy Hamispour, Ricardo Maldonado and Margaret Van Naerssen (eds.), Studies in language learning and Spanish Linguistics in honor of Tracy D. Terrell, 399418. New York: McGraw Hill. Maldonado, Ricardo and E. Fernando Nava L. 2002. Tarascan causatives and event complexity. In: Masayoshi Shibatani (ed.), The grammar of causation and interpersonal manipulation, 157 195. Amsterdam/Philadelphia: John Benjamins Publishing Company. Perini, Mrio A. 1977. Gramtica do infinitivo portugus [Grammar of the portuguese infinitive]. Petrpolis: Editora Vozes Limitada. Raposo, Eduardo. 1987. Case theory and Infl-to-Comp: The inflected infinitive in European Portuguese. Linguistic Inquiry 18 (1). 85109. Shibatani, Masayoshi. 1976. The grammar of causative constructions: A conspectus. In: Masayoshi Shibatani (ed.), Syntax and Semantics, Vol. 6, The Grammar of Causative Constructions, 140. New York/San Francisco/London: Academic Press. Shibatani, Masayoshi. 2002. Introduction. Some basic issues in the grammar of causation. In: Masayoshi Shibatani (ed.), The grammar of causation and interpersonal manipulation, 122. Amsterdam/ hiladelphia: John Benjamins Publishing Company. P Shibatani, Masayoshi and Prashant Pardeshi. 2002. The causative continuum. In: Masayoshi Shibatani (ed.), The grammar of causation and interpersonal manipulation, 85125. Amsterdam/ hiladelphia: John Benjamins Publishing Company. P Silva, Augusto Soares da. 1999. A Semntica de Deixar: uma Contribuio para a Abordagem Cognitiva em Semntica Lexical [The semantics of deixar: a contribution to the cognitive approach of lexical semantics]. Braga: Fundao Calouste Gulbenkian. Silva, Augusto Soares da. 2004. Imagery in Portuguese causation/perception constructions. In: Barbara Lewandowska-Tomaszczyk and Alina Kwiatkowska (eds.), Imagery in Language. Festschrift in Honour of Professor Ronald W. Langacker, 297319. Frankfurt/Main: Peter Lang Publishing Group. Silva, Augusto Soares da. 2005. Revisitando as construes causativas e perceptivas do Portugus: significado e uso [Revisiting the causation and perception constructions in Portuguese: meaning
600
R. Vesterinen
and use]. In: Ins Duarte and Isabel Leiria (eds.), Actas do XX Encontro Nacional da Associao Portuguesa de Lingustica, 855874. Lisbon: Associao Portuguesa de Lingustica. Song, Grace and Philip Wolff. 2004. Linking Perceptual Properties to Linguistic Expressions of Causation. In: Michel Achard and Suzanne Kemmer (eds.), Language, Culture and Mind, 237 250. Stanford: CSLI Publications. Traugott, Elizabeth. 1989. On the rise of epistemic meaning in English: An example of subjectification in semantic change. Language 65 (1). 3155. Traugott, Elizabeth. 1995. Subjectification in grammaticalisation. In: Dieter Stein and Susan Wright (eds.), Subjectivity and subjectivisation, 3155. Cambridge: Cambridge University Press. Traugott, Elizabeth. 1996. Subjectification and the development of epistemic meaning: The case of promise and threaten. In: Toril Swan and Olaf Jansen Westvik (eds.), Modality in Germanic Languages. Historical and Comparative Perspectives, 185210. Berlin/New York: Mouton de Gruyter. Verhagen, Arie. 2005. Constructions of intersubjectivity: discourse, syntax and cognition. New York: Oxford University Press. Verhagen, Arie and Suzanne Kemmer. 1997. Interaction and causation: Causative constructions in modern standard Dutch. Journal of Pragmatics 27 (1). 6182. Verspoor, Marjolijn. 2000. Iconicity in English complement constructions. Conceptual distance and cognitive processing levels. In: Kaoru Horie (ed.), Complementation. Cognitive and Functional Perspective, 205231. Philadelphia: John Benjamins Publishing Company. Vesterinen, Rainer. 2006. Subordinao adverbialum estudo cognitivo sobre o infinitivo, o cltico SE e as formas verbais finitas em proposies adverbiais do Portugus Europeu [Adverbial subordination: a cognitive study on the infinitive, the clitic SE and finite verb forms in European Portuguese Adverbial clauses]. Stockholm: Stockholm University dissertation. Vesterinen, Rainer. 2007a. A variao entre proposies adverbiais infinitivas e conjuntivas: subjectificao e espaos mentais [The variation between infinitive and subjunctive adverbial clauses: subjectification and mental spaces]. Diacrtica. Cincias da Linguagem 21 (1). 241 273. Vesterinen, Rainer. 2007b. Complementos finitos e infinitivos dos verbos perceptivos ver, ouvir e sentir: iconicidade lingustica e subjectificao [Finite and infinitive complements of the perception verbs ver, ouvir and sentir: linguistic iconicity and subjectification]. Revista Portuguesa de Humanidades. Estudos Lingusticos 11 (1). 251283. Vesterinen, Rainer. 2008. Direct, Indirect and Inferred Causation: Finite and Infinitive Complements of Deixar and Fazer. Journal of Portuguese Linguistics 7 (1). 2350.
Figurative Language Understanding in LCCM Theory

VYVYAN EVANS*
Abstract While cognitive linguists have been successful at providing accounts of the stable knowledge structures (conceptual metaphors) that give rise to figurative language, and the conceptual mechanisms that manipulate these knowledge structures (conceptual blending), relatively less effort has been thus far devoted to the nature of the linguistic mechanisms involved in figurative language understanding. This paper presents a theoretical account of figurative language understanding, examining metaphor and metonymy in particular. This account is situated within the Theory of Lexical Concepts and Cognitive Models (LCCM Theory). LCCM Theory (Evans 2006, 2009b) is a cognitively realistic model of lexical representation and semantic compositionality, providing, it is argued, an account of figurative language which complements the backstage cognition perspective of Conceptual Blending Theory. It also integrates the notion of conceptual metaphor within the account provided of figurative language understanding. The paper introduces the key mechanisms involved in figurative language understanding arising from language use. The paper also provides a programmatic account of how conceptual metaphors are integrated with linguistic knowledge in figurative language use. It is argued the present proposals flesh out a key aspect of the conceptual integration perspective promoted by Fauconnier and Turner, with which LCCM Theory is continuous. In part, the paper attempts to advance the prospect of a joined up cognitive linguistic account of figurative language understanding.
* Address for correspondence: School of Linguistics, Main Arts Building, Bangor University, College Road, Bangor, LL57 2DG, UK. Email: v.evans@bangor.ac.uk. Web: www.vyvevans. net. Acknowledgements: I am grateful for extremely helpful comments on an earlier version of this paper by three anonymous referees. I also gratefully acknowledge very helpful comments on two earlier versions of this paper by an anonymous Associate Editor for Cognitive Linguistics. Cognitive Linguistics 214 (2010), 601662 DOI 10.1515/COGL.2010.020 09365907/10/00210601 Walter de Gruyter
602
V. Evans LCCM Theory, linguistic metaphor, conceptual metaphor, metonymy, frontstage cognition, backstage cognition, meaning construction, Conceptual Metaphor Theory, Conceptual Blending Theory, Cognitive Grammar.
Keywords:
1. Introduction The cognitive linguistics enterprise has provided an approach to studying human imagination, arguing that language reveals systematic processes at work (Evans and Green 2006). Cognitive linguists have argued that such processes are central to the way we think (e.g., Coulson 2000; Evans 2004, 2009b; Fauconnier 1997; Fauconnier and Turner 2002; Lakoff and Johnson 1999; Turner 1996). One way in which cognitive linguists have approached the role of imagination in human thought has been by positing relatively stable knowledge structures which are held to inhere in long-term memory. These knowledge structures are termed conceptual metaphors (Lakoff and Johnson 1980, 1999) and are claimed to have psychological reality.1 In addition, conceptual metaphors are held to be manipulated by a dynamic meaning construction process: conceptual blending (Coulson 2000; Fauconnier and Turner 1998; 2002; 2008; Grady 2005). The way in which these structures and processes have been studied has predominantly been to examine systematicities in figurative language, particularly within the framework of Conceptual Metaphor Theory (Lakoff and Johnson 1980, 1999). George Lakoff and Mark Johnson, the proponents of the study of conceptual metaphor, argue that figurative language is a consequence of the existence of a universal set of pre-linguistic primary metaphors (Lakoff and Johnson 1999; see also Grady 1997), and a language-specific set of conceptual metaphors, both of which map structure from more concrete domains of conceptual structure, referred to as source domains, onto less easily apprehended aspects of conceptual structure, referred to as target domains. Together these knowledge structures are held to give rise both to the productive use of figurative language, as well as to more creative aspects, such as poetic metaphor (see Lakoff and Turner 1989). More recently, it has been argued that conceptual metaphors have a neural instantiation (see discussion in Feldman 2006; Gallese and Lakoff 2005; Lakoff 2008; Lakoff and Johnson 1999). While the success of both Conceptual Metaphor Theory and Conceptual Blending Theory provides the backdrop for the discussion in this paper, the
1.
For discussion of the psychological reality of conceptual metaphors see, for example, Boroditsky (2000); Casasanto (2010); Casasanto and Boroditsky (2008); Evans (To appear); Gentner et al. (2002); Nez et al. (2006); and Gibbs (1994).
Figurative Language Understanding in LCCM Theory 603 analyses presented here are orthogonal to these approaches. Moreover, in certain respects, the present approach seeks to nuance the approaches (and theoretical constructs) developed by these theories (as discussed in detail later in the paper). In particular, Conceptual Metaphor Theory is not primarily (if at all) a theory about metaphor understanding in language. Rather, Conceptual Metaphor Theory has traditionally been concerned with the nature and the level of the various cognitive representations that serve to structure target domains in terms of sources domains. That is, Conceptual Metaphor Theory is a theory concerned with backstage cognitionthe role of the non-linguistic conceptual processes that facilitate meaning construction behind the scenesso to speak.2 Analogously, Conceptual Blending Theory (Coulson 2000; Fauconnier and Turner 2002, 2008) is concerned with the conceptual processes involved in meaning construction, viewing language as impoverished prompts for semantic compositionality. For Fauconnier and Turner, what is really interesting about figurative language phenomena are the conceptual (rather than linguistic) processes that lie hidden from view, behind the scenes, so to speak. In addition to the backstage cognition perspective, (cognitive) linguists require, I suggest, a theoretical account that models how language deploys and interfaces with the non-linguistic knowledge structuresthe conceptual metaphorsand the conceptual mechanisms of meaning constructionthe process of conceptual integration or blendingduring the process of figurative language understanding. That is, we require a theory that addresses frontstage cognitionan account that is concerned with the role of linguistic prompts and linguistic processes of semantic composition in figurative language understanding. Moreover, such an account must remain consonant with what is known about the structures and processes involved in figurative thought, in the light of the research programmes of Lakoff and Johnson, and Fauconnier and Turner, as well as others. That is, such an account of figurative language understanding must be psychologically plausible. I discuss this, below, in terms of findings concerning processing issues in figurative language comprehension. In this paper I argue for a new (or at least a newly nuanced) perspective on the nature of semantic compositionality in figurative language. I do so by applying the Theory of Lexical Concepts and Cognitive Models (LCCM Theory for short) in order to provide the theoretical context for the account of figurative language understanding that I develop. The specific mechanisms I propose here are an attempt to model the interaction between linguistic knowledge and conceptual knowledge during the process of figurative language
2.
It was Fauconnier who coined the term backstage cognitionsee Fauconnier (1994, 1997). For detailed discussion of the distinction between frontstage cognition and backstage cognition see Evans (2009b).
604
V. Evans
understanding. Another way of thinking about the proposals elaborated on below is that the present paper represents an attempt to provide the first detailed account of the processes involved in (linguistically-mediated) composition in Fauconnier and Turners 2002 termsduring conceptual blending. Thus, while LCCM Theory (Evans 2006, 2009b) models lexical representation, it is also concerned with the way in which lexical concepts interface with nonlinguistic knowledge. As such, it addresses the thorny issue of semantic compositionality. In general terms, the LCCM worldview holds that meaning arises through integration. Hence, it meshes with, and as I argue later, is continuous with, the conceptual blending research programme. My purpose here is not to elaborate the LCCM perspective in detail (see Evans 2009b for a book length treatment). In this paper, I apply LCCM Theory to figurative language understanding. Once I have introduced the key mechanisms provided by LCCM Theory in contributing to figurative language understanding I return to the issue of how the LCCM perspective interfaces with conceptual metaphors. I also consider how LCCM Theory fleshes out one aspect of the semantic integration perspective advanced by Conceptual Blending Theory: the role of linguistic knowledge in semantic (and conceptual) composition. I also consider the way in which LCCM Theory contrasts with Cognitive Grammar (Langacker 1987, 1991, 2008). As the viewpoint I take in this paper is a frontstage cognition perspective being concerned with semantic compositionality from the viewpoint of (figurative) languagerather than a backstage cognition perspectivethe nonlinguistic knowledge structures implicated (conceptual metaphors)I am primarily concerned with (figurative) language. My objects of study are termed linguistic metaphors, and linguistic metonymies, to contrast them with non-linguistic knowledge structures, such as conceptual metaphors.3 A linguistic metaphor, as I use the term, relates to an utterance-specific metaphoric conception. That is, it is a metaphor that resides in (and emerges from) a situated (and hence contextualised) instance of language use. Linguistic metaphors may draw upon non-linguistic knowledge (including conceptual metaphors). As I shall argue in section 6, below, linguistic metaphors draw on other sorts of knowledge too.
3. This distinction is in fact well established in the literature. For instance, scholars in the psycholinguistic tradition (e.g., Gentner 2001; Gentner and Bowdle 2008; Glucksberg 2001, 2008) are primarily concerned with linguistic metaphors although they are concerned with the comprehension (and hence conceptual) strategies involved in the understanding of linguistic metaphors. In contrast, Lakoff and Johnson (1980, 1999) are concerned with conceptual metaphors, a level of metaphoric representation that does not rely on language. The terms linguistic metaphor and mental metaphor have been used in the literature previously by Daniel Casasanto (2010), to distinguish between the divergences that abound in mental representation and language use, in the realm of figurativity.
Figurative Language Understanding in LCCM Theory 605 The paper is structured as follows. In the next section I introduce the figurative language phenomena that I will be presenting an account of. In section 3, I provide an overview of the theoretical perspective which provides the basis for the analysis: LCCM Theory. In section 4, I present an analysis of the distinction between literal and figurative forms in language understanding from the frontstage cognition perspective of LCCM Theory. Section 5 addresses the distinction between linguistic metaphor and linguistic metonymy from the perspective of LCCM Theory. In section 6, I examine the way in which LCCM Theory complements other approaches in cognitive linguistics, before providing a brief conclusion in section 7. 2. Phenomena to be accounted for In the present paper I am concerned with providing a theoretical account of two related issues. Firstly, I address the factors that give rise to figurative language, and pinpoint differences in terms of the linguistic mechanisms involved in figurative versus literal language understanding. To do so, I examine recent research on the processing of figurative and literal language from the perspective of psycho- and neurolinguistics. Findings here suggest that, in processing terms at least, the traditional view (e.g., Grice 1975; Searle 1979) of a neat distinction between literal and figurative language is untenable. I argue that the difference between figurative and literal language is a consequence of three distinct factors modelled by LCCM Theory, which account for the various findings to emerge on differences (and similarities) between the way in which literal and figurative language are processed by the mind/brain. Secondly, I am concerned with accounting for the distinction between two of the best studied types of figurative phenomena in cognitive linguistics, metaphor and metonymy. My focus is less on the distinction between metaphor and metonymy as conceptual phenomena (a backstage cognition perspective), but rather, with the way in which one might account for such figurative phenomena in terms of a theoretical account of language understanding (frontstage cognition). Hence, I am concerned with developing a theoretical account of how language users marshal linguistic and non-linguistic structures and mechanisms in the course of interpreting specific figurative utterances. In the remainder of this section I elaborate on the nature of literal versus figurative language, and metaphor versus metonymy, the sets of phenomena for which I develop an account. 2.1. Literal versus figurative language
The standard pragmatic view holds that there is a neat distinction between literal and figurative language (Grice 1975; Searle 1979). For instance, a
606
V. Evans
putatively figurative expression such as: My boss is a pussycat, would first involve processing and then rejecting a literal interpretation (sentence meaning). A second stage would then be required, where communicative principles are deployed in order to interpret the speakers intention (speaker meaning), giving rise to a figurative meaning. Such a view makes the following assumptions: i) ii) Literal language is processed more quickly than figurative language. Literal language is processed automatically while figurative language is not. If a literal conception is available no further processing is required.
We now know that the standard pragmatic view, and the assumptions it makes are, in fact, false. For instance, research on reading times associated with expressions that can be interpreted both idiomatically as well as literally e.g., kick the bucket, spill the beans has shown that the idiomatic meanings associated with expressions of this kind are understood more quickly than their literal meanings (Gibbs 1980, 1994; Gibbs et al., 1989; Giora et al., 2007). Moreover, other comprehension time tasks have shown that well-established metaphors are understood more rapidly than literal paraphrases (see Giora 2008 for a review). Moreover, even novel metaphors can be comprehended as rapidly as comparable literal expressions as long as the novel metaphors are contextually appropriate (Blasko and Connine 1993; see Glucksberg 2008 for discussion). Other comprehension time tasks have found that just as figurative language can be processed as quickly as literal language, it is also processed automatically, contra the assumption made by the standard pragmatic view. One line of evidence for believing that literal language is processed automatically without conscious control by the listener (Miller and Johnson-Laird 1976: 166) is due to the well-known Stroop Effect (Stroop 1935). In this classic experiment subjects are asked to identify the colour of coloured cards. When the cards also feature a printed colour word (e.g., red), if the word fails to correspond to the colour on the card, the word interferes with the processing of the correct colour response, as measured by reaction time. That is, even though the task doesnt ask subjects to do anything with the printed words, they are automatically processed. In order to test whether figurative language is also processed automatically, Goldvarg and Glucksberg (1998) presented subjects with noun-noun compounds. While some could only be paraphrased literally, others could be paraphrased either literally or metaphorically. Such examples included shark lawyer, which can be interpreted literally: e.g., a lawyer who acts for an environmental group, or metaphorically: e.g., a lawyer who is predatory and aggressive. If literal meanings, but not metaphorical meanings, are processed automatically, then the literal meaning should be the preferred interpretation. However, when subjects were asked to explain the meaning of such com-
Figurative Language Understanding in LCCM Theory 607 pounds, 75% of the paraphrases produced were found to be metaphorical, even when a literal paraphrase existed. Goldvarg and Glucksberg argue that this finding demonstrates that metaphoric interpretations do indeed arise automatically. In addition, findings from neurolinguistic research also support the view that metaphoric understanding begins as early in processing as literal understanding. One technique which has been employed to investigate differences between literal and figurative language processing is the measurement of eventrelated potentials (ERPs). An ERP is small voltage fluctuation in brain activity that can be measured in a non-invasive way, by having subjects wear a cap fitted with electrodes that measure voltages as they are exposed to linguistic stimuli. A particularly important ERP element is the so-called N400, which peaks approximately 400ms after exposure to a stimulus. ERPs are measured on a graph where relative amplitude of a given ERP element corresponds to relative electrical activity. The N400 is associated with integration of words or expressions with preceding words. In general terms, the N400 is greater when semantic integration is more difficult, which is interpreted as being an indication of greater processing cost. For instance, in sentences such as those in (1) one would expect the amplitude of the N400 to increase from (1a) to (1d): (1) a. b. c. d. The gazelles ran for cover when chased by lions The gazelles ran for cover when chased by rabbits The gazelles ran for cover when chased by bicycles The gazelles ran away when chased by jam tarts
The standard pragmatic model, recall, claims that literal language is processed first. When a literal meaning is found to be incongruous, a figurative interpretation commences. In neurolinguistic terms, this model predicts an initial effect of literal incongruity, which should result in an increased N400, followed by a later ERP effect when metaphoric interpretation is activated. Pynte, Besson, Robichon and Poli (1996) tested this prediction by exposing subjects to literal and metaphoric sentences of the sort given in (2): (2) a. Those animals are lions b. Those fighters are lions [literal stimulus] [metaphoric stimulus]
They found that both types of stimuli elicited an N400, with the metaphoric stimulus being slightly larger. However, they didnt find a subsequent reliable ERP effect. This suggests that while metaphoric integration may involve a different type of processing, the time course is similar to literal sentences, contrary to the prediction made by the standard pragmatic model. In the same study, metaphorically true sentences such as those in (3a) evoked a smaller N400 than literal (but false) sentences such as (3b):
608 (3)
V. Evans a. b. The divorce is a nightmare The divorce is a table
This provides evidence that metaphoric interpretation occurs at least as early as literal processing and can, in fact, be easier to process. Other studies suggest that different types of literal and metaphoric interpretations involve different levels of complexity, in terms of processing. For instance, Coulson and Van Petten (2002) found that while the N400 of literal and metaphoric sentences was qualitatively the same, the amplitude increased as a function of metaphoricity. To illustrate, consider the following sentences: (4) a. b. c. He knows whiskey is a strong intoxicant He has used cough syrup as an intoxicant He knows that power is an intoxicant
The first sentence provides a literal reading: whiskey is a strong intoxicant. The second sentence involves understanding cough syrup, which is not normally considered to have an intoxicating effect, as having the properties associated with intoxicants. Hence, the processing of this sentence involves integrating classes of entities that are not normally associated. Finally, the sentence in (4c) is metaphoric in nature, involving an abstract entity, power, which is being ascribed the properties of an intoxicant. Coulson and Van Pettern found that the N400 increased from (ac), which they interpreted as being a consequence of increased complexity of semantic integration. The findings briefly discussed above argue against a straightforward distinction, in processing terms, between the literal and the figurative. Coulson (2008) argues that processing costs are a consequence of the relative complexity of the mappings involved in integrating semantic elements. This means that while metaphoric language is often associated with a larger N400 this is not inevitably the case. We saw above, for instance, that metaphorically true assertions are processed more quickly than literally false assertions. Complexity, then, presumably involves not just integration of content from different regions of conceptual space (e.g., from different inputs of an integration network, as in Blending Theory), but successfully integrating semantic content which is in certain respects incongruent. An important consequence of the claim that relative complexity determines processing cost is that there are degrees of complexity, as is evident in the work of Coulson and Van Petten (2002). In her work, Giora (e.g., 2003, 2008) also argues against assuming a straightforward literal/figurative distinction. She proposes, instead, a salient/non-salient distinction. Giora suggests that it is relative salience, rather than whether an expression is literal or figurative, which determines whether a particular meaning is processed more quickly. Empirical support for this perspective comes from the finding, discussed above, that idiomatic meanings are processed more quickly than their literal paraphrase. Moreover, novel metaphors
Figurative Language Understanding in LCCM Theory 609 e.g., Her mind is an active volcano, take longer to process than more familiar metaphors, e.g., Children are precious gems, (Pexman et al., 2000), also in keeping with her salient/non-salient distinction. Despite the foregoing, the fact that a straightforward literal/figurative distinction is not evident in terms of language processing, does not rule out the possibility that the distinction holds at the level of knowledge representation. Indeed, I argue below that there is a distinction in terms of the types of knowledge to which words provide access. This corresponds to the literal/figurative distinction. One of the consequences of the perspective I present is that figurativity is seen as a graded phenomenon, which is continuous in nature: interpretations exhibit degrees of figurativity. Of course, one of the challenges for a theoretical account of figurative language understanding is to successfully deal with the range of empirical findings discussed above. I argue that figurative language understanding is influenced by three factors: levels of knowledge representation, relative salience, and relative complexity. I propose that it is the interaction of these three factors that accounts for the processing findings described above. 2.2. Metaphor versus metonymy
I now turn to the second issue I discuss in this paper. A large body of research in the cognitive science literature has assumed that figurative language is a single monolithic category (Coulson 2008: 191; see Gagnon et al., 2003; Oliveri et al., 2004 for critical reviews). While there are reasons for believing that the distinction between different sorts of tropes (e.g., metaphor vs. metonymy) is slippery (see Barnden 2010), there are nevertheless sound reasons for thinking that the terms metaphor and metonymy relate to prototypes belonging to distinct (albeit overlapping) categories, exhibiting clear differences in terms of form, as well as communicative and discursive function (see, for instance, Gibbs 1994; Cameron 1999, Deignan 2005a, 2005b, Barcelona 2000, Radden and Kvecses 2007, Panther and Thornburg 2003, Steen 2007). My second objective in this paper is to provide an account of the meaning construction processes responsible for the figurative language phenomena often described as constituting metaphor and metonymy. These are exemplified by expressions of the following kind: Metaphor (5) My boss is a pussycat Metonymy (6) The ham sandwich has wandering hands In contemporary language science, metaphor is often understood as involving the interpretation (or conceptualisation) of one entity in terms of something
610
V. Evans
else, as in my boss in terms of a pussycat. Metonymy on the other hand is often taken to relate to a referent other than the one literally designated. For instance, in (6), ham sandwich refers to a customer in a restaurant who happened to order a ham sandwich. Traditionally, linguistic metaphor has been thought of as relating to an implicit comparison.4 Examples such as those in (5), which make use of the predicate nominative (X is a Y) construction, are the kinds of examples that are usually employed to support this perspectivealthough it is important to observe that metaphoric language with this form, while salient, is but a relatively small subset of the range of metaphoric language commonly used (see Deignan 2005a). Within cognitive linguistics, early research argued that in contrast to metaphor, metonymy is primarily referential in nature, highlighting a particular referent by virtue of activating a contextually salient entity closely associated with the referent in question, sometimes expressed in terms of conceptual contiguity (see Lakoff and Johnson 1980; Lakoff and Turner 1989). For instance, in (6) above, given a restaurant scenario, the food item ordered by a given customer (ham sandwich) is likely, among waiting staff, to be particularly salient, and thus an effective means of identifying a specific referent, in this instance, a particular customer. As this example demonstrates, linguistic metonymy can be referential in nature: it relates to the use of expressions to pinpoint entities in order to talk about them. This shows that (prototypical) metonymy may function differently from metaphor. Hence, while we might informally gloss the function of metonymy as the relation in which Y stands for X, by the same token, metaphor is the relation X understood in terms of Y.5 In this paper, I demonstrate the similarities and differences, in language understanding, between metaphor and metonymy (as prototypes). 3. LCCM Theory: An Overview The account of figurative language understanding presented in this paper draws upon the Theory of Lexical Concepts and Cognitive Models, or LCCM Theory
4. 5.
See Evans and Green (2006) for a review. It is important to note that a range of important work has been carried out on the linguistic function and conceptual basis of metonymy. Some of this work has emphasised other functions performed by metonymy. See in particular the collection of papers in Barcelona (2000), and Panther and Thornburg (2003). For important research on the conceptual basis of metonymy see Kvecses and Radden (1998), Peirsman and Geeraerts (2006). See also Barnden (2010) for a recent review of differences between metaphor and metonymy and Gibbs (1994). Nevertheless, I will continue to emphasise what I consider to be the salient referential function of metonymy in the remainder of this paper.
Figurative Language Understanding in LCCM Theory 611 for short (see Evans 2006, 2007, 2009a, 2009b, 2010). LCCM Theory constitutes a model of lexical representation and semantic composition in language understanding. It models the nature of the symbolic units in languageand in particular semantic structurethe nature of conceptual representations, and the compositional mechanisms that give rise to the interaction between the two sets of representationsthe semantic and the conceptualin service of linguistically-mediated meaning construction. LCCM Theory derives its name from two theoretical constructs which are central to the model developed: the lexical concept and cognitive model. In this section I present an overview of LCCM Theory. 3.1. Semantic Representation in LCCM Theory
The overarching assumption of the theory is that the linguistic system emerged, in evolutionary terms, much later than the earlier conceptual system. The utility of a linguistic system, on this account, is that it provides an executive control mechanism facilitating the deployment of conceptual representations in service of linguistically-mediated meaning construction. Hence, semantic representations in the two systems are of a qualitatively distinct kind. I model semantic structurethe primary semantic substrate of the linguistic system in terms of the theoretical construct of the lexical concept. A lexical concept is a component of linguistic knowledgethe semantic pole of a symbolic unit (in Langackers e.g., 1987 terms)which encodes a bundle of various types of highly schematic linguistic content (see Evans 2006, 2009a, 2009b). In particular, linguistic content includes information relating to the selectional tendencies associated with a given lexical conceptthe range of collocational and collostructional behaviour of a given lexical concept (see Evans 2006, 2009b). While lexical concepts encode highly schematic linguistic content, a subsetthose associated with open-class formsare connected, and hence facilitate access, to the conceptual system. Lexical concepts of this type are termed open-class lexical concepts.6 Such lexical concepts are typically associated with multiple areas in the conceptual system, referred to as association areas. The range of association areas to which a given lexical concept facilitates access is termed an access site. LCCM Theory assumes that the access site for a given open-class lexical concept is unique. As lexical concepts facilitate access to a potentially large number of association areas in the conceptual system, any given open-class lexical concept, in principle, facilitates access to a large semantic potential. However, only a small subset of this semantic potential is
6.
See Evans (2009b) for the rationale for this position.
612
V. Evans
typically activated in interpretation of a given utterance. I identify distinct lexical concepts by providing a gloss in square brackets that relates to salient aspects of a lexical concepts linguistic content, and its conceptual content: the conceptual representations that make up its semantic potential. While the linguistic system evolved in order to harness the representational power of the conceptual system for purposes of communication, the human conceptual system, at least in outline, is not far removed from that of other primates (Barsalou 2005), and shows some similarities with that of other species (Hurford 2007). In contrast to the linguistic system, the conceptual system evolved primarily to facilitate functions such as perception, categorisation, inference, choice and action, rather than communication. In LCCM Theory, conceptual structurethe semantic representational substrate of the conceptual systemis modelled by the theoretical construct of the cognitive model. A cognitive model is a coherent body of multimodal knowledge grounded in the brains modal systems, and derives from the full range of experience types processed by the brain including sensory-motor experience, proprioception and subjective experience including affect. The conceptual content encoded as cognitive models can become reactivated during a process referred to a simulation. Simulation is a general purpose computation performed by the brain in order to implement the range of activities that subserve a fully functional conceptual system. Such activities include conceptualisation, inferencing, choice, categorisation and the formation of ad hoc categories.7 In line with recent evidence in the cognitive science literature, LCCM Theory assumes that language can facilitate access to conceptual representations in order to prompt for simulations (see Glenberg and Kaschak 2002; Kaschak and Glenberg 2000; Pulvermller 2003; Vigliocco et al., 2009; and Zwaan 2004. For a review see Taylor and Zwaan 2009. For nuanced views on the role of simulations see Chatterjee 2010; Mandler 2010). As noted above, in LCCM Theory this is effected by a subset of lexical conceptsopen-class lexical conceptsfacilitating access to the conceptual system via a number of association areas. Each association area corresponds to a cognitive model, as captured in Figure 1. A summary of some of the key terms deployed in LCCM Theory is presented in Table 1. I now briefly illustrate the distinction between the content encoded in the linguistic system by lexical concepts, and the content encoded in the concep-
7.
For discussion and findings relating to the multimodal nature of conceptual representations and the role of simulation in drawing on such representations in facilitating conceptual function see, for instance, Barsalou (1999, 2008), Glenberg (1997), Gallese and Lakoff (2005), and references therein.
Figurative Language Understanding in LCCM Theory 613
Figure 1.
An association between an open-class lexical concept and a cognitive model
Table 1. Key terms deployed in LCCM Theory Term Linguistic system Symbolic unit Lexical concept Linguistic content Conceptual system Description The collection of symbolic units comprising a language, and the various relationships holding between them A conventional pairing of a phonological form and a semantic element The semantic element that is paired with a phonological form in a symbolic unit The type of content encoded by a lexical concept. This content is of a highly schematic type that can be directly encoded in language The body of non-linguistic knowledge captured from perceptual experience that is made of perceptual states. This knowledge derives from sensory-motor experience, proprioception and subjective experience The representational form that knowledge in the conceptual system takes, as modelled in LCCM Theory. Consists of multimodal information captured from brain states, which give rise to a potentially unlimited set of simulations The nature of the knowledge encoded by a cognitive model That part of semantic representation encoded by the linguistic system. Semantic structure is modelled, in LCCM Theory, by lexical concepts That part of the semantic representation encoded by the conceptual system. Conceptual structure is modelled, in LCCM Theory, by cognitive models
Cognitive model
Conceptual content Semantic structure Conceptual structure
tual system by cognitive models. To do so, consider the use of the lexical item red in the following examples, adapted from Zwaan (2004): (7) a. b. The teacher scrawled in red ink all over the assignment The red squirrel is in danger of becoming extinct in the British isles
In the examples in (7), red designates two different sorts of sensory experience. That is, while the hue derived from the use of red in (7a) is quite a vivid red, the hue prompted for by (7b) is likely to be closer to a dun/browny colour.
614
V. Evans
Hence, what I refer to as the semantic potential of red is not there in the word itself. Whatever red designates, we are not dealing with purely linguistic knowledge. Rather, the word red provides access to (in this case), perceptual information and knowledge, which can be simulated, which is say, reactivated. Put another way, the hue derived is not a function of linguistic knowledge, but relates to what I am referring to as conceptual content. This is not to say that red does not provide linguistic knowledge. The form red has an associated lexical concept that I gloss as [red]. This encodes schematic linguistic content, designating that an entity is being referred to, that the entity being referred to is a relation of some kind, and that the relation is specifically an attribute of a thing. In short, while linguistic content includes highly schematic semantic knowledge, conceptual concept concerns richly detailed knowledge grounded in the information captured from multimodal brain states. 3.2. The Cognitive Model Profile
An important construct in LCCM Theory, and one that is essential to providing an account of figurative language understanding, as we shall see below, is that of the cognitive model profile. As an open-class lexical concept facilitates access to numerous association areas within the conceptual system, it facilitates access to numerous cognitive models. Moreover, the cognitive models to which a lexical concept facilitates access are themselves connected to other cognitive models. The range of cognitive models to which a given lexical concept facilitates direct access, and the range of additional cognitive models to which it therefore facilitates indirect access is termed its cognitive model profile. To illustrate, consider the cognitive model profile for the lexical concept which I gloss as [france] associated with the form France. A partial cognitive model profile for [france] is represented in Figure 2. Figure 2 represents an attempt to capture the sort of knowledge that language users must have access to when speaking and thinking about France. As illustrated by Figure 2, the lexical concept [france] provides access to a potentially large number of cognitive models. As each cognitive model consists of a complex and structured body of knowledge which provides access to other sorts of knowledge, LCCM Theory distinguishes between cognitive models which are directly accessed via the lexical conceptprimary cognitive modelsand those cognitive models which form sub-structures of those which are directly accessedsecondary cognitive models. These secondary cognitive models are indirectly accessed via the lexical concept. The lexical concept [france] affords access to a number of primary cognitive models, which make up the primary cognitive model profile for [france]. These are hypothesised to include: geographical landmass, nation state and holiday destination. Each of these cognitive models provides access to
Figure 2.
Partial cognitive model profile for [france]
further cognitive models. In Figure 2 a flavour of this is given by virtue of the various secondary cognitive models which are accessed via the nation state cognitive model: the secondary cognitive model profile. These include national sports, political system and cuisine. For instance, we may know that in France, the French engage in national sports of particular types, for instance, football, rugby, athletics, and so on, rather than others: the French dont typically engage in American football, ice hockey, cricket, and so on. We may also know that as a sporting nation they take part in international sports competitions of various kinds, including the FIFA football world cup, the Six Nations rugby competition, the rugby world cup, the Olympics, and so on. That is, we may have access to a large body of knowledge concerning the sorts of sports French people engage in. We may also have some knowledge of the funding structures and social and economic conditions and constraints that apply to these sports in France, Frances international standing with respect to these particular sports, and further knowledge about the sports themselves including the rules that govern their practice, and so on. This knowledge is derived from a large number of sources including direct experience and through cultural transmission (including language). With respect to the secondary cognitive model of political system, Figure 2 illustrates a sample of further secondary cognitive models which are accessed via this cognitive model. In other words, each secondary cognitive model has further (secondary) cognitive models to which it provides access. For instance, (french) electorate is a cognitive model accessed via the
616
V. Evans
cognitive model (french) political system. In turn the cognitive model (french) political system is accessed via the cognitive model nation state. Accordingly, nation state is a primary cognitive model while electorate and political system are secondary cognitive models. 3.3. Semantic Composition in LCCM Theory
LCCM Theory is motivated, in large part, by the observation that word meanings vary across contexts of use in terms of the conceptualisation that they, in part, give rise to. To illustrate, consider the following examples which relate to the lexical form France: (8) a. France is a country of outstanding natural beauty b. France is one of the leading nations in the European Union c. France beat New Zealand in the 2007 Rugby world cup d. France voted against the EU constitution in the 2005 referendum
In the first example, France relates to a specific geographical landmass coincident with the borders of mainland France. In the second example, France relates to the political nation state, encompassing its political infrastructure, political and economic influence and its citizens, including those in French overseas territories. In the example in (8c) France relates to the team of 15 rugby players, drawn from the pool of rugby players of French citizenship, who represented the French nation in the 2007 rugby world cup. In the final example, France relates to the French electorate, and specifically that part of the electorate which voted against proceeding with ratification of a proposed EU constitution in a national referendum in 2005. These examples illustrate that a word form such as France appears to be protean in nature: its meaning is flexible, in part dependent upon the context of its use. LCCM Theory accounts for variation in word meaning by proposing two compositional mechanisms which integrate information deriving from context with linguistic content and conceptual content. These mechanisms facilitate the integration of words and other grammatical constructions such that an utterance-level simulation is derived. This utterance-level simulation (informally, what we might think of as utterance meaning), is termed a conception in LCCM Theory. The two compositional mechanisms are lexical concept selection and fusion. The first, lexical concept selection, serves to identify the most appropriate lexical concept associated with a given form, during the processing of an utterance. As the linguistic system consists of symbolic unitsconventional pairings between phonological forms and lexical conceptsa form may potentially be associated with a large number of distinct lexical concepts. To illustrate, consider the lexical form in, which occurs in the following examples:
Figurative Language Understanding in LCCM Theory 617 (9) a. b. c. The kitten is in the box The flag is flapping in the wind John is in love
In each of these examples, a distinct lexical concept is selected for. The lexical concepts for in selected are [enclosure] for (9a), [prevailing conditions] for (9b) and [psycho-somatic state] for (9c).8 Selection relies on a number of constraining factors to determine the appropriate lexical concept: the lexical concept which best fits the conception under construction.9 Once a lexical concept has been selected, it must be integrated with other selected lexical concepts of the utterance, and, if it is an open-class lexical concept, interpreted in the light of conceptual structure to which it affords access, and the other open-class lexical concept(s) with which it has been integrated. That is, the selected lexical concept undergoes the second compositional process: namely fusion. Fusion is the integrative process at the heart of semantic composition in LCCM Theory, and the second of the two constituent processes of meaning construction. It results in the construction of a conception. This is achieved by recourse to two sorts of knowledge: linguistic content and conceptual content. Fusion is itself made up of two constituent processes: lexical concept integration and interpretation. The first relates to the integration of linguistic content, in order to produce, informally, the scaffolding for the activation of conceptual content. Both sorts of information, and both types of processes, are necessary for the construction of meaning, and thus the formation of a conception. Lexical concept integration involves the integration of lexical concepts in order to produce a composite unit: a lexical conceptual unit. The output of this process is a semantic value, a situated semantic attribution associated with a lexical conceptual unit based on integration of linguistic content. Hence, the semantic contribution of the lexical conceptual unit is highly schematic in nature. The lexical conceptual unit then undergoes interpretation. That is, openclass lexical concepts within the lexical conceptual unit activate part(s) of the conceptual content (the semantic potential) to which they facilitate access. That part of the semantic potential that becomes activated is constrained by the nature of the semantic value for the lexical conceptual unit of which the open-class lexical concept(s) are part, and which emerges from integration. That is, interpretationthe activation of conceptual contentis constrained by integrationthe unpacking of linguistic content. A diagrammatic
8. 9.
For discussion of the LCCM approach to polysemy see Evans (2010). For further discussion of this issue see Evans (2009b).
618
V. Evans
Figure 3. Processes of semantic composition in LCCM Theory
representation of the processes of semantic composition in LCCM Theory is provided in Figure 3. As it is interpretation, the activation of conceptual content guided by unpacked linguistic content, that is the most relevant of the compositional mechanisms for the discussion of figurative language, I focus in the remainder of this section on a more detailed discussion of interpretation. 3.4. Interpretation
In a lexical conceptual unit it is only open-class lexical concepts that undergo interpretation. The outcome of interpretation results in the open-class lexical concepts achieving an informational characterisation, which is to say a semantic interpretation facilitated by simulation. This takes place by virtue of the relevant part of the semantic potential to which the lexical concepts facilitate access becoming activated. In the canonical case, when there are two (or more) open-class lexical concepts in the same lexical conceptual unit, these lexical concepts undergo interpretation simultaneously. In such cases, interpretation of the lexical concepts is constrained by a process termed matching. The purpose of matching is to ensure that a coherent informational characterisation emerges: one in which coherent parts of the cognitive model profile to which the distinct lexical concepts facilitate access are activated. Hence, interpretation is a constrained process. To provide an immediate illustration of how interpretation proceeds, consider the expressions in (10) and (11) in the light of the partial primary cognitive model profiles for [france] in Figure 4 (based on Figure 2), for [region] in Figure 5 and for [nation] in Figure 6.
Figure 4. Partial primary cognitive model profile for [france]
Figure 5.
Partial primary cognitive model profile for [landmass]
Figure 6.
Partial primary cognitive model profile for [nation]
(10) France, the landmass (11) France, the nation In each of these examples France receives a distinct informational characterisation. In (10) France relates to a geographical area, while in (11) it relates to a political entity. My purpose here is to illustrate how it is that each of these instances of France receives a distinct interpretation. As we have seen earlier, the lexical concept [france] affords access to conceptual content relating, at the very least, to France as a geographical region, as a political entityincluding knowledge relating to the French political system, the French people and their social customs practices, their history and language and, the national sports engaged in, and so forthand to France as a holiday destination, with, perhaps, knowledge relating to the sorts of holiday activities it is possible (or typical) to engage in, in France, such as skiing (in the Alps), seaside holidays (on the Mediterranean coast), and so on.
620
V. Evans
The lexical concept [landmass]see Figure 5facilitates access, at the very least, to primary cognitive models that relate to a physical terraina landmass can be hilly, mountainous, may consist of plains, woodland, and so on or to a geographical area. Figure 6 relates to a very partial primary cognitive model profile for [nation]. This lexical concept, at the very least, facilitates access to cognitive models having to do with a political entity, a nation-state, and hence a particular political system, a people (with common customs, traditions, cuisine, and so on), and language (and/or languages), and a common (often complex) history. Interpretation works by virtue of the process of matching, which takes place between the cognitive model profiles accessed by the open-class lexical concepts which are subject to matching. In terms of the examples in (10) and (11), the relevant lexical concepts are [france], [landmass] and [nation]. Interpretation involves establishing a match between one (or more) cognitive models in the cognitive model profiles associated with the relevant lexical concepts. This process serves to activate the matched cognitive models. For instance, in the example in (10), a match is established between the primary cognitive model profile associated with [landmass], and one of the cognitive models to which [france] affords access. This of course is the cognitive model geographical region, accessed via the lexical concept [france], which becomes activated. In the second example, the match takes place between the primary cognitive model profile to which [nation] affords access and the nation state cognitive model to which [france] affords access. Hence, the reason for different readings of [france] in (10) and (11) is because the lexical concept in each utterance receives a distinct informational characterisation. In (10) interpretation results in an informational characterisation for [france] relating to France as geographical landmass. In (11) interpretation results in an informational characterisation of a political entity: France the nation-state. The compositional mechanisms in LCCM Theory, including matching, are subject to constraints. These constraints are formalised by a number of principles that govern the operation of semantic composition.10 The matching operation central to interpretation is constrained by the Principle of Conceptual Coherence. This can be stated as follows: (12) Principle of Conceptual Coherence Matching occurs between one or more cognitive models belonging to distinct cognitive model profiles, which share schematic coherence in terms of conceptual content.
10.
See Evans (2009b) for detailed discussion.
Figurative Language Understanding in LCCM Theory 621 This principle relies on a second principle, the Principle of Schematic Coherence: (13) Principle of Schematic Coherence The conceptual content associated with entities, participants and the relations holding between them must exhibit coherence in fusion operations. What the two principles do, in (12) and (13), is to guarantee that matching takes place only when the cognitive models that undergo the matching process i) belong to different cognitive model profilesand hence are accessed by different lexical conceptsand ii) exhibit coherence. To illustrate consider the example in (14) which again employs the lexical concept [france]: (14) France is beautiful. The example in (14) provides what I will term a geographical region conception. A common conception arising from (14), without a further specifying linguistic or extra-linguistic context, might relate to an understanding of France as a geographical region which is physically beautiful, for instance in terms of its landscape, and so forth. This takes place by virtue of the lexical concepts [france] and [beautiful] undergoing matching, giving rise to an informational characterisation. The Principles of Conceptual and Schematic Coherence in (12) and (13) determine how the matching process is constrained and hence how, in general terms, the cognitive models across cognitive model profiles to be matched are selected. To make this clear consider the partial cognitive model profile for the lexical concept [beautiful], given in Figure 7. The lexical concept [beautiful] facilitates access, at the very least, to cognitive models that have to do with multimodal knowledge relating to visual pleasure, non-visual pleasure (such as touch and sexual arousal, for instance), and aesthetic pleasure, relating, for instance, to our experience of pleasure arising from an appreciation of literature, music, language, and so on.
Figure 7.
Partial primary cognitive model profile for [beautiful]
622
V. Evans
Matching takes place by conducting what is referred to as a search in the primary cognitive model profiles of the two lexical concepts subject to matching, as guided by the principles in (12) and (13). That is, the primary cognitive models accessed by [france] (Figure 3) and [beautiful] (Figure 7) are searched in order to identify a match at the level of schematic coherence across conceptual content. Put another way, the match relates not to details of similarity, but rather, how schematically coherent the conceptual content is. In terms of the three primary cognitive models given for [france] in Figure 4, only that of geographical region achieves a match in terms of schematic coherence with one (or more) of the primary cognitive models for [beautiful]. After all, the holiday destination cognitive model has to do with the nature and types of holiday opportunities that exist in France, while the nation state cognitive model concerns the nature of France as a political entity. In contrast, the geographical region cognitive model might include knowledge relating to the physical beauty, particularly the visual pleasure, that derives from aspects of France as a geographical region. Hence, a match takes place between at least one of the primary cognitive models accessed via [beautiful] and the geographical region cognitive model accessed via the [france] lexical concept. For this reason, a match is established between the primary cognitive model profile of [beautiful] and the geographical region cognitive model of [france]. This results in an informational characterisation geographical region for [france]. 4. Figurative Language in LCCM Theory In this section I address figurative language from the perspective of LCCM Theory. I argue that distinct levels of knowledge representationthe distinction between primary versus secondary cognitive model profiles, as introduced abovegives rise to a distinction in literal versus figurative language. However, there are two further phenomena that are relevant for language understanding: salience and complexity. As we shall see, these three factors contribute to figurative language understanding, accounting for the psycholinguistic findings discussed earlier. Salience and complexity are also relevant for literal language understanding. Salience, in present terms, relates to how well entrenched a given lexical concept is in semantic memory. Language understanding makes use of a complex repertoire of lexical concepts which are integratedthe process of lexical concept integration. As some lexical concepts are likely to be better entrenched than others, this provides one way in which the distinction between the literal versus figurative arises in terms of language processing, as I will discuss. Complexity, in present terms, relates to the length of the access route through a cognitive model profile, as I shall discuss. In language understanding, greater

Table 2. Theoretical constructs for modelling factors involved in figurative language understanding Phenomenon Degree of literality/figurativity Relative salience Relative complexity How modelled in LCCM Theory? Cognitive model profile structure (i.e., primary vs. secondary cognitive models) Degree of entrenchment of lexical concept(s) Access route length (through the cognitive model profile)
processing effort, and hence greater complexity, is a consequence of the relative centrality of a conceptual unit of knowledge to a lexical concepts access site. The greater the access route lengthwhich amounts to a greater number of cognitive models becoming activated in order to facilitate matching and hence interpretationthe more complex a given conception is. As with the notion of salience, complexity is a factor in language processing, which serves to blur the distinction between literal versus figurative language, as we shall see. LCCM Theory takes the view that literal and figurative language are probably idealised end-points on continuum,11 resulting from the intersection of these three distinct types of phenomena (summarised in Table 2). These three factors intersect during the process of language understanding to give rise to degrees of literality and figurativity. Moreover, the mechanisms provided by LCCM Theory elegantly model, I argue, findings from psycho- and neurolinguistics, as described by Coulson (2008), Glucksberg (2008) and Giora (2008), amongst others. 4.1. Literal versus figurative language understanding
In this section I present the way in which the distinction between literal versus figurative language is modelled by LCCM Theory. In later sections I consider the notions of salience and complexity. The distinction between what I will refer to as a literal conceptionthe meaning associated with a literal utteranceon the one hand, and a figurative conceptionthe meaning associated with a figurative utteranceon the other, relates to that part of the semantic potential which is activated during the process of interpretation while constructing a conception. While a literal conception canonically results in an interpretation which activates a cognitive model, or cognitive models, within the primary, which is to say default, cognitive model profile, a figurative conception arises when a clash arises in the primary cognitive model profiles subject to matching. This is resolved by one of the
11.
See also Sperber and Wilson (2008) who argue, albeit from a different perspective, that figurative language (e.g., metaphor) forms a continuum with other types of language use.
624
V. Evans
cognitive model profiles achieving a match in its secondary cognitive model profile. A figurative conception arises, therefore, when a match is achieved in the secondary cognitive model profile of one of the lexical concepts undergoing matching. To illustrate, consider the following examples, again making use of the lexical concept [france], which relate to a literal versus a figurative conception respectively: Literal conception (15) France has a beautiful landscape Figurative conception (16) France rejected the EU constitution A literal conception arises for the first example, in (15), by virtue of a match occurring between the informational characterisation of the lexical concepts associated with the expression beautiful landscapethe result of a prior match between [beautiful] and [landscape]and the primary cognitive model profile to which [france] affords access, these being the only expressions in this utterance which are associated with conceptual content. This occurs as follows. The informational characterisation for [beautiful] and [landscape] undergoes matching with the cognitive model profile to which the lexical concept [france] facilitates access. Hence, a search takes place in the primary cognitive model profile associated with [france]. The Principles of Conceptual Coherence and Schematic Coherence ensure that a match is achieved in the primary cognitive model profile of [france]. In terms of activation of cognitive models for [france] in the example in (15), the Principle of Conceptual Coherence ensures that the geographical landmass cognitive model for [france] is activated (recall the cognitive model profile for [france] presented in Figure 2). That is, it is this cognitive model which achieves a match with the informational characterisation associated with the lexical concepts associated with the expression beautiful landscape. Hence, the conception which arises for (15) is literal, as activation occurs solely in the primary cognitive model profile (of [france]). In contrast to (15), the example in (16) is usually judged as being figurative in nature. While France in (15) refers to a specific geographical regionthat identified by the term Francein the example in (16) France refers to the electorate majority who voted against implementing an EU constitution in a 2005 referendum. This figurative conception arises due to a clash arising between the primary cognitive model profile of [france], as represented by Figure 4, and the informational characterisation associated with the expression rejected the EU constitution. That is, none of the primary cognitive models to which [france]
Figurative Language Understanding in LCCM Theory 625 facilitates access can be matched with the informational characterisation associated with the expression rejected the EU constitution due to application of the Principles of Conceptual and Schematic Coherence given in (12) and (13). The failure of matching in the primary cognitive model profile for [france] requires establishing a wider search domain, namely matching in the secondary cognitive model and hence cognitive models to which the lexical concept [france] provides only indirect access. This process of clash resolution is constrained by the Principle of Ordered Search which is given in (17): (17) Principle of Ordered Search If matching is unsuccessful in the default search domain, which is to say, a clash occurs, then a new search domain is established in the secondary cognitive model profile. The search proceeds in an ordered fashion, proceeding on the basis of secondary cognitive models that are conceptually more coherent with respect to the primary cognitive models (and hence modelled as being conceptually closer in the cognitive model profile) prior to searching cognitive models that exhibit successively less conceptual coherence. In essence, the Principle of Ordered Search ensures the following. When there is a clash in the primary cognitive model profiles of the lexical concepts or informational characterisation(s) in question, as in (16), a larger search region is established which includes cognitive models in relevant secondary cognitive model profile(s). This principle thus enables clash resolution by virtue of facilitating a search region beyond the default search region. With respect to the example in (16), due to application of the Principle of Ordered Search, a secondary cognitive model is identified which achieves schematic coherence thereby avoiding a clash, and thus achieving a match. The cognitive model which achieves activation is the electorate cognitive model (see Figure 2). Hence, in (16), the process of interpretation results in an informational characterisation for [france] which is that of electoral majority. As the electorate cognitive model is a secondary cognitive model (recall the discussion relating to Figure 2 above), this means that the conception is figurative in nature. In order to summarise the main distinction between the construction of literal versus figurative conceptions, based on the mechanisms proposed by LCCM Theory, consider Figure 8. Figure 8 illustrates the following. At interpretation, the primary cognitive model profiles for lexical concepts which afford access to conceptual content undergo matching. The Principle of Conceptual Coherence requires that a clash in the cognitive model profiles of the two (or more) lexical concepts undergoing interpretation is avoided. The Principle of Ordered Search ensures that if there is no match in the primary cognitive models of the lexical concepts subject to matching then clash resolution is required. In order to achieve this, a
626
V. Evans
Figure 8. Meaning construction processes in LCCM Theory leading to literal versus figurative conceptions
search is initiated in the secondary cognitive model profile. The secondary cognitive model profile of a lexical concept relates to knowledge that is not directly associated with a given lexical concept, as it does not form part of a lexical concepts access site. As such, the secondary cognitive model profile constitutes a very large semantic potential available for search. The Principle of Ordered Search ensures that the search in the secondary cognitive model profile proceeds in a coherent way. That is, the secondary cognitive models are searched to facilitate a match based on their conceptual coherence with the primary cognitive models which form part of the lexical concepts access site. Hence, this principle ensures that secondary cognitive models are searched in
Figurative Language Understanding in LCCM Theory 627 the order of their relative distance from the point of lexical access. Secondary activation continues upwards through the secondary cognitive model profile until a match is achieved, giving rise to activation of one or more secondary cognitive models. The consequence of this is that activation of a secondary cognitive model that is relatively further removed, in conceptual terms, from a secondary cognitive model that is relatively less removed from the default search region, is likely to be judged as being more figurative in nature. In sum, the defining feature of a literal conception is that matching occurs in the primary cognitive model profiles of the relevant lexical concepts. The defining feature of a figurative conception is a clash in the primary cognitive model profiles of the relevant lexical concepts necessitating clash resolution, and hence activation of cognitive models in the secondary cognitive model profile of one (or more) of the relevant lexical concepts. Moreover, the further the conceptual distance required in the secondary cognitive model to achieve clash resolution by virtue of a successful match, the greater the access route length in the cognitive model profile, and hence the greater the figurativity of the expression (as discussed further below in terms of complexity). 4.2. Salience
While the situation described in section 4.1. relates to an idealised scenario, in practice language understanding is more complex than this. For one thing, semantic structure consists of a vast repertoire of lexical conceptsthe semantic poles of linguistic forms, as described above. And moreover, lexical concepts exhibit degrees of complexity as they can be internally open or internally closed. For instance, the distransitive construction, as studied by Goldberg (e.g., 1995), and as exemplified in (18) involves a lexical concept that is internally open: the lexical concept in (18b) can be integrated with other lexical concepts as exemplified by the lexical concepts conventionally paired with the forms in (19): a. Form: Subj verb Obj1 Obj2 b. Lexical concept: [entity x causes entity y to receive entity z] (19) Sally, gave, John, a kiss In addition, forms can be conventionally paired with more than one internally open lexical concept. Consider the expression in (20): (20) I hit the roof This expression potentially instantiates two distinct lexical concepts, given in (21): (21) a. b. [x exerts transfer of energy with respect to z] [x becomes very angry] (18)
628
V. Evans
While the lexical concept in (21a) can be instantiated by a wide number of expressions, as in (22), which is a consequence of its form which is lexically underspecified, the lexical concept in (21b) has a smaller range of instantiations, as illustrated in (23): (22) a. b. c. (23) a. b. c. I/he/she/we/they hit the nail/wall/box/floor, etc. I/he/she/we/they kicked the wall/box/floor/man, etc. I/he/she/we/they punctured the balloon/tyre/bubble/inflatable ring, etc. and so on I/he/she/we/they hit the roof I/he/she/we/they will hit the roof I/he/she/we/they are bound to hit the roof and so on
The instantiation in (20) of (21a) is normally described as being literal, while the instantiation in (20) of (21b) is normally described as idiomatic (or figurative). But from the perspective of LCCM Theory, both lexical concepts are, in a fundamental sense, idiomatic. They relate to distinct lexical concepts: each provides a schematic meaning that can be instantiated by the expression in (20). The different interpretations associated with (20), the literal (I physically punched the roof) reading versus the idiomatic (I flew into a rage) readings are a consequence of two distinct lexical concepts which encode a distinct semantic value: they are semantic units which are conventionally associated with a given form, and in this sense are idiomatic. For the present discussion, what is important to bear in mind is that the lexical concept in (21b) is more saliently associated with the form in (20), than the lexical concept in (21a). This follows as the form with which the lexical concept in (21b) is conventionally paired is partially lexically specified, and includes the obligatory elements hit the roof, as exemplified in (24): (24) a. Form: Subj hit + TNS the roof This being the case, LCCM Theory makes the claim that as the expression in (20) so closely instantiates the form in (24) which is conventionally paired with the lexical concept in (21b), the most salient reading of (20) will correspond more closely to the idiomatic reading associated with the lexical concept in (21b) rather than (21a). In fact, LCCM Theory makes the further prediction that this reading should be processed more quickly than the literal reading, which is exactly what psycholinguistic studies reported on above do indeed find. In cases such as (20), where an idiomatic reading is derived, the process of clash resolution described in section 4.1. doesnt apply. This is because the process of interpretation follows, and is guided by, the process of lexical con-
Figurative Language Understanding in LCCM Theory 629 cept integration. The lexical concept in (21b) provides a schematic semantic unit which guides the way in which the individual lexical concepts that are integrated with this internally open lexical concept are combined, and subsequently undergo interpretation. As there is a semantic unit that provides a holistic meaning, the entire expression functions as a single lexical concept for purposes of interpretation. That is, there is no matching to be done, and hence no clash to be resolved. And because there is no matching to be done, language understanding proceeds more quickly, in the case of the lexical concept in (21b) than the lexical concept in (21a). I now turn to a slightly different manifestation of salience. In some accounts of figurative language phenomena, examples such as the italicised lexical items in each of the following are taken to be figurative (and specifically metaphoric) in nature:12 (25) a. b. c. d. That is a loud shirt They have a close relationship She is in love That took a long time
In these examples, the use of loud refers to a brightly coloured shirt, close relates to emotional closeness, in relates to an emotional state while long relates to extended duration. From the perspective of LCCM Theory, such usages relate to distinct lexical concepts, rather than interpretations arising due to clash resolution (as described in section 4.1.). For instance, LCCM Theory predicts that long has at least two conventionally established lexical concepts associated with it: [extended in horizontal space], and [extended duration]. During lexical concept selection the [extended duration] lexical concept is selected, as this is the most salient lexical concept associated with long, in view of the lexical concept that is paired with the form time.13
12.
For instance, some accounts of linguistic metaphor, such as the metaphor identification criteria as developed by the Pragglejaz Group (2007), would classify these examples as being instances of metaphor. 13. Note that, by claiming that conventional lexical concepts do not require clash-resolution, I am not excluding the possibility that examples such as (25) may give rise, at the conceptual level, to distinct conceptual metaphors, (e.g., deviant colours are deviant sounds for A loud shirt, or degree of affection is spatial connection for They have a close relationship, etc.), or that conceptual metaphors may have, in part, motivated the existence of the examples in the first place. I am simply making the point, from the perspective of a linguistically informed account of figurative language understanding, that there are likely to be highly conventional lexical concepts in addition to any putative conceptual metaphors (or metonymies). This is an issue I return to later in section 6 when I consider the status of conceptual metaphors within the LCCM account of figurative language understanding.
630
V. Evans
In processing terms, upon encountering the form long, both the [extended in horizontal space] and [extended duration] lexical concepts will receive background activation. However, upon encountering the form time, the [extended duration] lexical concept is selected for. And crucially, the [extended duration] lexical concept conventionally associated with long provides a different access site to that of the [extended in horizontal space] lexical concept: both facilitate access to a different set of primary cognitive models. The [extended duration] lexical concept for long, and the [duration] lexical concept associated with time facilitate access to cognitive model profiles which can be matched in their primary cognitive model profiles. Hence, an example such as this does not lead to a clash in the primary cognitive model profiles undergoing matching. In examples such as these, LCCM Theory is able to account for the finding that conventionalised metaphors such as these examples are processed as quickly as putatively non-metaphorical examples. In fact, in the examples in (25), the linguistic context makes salient an entrenched lexical concept. From this perspective, (25d), for instance, is only judged as being metaphoric if the [extended duration] lexical concept for long, for instance, is judged by the analyst as, in some sense, less prototypical (or more abstract) than the [extended in horizontal space] lexical concept. In terms of the prediction made by LCCM Theory, in all other respects, these examples are no different, in processing terms, than those given in (26): (26) a. b. c. d. That is a green shirt They have a loving relationship She experiences love That took an extended period of time
The LCCM Theory account of expressions such as long, as in long time, is consonant with the approach developed in the Career of Metaphor Hypothesis (Bowdle and Gentner 2005). In the Career of Metaphor Hypothesis, highly conventionalised linguistic metaphors are treated as being polysemous senseunits which are conventionally associated with the base term, here, long, and which are accessed via a lexical look-up process, rather than by establishing structural alignments and inference projections (mappings) between a base and target. From the LCCM perspective, the interesting question in such cases concerns not whether these cases are metaphoric or notthey do not involve clash resolution and hence are not figurative conceptions, from the LCCM perspective. Rather, the more interesting question concerns how an [extended duration] lexical concept became conventionally associated with the form long in the first place. Recent work on semantic change pioneered by Elizabeth Closs
Figurative Language Understanding in LCCM Theory 631 Traugott (e.g., Traugott and Dasher 2004) has argued that situated implicatures (or invited inferences) can become detached from their contexts of use and reanalysed as being distinct sense-unitslexical concepts in present terms which are associated with a given form. The [extended duration] lexical concept associated with long might be historically derived from contexts of communication in which reference to length can be understood as reference to duration without harming expression of the communicative intention, as in communication about long journeys. Through repeated use of this form in such bridging contexts (Evans and Wilkins 2000), which is to say, with the inferred meaning, it is plausible that long developed an [extended duration] lexical concept by virtue of decontextualisation (Langacker 1987). 4.3. Complexity
The third factor that I consider in figurative language understanding is complexity. This relates to the length of the access route, in cases of clash resolution. Access route length gives rise to degree of figurativity. That is, figurative conceptions themselves exhibit degrees of figurativity and hence are graded. LCCM Theory claims that a longer access route corresponds to a more figurative conception. Moreover, it predicts that that there is a greater processing cost associated with conceptions involving a greater access route length, for instance in terms of the amplitude of the N400 (in ERP terms). To illustrate, consider the following metaphoric conceptions: (27) a. That soldier is a lion b. That ballerina is a lion
LCCM Theory claims that figurative conceptions emerge for examples such as those in (27). Due to a failure to match in the primary cognitive model profiles to which [soldier] and [lion], and [ballerina] and [lion] facilitate access, clash resolution is initiated. This involves, in both cases, establishing a search region in the secondary cognitive model profile for [lion].14 Due to the Principle of Ordered Search, the search proceeds such that cognitive models that are conceptually closer to the access site are searched prior to those which are conceptually more distant. Due to the Principle of Conceptual Coherence, the search is only complete when a match is achieved between a cognitive
14.
See discussion in section 5 for why it is that a search region is established in the cognitive model profile for [lion] rather than [soldier] or [ballerina].
632
V. Evans
Figure 9. Partial cognitive model profile for [lion]
model in the respective primary cognitive model profiles of [soldier] and [ballerina], on one hand, and the secondary cognitive model profile of [lion] on the other. To illustrate, consider a partial cognitive model profile for [lion] in Figure 9. The lexical concept [lion] facilitates access to a number of primary cognitive models: its access site. These include, at the very least, bodies of knowledge relating to a lions physical attributes, including its bodily formits morphology, the fact that lions have a mane, lionesses dont, and so onits social behaviourincluding social groupings, mating behaviour, and so onits habitatincluding the geographical regions where lions are foundand its hunting behaviour. The cognitive model hunting behaviour provides access to a range of secondary cognitive models including information about prey types (buffalo, wildebeest, gazelle, and so on), which can often be larger than the lion, the behaviour it exhibits in stalking and subsequently subduing prey including ferocity and strength, and the apparent fearlessness exhibited by lions in attacking prey often much larger than themselves. A further secondary model, which is presumably accessed from scenarios involving the stalking behaviour exhibited by lions, is that of the immense patience and persistence exhibited. Like all cats, lions have great acceleration but little stamina, hence they must get very close to their intended prey if they are to have a reasonable chance of catching and subduing the herbivores they prey upon before their prey can escape. Lions (and particularly lionesses) exhibit extreme patience in stalking prey in order to gain an opportunity to strike. Returning to the examples in (27), the kinds of scenarios in which soldiers may find themselves, in which they face a strong enemy, and must risk their lives, may require displays of strength/ferocity and/or fearlessness.
Figurative Language Understanding in LCCM Theory 633 Hence, when describing a soldier as a lion, LCCM Theory would predict that, without a further narrowing context, either (or both) of these secondary cognitive models become activated in service of facilitating clash resolution. The utterance involving a ballerina is slightly different: after all, a ballerina as part of her professional duties does not normally engage in situations which require displays of ferocity or fearlessness. However, ballet, by its very nature, requires a vast amount of practice. And, moreover, it can require undergoing a great deal of discomfort, as evidenced by the physical deformities that experienced ballerinas can suffer due to the physically demanding nature of some of the techniques practised on a daily basis. In this context, describing a ballerina as a lion might activate the patience/persistence secondary cognitive model associated with [lion]. While fearlessness and ferocity are qualities that are perhaps, self-evidently associated with lions, patience/persistence is less obviously associated with lions. Nevertheless, my claim is that some language users, especially zoologists, and others who have detailed knowledge of lions, are likely to have knowledge relating to the displays of extreme patience exhibited by lions in stalking their prey. But the very fact that such a secondary cognitive model may require specialist knowledge of the hunting behaviour associated with lions demonstrates that the knowledge structure I gloss as patience/persistence is conceptually less close to the access site (the primary cognitive models) for [lion] than strength/ferocity or fearlessness. Put another way, to activate the patience/persistence secondary cognitive model involves a longer access route than that required to activate either the strength/ferocity or fearlessness secondary cognitive models. Thus, the prediction made by LCCM Theory is that the example in (27b) would be judged as exhibiting greater figurativity than the example in (27a). And moreover, the further prediction would be that this is due to greater complexity involved in integrating the cognitive model profiles involved (that associated with [lion] with that accessed by [soldier], and [lion] with [ballerina]). Hence, in processing terms, the prediction is that there is a greater cognitive cost involved in processing (27b) than (27a). The neurolinguistic findings discussed by Coulson (2008) seem to support such a prediction. 5. Metaphor and Metonymy In the light of the discussion in section 4.1., in this section I consider the nature of two specific types of figurative conceptions: those associated with metaphor and metonymy in language understanding. This section illustrates that, using the meaning construction mechanisms of LCCM Theory, it is possible to distinguish (at least prototypical) instances of metaphoric and metonymic language.
634 5.1.
V. Evans Metaphor
In this section I focus on metaphoric conceptions employing the predicate nominative (i.e., X is a Y) construction.15 This has traditionally been the kind of linguistic form par excellence that has been studied under the heading of metaphor, particularly by psycholinguists (e.g., Giora 2003, Glucksberg 2001 and Gentner et al., 2001), philosophers of language (Leezenberg 2001; Stern 2000) and scholars in the pragmatics tradition (e.g., Carston 2002; Sperber and Wilson 1995, 2008).16 To illustrate, I will consider the metaphoric conception that emerges based on the example in (28) (28) My boss is a pussycat What is strikingly figurative about the example in (28) is that the entity designated by my boss is not normally taken as being a member of the class of pussycats. Nevertheless, the predicate nominative construction is normally taken as having a class-inclusion function associated with it: (29) My boss is a beer drinker This expression, exemplified by the utterance in (29), involves the copular or linking verb be which combines with a nominal, e.g., a beer drinker. The nominal functions as the essential part of the clausal predicate: is a beer drinker. The function of the lexical concept conventionally paired with be in this symbolic unit is to signal a stative relation (Langacker 1991): namely, my boss is a member of the class of beer drinkers, a situation which persists through time. The same cannot hold for the example in (28) as, in the normal course of events, someones boss cannot literally be a pussycat. That is, the entity designated by the expression my boss is not normally taken to be a member of the class of pussycats. The metaphoric conception which this utterance gives rise to is derived from a property which is usually associated with pussycats, namely that they are extremely docile and often affectionate, and thus not frightening or intimidating in any way. In this utterance, we are being asked to understand the boss, not in terms of being a pussycat, but in terms of exhibiting some of the properties and behaviours often associated with pussycats as manifested towards their human owners, such as being docile, extremely friendly and thus non-forbidding and perhaps easy to manipulate.
15. 16.
I will consider other types of metaphoric language later when I discuss Conceptual Metaphor Theory. It is important to note that this particular construction forms only a small subset of the way metaphor emerges in language use, cf. Jane is a weasel vs. Jane weaselled out of that. See Deignan (2005a) for a corpus-based analysis of the forms that metaphoric language takes.
Figurative Language Understanding in LCCM Theory 635 The LCCM approach to figurative meaning construction allows us to see the similarities and differences between metaphor and the literal predicate nominative examples such as (29). An important point of similarity relates to the process of fusion crucial for meaning construction, involving interpretation in particular. As noted in section 4.1., figurative language, of which (prototypical) metaphor is a sub-type, diverges from literal language use in terms of activation in the secondary cognitive model profile of the lexical concept which is undergoing clash resolution. In an utterance such as My boss is a beer drinker, the two relevant lexical concepts for interpretation are [boss] and [beer drinker]. This follows as these are the only two lexical concepts in the utterance which have access sites and thus provide direct access to conceptual content. Interpretation proceeds by attempting to match cognitive models in the primary cognitive model profiles associated with each of these lexical concepts as guided by the Principle of Conceptual Coherence and application of the Principle of Ordered Search. A match is achieved in the primary cognitive model profiles of each lexical concept. That is, it is semantically acceptable to state that My boss is a beer drinker because the referent of my boss is a human and humans can (and do) drink beer. Now lets consider how the metaphoric conception arises. In the example in (28), the process of interpretation leads to a clash in the primary cognitive model profiles of [boss] and [pussycat]. This is where metaphor differs from literal class-inclusion statements. A partial primary cognitive model profile for [boss] is provided in Figure 10. The primary cognitive model profile for [boss] includes, at the very least, cognitive models relating to the fact that a boss is, typically, a human being, and the complex body of knowledge we each possess concerning what is
Figure 10.
Partial cognitive model profile for [boss]
636
V. Evans
involved in being a human being, that a boss has particular pastoral responsibilities with respect to those for whom he or she is line-manager, as well as managerial responsibilities and duties, both with respect to those the boss manages, the subordinate(s), and the particular company or organisation for whom the boss works. In addition, there are an extremely large number of secondary cognitive models associated with each of these, only a few of which are represented in Figure 10. In particular, by virtue of being a human being, a boss has a particular personality and exhibits behaviour of various sorts, in part a function of his/her personality, in various contexts and situations. In addition, each boss exhibits a particular managerial style, which includes interpersonal strategies and behaviours with respect to those the boss manages. The boss can, for instance, be aggressive or docile with respect to the subordinate. Moreover, there is a clichd cultural model of a ferocious and aggressive boss who seeks to keep employees on their toes by virtue of aggressive and bullying interpersonal behaviour. By contrast, a boss who is relatively placid and can thus be treated as a colleague rather than a superior may be somewhat salient with respect to the stereotype.17 Just as the lexical concept for [boss] has a sophisticated cognitive model profile to which the lexical concept potentially affords access, so too the [pussycat] lexical concept provides access to a wide range of knowledge structures. A very partial cognitive model profile is provided in Figure 11. The lexical concept [pussycat] relates to cognitive models having to do with, at least, knowledge concerning physical attributes, including body shape and size, diet and eating habits, patterns of behaviour, and a pussycats status in western culture as the household pet of choice for many people. In terms of secondary cognitive models, there are a number that relate to our knowledge associated with the sorts of behaviours pussycats exhibit. For instance, pussycats exhibit motor behaviour of certain kinds including the particular manner of motion pussycats engage in. Pussycats also exhibit animal behaviours of certain kinds including hunting, reproduction and so forth. Finally, pussycats also exhibit social behaviour, including behaviour towards other conspecifics, and behaviour towards humans. Hence, social behaviour is a cognitive model relating to at least two primary cognitive models: those of patterns of behaviour and household pet. In the example in (28), a figurative conception arises due to a failure to establish a match in the primary cognitive model profiles associated with [boss] and [pussycat], the two lexical concepts relevant for interpretation. Hence, a
17.
See Lakoffs (1987) discussion of the way in which what he refers to as idealised cognitive models (ICMs), can metonymically give rise to prototype effects, by serving as cognitive reference points.
Figure 11.
Partial cognitive model profile for [pussycat]
clash occurs leading to a search in a secondary cognitive model profile. In LCCM Theory, the particular lexical concept selected for clash resolution, and hence, for activation in the secondary cognitive model profile, is contextually determined. This is formalised as the Principle of Context-induced Clash Resolution. This can be stated as follows: (30) Principle of Context-induced Clash Resolution In cases where clash resolution is required, the lexical concept whose secondary cognitive model profile is searched to resolve the clash is determined by context. This is achieved by establishing a figurative target and a figurative vehicle, on the basis of context. The lexical concept that is established as the figurative vehicle is subject to clash resolution. In the utterance in (28), I am assuming a discourse context in which the speaker has been discussing their boss. In such a context, the figurative target (or target for short) is the boss, as this is the topic or theme of the utterance. Informally, the point of the utterance is to say something about the boss. From this it follows that the figurative vehicle (or vehicle for short), is the pussycat. Crucially, it is the secondary cognitive model profile of the vehicle, here [pussycat], rather than the target, which undergoes search in order to facilitate clash resolution. In other words, the principle in (30) serves to determine which of the lexical concepts secondary cognitive model profiles is subject to search.
638
V. Evans
Before concluding the discussion of the example in (28), a caveat is in order. In my discussion thus far I have assumed that the literal class inclusion statement, as in (29) involves the same symbolic unit (and hence the same lexical concept) as the metaphoric version of the predicate nominative construction in (28). I have done so for purposes of explicating the nature of figurative language conceptions, in order to contrast them with metonymic conceptions, below. Yet, as should by now be clear, as LCCM Theory assumes a constructional perspective on grammatical organisation (e.g., Goldberg 2006; Langacker 2008), a difference in form and/or meaning is indicative of a different symbolic unit and hence lexical concept. Accordingly, it is likely that the lexical concepts associated with the expressions in (28) and (29) are not, in fact, motivated by a single predicate nominative symbolic unit. Rather, the fact that human agents can have attributes of animals ascribed to them highly productively, as evidenced by examples such as (31), suggests that English speakers have an entrenched symbolic unit of the type indicated in (32): (31) Sam is a wolf/pig/lion/fox/mouse, etc. (32) a. b. Form: SUBJECT BE+TNS a ANIMAL TERM Lexical concept [volitional agent x has functional attribute(s) of animal y]
From this perspective, the metaphoric reading resulting from (28) is due to the lexical concept given in (32b), rather than being due to a class inclusion lexical concept (cf. the example in (29)). LCCM Theory therefore predicts the following in terms of processing. The class inclusion lexical concept is plausibly better entrenched (and hence more salient without a specific context) than the lexical concept in (32b). That being so, when a language user is exposed to an example such as (28) they begin by processing the class inclusion lexical concept. Upon encountering the animal term, lexical concept selection revision takes place, such that a new lexical concept is selected for: that provided in (32b). The prediction, therefore, is that there should be a slightly higher N400, in ERP terms, for examples such as (28) and (31) than those such as (29). In view of this caveat, how then should we interpret the discussion of the figurative conception for (28) given above? I assume that the class inclusion lexical concept associated with the predicate nominative form existed in the language prior to the emergence of the lexical concept in (32b). In fact, it is plausible that the lexical concept in (32b) emerged historically from the literal class inclusion lexical concept.18 This process of semantic change plausi18.
For detailed discussion of the way in which metaphoric lexical concepts emerge from literal lexical concepts see the discussion of the emergence of the state lexical concepts from the spatial senses for in, on and at in Evans (2010).
Figurative Language Understanding in LCCM Theory 639 bly involves usage-based bridging contexts, and pragmatic strengthening as alluded to above in the discussion of the examples in (25). Hence, the discussion of how the metaphoric conception for (28) arises, described above, is likely to relate to an earlier stage in the language, before the lexical concept in (32b) had become conventionally associated with the form in (32a), i.e., before it had unit-like status. 5.2. Metonymy
I now turn, briefly, to the LCCM account of metonymic conceptions. I do so in order to contrast this with the LCCM account of metaphoric conceptions. In this section I will consider the example in (33) in order to illustrate the way metonymic conceptions are derived. (33) The ham sandwich asked for the bill
As we saw with the earlier analysis of the example in (16) and the analysis of metaphoric conceptions, one aspect of language understanding that is common to both metaphor and metonymy in the LCCM account is that language understanding involves activation of cognitive models in the secondary cognitive model profile of a particular lexical concept. Hence, clash resolution is required, which is the distinguishing feature of figurative as opposed to literal meaning construction (the other features, salience and complexity are also involvedalthough these phenomena are also involved in literal language processing). In the utterance in (33) the lexical concept [ham sandwich] undergoes interpretation in conjunction with the informational characterisation asked for the bill. However, there is a clash between the informational characterisation, and the primary cognitive model profile of [ham sandwich]. After all, a ham sandwich is not, normally, conceived of as an animate entity that can ask for the bill. Due to the Principle of Context-induced Clash Resolution, the customer who ordered the ham sandwich is identified as the figurative target, and the ham sandwich is identified as the figurative vehicle. Accordingly, it is the cognitive model profile associated with the lexical concept [ham sandwich] which becomes the site for clash resolution. Following the Principle of Ordered Search, the search region for clash resolution is expanded to take in secondary cognitive models associated with [ham sandwich]. A partial cognitive model profile for [ham sandwich] is provided in Figure 12. In this example, clash resolution is achieved by virtue of a search occurring in the secondary cognitive model profile of [ham sandwich]. The cognitive model which achieves activation is that of restaurant customer.
640
V. Evans
Figure 12. Partial cognitive model profile for [ham sandwich]
5.3.
Metaphor versus metonymy
As observed earlier, it has often been pointed out that metonymy, but not metaphor, has a referential functionone entity serves to stand for, or identify, another, as in a ham sandwich serving to identify the particular customer who ordered the ham sandwich. In contrast, previous scholars have variously argued that metaphor serves to frame a particular target in terms of a novel categories, e.g., My job is a jail (e.g., Glucksberg 2001; Carston 2002), or analogy, e.g., Juliet is the sun (e.g., Gentner et al., 2001). That is, the prototypical linguistic metaphor has what we might very loosely refer to as a predicative function. From the perspective of LCCM Theory, the distinction between the prototypical functions of metaphor and metonymy relates to whether the figurative target and figurative vehicle exhibit alignment, and hence whether the clash resolution site corresponds to the figurative target. To illustrate, lets reconsider the metaphoric conception of My boss is a pussycat. In this example, the figurative target is the lexical concept [boss] and the figurative vehicle is [pussycat]. Following the Principle of Context-induced Clash Resolution, the cognitive model profile for [pussycat], the figurative vehicle, is the clash resolution site: activation of a secondary cognitive model takes place here. This situation differs with respect to metonymy. In the ham sandwich example, the customer corresponds to the figurative target, as determined by the Principle of Context-induced Clash Resolution, and the figurative vehicle corresponds to the ham sandwich. However, both contextually salient elements
Figurative Language Understanding in LCCM Theory 641 are accessed via the cognitive model profile associated with a single lexical concept: [ham sandwich]. In other words, there is alignment, in a single cognitive model profile, of the figurative target and vehicle. Hence, the site of clash resolution corresponds to the access route for the figurative target: customer. In sum, LCCM Theory reveals a divergence in the prototypical properties of metaphor and metonymy, which emerges as an outcome of the application of regular meaning construction mechanisms. Figurative conceptions which are labelled as metonymic arise due to the figurative vehicle facilitating direct access to the figurative target due to alignment of the figurative vehicle and target in the same lexical concept and cognitive model profile. In contrast, metaphoric conceptions arise due to a divergence between figurative vehicles and targets across two distinct lexical concepts. In the final analysis, metaphor and metonymy are terms that have been applied by different scholars to a range of overlapping and sometimes distinct figurative language phenomena. What emerges from the LCCM account is that the intuitions that lie behind the use of these terms to data of particular kinds is a function of small set of compositional mechanisms that are guided by various sorts of constraints (the principles identified in this paper). Although only a small set of data have been considered in this paper, I argue that the application of these mechanisms and principles gives rise to a range of figurative conceptions which, in terms of discourse functions, are continuous in nature. That is, from the perspective of language understanding, while there are, what might be thought of as, symptoms of metaphor and metonymy, there is not always a neat distinction that can be made that serves to identify where metaphor ends and metonymy begins. 6. LCCM Theory in comparison and contrast In this section I consider how LCCM Theory interfaces with two theories of backstage cognition. I argue that it refines how the theoretical construct of the conceptual metaphor is viewed, treating it as but one type of knowledge which is important in figurative language understanding. Some aspects of my claims, therefore, may be at odds with Conceptual Metaphor Theory as classically formulated. Nevertheless, I emphasise that the importance and status of the notion of conceptual metaphor as a theoretical construct is maintained in the present account. I also argue that LCCM Theory is continuous with Conceptual Blending, conceived here, in terms of a research programme, rather than a defined theory with a single overarching conceptual mechanism, i.e., blending, in the sense of Fauconnier and Turner (2002). I also consider how LCCM Theory differs from, and complements, what is arguably the best developed theory of grammar in cognitive linguistics: Cognitive Grammar.
642 6.1.
V. Evans Knowledge types involved in figurative language understanding
The LCCM Theory perspective assumes that figurative language understanding involves a number of different knowledge types. I therefore begin with this. One type of knowledge involves what have been termed primary conceptual metaphors (Grady 1997; Lakoff and Johnson 1999). These are hypothesised to be cross-domain conceptual primitives that arise automatically on the basis of pre-conceptual and universally-shared experience types. However, some of the proposed primary metaphors, e.g., what Lakoff and Johnson dub the Moving Observer and Moving Time metaphors may not, in fact be universal. Based on linguistic and gestural evidence, the Andean language Aymara appears not to have motion based Ego-centred conceptual metaphors (Nez and Sweetser 2006). While there are likely to be no more than a few hundred primary metaphors (Grady p.c.), much work still remains to establish the full set. A second knowledge type involves what have been referred to as complex metaphors (Lakoff and Johnson 1999) or compound metaphors (Grady 1997, 2005). These are, in effect, complex bodies of knowledge arising through processes of conceptual integration (in the sense of Fauconnier and Turner). Hence, they are a type of (often very complex) blend. Specific proposals as to how these arise have been made by Grady (1997, 2005; and indeed Fauconnier and Turner, e.g., 2008; see also Evans To appear). The common denominator in primary and complex metaphors is that they involve knowledge that is recruited from other regions of conceptual space, which is to say, from other domains of experience. In LCCM Theory I assume that primary and complex metaphors structure the cognitive models that make up a lexical concepts cognitive model profile, as we shall see below. Hence, on the present account, conceptual metaphors (whether primary or complex), form part of the knowledge to which an open-class lexical concept potentially facilitates access. Hence, they form part of the conventional body of knowledge that is potentially invoked by any given lexical item during the process of figurative language understanding. In addition to knowledge of this type, lexical concepts facilitate what I refer to as semantic affordances. Semantic affordances (elaborated on in more detail below) are the knowledge types that are immanent in the cognitive model profile, prior to additional structuring via conceptual metaphor. For instance, the lexical concept associated with the form whizzed by provides a number of possible interpretations that arise purely on the basis of the cognitive models to which it facilitates direct access (primary cognitive models), and indirect access (secondary cognitive models). These inferences constitute semantic affordances. Moreover, semantic affordances are activated during the process of (figurative) language understanding due to the operation of the normal pro-
Figurative Language Understanding in LCCM Theory 643 cesses of lexical concept integration and interpretation, as mediated by context, as described above. For instance, semantic affordances potentially activated by the selection of the lexical concept [whizzed by] might include rapid motion, a distinct audible sound, lack of detail associated with the object of motion, and limited durational elapse to observe object of motion, as well as many others. I argue (below), that semantic affordances, as well as relational structure recruited via conceptual metaphor, are both important in giving rise to the interpretation associated with any given open-class lexical concept during figurative language understanding. In order to make more explicit the respective contribution of the types of knowledge just alluded to, I present below my assumptions regarding their respective contribution in figurative language understanding, before providing details of how this works in practice in the next section. Assumption 1: conceptual metaphors underdetermine (figurative) linguistic utterances. Assumption 2: Figurative semantic affordances arise when a lexical concept facilitates activation of aspects of a secondary cognitive model profile, due to clash resolution. Assumption 3: linguistically-mediated meaning construction always involves a linguistically-informed process of interpretation. In figurative language understanding this may involve activation of conceptual metaphors and semantic affordances. Assumption 4: conceptual metaphors (in LCCM Theory) provide a special type of knowledge structure which hold at the level of cognitive models: they provide primary cognitive model profiles with a level of structure which complements existing cognitive models (within a cognitive model profile). I briefly elaborate on each of these assumptions. Assumption 1: There are good grounds for thinking that conceptual metaphors, while part of the story, actually underdetermine the linguistic metaphors that show up in language use. For instance, consider the conceptual metaphor states are locations. As I argue in previous work (Evans 2010)19, this conceptual metaphor does not predict why there are different patterns in the sorts of states that can be encoded by different prepositions in English: (34) a. b. She is in love (cf. *She is on love) The soldiers are on red alert (cf. *The soldiers are in red alert)
19.
See Evans (2004: Ch. 4) for related arguments for the underspecification of linguistic temporal conceptions by conceptual metaphors for time.
644
V. Evans
That is, if the conceptual metaphor states are locations directly motivated language use, we would expect both in and on to be able to encode states such as love and red alert. As I argue in detail in Evans (2010), the reason they cannot is due to the linguistic content of the lexical concepts specific to the forms in and on, and language use, rather than due to an over-arching conceptual metaphor. Of course, this does not preclude the existence of an overarching conceptual metaphor: states are locations. And I assume the existence of conceptual metaphors, as discussed below. Lets take another example. In previous work (e.g., Evans 2004: Ch. 4) I showed that conceptual metaphors in the domain of time underdetermine conventional patterns evident in language. Consider the following examples by way of illustration. They all involve the lexical item time, and a verbal complement relating to a motion event: (35) a. The time for a decision has come b. Time drags (when youre bored) c. Time flies (when youre having fun) d. Time flows on (forever) [temporal moment] [protracted duration] [temporal compression] [temporal matrix]
I argued in Evans (2004; see also Evans 2005) that the forms for time in each of these examples is conventionally paired with a distinct lexical concept (indicated in square brackets). Not only does the grammatical encoding associated with the lexical concept vary across the examples in predictable ways, so do the semantic arguments. That is, the semantic value associated with time in each example is paired with a restricted range of semantic arguments. For instance, the [temporal moment] lexical concept for time can only collocate with motion events which involve deictic (and often terminal) motion. In contrast, the [temporal matrix] lexical concept, which relates to time as an ontological category (our conceptualisation of time as the event in which all other events occurs), can only occur with non-terminal motion events. Only certain types of motion events can collocate with specific types of temporal concepts. Importantly, the various conceptual metaphors for time that have been proposed in the literature do not predict this fact. Assumption 2: A semantic affordance is an inference that is specific to a given lexical concept. It arises during figurative (and indeed non-figurative) language understanding. It is due to activation of (part of ) a cognitive model to which the lexical concept facilitates access. A lexical concept can, in principle, facilitate activation of a vast number of semantic affordances, only constrained by the cognitive model profile to which it facilitates access. Moreover, a lexical concept can give rise to more than one semantic affordance in any utterance, a consequence of the extra-linguistic context (venue, time, interlocutors), the linguistic context, and the processes of meaning construction which apply.
Figurative Language Understanding in LCCM Theory 645 To illustrate, consider the following utterances: (36) a. b. Christmas is approaching Christmas whizzed by (this year)
Conceptual Metaphor Theory, for instance, claims that the ego-centred conceptual metaphors for Moving Time (e.g., Lakoff and Johnson 1999; Moore 2006) allow us to understand (the passage of) time in terms of the motion of objects thorough space, thereby licensing these examples. While these examples are no doubt, in part, a consequence of conceptual metaphors for time (for instance, in terms of their location in time, as either being future, as with (36a) or past as with (36b)), the forms approaching and whizzed by give rise to distinct and distinctive semantic affordances. These cannot be predicted solely on the basis of the common conceptual metaphor that is meant to license these examples (in Conceptual Metaphor Theory). For instance, the semantic affordance associated with the lexical concept [approaching] relates to relative imminence. The occurrence of the event in question, which in (36a) concerns Christmas, is construed as imminent. In contrast, the semantic affordance associated with [whizzed by] in (36b) has to do not with imminence, but with the perceived compressed durational elapse associated with the observers experience of Christmas. In other words, the semantic affordance relates to the phenomenological experience that, on the occasion referred to in (36b), Christmas felt as if it lasted for a lesser period than is normally the case. While the Moving Time conceptual metaphor (I argue below), allows the language user to apply relational structure from our experience of objects moving in space, and so interpret Christmas metaphorically as an object, part of the interpretation that arises also involves semantic affordances that are unique to given lexical concepts for motion. In other words, as the inferences just mentioned are specific to lexical forms, it is theoretically more accurate to assume that this aspect of meaning construction involves a bottom-up process: they arise due to activation of knowledge (i.e., semantic affordances) specific to the lexical concepts in question, rather than a top-down process of overarching conceptual metaphors. Assumption 3: My third assumption is that conceptual metaphors and semantic affordances provide two complementary types of knowledge which are essential to figurative language meaning construction. LCCM Theory assumes that language use, and specifically figurative conceptions, draw on a number of different types of knowledge. These include purely linguistic knowledge, as well as conceptual knowledge. The semantic dimension of linguistic knowledge is modelled in terms of the theoretical construct of the lexical concept, which constitutes a bundle of different knowledge types as briefly described earlier (see Evans 2009b for full details). Conceptual knowledge takes different forms
646
V. Evans
and, as mentioned above, includes (at the very least) primary cognitive models, secondary cognitive models, and conceptual metaphors, which structure primary cognitive models in terms of structure recruited from other domains. As LCCM Theory takes a usage-based perspective, I assume that any utterance will always involve invocation of various knowledge types in producing a conception, including context of use. The difference, in terms of processing effort, associated with producing any given conception, is likely to be a consequence of the factors considered earlier in the paper, in particular salience and complexity. Assumption 4: Finally, I assume that conceptual metaphors (in LCCM Theory) hold at the level of cognitive models. They structure the primary cognitive model(s) to which an open-class lexical concept facilitates access. This means that the cognitive model profile for a lexical concept such as [christmas] has enhanced conceptual structure. This lexical concept, for instance, potentially facilitates access to relational knowledge concerning the motion of objects through space. This allows language users to invoke inferences associated with objects in motion in order to understand temporal relations involving the relative location in time of the temporal event Christmas. I illustrate, in the next section, as to how this might work in practice. 6.2. The status of conceptual metaphors in LCCM Theory
Thus far in this paper I have been dealing with how figurative conceptions arise. And I have done so without recourse to conceptual metaphors: stable cross-domain mappings which inhere in long-term memory (Lakoff and Johnson 1980, 1999). In this section I detail the status of conceptual metaphors in LCCM Theory, and specifically in the LCCM approach to figurative language understanding. In so doing, I attempt to illustrate the respective role(s) of conceptual metaphors and semantic affordances (the latter arising via clash resolution, in terms of figurative language understanding). Nevertheless, a caveat is in order. The ensuing analysis is meant to be indicative rather than definitive. Ongoing research within LCCM Theory seeks to establish the nature of the intersection between semantic affordances and conceptual metaphors in the domain of time. The proposals below should therefore be viewed as being programmatic, and may be subject to revision as the interaction between linguistic and conceptual knowledge in figurative language understanding becomes better understood. To illustrate the interaction between conceptual metaphors and semantic affordances, I make use of (36a) which I revise as (37): (37) Christmas is approaching (us)
Figurative Language Understanding in LCCM Theory 647 Before discussing in more detail the conception associated with this utterance, and how this arises, I want to first focus on the cognitive model profile for [christmas]. In particular, I focus on the way in which this cognitive model profile is structured by a conceptual metaphor. The lexical concept [christmas] facilitates access to a number of primary cognitive models, as illustrated in Figure 13. These include knowledge relating to Christmas as a cultural festival, including the exchange of gifts and other cultural practices. The second type of knowledge relates to Christmas as a temporal event. This includes a whole host of temporal knowledge, as illustrated by the attributes and values associated with the temporal event cognitive modelattributes and values are subsets of the knowledge that make up a cognitive model (see Evans 2009b for detailed discussion). For instance, part of our knowledge relating to a temporal event is that it can be situated in the past, present, and future. A further attribute relates to the nature of the durational elapse associated with the event, which is to say its duration. This attribute has a number of values associated with it. Moving from right to left, the first is temporal compressionthe underestimation of time, which is to say, the experience that time is proceeding more quickly than usual. The second is synchronous durationthe normative estimation of time, which is to say, the experience of time unfolding at its (cultural and phenomenologically)
Figure 13 Partial primary cognitive model profile for [christmas]
648
V. Evans
standard or equable rate. The final value is protracted duration. This relates to an overestimation of duration, which is to say the felt experience that time is proceeding more slowly than usual. The final primary cognitive model diagrammed in Figure 13 is that of Christmas as a religious festival. This relates to knowledge concerning the nature and status of Christmas as a Christian event, and the way in which this festival is enacted and celebrated. In addition, the primary cognitive models for [christmas] recruit structure from other cognitive models via conceptual metaphor. That is, as operationalised in LCCM Theory, a conceptual metaphor provides a stable link that allows aspects of conceptual content encoded by one cognitive model to be imported so as to form part of the permanent knowledge representation encoded by another. For instance, the primary cognitive model temporal event is structured via a conceptual metaphor in terms of a stable, long-term link holding between it and the cognitive model relating to an object in motion along a path. As such, the cognitive model, object in motion along a path, which is represented in Figure 13 by virtue of a circle located on a path, with the arrow indicating direction of motion, provides the temporal event cognitive model with relational structure concerning our knowledge of objects undergoing motion along a path. The conceptual content recruited via conceptual metaphor is indicated by the dashed lines. Specifically, relational structure from this cognitive model is inherited by the past, present, and future attributes, such that content relating to the region of the path behind the object serves to structure, in part, our experience of pastness, conceptual content relating to the objects present location serves to structure, in part, our experience of the present, and content relating to that portion of the path in front of the object serves to structure our experience of futurity. This is indicated by the dashed lines which map the relevant portions of the path of motion from the object in motion along a path cognitive model onto the relevant attributes: future, present, past. In addition, content relating to the nature of motion is inherited by the duration attribute. Again this is captured by the dashed arrow, which links the arrowsignifying motionwith the duration attribute. Now I return to addressing the figurative conception that arises for the utterance in (37). In Conceptual Metaphor Theory, this expression is held to be motivated by a conceptual metaphor, the so-called Moving Time metaphor (see Lakoff and Johnson 1999). From the LCCM perspective, an expression such as this involves, first and foremost, a sentence-level lexical concept which encodes what I refer to as a temporal frame of reference, or TFoR for short. Akin to spatial frames of reference (e.g., Levinson 2003; see also Talmy 2000), TFoRs are complex symbolic units, involving a form and an internally open closed-class lexical concept. Being internally open, the TFoR lexical concept can be integrated with other lexical concepts, notably [christmas] and
Figure 14. Representation of the linguistic content encoded by [location of event in time, from perspective of event]
[approaching], each of which facilitates access to cognitive model profiles. As noted above, I assume that conceptual metaphors operate at the level of cognitive model(s), providing an additional level of knowledge which lexical concepts, e.g., the temporal nominal lexical concept [christmas], can activate during regular processes of meaning construction, as I explain below. First, lets briefly examine the nature of the TFoR lexical concept which sanctions the instance in (37). The TFoR symbolic unit for (37) is given in (38): (38) a. NP1 VERBAL COMPLEX OF DIRECTED MOTION (NP2) b. Lexical concept [location of event in time, from perspective of event] Form
The lexical concept in (38b) encodes the following. There is an event (E) which is located in time with respect to an experiencer which serves as the reference point (RP). Additionally, the temporal location is viewed from the perspective point (PP) of the event. This can be represented diagrammatically as in Figure 14. The linguistic content encoded by the lexical concept illustrated in Figure 14 is highly schematic in nature. It does not relate to the phenomenological experience of what it feels like, for instance, to experience the passage of time. Nor does it encode phenomenologically rich notions relating to the experience of pastness or futurity. That is, this lexical concept simply encodes a relation holding between an event and the RP: the present. In other words, what gets into language, so to speak, in terms of linguistic content, is a highly paramaterised version of temporal experience.20 It says nothing about whether the event is located in the future or the past with respect to the RP. This rich inference emerges following interpretation, once open-class lexical concepts have been integrated with the TFoR lexical concept. For this reason, the time line in
20.
See Evans (2009b) for discussion on the notion of paramaterisation in language.
650
V. Evans
Figure 14 has no directionality. In addition to this schematic content, the lexical concept also encodes details as to what types of lexical concepts and forms can fill the various slots that make it up. This I refer to as its lexical profile. This includes the following: NP1 must be a temporal event of some kind, and the optional NP2 (signalled by the parentheses in (37)) must be an experiencer of some kind. The verbal complex of directed motion must relate to motion events that can be construed as facilitating arrival at the experiencer. These include verbs of deictic motion, such as come, verbs of terminal motion, such as approach, verbal complexes involving increase in proximity, such as get/move closer, or verbs of motion which are manner-neutral, such as move, but which are paired with a path satellite of directed motion, such as up on, to give the verbal complex move up on, and so on. In a typical conception arising on the basis of (37) three specific inferences arise which collectively make up the conception. These can be summarised as follows: i) ii) iii) The utterance relates to a temporal scenario rather than one involving veridical motion. The temporal event of Christmas is located in the future with respect to our understanding of the present which is implicit, although not explicitly mentioned, in the utterance. The future event of Christmas is interpreted as being relatively imminent with respect to the present.
Lets consider how the processes of meaning construction developed in LCCM Theory account for these. And in so doing, well see the role conceptual metaphors play in the theory. In terms of the first issue, I argue that the language user recognises the utterance as relating to a temporal scenario (rather than one involving motion) in precisely the same way as the idiomatic meaning of He hit the roof is instantly recognised. The existence of the TFoR lexical concept presented in (38b) is highly salient, in the sense discussed earlierit is well entrenched in semantic memory. The existence of the lexical concept serves as a frame for interpreting the open-class lexical conceptsthose associated with the forms Christmas and approachingallowing them to achieve an informational characterisation relating to a temporal scene. Turning now to the second issue, how is it that the utterance is understood as relating to a temporal event which is located in the future? The answer, I suggest, relates to the existence of the ego-centred conceptual metaphor time is motion of objects (along a path), aka Moving Time, which structures the cognitive model profile of [christmas]. In terms of the inference arising from (37), that the event of Christmas is situated in the future, this is due to matching between the primary cognitive
Figurative Language Understanding in LCCM Theory 651 model of [christmas]involving spatial content recruited via conceptual metaphorand the primary cognitive model profile accessed via [approaching]. That is, the conceptual metaphor structures the primary cognitive model temporal event, providing it with relational structure recruited from a cognitive model relating to motion through space. Hence, in terms of the utterance in (37), matching is achieved in the primary cognitive model profiles of both [christmas] and [approaching]. After all, due to the conceptual metaphor, [christmas] facilitates access to relational structure derived from the motion scenario involving an object in motion. This knowledge forms part of the temporal event cognitive model. This is matched with the kind of terminal motion accessed via [approaching]. The cognitive model profile associated with [approaching] involves motion towards an entity, and hence, the object in motion is in front of the entity with respect to which it is approaching. As the future attribute of the temporal event cognitive model accessed via [christmas] is structured in terms of that part of the motion trajectory that is in front, there is a match. And the resulting match involves an interpretation in which the temporal event of Christmas is located in the future. In other words, this particular interpretation is a consequence of a special type of matching I refer to as conceptual metaphor matching. Importantly, LCCM Theory assumes that in cases of conceptual metaphor matching, regular matching (as described in section 4.1) still takes place. In other words, conceptual metaphor matching involving primary cognitive models does not prohibit additional figurative semantic affordances arising on the basis of activation in the secondary cognitive profile of one of the lexical concepts undergoing matching (and clash resolution). The third and final issue relates to the inference that the temporal event of Christmas in (37) is relatively imminent. This interpretation arises, I argue, due to the regular process of matching as described in section 4.1. abovethe fact that conceptual metaphor matching has occurred does not preclude further matching. Matching, as guided by the previously introduced Principles of Interpretation, attempts to build an informational characterisation for [christmas] and [approaching] by first searching the primary cognitive models of both these open-class lexical concepts. As Christmas is a temporal, cultural, and religious event, and hence something that cannot undergo the sort of veridical motion implicated by the primary cognitive model profile associated with [approaching], a clash arises. This necessitates clash resolution. Due to the Principle of Context-induced Clash Resolution, introduced above, [christmas] is designated as the figurative target, and [approaching] the figurative vehicle. The consequence is that a search is established in the secondary cognitive model profile of [approaching]. A very partial cognitive model for [approaching] is provided in Figure 15. The cognitive model profile for
652
V. Evans
Figure 15.
Partial cognitive model profile for [approaching]
[approaching] includes primary cognitive models for a target location, the directed motion of an entity, and the imminence of arrival of an entity. A consequence of the relative imminence of arrival of an entity is the imminence of occurrence of event, which is a secondary cognitive model. As a temporal event such as Christmas can occur, but not (literally) arrive, there is a match between the secondary cognitive model imminence of occurrence of event and the primary cognitive model profile of [christmas]. Hence, the interpretation of the imminence of the occurrence of Christmas is due to a semantic affordance arising, which results from clash resolution following regular matching. This analysis reveals that the interpretation of (37) involves more than simply a conceptual metaphor. A number of different knowledge types are involved, and regular processes of meaning construction take place, as modelled by LCCM Theory. This involves understanding the temporal event as an object that can undergo motion (via conceptual metaphor), and hence its location in the future, and understanding, through clash resolution that the type of motion involved implicates relative imminence of occurrence, achieved without recourse to conceptual metaphora semantic affordance. 6.3. The relationship between LCCM Theory and Conceptual Blending
Conceptual blending (Coulson 2000; Fauconnier and Turner 1998, 2002, 2008) is held to be a mechanism that is central to the way we think. It provides a means of integrating and compressing often very complex knowledge, typically in the process of ongoing meaning construction. Blending involves the setting up of an integration network, the purpose of which is to facilitate integration, and more precisely, the blending together of elements from a number
Figurative Language Understanding in LCCM Theory 653 of distinct mental spaces (known as inputs). Knowledge from the inputs is projected to the blend selectively, in service of the particular inference or meaning under construction. This leads to a process whereby inputs contribute some, but not all, of their content. This selective projection of knowledge to the blended space is then integrated in a process known as composition. Once this has happened, the composed elements may require further knowledge being recruited to complete the blend that is emerging. This further process of knowledge recruitment is known as pattern completion. Finally, the blended space provides a means of allowing us to do inferential work. We can use the blend for ongoing reasoning, and can even extend and further elaborate the blend. This is known as running the blend. The proposals provided in this paper can be construed as representing a detailed account of the linguistically-mediated mechanisms involved in composition: one of central drivers of conceptual blending. After all, linguisticallymediated composition presumably involves the activation of knowledge in ways that facilitate a coherent interpretation. The process of clash resolution, one of the symptoms of figurativity described in this paper, presents a mechanism for achieving integration of knowledge leading to coherence, and hence satisfying, in principle, the various goals and subgoals of Blendingalthough the way in which this might be achieved hasnt been worked out here. That all said, meaning construction is exquisitely complex. While Blending Theory has attempted to provide a single well articulated and coherent account of meaning construction, it is highly unlikely, to my mind, that the range of phenomena claimed to exhibit conceptual integration, in the terms of Fauconnier and Turner (e.g., 2002), in fact arise from a single mechanism. For instance conceptual blending, a single unified mechanism, is held to be responsible for phenomena as diverse as neurological binding, solving riddles, performing mathematic calculations, to the creation of novel word, and word compound coinages, as well as grammatical constructions. While these phenomena involve integration of some kind, it is far from clear that a single set of mechanisms and unified principles can adequately account for the range of knowledge types, and neurological mechanisms involved. In view of this, I suggest the following. If we allow blending to be interpreted more broadly as a research programme (rather than a theory), language (and cognitive) scientists are provided with a fresh and an important perspective for investigating meaning construction. The truly notable finding that arises from Fauconnier and Turners research on blending is that integration does indeed appear to be ubiquitous: it is central to the way we think. It is in this spirit that LCCM Theory is put forward. The LCCM perspective offered in this paper, presents a reasonably detailed first pass at accounting for how knowledge accessed via linguistic inputs undergoes composition, in service of figurative meaning construction. Linguistically-
654
V. Evans
mediated composition, as studied here, is one of the (probably many) compositional integration types that are necessary to produce meaning. The other salient integration type identified by Fauconnier and Turner is referred to as pattern completion (which itself is probably a complex category of different types of integration). Thus, LCCM Theory represents an attempt to model one specific type of composition, which is one type of integration. It forms part of what is envisaged to be a large-scale study of integration mechanisms involving linguistic and other types of knowledge, in producing meaning. 6.4. LCCM Theory and Cognitive Grammar
I now briefly consider the way in which LCCM Theory is distinct from Cognitive Grammar (e.g., Langacker 1987, 1991, 2008). I address two specific issues: theoretical focus, and encyclopaedic semantics. I argue that LCCM Theory has distinct (albeit complementary) theoretical foci. It also provides, I argue, a nuanced perspective on the approach to encyclopaedic semantics advocated by Cognitive Grammar. Cognitive Grammar represents an attempt to develop a cognitively-realistic account of grammatical representation and structure. In so doing, Cognitive Grammar develops an account of the way linguistic unitswhat are referred to as symbolic unitsare integrated in producing larger grammatical units. This account assumes a central role for semantics in grammatical compositionality. Langacker argues that grammatical structure arises due to a distinction between conceptually independent and conceptually dependent lexical structures. Conceptually dependent lexical structures are relational in the sense that they have schematic trajectors (TRs) and landmarks (LMs) which form part of their semantic representation. The distinction between a TR and an LM relates to a distinction in focal prominence in what Langacker refers to as a profiled relationship. Profiling concerns the attribution of attention to a particular entity or relationship by virtue of encoding in language. To illustrate, consider the utterance in (39): (39) The boy smashed the vase The TR relates to the participant in the relationship being profiled which receives focal prominence. That is, in (39) the TR is the participant designated by the boy. In contrast, the LM is the participant in the profiled relationship which receives secondary prominence. In (39) the LM corresponds to the entity designated by the vase. One consequence of this is that what counts as a TR or an LM is encoded as part of linguistic content by the relational or conceptually dependent lexical concept (e.g., smashed ), rather than the conceptually independent or nominal lexical concepts (e.g., boy, vase). To illustrate consider (40).
Figurative Language Understanding in LCCM Theory 655 (40) The vase fell In this example the vase corresponds to the TR. This follows as it occupies the schematic TR slot encoded by the relational lexical concept associated with the form fell. Langacker refers to the schematic TRs and LMs encoded by conceptually dependent lexical concepts as elaboration sites (or e-sites for short), and the profiling of these e-sites as elaboration. From the perspective of Cognitive Grammar, then, compositionality is a consequence of conceptually dependent lexical concepts becoming elaborated by nominal lexical concepts which are conceptually autonomous. This is not the whole story, of course. Any cognitively realistic account of compositionality must provide an account of how the level of semantic structure that is encoded by language, or that results from the integration of grammatical structures, as in the case of elaboration in the sense of Langacker, interfaces with what I refer to, in LCCM terms, as conceptual content. In Cognitive Grammar, this latter level of semantic representation is broadly referred to as encyclopaedic knowledge. Langacker argues that words directly encode what I operationalise in terms of conceptual content. Conceptual content is modelled, in Cognitive Grammar, in terms of a theory of conceptual domains, with a word designating a profile against some base, which relates to a subset of some domain or domains. Yet, not only is the notion of a domain not worked out in any great detail, it is not clear how the result of integration at the linguistic (or grammatical) level then interfaces with this encyclopaedic knowledge at the level of an utterance in order to produce an utterance-level meaning: a conception. That is, it is not clear how this level of knowledge representation interfaces with the linguistic or grammatical level, and what the mechanisms are whereby structure from the perceptually rich domains becomes incorporated with grammatical structures. To be fair to the account developed by Langacker, the model he develops is not primarily concerned with the details of semantic composition. Rather, he is primarily exercised by attempting to develop a semantically based account of linguistic organisation and structure (a grammar), which can account for issues such as constituency, and the combinatorial properties of the formal aspects of language. In view of this, LCCM Theory can then be seen as complementing the research perspective provided by Cognitive Grammars account of grammatical organisation. LCCM Theory diverges from Cognitive Grammar in that it is concerned precisely with the nature of semantic representation, as well as the mechanics of semantic composition. Moreover, given its foundational assumption that semantic structure and conceptual structure constitute distinct kinds of representation, it follows that I posit two distinct processes of composition:
656
V. Evans
lexical concept integration, which relates to fusion of linguistic content, and interpretation, which concerns fusion of conceptual content. I now briefly address the thesis of encyclopaedic semantics. More than any other researcher in cognitive linguistics, Langacker (1987, 1991, 2008) has been responsible for developing this thesis. He does this in adducing a conceptual semantics that underpins his theory of Cognitive Grammar. Langackers view of encyclopaedic semantics is based on two assumptions: (i) that the semantic structure associated with words directly accesses conceptual structure, and (ii) words and other symbolic units cannot be understood independently of the larger knowledge structures, the encyclopaedic domains of conceptual knowledge, to which words serve as points of access. In essence, Langackers claim is that semantic structure is equivalent to conceptual structure; that is, the semantic structure associated with a lexical form is conceptual structure. LCCM offers a somewhat nuanced perspective. On my account, the thesis of encyclopaedic semantics oversimplifies matters. It blurs the boundaries between linguistic and conceptual knowledge. While marking such boundaries may not be necessary in Cognitive Grammar, for instance, which is ultimately concerned with accounting for formal properties of linguistic organisation, such a situation is unsatisfactory when attempting to account for the role of language in meaning construction, and specifically, figurative language understanding, as I am doing in this paper. The claim at the heart of LCCM Theory, and one enshrined in the distinction between its two foundational theoretical constructsthe lexical concept and cognitive modelis that what has, in cognitive linguistics, been treated as two qualitatively distinct, albeit related, aspects of semantic structureschematic versus rich aspects of semantic content, as described, for instance, by Talmy (2000) in his distinction between content encoded by open and closed-class formsin fact relates to very different types of representation that constitute different kinds of knowledge. While these two knowledge types interact in order to produce simulations, they nevertheless constitute different knowledge formats. 7. Conclusion This paper has been concerned with an LCCM account of figurative language understanding. This account relates to the role of language in figurative language understanding and the way in which it interfaces with non-linguistic knowledge. A consequence of meaning construction mechanisms proposed by LCCM Theory is the assumption that literal and figurative language arise from the same compositional mechanisms. They can be seen as points lying along a continuum of meaning construction, rather than being due to wholly different mechanisms. Analogously, metaphor and metonymy, as two particular exem-
Figurative Language Understanding in LCCM Theory 657 plars of figurative language use can be seen, from this perspective, as arising from similar meaning construction processes, differing in terms of the way meaning construction occurs. The key assumptions associated with the LCCM approach to figurative language can be summarised as follows: i) ii) iii) there is continuity between figurative and literal language there is continuity between metaphor and metonymy figurative language understanding is a consequence of the nature of semantic representation and semantic composition, which is to say, essentially the same structures and processes as for literal language.
One of the motivations for the development of the LCCM Theory account in this paper has been to develop a joined up cognitive linguistic account of figurative language understanding. This endeavour should be situated, I argue, within the perspective of seeking to account for conceptual integration in producing meaning. The two most influential theories of figurative language in cognitive linguistics are Conceptual Metaphor Theory and Conceptual Blending Theory. Yet both these approaches are concerned with (different aspects of ) backstage cognition: stable knowledge structures in the conceptual system (in the case of conceptual metaphors), and dynamic aspects of meaning construction (in the case of conceptual blending). What is missing is a frontstage cognition perspective, one that takes account of the sophisticated nature of linguistic information encoded in language, and the way in which it interfaces with nonlinguistic knowledge during meaning construction. This is what the LCCM project seeks to redress. Moreover, both backstage cognition accounts are sometimes presented as being competing. For instance, Lakoff (2008) argues that there is not a dedicated process of Blending in the brain. For their part, Fauconnier and Turner (2008) claim that (even the most basic) conceptual metaphors may arise due to (a dedicated process of ) Blending. For his part, Lakoff is probably right. The range of knowledge types and the processes involved in meaning construction are exquisitely complex. It is highly unlikely that the range and diversity of different types of knowledge, and the various ways in which they can be combined, follow from a single unified process, as proposed by Blending Theory. Yet, in identifying a programmatic framework, Fauconnier and Turner have made a significant contribution in focusing the challenge that lies ahead. By developing their Blending framework, they have provided future researchers with a handle on the nature of the challenge, which allows us to begin to model the (probably) many different types of integration involved in meaning construction. And just as Lakoff is partially right, so too it is with Fauconnier and Turner. Save for a relatively small number of primitive conceptual metaphors, probably much of the (stable) knowledge that populates our conceptual systems is constructed through
658
V. Evans
regular processes of meaning construction. The challenge remains to identify these processes. Fauconnier and Turner (2002), interpreted as providing programmatic proposals, have made an important start in this endeavour. In the final analysis, the impoverished linguistic prompts that language users deploy in meaning construction are impressively sophisticated. The LCCM perspective attempts to reconcile the impulse to focus on backstage processes with an awareness of the complexity apparent in language. It also seeks to examine the linguistic processes involved in semantic composition, including how linguistic prompts signal which aspects of non-linguistic knowledge are activated in linguistically-mediated meaning construction: a frontstage cognition perspective. I argue that this perspective complements and is necessary to develop a fully-fledged science of integration. To build on the achievements of Lakoff, and Fauconnier and Turner (as well as others) holds out the possibility of a mature cognitive linguistic approach to the linguistic and non-linguistic mechanisms of integration. It is these mechanisms which underlie (figurative) language understanding, and which ongoing and future work must aim to model. Received 13 May 2009 Revision received 8 March 2010 References
Barcelona, Antonio. 2000. Metaphor and metonymy at the crossroads. Berlin: Mouton de Gruyter. Barnden, John. 2010. Metaphor and metonymy: Making their connections more slippery. Cognitive Linguistics, 211: 134. Barsalou, Lawrence. 1999. Perceptual symbol systems. Behavioral and Brain Sciences, 22, 577 660. Barsalou, Lawrence. 2005. Continuity of the conceptual system across species. Trends in Cognitive Sciences, 9: 309311. Barsalou, Lawrence. 2008. Grounded cognition. Annual Review of Psychology, 59, 617645. Blasko, Dawn and Cynthia Connine. 1993. Effects of familiarity and aptness on metaphor processing. Journal of Experimental Psychology: Learning, Memory and Cognition, 19, 295308. Bowdle, Brian and Dedre Gentner. 2005. The career of metaphor. Psychological Review, 112: 193216. Boroditsky, Lera. 2000. Metaphoric structuring: Understanding time through spatial metaphors. Cognition, 75, 1, 128. Cameron, Lynne. 1999. Identifying and describing metaphor in spoken discourse data. In L. Cameron and G. Low (eds.), Researching and applying metaphor, pp. 105132. Cambridge: Cambridge University Press. Carston, Robyn. 2002. Thoughts and utterances: The pragmatics of explicit communication. Oxford: Blackwell. Casasanto, Daniel. 2010. Space for thinking. In V. Evans and P. Chilton (eds.), Language, cognition and space: The state of the art and new directions, pp. 453478. London: Equinox Publishing. Casasanto, Daniel and Lera Boroditsky. (2008). Time in the mind: Using space to think about time. Cognition, 106, 579593.
Bangor University

Chatterjee, Anjan. 2010. Disembodying cognition. Language and Cognition, 21: 79116. Coulson, Seana. 2000. Semantic Leaps. Cambridge: Cambridge University Press. Coulson, Seana. 2008. Metaphor comprehension and the brain. In R. Gibbs (ed.), The Cambridge handbook of metaphor and thought, pp. 177196. Cambridge: Cambridge University Press. Coulson, Seana and Cyma Van Petten. 2002. Conceptual integration and metaphor: An ERP study. Memory and Cognition, 306: 958968. Deignan, Alice. 2005a. Metaphor and corpus linguistics. Amsterdam: John Benjamins. Deignan, Alice. 2005b. A corpus perspective on the relationship between metaphor and metonymy. Style, 7991. Evans, Nicholas and David Wilkins. 2000. In the minds ear: The semantic extensions of perception verbs in Australian languages. Language, 763: 546592. Evans, Vyvyan. 2004. The structure of time: Language, meaning and temporal cognition. Amsterdam: John Benjamins. Evans, Vyvyan. 2005. The meaning of time: Polysemy, the lexicon and conceptual structure. Journal of Linguistics, 411: 3375. Evans, Vyvyan. 2006. Lexical concepts, cognitive models and meaning-construction. Cognitive Linguistics 174, 491534. Evans, Vyvyan. 2007. Towards a Cognitive Compositional Semantics. In U. Magnusson, H. Kardela and A. Glaz (eds.), Further Insights in Semantics and Lexicography, pp. 1142. Poland: University Marie Curie University Press. Evans, Vyvyan. 2009a. Semantic representation in LCCM Theory. In V. Evans and S. Pourcel (eds.). New Directions in Cognitive Linguistics, pp. 2755. Amsterdam: John Benjamins. Evans, Vyvyan. 2009b. How words mean: Lexical concepts, cognitive models and meaning construction. Oxford: Oxford University Press. Evans, Vyvyan. 2010. From the spatial to the non-spatial: The state lexical concepts of in, on and at. In V. Evans and P. Chilton (eds.), Language, cognition and space: The state of the art and new directions, pp. 215248. London: Equinox Publishing. Evans, Vyvyan. To appear. A window on the mind. Oxford: Oxford University Press. Evans, Vyvyan and Melanie Green. 2006. Cognitive linguistics: An introduction. Edinburgh: Edinburgh University Press. Fauconnier, Gilles. 1994. Mental spaces. Cambridge: Cambridge University Press. Fauconnier, Gilles. 1997. Mappings in thought and language. Cambridge: Cambridge University Press. Fauconnier, Gilles and Mark Turner. 1998. Conceptual integration networks. Cognitive Science, 222: 33187. Fauconnier, Gilles and markMark Turner. 2002. The way we think: Conceptual blending and the minds hidden complexities. New York: Basic Books. Fauconnier, Gilles and Mark Turner. 2008. Rethinking metaphor. In R. Gibbs (ed.), The Cambridge handbook of metaphor and thought, pp. 5366. Cambridge: Cambridge University Press. Feldman, Jerome. 2006. From molecule to metaphor: A neural theory of language. Cambridge, MA: MIT Press. Gagnon L., P. Goulet, F. Giroux, and Y. Joanette. 2003. Processing of metaphoric and nonmetaphoric alternative meanings of words and right- and left-hemispheric lesion. Brain and Language, 87, 217226. Gallese, Vittorio and George Lakoff. 2005. The brains concepts: The role of the sensory-motor system in reason and language. Cognitive Neuropsychology, 22, 455479. Gentner, Dedre and Brian Bowdle. 2008. Metaphor as structure-mapping. In R. Gibbs (ed.), The Cambridge handbook of metaphor and thought, pp. 109128. Cambridge: Cambridge University Press.
660
V. Evans
Gentner, Dedre, Brain Bowdle, Phillip Wolff and Consuelo Boronat. 2001. Metaphor is like analogy. In D. Gentner, K. J. Holyoak and B. N. Kokinov (eds.), The analogical mind: Perspectives from cognitive science, pp. 199253. Cambridge, MA: MIT Press. Gentner, Dedre, Mutsumi Imai and Lera Boroditsky. 2002. As time goes by: Evidence for two systems in processing space time metaphors. Language and Cognitive Processes, 175: 537 565. Gibbs, Raymond W. Jr. 1980. Spilling the beans on understanding and memory for idioms in conversation. Memory and Cognition, 8, 449456. Gibbs, Raymond W. Jr. 1994. The Poetics of Mind. Cambridge: Cambridge University Press. Gibbs, Raymond W. Jr., N. P. Nayak and C. Cutting. 1989. How to kick the bucket and not decompose: Analyzability and idiom processing. Journal of Memory and Language, 28: 576 593. Giora, Rachel. 2003. On our mind: Salience, context, and figurative language. New York: Oxford University Press. Giora, Rachel. 2008. Is metaphor unique? In R. Gibbs (ed.), The Cambridgehandbook of metaphor and thought, pp. 143160. Cambridge: Cambridge University Press. Giora, Rachel, Ofer Fein, Keren Aschkenazi, Inbar Alkabets-Zlozover. 2007. Negation in context: A functional approach to suppression. Discourse Processes, 43, 153172. Glenberg, Arthur. 1997. What memory is for. Behavioral and Brain Sciences, 20, 155. Glenberg, Arthur and Kaschak, Michael. 2002. Grounding language in action. Psychonomic Bulletin and Review, 9, 558565. Glucksberg, Sam. 2001. Understanding figurative language: From metaphors to idioms. Oxford: Oxford University Press. Glucksberg, Sam. 2008. How metaphors create categoriesquickly. In R. Gibbs (ed.), The Cambridge handbook of metaphor and thought, pp. 6783. Cambridge: Cambridge University Press. Goldberg, Adele. E. 1995. Constructions: An argument structure approach to construction grammar. Chicago: University of Chicago Press. Goldberg, Adele E., 2006. Constructions at work. Oxford: Oxford University Press. Goldvarg, Yevgeniya and Sam Glucksberg. 1998. Conceptual combinations: The role of similarity. Metaphor and Symbol, 13: 243255. Grady, Joseph E. (1997). Foundations of meaning: Primary metaphors and primary scenes. Unpublished doctoral thesis, Linguistics dept. UC Berkeley. Grady, Joseph E. (2005). Primary metaphors as inputs to conceptual integration. Journal of Pragmatics, 37, 15951614. Grice, H. Paul. 1975. Logic and Conversation. In P. Cole and J. L. Morgan (eds.), Syntax and semantics, volume 3. Speech acts, pp. 4158. New York: Academic Press. Hurford, James. 2007. Origins of meaning. Oxford: Oxford University Press. Kaschak, Michael and Arthur Glenberg. 2000. Constructing meaning: The role of affordances and grammatical constructions in sentence comprehension. Journal of Memory and Language, 43, 508529. Kvecses, Zoltn and Gunter Radden 1998. Metonymy: Developing a cognitive linguistic view Cognitive Linguistics, 9, 1, 3777. Lakoff, George. 1987. Women, fire and dangerous things: What categories reveal about the mind. Chicago: Chicago University Press. Lakoff, George. 2008. The neural theory of metaphor. In R. Gibbs (ed.), The Cambridge handbook of metaphor and thought, pp. 1738. Cambridge: Cambridge University Press. Lakoff, George and Mark Johnson. 1980. Metaphors we live by. Chicago: University of Chicago Press. Lakoff, George and Mark Johnson. 1999. Philosophy in the flesh. New York: Basic Books.

Lakoff, George and Mark Turner. 1989. More than cool reason. Chicago: University of Chicago Press. Langacker, Ronald W. 1987. Foundations of Cognitive Grammar: Volume I Stanford: Stanford University Press. Langacker, Ronald W. 1991. Foundations of Cognitive Grammar: Volume II Stanford: Stanford University Press. Langacker, Ronald W. 2008. Cognitive Grammar: A basic introduction. Oxford: Oxford University Press. Leezenberg, Michiel. 2001. Contexts of metaphor. Oxford: Elsevier Science. Levinson, Stephen. 2003. Space in language and cognition. Cambridge: Cambridge University Press. Mandler, Jean. 2010. The spatial foundations of the conceptual system. Language and cognition, 21: 2144. Miller, George and Philip Johnson-Laird. 1976. Language and perception. Cambridge, MA: Harvard University Press. Moore, Kevin Ezra. 2006. Space-to-time mappings and temporal concepts. Cognitive Linguistics, 172: 199244. Nez, Rafael, Benjamin Motz, and Ursina Teuscher. 2006. Time after time: The psychological reality of the Ego- and Time-Reference-Point distinction in metaphorical construals of time. Metaphor and Symbol, 21, 133146. Nez, Rafael and Eve Sweetser. 2006. Looking ahead to the past: Convergent evidence from Aymara language and gesture in the crosslinguistic comparison of spatial construals of time. Cognitive Science, 30, 401450. Olivieri, Massimiliano Leonor Romero and Costanza Papagno. 2004. Left but not right temporal involvement in opaque idiom comprehension: A repetitive transcranial magnetic stimulation study. Journal of Cognitive Neuroscience, 165, 848855. Panther, Klaus Uwe and Linda Thonburg. 2003. Metonymy and pragmatic inferencing. Amsterdam: John Benjamins. Peirsman, Yves and Dirk Geeraerts. 2006. Metonymy as a prototypical category. Cognitive Linguistics 17. 269316. Pexman, Penny, Todd Ferretti, and Albert Katz. (2000). Discourse factors that influence online reading of metaphor and irony. Discourse Processes, 293: 201222. Pragglejaz Group. 2007. MIP: A method for identifying metaphorically used words in discourse. Metaphor and Symbol, 221: 139. Pulvermller, Friedemann. 2003. The neuroscience of language: On brain circuits of words and serial order. Cambridge: Cambridge University Press. Pynte, J., M. Besson, F. H. Robichon and J. Poli. 1996. The time-course of metaphor comprehension: An event-related potential study. Brain and Language, 55, 293316. Radden, Gnter and Zoltn Kvecses. 2007. Towards a Theory of Metonymy. In V. Evans, B. K. Bergen and J. Zinken (eds.). The Cognitive Linguistics Reader, pp. 335359. London: Equinox. Searle, John. 1979. Expression and meaning: Studies in the theory of speech acts. Cambridge: Cambridge University Press. Stern, Josef. 2000. Metaphor in context. Cambridge, MA: MIT Press. Stroop, J. R. 1935. Studies of interference in serial verbal interactions. Journal of Experimental Psychology, 18, 643662. Sperber, Dan and Deidre Wilson. 2008. In R. Gibbs (ed.), The Cambridge handbook of metaphor and thought, pp. 84108. Cambridge: Cambridge University Press. Sperber, Dan and Deidre Wilson. 1995. Relevance: Communication and cognition (second edition). Oxford: Blackwell.
662
V. Evans
Steen, Gerard. 2007. Finding metaphor in grammar and usage. Amsterdam: John Benjamins. Talmy, Leonard. 2000. Towards a cognitive semantics (two volumes). Cambridge, MA: MIT Press. Taylor, Lawrence J. and Rolf A. Zwaan. 2009. Action in cognition: The case of language. Language and Cognition, 11: 4558. Traugott, Elizabeth-Closs and Richard Dasher. 2004. Regularity in semantic change. Cambridge: Cambridge University Press. Turner, Mark. 1996. The Literary Mind. Oxford: Oxford University Press. Vigliocco, Gabriella; Lotte Meteyard; Mark Andrews and Stavroula Kousta. 2009. Toward a theory of semantic representation. Language and Cognition, 12: 219248. Zwaan, Rolf A. 2004. The immersed experiencer: toward an embodied theory of language comprehension. In B.H. Ross (ed.) The Psychology of Learning and Motivation, pp. 3562. New York, NY: Academic Press.
A new look at metaphorical creativity in cognitive linguistics

ZOLTN KVECSES*
Abstract Where do we recruit novel and unconventional conceptual materials from when we speak, think and act metaphorically, and why? This question has been partially answered in the cognitive linguistic literature but, in my view, a cru cial aspect of it has been left out of consideration or not dealt with in the depth it deserves: it is the effect of various kinds of context on metaphorical concep tualization. Of these, I examine the following: (1) the immediate physical set ting, (2) what we know about the major entities participating in the discourse, (3) the immediate cultural context, (4) the immediate social setting, and (5) the immediate linguistic context itself. I suggest that we recruit conceptual mate rials for metaphorical purposes not only from bodily experience but also from all of these various contexts. Since the contexts can be highly variable, the metaphors used will often be variable, novel, and unconventional. The phe nomenon can be observed in both everyday forms of language and literary texts. Keywords: metaphor, metaphorical creativity, motivation, context, pressure of coherence, embodiment, discourse
* Address for correspondence: Department of American Studies, Etvs Lornd University, Budapest, Hungary. E-mail: zkovecses@ludens.elte.hu Acknowledgements: I want to thank the three anonymous reviewers and one of the associate editors for Cognitive Linguistics for their extremely helpful and constructive criticism of this article. Part of the research for this work was conducted at the Institute of Advanced Study, Durham University, England, where I was distinguished fellow for three months in 2008. My gratitude goes to Andreas Musolff, David Cowling, Ash Amin, and Patrick OMeara for inviting me and providing ideal working conditions. I also thank my colleagues Enik Bollobs and Rka Benczes for help of all sorts with this paper and my students Eszter Nucz and Tams Tmr for their help with the research on poetry. Cognitive Linguistics 214 (2010), 663697 DOI 10.1515/COGL.2010.021 09365907/10/00210663 Walter de Gruyter
664
Z. Kvecses
1. Introduction The general issue I address in this paper is as follows: Where do we recruit conceptual materials from when we speak, think and act metaphorically, and why? This larger issue involves a more specific question that will be my focus: Where do we recruit novel and unconventional conceptual materials from when we speak, think and act metaphorically, and why? This question has, as we will see, been partially answered by several scholars in the cognitive linguistic literature, though it was asked in a different form. In the paper, I would like to answer the question in a new way, which, I hope, gives us a more complete account of metaphorical creativity. By metaphorical creativity I mean the production and use of conceptual metaphors and/or their linguistic manifestations that are novel or unconventional (with the understanding that novelty and unconventionality are graded concepts that range from completely new and unconventional through more or less new and unconventional to well-worn, entrenched and completely conventional cases). In recent years, a large number of scholars have criticized the theory of conceptual metaphor for a variety of reasons (for example, Cameron 2003, 2007; Clausner and Croft 1997; Deignan 1999, 2005; Dobrovolskij and Piirainen 2005; Gevaert 2001, 2005; Pragglejaz Group 2007; Rakova 2002; Ritchie 2003; Semino 2005; Steen 1999; Stefanowitsch 2007; Zinken 2007). Perhaps the most significant element of this criticism was the suggestion that conceptual metaphor theory ignores the study of metaphor in the contexts in which metaphorical expressions actually occur; namely, in real discourse. The claim is that the practitioners of traditional conceptual metaphor theory (i.e., Lakoff and Johnson and their followers) set up certain, what they call conceptual metaphors and exemplify them with groups of (mostly) invented metaphors. In this way, traditional researchers in conceptual metaphor theory fail to notice some essential aspects of metaphor and cannot account for phenomena that can only be accounted for if we investigate metaphors in real discourse. I have responded to several components of this criticism in some previous publications (Kvecses 2005, 2008, 2009a, in print) and I do not wish to repeat my response here. Instead, I will take the advice of the critics seriously, look at some pieces of real discourse where metaphors are used, and see how traditional conceptual metaphor theory can and should be modified to accommodate at least some of the criticism. One area that the study of real discourse can throw considerable light on is the issue of metaphorical creativity. Metaphorical creativity in discourse can involve a variety of distinct forms. In Metaphor in Culture (2005), I distinguished two types: creativity that is based on the source domain and creativity that is based on the target. Source-related creativity can be of two kinds: source-internal and source-external creativity. Source-internal creativity
A new look at metaphorical creativity 665 involves cases that Lakoff and Turner (1989) describe as elaboration and extending, where unused source-internal conceptual materials are utilized to comprehend the target. For example, given the conventional death is sleep metaphor, we find in Hamlets soliloquy To die to sleep? Perchance to dream!, where dreaming is an extension of the source domain (Lakoff and Turner 1989). Source-external cases of creativity operate with what I called the range of the target phenomenon, in which a particular target domain receives new, additional source domains in its conceptualization (Kvecses 2005). For instance, Ning Yu (1998) notes that the concept of happiness is conceptualized by means of the metaphor happiness is flowers in the heart that is additional to other, more conventional source domains that are present both in Chinese and English. The type of creativity in discourse that is based on the target was also described by Kvecses (2005). In this type of creativity, a particular target that is conventionally associated with a source connects back to the source taking further knowledge structures from it. Musolff (2001) provides several examples (reanalyzed by Kvecses 2005) where metaphorical expressions, such as fireexit, are selected from the source domain of building on the basis of target domain knowledge in the europe is a building metaphor, though they are not part of the conventional mappings. We can call this target-induced creativity. In the present paper, I will suggest that there is yet another form of metaphorical creativity in discoursecreativity that is induced by the context in which metaphorical conceptualization takes place. This kind of creativity has not been systematically explored in the cognitive linguistic literature on metaphor. I will term the creativity that is based on the context of metaphorical conceptualization context-induced creativity and the metaphors that result from the influence of the context on that conceptualization context-induced metaphors. In the paper, I will distinguish five contextual factors that commonly produce unconventional and novel metaphors: (1) the immediate physical setting, (2) what we know about the major entities participating in the discourse, (3) the immediate cultural context, (4) the immediate social setting, and (5) the immediate linguistic context itself. There are surely others, but I will limit myself to the discussion of these five. I will suggest, furthermore, that the same contextual factors that lead conceptualizers to produce unconventional and novel metaphors in everyday forms of language are also at work in poetry and literature in general. The issue of metaphorical creativity was first studied systematically in the cognitive linguistic literature by George Lakoff and Mark Turner (1989) in their More Than Cool Reason. Lakoff and Turner make two very important claims. One is that poets share with everyday people most of the conceptual metaphors they use in poetry and, second, metaphorical creativity in poetry is the result of four common conceptual devices that poets use in manipulating otherwise shared
666
Z. Kvecses
conceptual metaphors. These include the devices of elaboration, extension, questioning, and combining. However, others have shown that these cognitive devices, or strategies, exist not only in poetic language but also in more ordinary forms of language use, such as journalism (see, e.g., Jackendoff and Aaron 1991; Semino 2008). Moreover, it has been noticed that not all cases of the creative use of metaphor in poetry are the result of such cognitive devices. Mark Turner proposed that in many cases literature and poetry make use of what he and Fauconnier call blends, in which various elements from two or more spaces, domains, or frames, can be conceptually fused, or integrated (see, e.g., Turner 1996; Fauconnier and Turner 2002). I will propose that in order to be able to account for an even fuller range of metaphorical creativity in poetry, we need to go still further. I will suggest that a more complete account of the poetic use of metaphor requires that we look at the possible role of the context in which poets create poetry. When ordinary people conceptualize an idea metaphorically, they do so under two kinds of pressure called the pressure of coherence: the pressure of their bodily experiences and the pressure of the context that surrounds them (Kvecses 2005). In more recent studies (e.g., Kvecses 2008, 2009b), I have suggested that when we speak and think metaphorically, we are influenced by these two factors and that the effect of context on metaphorical conceptualization is just as pervasive, if not more so, as that of the body. I claim that poets work under the same cognitive pressures and that the effect of context may in part be responsible for the creative use of metaphor in poetry. This discussion of the role of the body and context in the use of metaphor leads us to the issue of universality and variation in metaphorical conceptualization. This topic is important for the present purposes because variation in metaphorical conceptualization is directly related to metaphorical creativity. The cognitive linguistic view of metaphor (Kvecses 2002, 2006; Lakoff and Johnson 1980) that uses primary metaphors as its fundamental construct assumes that primary metaphors are based on correlations in bodily experience and, hence, that these metaphors are embodied (Grady 1997a, b; Lakoff and Johnson 1999). Since embodiment such as the correlation between amount and verticality, purposes and destinations, similarity and closeness, anger and heat, and the like, characterizes all human beings, the corresponding primary metaphors will be, or at least can potentially be, universal. In this view, nonuniversal aspects of metaphor are accounted for by the various ways in which primary metaphors are put together in different cultures to form complex metaphors. The main focus of this kind of research is, however, on universal aspects of metaphor. By contrast, another line of research within the cognitive linguistic paradigm takes as its point of departure the huge amount of variation we can find in metaphor both cross-culturally and within cultures, and it places a great deal
A new look at metaphorical creativity 667 of emphasis on the attempt to account for such variation. As Kvecses (2005) observes, the major driving force behind variation is context. This is defined by a variety of contextual factors, such as differences in key concepts in a culture, in history, and environment. Thus, given conceptual metaphor theory, it appears that we can have two foci in our research interests, one primarily concerned with universality and another primarily concerned with variation. Taking into account the causes of universality and variation, we get two general lines of research: EmbodimentUniversality ContextVariation In line with the argument above, we can reconcile the two programs by making the claim that when we comprehend something metaphorically in particular situations, we are under two kinds of pressure: the pressure of our embodiment and the pressure of context. Metaphorical conceptualizers try to be coherent with both their bodies (i.e., correlations in bodily experience) and their contexts (i.e., various contextual factors). For the sake of a clearer exposition, I distinguish two basic kinds of context: global and local. By global context I mean the contextual factors that affect all members of a language community when they conceptualize something metaphorically. By local context I mean the immediate contextual factors that apply to particular conceptualizers in specific situations. In sum, my major concern in this paper is not with the structure of novel conceptual metaphors, with the process of understanding novel metaphors, or with how people create complex novel blends online in discourse. My major concern is with where people recruit the conceptual source materials from when they are engaged with all these phenomena. In other words, my main interest is in the issue of motivation, and not in structure, process, or meaning construction in metaphor. I define motivation as any of the bodily and contextual factors that trigger, prompt, or, simply facilitate the selection and use of particular conceptual metaphors or their linguistic manifestations. In other words, I think of motivation as graded phenomenon that can affect the conceptualizer with various degrees of strength. 2. Global contexts Global contexts include a variety of different contextual factors. When we engage with the world and metaphorically conceptualize it, we unconsciously monitor and pick out certain details of it. This world consists of ourselves (our body), the physical environment, the physical and social aspects of the settings in which we act, and the broader cultural context. Since all of these aspects of
668
Z. Kvecses
the world can vary in many ways, the metaphors we use can vary in many ways. Let us see some examples for this phenomenon. The survey and the examples are based on Kvecses (2005). We can begin with the physical environement. There are differences in the physical environment in which people live, and because people are (mostly unconsciously) attuned to these differences, the metaphors that people speaking different languages and varieties of languages use will also vary. The physical environment includes the particular geography, landscape, fauna and flora, dwellings, other people, and so forth that speakers of a language or variety interact with on a habitual basis. A good test case of this suggestion is a situation in which a language, which was developed by speakers living in a certain kind of natural and physical environment, is moved by some of its speakers to a new and very different natural and physical environment. If this happens, we should expect to find differences between metaphorical conceptualization by speakers of the original language and conceptualization used by people who speak the transplanted version of the same language. For example, (American) English is a language that was moved to a new and very different physical environment; that is, to North America, where it developed a unique metaphorical language patterned after the new environment (Kvecses 2000, 2005). Social factors can play a similar role in shaping the overall metaphorical patterns of a community. One example of this is the distinction between men and women in all societies. Mens and womens metaphors may differ when they conceptualize aspects of the world. Annette Kolodny (1975, 1984) shows us that American men and women had significantly different metaphorical images of the frontier in the period between 1630 and 1860. Based on her careful examination of hundreds of literary and non-literary documents in the period, men thought of the frontier as a virgin land to be taken, whereas women thought of it as a garden to be cultivated. The cultural context means the unique and salient concepts and values that characterize particular (sub)culturestogether with the governing principles of a given culture or subculture. The governing principles and key concepts have special importance in (metaphorical) conceptualization because they permeate several general domains of experience for a culture or cultural group. This can be noticed in perfectly everyday concepts. They may have an important role in distinguishing peoples habitual metaphorical thought across cultures or subcultures. For example, Frank Boers and Murielle Demecheleer (1997, 2001) suggested that the concepts of hat and ship are more productive of metaphorical idioms in English than in French. And conversely, the concepts of sleeve and food are more productive of metaphorical idioms in French than in English. They argue that this is because the former two concepts are relatively more salient for speakers of (British) English, while the latter two are relatively more salient for speakers of French.
A new look at metaphorical creativity 669 An additional set of factors includes what I call differential memory. What this means is the historythe major or minor events that occurred in the past of a society/culture, group, or individual. The memory of the events is coded into the language. Because of the past-oriented nature of language, many of the metaphors we use may reveal a certain time-lag between our experiences of the world today and the experiences associated with the source domain in the past (Deignan 2003). One of my students, Niki Kves (2002), did a survey of the metaphors Hungarians and Americans use for the concept of life. Her survey showed that Hungarians primarily use the life is war and life is a compromise metaphors, whereas the Americans most commonly employ the life is a precious possession and life is a game metaphors. The issue obviously has to do with the peculiarities of Hungarian and American history. Hungarians have been in wars throughout their more than one thousand year old history as a nation and state, and had to struggle for their survival as they are wedged between powerful German-speaking and Slavic nations. Given this history, it is not surprising that for many Hungarians life is struggle, and less of a game. With time, however, this habitual way of conceptualizing life, or any other concept, may change. Finally, a set of causes that produces metaphor variation is what I termed differential concerns and interests. An entire society may be characterized by certain concerns and interests. Americans, for example, are often said to be given to action, as opposed to passivity (a well-known example of this is the preference for take over have in American English, as opposed to the preference for have over take in British English in phrases such as take a shower and have a shower. See Kvecses 2000). This trait may explain the heavy use of sports and game metaphors by Americans (e.g., to quarterback an operation, taken from American football). The claim here is not that only Americans have the game and sports metaphors, but that they have them for a more extensive range of target concepts than other nations. In other words, the reality (or maybe just the myth) of having a trait may give rise to a heavy reliance on a metaphorical source domain that is coherent with the trait. 3. Local contexts Metaphorical conceptualization is also affected by more immediate local contexts. These include the immediate physical setting, the knowledge about main entities in the discourse, the immediate cultural context, the immediate social setting, and the immediate linguistic context. Local and global contexts are assumed here to form a continuum from the most immediate local contexts to the most general global ones. My strategy will be to first characterize the effect of these more local contexts on metaphorical conceptualization in everyday forms of language and then to turn to how the same contexts can influence
670
Z. Kvecses
metaphorical conceptualization in poetry. Some of the discussion of the various contextual effects on everyday metaphor use is also found in Kvecses (2010). 3.1. The effect of immediate physical setting on metaphor use
The immediate physical setting can influence the selection and use of particular metaphors in discourse. The physical setting comprises, among possibly other things, the physical events and their consequences that make up or are part of the setting, the various aspects of the physical environment, and the perceptual qualities that characterize the setting. Ill briefly discuss an example for the first. We can find examples for the latter two in other studies (e.g., Boers 1999; Semino 2008), though their analyses are not formulated in terms of the present framework. Physical events and their consequences are well demonstrated by a statement made by the American journalist who traveled to New Orleans to do an interview with Fats Domino two years after the devastation wreaked by hurricane Katrina, when the city of New Orleans was still struggling with many of the consequences of the hurricane. The journalist comments: (1) The 2005 hurricane capsized Dominos life, though hes loath to confess any inconvenience or misery outside of missing his social circle . . . (USA TODAY, 2007, September 21, Section 6B) The metaphorical statement The 2005 hurricane capsized Dominos life is based on the general metaphor life is a journey and its more specific version life is a sea journey. The sea journey source domain is chosen probably because of the role of the sea in the hurricane. More importantly, it should be noted that the verb capsize is used (as opposed to, say, run aground), though it is not a conventional linguistic manifestation of either the general journey or the more specific sea journey source domains. I suggest that this verb is selected by the journalist as a result of the then (still) visible consequences in New Orleans of the hurricane as a devastating physical event. The physical setting thus possibly triggers the extension of an existing conventional conceptual metaphor and causes the speaker/conceptualizer to choose a metaphorical expression that best fits that setting. 3.1.1. The physical context in poetry. Context can be used in poetry in two ways: Poets may describe the context in which they create poetry. They may use context as a means of talking about something else. When the first is the case, we get straightforward examples of describing a scene, such as in Matthew Arnolds Dover Beach:
A new look at metaphorical creativity 671 (2) The sea is calm to-night. The tide is full, the moon lies fair Upon the straits,- on the French coast, the light Gleams and is gone; the cliffs of England stand, Glimmering and vast, out in the tranquil bay. Come to the window, sweet is the night-air! (retrieved from http://www.artofeurope.com/arnold/arn1.htm) However, from the perspective of poetic metaphors and the study of particular poems, much more interesting are the cases where this more or less literally conceived context is used metaphorically to express meanings that are not normally considered part of the meaning of the context as described. Using conceptual metaphor theory, we can say that the context can function as the source domain and the meanings to be expressed by means of the source domain function as the target. The exciting question in such cases is: What is the meaning (or, what are the meanings) that the dominantly literally-conceived source (i.e., the context) is intended to convey? Consider the continuation of the Arnold poem: (3) Only, from the long line of spray Where the sea meets the moon-blanched land, Listen! You hear the grating roar Of pebbles which the waves suck back, and fling, At their return, up the high strand, Begin, and cease, and then again begin, With tremulous cadence slow, and bring The eternal note of sadness in. (retrieved from http://www.artofeurope.com/arnold/arn1.htm) Although the description of the context continues, there is a clear sense in the reader that the poem is not primarily about depicting the physical location and events that occur around the poet/observer. Indeed, the last line (and bring the eternal note of sadness in) makes this meaning explicit; the coming in and going out of the waves convey an explicitly stated sadness. And of course we know that waves cannot actually bring in sadness or notes of sadnessthey can only be metaphorically responsible for our sad mood when we hear the tremulous cadence slow. And this sense of sadness is reinforced in the next stanza: (4) Sophocles long ago Heard it on the Aegean, and it brought Into his mind the turbid ebb and flow Of human misery; we
672
Z. Kvecses Find also in the sound a thought, Hearing it by this distant northern sea. (retrieved from http://www.artofeurope.com/arnold/arn1.htm)
In sum, then, a poet can describe a context (scene) in which s/he writes a poem, or s/he can use the context (scene) (which functions as a source domain) to talk about things that go beyond or are outside the context (scene) s/he is involved in (this functions as the target domain). My concern will be with this second use of context, or scene. Let us now continue with the Arnold poem: (5) The sea of Faith Was once, too, at the full, and round earths shore Lay like the folds of a bright girdle furled. But now I only hear Its melancholy, long, withdrawing roar, Retreating, to the breath Of the night-wind, down the vast edges drear And naked shingles of the world. (retrieved from http://www.artofeurope.com/arnold/arn1.htm) At work in this stanza are two conceptual metaphors: health is wholeness and perfection/completeness is roundness, as indicated by the expressions at the full (wholeness) and and round earths shore (roundness). The stanza, we understand, is about the health and perfection of the human condition until the coming of the changes that were happening at the time: the changes to the established order of the world in which religion played a major role. These two extremely general metaphors can be instantiated (and could be instantiated by Arnold) in many different ways. The question arises why they are made conceptually-linguistically manifest in the particular way they are; that is, by the metaphor the sea of Faith. This metaphor assumes the conceptual metaphors (christian) faith is the sea and people are the land. The sea was once full and covered the land all around, and in the same way Christian faith provided people with a spiritual health (health is wholeness) and a perfect state of the human condition (perfection is roundness), unlike the situation in which Arnold wrote the poem. In addition, the full cover of faith protected people from the dangers of the new times that now threaten a faithless world. These ideas were given expression in these particular ways, we can safely assume, because of what Arnold saw before him at the time of creating the poem: the ebb and flow of the sea. As the sea retreats, that is, as faith disappears, the world becomes a less healthy and less perfect place, unprotected by faith.
A new look at metaphorical creativity 673 3.2. The effect of knowledge about major entities in the discourse on metaphor use
The main entities participating in discourse include the speaker (conceptualizer), the hearer (addressee/conceptualizer), and the entity or process we talk about (topic). These can all influence the use of metaphor in discourse. Ill discuss three such examples, involving the topic, the speaker/conceptualizer, and the addressee/conceptualizerin this order. Knowledge about the topic frequently leads to novel and unconventional metaphors. I use topic not in the sense in which it is commonly used in metaphor theory in general (i.e., as a theoretical concept corresponding to the target domain in conceptual metaphor theory), but in the sense of any kind of knowledge or information that is explicitly or implicitly conveyed by a piece of discourse. If we have some special knowledge or information about the elements of the discourse, we can utilize that knowledge or information for purposes of metaphorical creativity, that is, to metaphorically conceptualize a certain target domain. This means that what I call the topic here is very close to what is termed the source domain in conceptual metaphor theory. Particularly creative examples can be found in journalism. Consider the following newspaper headline: Foot heads arms body. We get an explanation of what this could possibly mean from the short letter sent in to the editor of The Times: (6) Sir, The letters about odd headlines . . . reminded me of an all-time favourite. In the early 1980s Michael Foot became the leader of the Labour Party. He was also a co-founder of CND and pushed for nuclear disarmament. Mr Foot travelled to Brussels to chair a lobby group in the European Parliament to construct a plan to get rid of the bomb as part of the European election policy. From this came the headline Foot heads arms body. (The Times, Letters to the Editor, Wednesday January 30, 2008, p. 16) Since the topic involves the various entities, such as Foot and disarmament and Mr. Foot being the chair of the committee that deals with the issue of disarmament, the speaker/conceptualizer had the opportunity to deliberately create a humorous headline. In the previous case, the metaphor was selected and elaborated as a result of what the conceptualizer knows about the topic. It is also possible to find cases where the selection of a metaphor depends on knowledge about the conceptu alizer himself or herself. What is especially intriguing about such cases is that the authors (conceptualizers) knowledge about him- or herself does not need to be conscious. The next example, taken from my previous work (Kvecses, 2005) but reanalyzed here, demonstrates this possibility. As one would expect,
674
Z. Kvecses
one important source of such cases is the area of therapy or psychological counseling. In a therapeutic context people commonly create novel metaphors as a result of unique and traumatic life experiences. The metaphors that are created under these circumstances need not be consciously formed. The example comes from an article in the magazine A & U (March, 2003) about photographic artist Frank Jump. Frank Jump photographs old painted mural advertisements in New York City. He has AIDS, but he has outlived his expected life span. His life and his art are intimately connected metaphorically. The conceptual metaphor operative here could be put as follows: surviving aids despite predictions to the contrary is for the old mural advertisements to survive their expected life span. At first, Jump was not consciously aware that he works within the frame of a conceptual metaphor that relies on his condition. In his own words: (7) In the beginning, I didnt make the connection between the subject matter and my own sero-positivity. I was asked to be part of the Day Without Art exhibition a few years ago and didnt think I was worthyother artists work was much more HIV-specific. . . . But my mentor said, Dont you see the connection? Youre documenting something that was never intended to live this long. You never intended to live this long. [p. 27; italics in the original]
The mentor made the conceptual metaphor conscious for the artist. I believe something similar is happening in many cases of psychotherapy and counseling. It could be argued that it is the mentor who conceptualizes the situation for the artist. This may be a possible interpretation, but, on the other hand, when Jump says In the beginning, I didnt make the connection between the subject matter and my own sero-positivity, it is clear that the connection is in his unconscious and readily available to him as well. Obviously, the metaphor surviving aids despite predictions to the contrary is for the old mural advertisements to survive their expected life span is anything but a conventional conceptual metaphor. The metaphor is created by Frank Jump as a novel analogythe unconscious but nevertheless real analogy between surviving ones expected life span as a person who has AIDS and the survival of the mural advertisements that were created to be visible on the walls of buildings in New York City for only a limited amount of time. In this case, (unconscious) self-knowledge leads the conceptualizer to find the appropriate analogy. The analogy is appropriate because the source and the target domains share schematic structural resemblance; namely, an entity existing longer than expected. The resulting metaphor(ical analogy) is novel and creative and it comes about as a result of what the conceptualizer knows about himself.
A new look at metaphorical creativity 675 Let us take another example of how the topic can influence the choice of novel metaphors in discourse. As well see, the example is additionally interesting because it gives us some idea how the addressee may also be involved in the selection of metaphors by the speaker/conceptualizer. In the Comment section of The Times (January 30, 2008, p. 14), the author congratulates and offers advice to the newly elected head coach of the England football team. His or her specific recommendation (the name is not indicated) is that Fabio Capello, the new Italian head coach, should play David Beckham against Switzerland in an upcoming game at Wembley Stadium, despite the fact that Beckham had not played top-class football for several months at the time. If Beckham is given a chance to play, he will have played on the English national team 100 times, and this would be a nice way of saying good-bye to him as regards his career on the national team. The author of the article explains that he or she is aware that Beckham is not fully prepared for this last game on the national team. S/he writes: (8) Beckham is 32. He has not played top-class football since November. Los Angeles Galaxy are sardines not sharks in the ocean of footy. How did the author arrive at the novel metaphors according to which the American football (soccer) team, the Los Angeles Galaxy, are sardines not sharks in the ocean of footy? In all probability, it is the authors knowledge about David Beckham, the main topic of the discourse, that gives rise to the metaphors. The author (together with us) knows that Beckham plays for the Los Angeles Galaxy, a team located in Los Angeles, which, in turn, is a city on the Pacific Ocean, and the Pacific Ocean contains sardines and sharks. In somewhat more technical language, we could say that the frame for Beckham as a football player includes the name of the team that he plays for and the place where the team is located, which in turn evokes the frame of the Pacific Ocean. The frame for the Pacific Ocean in turn involves the various kinds of fish that live in that ocean. Of all these various kinds of fish, why are the Los Angeles Galaxy sardines and not sharks and why is football an ocean? With this question, I wish to indicate that the authors knowledge about Beckham does not provide a full explanation of the novel metaphors used. It is a major part of the story, but probably not the whole story. What we have to take into account additionally are some highly schematic conventional conceptual metaphors, such as the size of social groups is the size of physical entities and social competition is the survival behavior of animals. The former conceptual metaphor is extremely general and probably functions only as a very general constraint on which linguistic expression can actually be selected; the idea of the vastness of the world of football and the many teams participating in it should be conveyed through reference to some huge physical entity (such as the ocean). The latter
676
Z. Kvecses
conceptual metaphor seems to be a special case of the social behavior is animal behavior metaphor. In the world of business competition, English has the conventional metaphorical expression: big fish eat small fish. Similarly, in football some teams are very powerful (the sharks), but most of them are weak (the sardines) in relation to the powerful ones. The expression big fish eat small fish and the underlying conceptual metaphor may in part be responsible for the author using the words sardines and sharks for some of the strong teams and for the much larger number of weak teams in the world of football. The same article also offers us a glimpse of how knowledge about the ad dressee can give rise to novel metaphors in discourse. There are two examples in the article that point in that direction. The first one reads: Dear Signor Capello (my italics, ZK). This is the first sentence of the article, with which the author addresses the intended recipient of the messagethe new Italian head coach of the English team, Fabio Capello. Although the use of the word Signor could not be interpreted as a metaphor, the fact that the English author addresses the recipient (Signor Capello), an Italian, partly in Italian is an indication that, in general, the knowledge about the addressee plays a role in how we select linguistic items for our particular purposes in the discourse. The second example is as follows: Beckham is a good footballer and a nice man: e una bella figura (italics in the original, ZK). This example comes much closer to being a metaphor, in that a man (Beckham) is compared to a figure, a shapea schematic word for geometric forms. In addition, the comparison is given in Italian, which shows that the language of the addressee must have influenced the choice of the metaphor. More generally, a part of what we know about the addressee in all probability plays a role in the selection of the metaphor. 3.2.1. Knowledge about the main entities of discourse in poetry. We can also distinguish several major entities of poetic discourse: the speaker (poet), the topic, and the hearer, or addressee (audience). (In what follows I will ignore all the difficulties in identifying the speaker with the poet and the addressee with the real audience. Such distinctions are not directly relevant to the main argument of the present paper.) Speaker/Poet. The idea that the general physical, biological, mental, emotional, etc. condition, or situation, of a poet can influence the way a poet writes poetry is well known and is often taken into account in the appreciation of poetry. Dickinson is a well-studied case, as discussed, for example, by Margaret Freeman (see, e.g., Freeman 1995, 2000, 2007) and James Guthrie (1998). Guthrie has this to say on the issue:
. . . I propose to concentrate on the fact of illness itself as a governing factor in Dickinsons development as a poet. We are already accustomed to thinking about ways in which illness or deformity modulate the registers of expression we hear while reading Milton, Keats, Emily Bronte, Lord Byron. For Dickinson, illness was a formative expe-
A new look at metaphorical creativity 677

rience as well, one which shaped her entire poetic methodology from perception to inscription and which very likely shook the foundations of her faith. Reading Dickinsons poems in the full knowledge and belief that, while writing them, she was suffering acutely from a seemingly irremediable illness renders many of them recuperable as almost diaristic records of a rather ordinary persons courageous struggle against profound adversity. (Guthrie 1998: 45)
Along similar lines, I suggest that a poets physical condition, especially poor health, can have an effect on the way he or she metaphorically conceptualizes the subject matter he or she writes about. In my terminology, this is how selfknowledge of ones situation as a contextual factor can often lead to the creative use of metaphors by poets. Let us take one of Dickinsons poems as a case in point: (9) I reckonwhen I count it all FirstPoetsThen the Sun Then SummerThen the Heaven of God And thenthe List is done But, looking backthe First so seems To Comprehend the Whole The Others look a needless Show So I writePoetsAll Their Summerlasts a Solid Year They can afford a Sun The Eastwould deem extravagant And if the Further Heaven Be Beautiful as they prepare For Those who worship Them It is too difficult a Grace To justify the Dream (retrieved from http://poetry.poetryx.com/poems/2520/) The question that Im asking here is how Dickinsons optical illness is transformed into metaphorical patterns in her poetry in general and in this poem in particular. I would propose the following analysis that fits my interpretation of the poem. (However, others may have a very different interpretation that may require a very different conceptual analysis.) In my interpretation, the poem is about poetic creativitythe issue of what inspires a poet to write poetry. Dickinson uses the following conceptual metaphor to talk about it: poetic creativity is a new way of seeing (as a result of the summer sun). The mappings, or correspondences, that make up the metaphor are as follows (the mappings go from source to target):
678
Z. Kvecses
summer productive period sun inspiration new way of seeing being poetically creative (i.e., coming up with a poem) An interesting property of the first mapping is that the literal summer stands metonymically for the literal year and the metaphorical summer stands for always. Thus, poets are always creative; they have a year-long summer. A second metaphor that Dickinson relies on is poems are heavens. In this metaphor, the mappings are: further heaven poem worshippers people reading poetry God poet As an important additional mapping in this metaphor, we also have: Gods grace poets inspiration Unlike the previous metaphor, where poetic inspiration is metaphorically equated with the sun, it is Gods grace that corresponds to the poets inspiration in this second metaphor. Dickinsons inspiration, however, is a difficult one: it is her optical illness. She writes her poetry by relying on, or making use of, her illness. This is a difficult grace to accept. In other words, her bodily condition of having impaired vision is put to use in an extraordinary way in this poem by Dickinson. Other poets may make use of their physical condition, or self-knowledge, in different ways. I believe it would be difficult to make generalizations about the precise ways in which self-knowledge of this kind is used by poets. At the same time, this contextual factor may explain some of the apparently strange uses of metaphor in the works of poets. Topic and addressee. For an illustration of how the addressee and the topic can influence the choice of a poets metaphors, let us turn to Sylvia Plaths poem, Medusa. Here are some relevant lines: (10) Off that landspit of stony mouth-plugs, Eyes rolled by white sticks, Ears cupping the seas incoherences, You house your unnerving headGod-ball, Lens of mercies, Your stooges Plying their wild cells in my keels shadow, Pushing by like hearts, Red stigmata at the very center, Riding the rip tide to the nearest point of departure,
A new look at metaphorical creativity 679 Dragging their Jesus hair. Did I escape, I wonder? (retrieved from http://www.americanpoems.com/poets/sylviaplath/ 1412) In this poem, the addressee is Sylvia Plaths mother. The question arises why the poet thinks metaphorically of her mother as a medusain both senses of this term (medusa as gorgon and as jellyfish). What we know about Sylvia Plath is that her relationship to her mother was strained and ambivalent. The strained and ambivalent nature of the relationship is one of the major topics, or subject matters, of the poem. In Greek mythology, Medusa is a gorgon with snakes for hair, who turns people who look at her to stone. We can thus suggest that the negative aspects of Plaths relationship to her mother are analogically reflected in the Medusa metaphor (your unnerving head). That is to say, the particular metaphorical image for the mother is provided by the broader cultural context; i.e., Greek mythology. Note, however, that the selection of this image is secondary to the poets knowledge about the addressee and the topic of the discourse; if her mother had been different, Plath would not have picked the image of the Medusa but something elsean image that would have fit a different mother with different properties. In this sense, I propose that it is the addressee and the topic of the discourse (the poem) that primarily governs the choice of the image applied to the motherthough conveyed in the form of a culturally defined analogy. 3.3. The effect of the immediate cultural context on metaphor use
Consider the following example taken from the San Francisco Chronicle, in which Bill Whalen, a professor of political science in Stanford and an advisor to Arnold Schwarzenegger, uses metaphorical language concerning the actor who later became the governor of California: (11) Arnold Schwarzenegger is not the second Jesse Ventura or the second Ronald Reagan, but the first Arnold Schwarzenegger, said Bill Whalen, a Hoover Institution scholar who worked with Schwarzenegger on his successful ballot initiative last year and supports the actors campaign for governor. Hes a unique commodityunless there happens to be a whole sea of immigrant body builders who are coming here to run for office. This is Rise of the Machine, not Attack of the Clones. (San Francisco Chronicle, A16, August 17, 2003) Of interest in this connection are the metaphors Hes a unique commodity and particularly This is Rise of the Machine, not Attack of the Clones. The first
680
Z. Kvecses
one is based on a completely conventional conceptual metaphor: people are commodities, as shown by the very word commodity to describe the actor. The other two are highly unconventional and novel. What makes Bill Whalen produce these unconventional metaphors and what allows us to understand them? There are, I suggest, two reasons. First, and more obviously, it is because Arnold Schwarzenegger played in the first of these movies. In other words, what sanctions the use of these metaphorical expressions has to do with the knowledge that the conceptualizer (Whalen) has about the topic of the discourse (Schwarzenegger), as discussed in a previous section. Second, and less obviously but more importantly here, he uses the metaphors because these are movies that, at the time of speaking (i.e., 2003), everyone knew about in California and the US. In other words, they were part and parcel of the immediate cultural context. Significantly, the second movie, Attack of the Clones does not feature Schwarzenegger, but it is the key to understanding the contrast between individual and copy that Whalen is referring to. Given this knowledge, people can figure out what Whalen intended to say, which was that Schwarzenegger is a unique individual and not one of a series of look-alikes. But figuring this out may not be as easy and straightforward as it seems. After all, the metaphor Rise of the Machine does not clearly and explicitly convey the idea that Schwarzenegger is unique in any sense. (As a matter of fact, the mention of machines goes against our intuitions of uniqueness.) However, we get this meaning via two textual props in the text. The first one is a series of statements by Whalen: Arnold Schwarzenegger is not the second Jesse Ventura or the second Ronald Reagan, but the first Arnold Schwarzenegger and Hes a unique commodityunless there happens to be a whole sea of immigrant body builders who are coming here to run for office. What seems to be the case here is that the speaker emphasizes the idea of individuality before he uses the machine metaphor. But not even this prior emphasis would be sufficient by itself. Imagine that the text stops with the words . . . This is Rise of the Machine. I think most native speakers would be baffled and have a hard time understanding what Whalen intended to say in this last sentence. Therefore, in order to fully understand the discourse we badly need the second textual prop, which is: not Attack of the Clones. It is against the background of this phrase that we understand what the metaphorical expression Rise of the Machine might possibly mean. 3.3.1. The cultural context in poetry. As we saw above (section 3.2.1.), the choice of the image of Medusa was in part motivated by the larger cultural context, of which the three gorgons of Greek mythology, including Medusa, form a part. The symbolic belief system is thus one aspect of Sylvia Plaths cultural system. The poem continues with the following lines:
A new look at metaphorical creativity 681 (12) My mind winds to you Old barnacled umbilicus, Atlantic cable, Keeping itself, it seems, in a state of miraculous repair. (retrieved from http://www.americanpoems.com/poets/sylviaplath/ 1412) Another aspect of the cultural context involves the entities we find in a particular physical-cultural environment. In the lines, the relationship to her mother is conceptualized metaphorically both as the umbilicus and the Atlantic telephone cable. In the former case, the generic-level conceptual metaphor personal relationships are physical connections is fleshed out at the specific level as the umbilicus. This is of course motivated by human biology, not by cultural context. What gives a metaphorical character to it is that we know that the poet is no longer physically-biologically linked to the mother through the umbilicus. The metaphor is probably used to convey the naturalness and inevitability of a strong bond between mother and child. However, the adjacent metaphor Atlantic cable derives from the surrounding physical-cultural environment. The first transatlantic telephone cable system between Great-Britain and North-America was laid in the 1950s, making it possible for people to communicate directly with each other at a long distance. Through the metaphor, the strength of the biological bond is reinforced, and the Atlantic cable can be seen as the temporal (and metaphorical) continuation of the umbilicus. The cultural context, among other things, includes, as we just saw, the belief system of a person and the physical-cultural environment. Both of these occur in various specific forms in a large number of other poems. The cultural belief system also involves the religious beliefs that are entertained in a given culture. Let us take the first stanza of a poem, Prayers of Steel, by Carl Sandburg. (13) LAY me on an anvil, O God. Beat me and hammer me into a crowbar. Let me pry loose old walls. Let me lift and loosen old foundations. (retrieved from http://www.bartleby.com/134/39.html) Here the poet evokes God and wants God to turn him into an instrument of social change. This making of an old type of man into a new type of man is conceptualized on the analogy of Gods creation of man in the Bible. In other words, the source domain of the metaphor is the biblical act of mans creation, while the target domain is the making of a new type of man who can effect social changes in the world. This means that the source domain is provided by the religious belief system in the culture of the poet by virtue of an analogy between Gods creation of man and the creation of a tool that metonymically
682
Z. Kvecses
stands for the poet (instrument used for the person using it), who can thus function in a new role to effect social change. A physical-cultural element, or entity, that is significant in Sandburgs poetry is the skyscraper. Consider the first stanza of the poem called Skyscraper: (14) BY day the skyscraper looms in the smoke and sun and has a soul. Prairie and valley, streets of the city, pour people into it and they mingle among its twenty floors and are poured out again back to the streets, prairies and valleys. It is the men and women, boys and girls so poured in and out all day that give the building a soul of dreams and thoughts and memories. (Dumped in the sea or fixed in a desert, who would care for the building or speak its name or ask a policeman the way to it?) (retrieved from http://www.bartleby.com/165/55.html) What makes the skyscraper such a significant symbol and what makes Sandburg choose it to talk about America? The poem was written in 1916 in Chicago. It was at the turn of the 20th century in the major American cities that skyscrapers began to be built on a large scale. The skyscraper became a dominant feature of the city skyline. Due to its perceptual and cultural salience, it became, for Sandburg and many others, a symbol of America. The symbol is based on a connection between a salient element (a kind of building) that characterizes a place and the place itself; hence the metonymy skyscraper for america, which is a specific-level version of the generic-level metonymy a characteristic property for the place that it characterizes. In this case, the characteristic property is embodied in a type of building. What is additionally interesting about this example is that it is a metonymy, not a metaphor. It seems that metonymies are also set up in part as a result of the local cultural influence; the skyscraper was at Sandburgs time a salient feature of the American landscape that made it a natural choice for a metonymic symbol for the country. 3.4. The effect of the immediate social setting on metaphor use
When we use metaphors, we use them in social contexts as well. The social context can be extremely variable. It can involve anything from the social relationships that obtain between the participants of the discourse through the gender roles of the participants to the various social occasions in which the discourse takes place. Let us take an example for the last possibility from the American newspaper USA TODAY. As mentioned above, in 2007 the newspaper carried an article about Fats Domino, one of the great living musicians based in flood-stricken New Orleans. In the article, the journalist describes in part Dominos life after Katrinathe
A new look at metaphorical creativity 683 hurricane that destroyed his house and caused a lot of damage to his life and that of many other people in New Orleans. The subtitle of the article reads: (15) The rock n roll pioneer rebuilds his lifeand on the new album Goin Home, his timeless music. (USA TODAY, 2007, September 21, Section 6B) How can we account for the use of the metaphor rebuilds his life in this text? We could simply suggest that this is an instance of the life is a building conceptual metaphor and that whatever meaning is intended to be conveyed by the expression is most conventionally conveyed by this particular conceptual metaphor and this particular metaphorical expression. But then this may not entirely justify the use of the expression. There are potentially other conceptual metaphors (and corresponding metaphorical expressions) that could also be used to achieve a comparable semantic effect. Two that readily come to mind include the life is a journey and the life is a machine conceptual metaphors. We could also say that x set out again on his/her path or that after his/her life broke down, x got it to work again or restarted it. These and similar metaphors would enable the speaker/conceptualizer and the hearer to come to the interpretation that the rebuilding idea activates. However, of the potentially possible choices it is the life is a building metaphor is selected for the purpose. In all probability, this is because, at the time of the interview, Domino was also in the process of rebuilding his house that was destroyed by the hurricane in 2005. If this is correct, it can be suggested that the social situation (rebuilding his house) triggered, or facilitated, the choice of the conceptual metaphor life is a building. In other words, a real-world instance of a source domain is more likely to lead to the choice of a source concept of which it is an instance than to that of a source domain of which it is not. In this sense, the social setting may play a role in the selection of certain preferred conceptual metaphors, and hence of certain preferred metaphorical expressions in discourse. In such cases, the emerging general picture seems to be as follows: There is a particular social setting and there is a particular meaning that needs to be activated. If the meaning can be activated by means of a metaphorical mapping that fits the actual social setting, speakers/conceptualizers will prefer to choose that mapping (together with the linguistic expressions that are based on the mapping). More simply, if the actual social setting involves an element that is an instance of an appropriate source domain, speakers are likely to use that source domain. 3.4.1. The social context in poetry. We have seen above in the analysis of the first stanza of the Sandburg poem that the poet conceptualizes the creation of a new type of man in the form of an implement on the analogy of the
684
Z. Kvecses
creation of man. We can see the same conceptual process at work in the second stanza: (16) Lay me on an anvil, O God. Beat me and hammer me into a steel spike. Drive me into the girders that hold a skyscraper together. Take red-hot rivets and fasten me into the central girders. Let me be the great nail holding a skyscraper through blue nights into white stars. (retrieved from http://www.bartleby.com/134/39.html) An important difference between the first and the second stanza is that the implement that is created in the first can be used to take apart a structure, whereas the object that is created in the second stanza can be used to put a structure together (steel spike, red-hot rivets, great nail). In other words, first an implement is made that is used to destroy a structure, and then the essential ingredients of a structure are made to construct a new structure. This process of work serves as the source domain for a target domain in which the old social structure is removed by means of a work implement and a new social structure is put in its place by means of a new type of man who can accomplish all this. The new type of man is the poet who does both jobs. In short, this is based on the conceptual metaphor the construction of new social structure is the physical making of new tools and building ingredients. In other words, it is the characteristically social situation of tool making and using that tool to make something else in the American context that inspires the analogy used by the poet. 3.4.2. The interaction of contextinduced and conventional conceptual meta phors. It was noted in the section on cultural context that the skyscraper became one of Americas symbols in the early 20th century. This was the result of the metonymy skyscraper for america. It was also noted in the section on social context that the metaphor the construction of new social structure is the physical making of new tools and building ingredients plays a role in the general meaning of the poem by Sandburg. These contextinduced conceptual patterns, however, interact with a conventional conceptual metaphor in the poem; it is societies are buildings. This conventional conceptual metaphor is a specific-level version of the more general complex systems are complex physical objects metaphor (Kvecses, 2002). The societies are buildings metaphor consists of a number of fixed, conventional mappings, including (again, going from source to target): the builders the persons creating society the process of building the process of creating society
A new look at metaphorical creativity 685 the foundations of the building the basic principles on which society is based the building materials the ideas used to create society the physical structure of the building the social organization of the ideas the building the society Since America is a society, it is conceived of as a building, more specifically, as a skyscraper. The conventional conceptual metaphor a society is a building is evoked by the poem, but the poet goes way beyond it. He creates a complex image (a blend) with several changes in the basic metaphor: the building becomes a skyscraper, the builder becomes a God/blacksmith/poet/worker, and the building material and tools become the poet. Many of these changes are motivated by contextual factors. The building as skyscraper is motivated by the physical-cultural context, the builder as God by the religious belief system, the builder as blacksmith by the poets personal history, and the builder as worker by a particular social model of work. 3.5. The effect of the immediate linguistic context on metaphor use
Sometimes it is the immediate linguistic context that plays a role in the selection of novel metaphors. Consider the following text: (17) When the Electoral Commission came to make its choice between referring the case to the police and taking no action it was this defence, described by an authoritative source as showing contempt for the law, which helped to tilt the balanceand Mr Hainover the edge. (The Times Friday January 25, 2008, News 7) The metaphorical expressions that are relevant here are tilt the balance and [tilt] Mr Hain over the edge. The second metaphorical expression is elliptical in the text, but we can easily supply the word tilt to make the sentence complete. Why can we do this? We can do it, of course, because the word tilt that was used in the first expression also fits the second. We keep it in memory and since it fits, we can supply it again. Let us look at some of the details of how this might happen. The metaphorical expression tilt the balance is a conventional one and is a linguistic example of the metaphor uncertainty is balance (of the scales) and certainty is lack of balance (of the scales). In the metaphor, making a choice (i.e., eliminating uncertainty) corresponds to tilting the balance. The second expression, tilt someone over the edge, is much less conventional than the first. The question is why the word tilt gets selected in the second one besides the fact that it (the word form) is still in memory. Clearly, it has to fit, but why does it fit? In the second expression the relevant conceptual metaphor is loss of rational/moral control is loss of physical control, such as
686
Z. Kvecses
physical fall (into a (deep) hole). The cause of the loss of rational/moral control is the same as the cause that made the commission arrive at a decision; namely, showing contempt for the law. There are many linguistic expressions that could be used to convey the idea to cause someone to fall down (into a hole), including push, drive, force, jolt, nudge, poke, prod, propel, shove, press, butt, and so on. Of these, the most conventional ones are certainly push and drive; both of which occur in the idiom push/drive someone over the edge. However, in the discourse the author uses tilt, which is an additional but somewhat unmotivated possibility to express the idea of causing someone to physically fall down (into a hole). What makes it acceptable and natural, though, is that it fits the metaphor (no matter how unconventionally), on the one hand, and that it is elicited by the word used in the previous linguistic metaphor. In this manner, the phonetic shape of an expression in discourse can function as an elicitor of a metaphorically used expression in the same discourse, provided that the condition of fitting the required conceptual metaphor is also met. 3.5.1. The effect of the immediate linguistic context in poetry. Let us now return to the Plath poem. As the lines quoted above also suggest, the poet is trying to escape from the harmful influence of her mother. (This can be seen most clearly in the line Did I escape, I wonder?). What is remarkable here is that, to convey this, the poet makes use of the other sense of medusa: the jellyfish sense (Your stooges /Plying their wild cells in my keels shadow). Shes trying to get away from an overbearing mother, and the mother is portrayed analogically as jellyfish. Schools of jellyfish move about in the sea, and jellyfish stings can inflict pain and even death in humans. Thus it can be suggested that the jellyfish meaning of medusa is used by the poet because the mythological Medusa was introduced early on in the poem (in the title) to begin with. The word form medusa evokes all the knowledge structures associated with it (given as the two senses of the word), and the poet is taking advantage of them, as they analogically fit the nature of the relationship with her mother. Another motivating factor for the use of the second sense is that, according to some commentators, Sylvia Plath developed a great deal of interest in marine biology at about the time she wrote Medusa. The personal interests of a poet may also influence the choice of particular metaphorical images (in this case, the image for the addressee). 4. The combined effect of factors on metaphor use For the sake of clarity of analysis, I have tried to show the relevance of each of the factors to the selection of discourse metaphors one by one. But this does not mean that in reality they occur in an isolated fashion. As a matter of fact, it
A new look at metaphorical creativity 687 is reasonable to expect them to co-occur in real discourse. For example, a persons concerns, or interests, as a factor may combine with additional knowledge about himself or herself, as well as the topic of the discourse, and the three can, in this way, powerfully influence how the conceptualizer will express himself or herself metaphorically. The next and final example demonstrates this possibility in a fairly clear way. 4.1. The combined effect of contextual factors on the use of metaphors in everyday language
While I was doing research for the present article (January through March, 2008), there was heated debate in Hungarian society about whether the country should adopt a health insurance system, similar to that in the U.S.A., based on competing privately-owned health insurance companies, rather than stay with a single, state-owned and state-regulated health-care system. As part of the debate, many people volunteered their opinion on this issue in a variety of media, the Internet being one of them. As I was following the debate on the Internet, I found an article that can serve, in my view, as a good demonstration of a situation in which ones use of metaphors in real discourse is informed by a combination of contextual factors, not just a single one. A Hungarian doctor (Dr. Kullman Tams) published a substantial essay in one of the Hungarian news networks about the many potential undesirable consequences of the proposed new privatized system. He outlines and introduces what he has to say in his essay in the following way (given first in the Hungarian original): (18) Dolgozatom a gondolkodsi idben szletett. Clkitzse a trvny vrhat hatsainak elemzse. Mdszereiben az orvosi gondolkodst kveti. A magyar egszsggyet kpzeli a beteg helyzetbe. Kezelorvosnak a kormnyt tekinti, s konzulensknt a szakrtket, illetve a szerzt magt kri fel. A prognzis meghatrozs felttelnek tekinti a helyes diagnzist. Vgl rviden megvizsglja van-e alternatv kezelsi lehetsg. (retrieved from http://mkdsz.hu/content/view/8480/207/, February 2, 2008) Heres an almost literal translation of the text into English: This paper was born in the period when people think about the issue. Its objective is to analyze the expected effects of the law. In its methods, it follows the way doctors think. It imagines Hungarian health care as the patient.
688
Z. Kvecses
It takes the government as the attending physician, and invites experts and the author [of the article] himself to be the consultants. It considers the correct diagnosis to be the precondition for predicting the prognosis. Finally it briefly examines if there is an alternative possibility for treatment. Unless the author of the article deliberately wishes to provide an illustration for the use of metaphors in real discourse (and I doubt that it is the case), this is a remarkable example of how a combination of contextual factors can influence the way we often speak/write and think metaphorically. The author of the article is a doctor himself, we can assume he has a great deal of interest in his job (he took the trouble of writing the article), and he is writing about Hungarian health care. The first of these is concerned with what I called knowledge about the speaker/conceptualizer; the second corresponds to ones personal concerns, or interests; and the third involves what was called the topic of the discourse. It seems that the three factors are jointly responsible for the way the author uses metaphors in the discourse (and, given this example, for how he, in addition, actually structures what he says). Needless to say, many other combinations of factors can be imagined and expected to co-occur in and influence the use of metaphors in real discourse. 4.1.1. The combined effect of factors in poetry. In many cases of the influence of contextual factors on metaphoric conceptualization in poetry, the kinds of contexts we have identified so far contribute jointly to the metaphorical conceptualization and expression of ideas. Let us consider the Sandburg poem again, as analyzed above. Heres the poem in full: (19) LAY me on an anvil, O God. Beat me and hammer me into a crowbar. Let me pry loose old walls. Let me lift and loosen old foundations. Lay me on an anvil, O God. Beat me and hammer me into a steel spike. Drive me into the girders that hold a skyscraper together. Take red-hot rivets and fasten me into the central girders. Let me be the great nail holding a skyscraper through blue nights into white stars. (retrieved from http://www.bartleby.com/134/39.html) We have seen that both the cultural and social contexts motivate the choice of certain aspects of the language and conceptualization of the poem. The reli-
A new look at metaphorical creativity 689 gious belief system (from the cultural context) serves to think and talk about the making of a new man who can build a new social structure and the model of work (from the social context) functions to talk and think about the construction of the new social structure. But there is an additional type of context that needs to be discussed as it clearly contributes to the poems conceptual universe. This is the knowledge the speaker-poet has about himself or herself, as discussed above in connection with the Dickinson example. The knowledge a poet has about himself or herself includes not only the biological-physical condition that characterizes the poet but also his or her personal history. If we take into account Sandburgs personal history, we can account for why he talks about Lay me on an anvil, O God /Beat me and hammer me into a crowbar (and into a steel spike in the second stanza). The likely reason is that his father was a blacksmith, and we can assume that the poet had some early childhood experience with the job of a blacksmith. It is a blacksmith who takes a piece of metal, heats it, puts it on an anvil, and shapes it into some useful object. This personal knowledge about the job may have led the poet to make use of this image. Although both images are simultaneously present and important, the image of the blacksmith overrides the image of God making man. In the Bible, God makes man by forming him from the dust of the ground and breathing life into his nostrils. In the poem, however, the man-object is created by God as a blacksmith. What emerges here is a complex picture in which the creation of the man-object is accomplished by a God-blacksmith and the resulting man-object is used according to the social model of work as source domain to conceptualize the creation of a new social structure. This is a complex case of conceptual integration, or blending, as proposed by Fauconnier and Turner (2002). What this analysis adds to conceptual integration theory is that it makes the motivation for the particular input frames participating in the blend clear and explicit. My specific suggestion is that the integration network consists of the input spaces (frames) it does (biblical creation, job of a blacksmith, model of work, and creation of new social structure) because of the various contextual influences that were at work in the poets mind in the course of the metaphorical construction of the poem. 5. What are the sources of metaphorical creativity? The standard version of conceptual metaphor theory operates with largely uncontextualized or minimally contextualized linguistic examples of hypothesized conceptual metaphors. The conceptual metaphors are seen as constituted by sets of mappings between the source and the target domains. The mappings are assumed to be fairly static conceptual structures. The linguistic metaphors
690
Z. Kvecses
that are motivated by such static correspondences are entrenched, conventional expressions that eventually find their way to good, detailed dictionaries of languages. Dictionaries and the meanings (either literal or figurative) they contain represent what is static and highly conventional about particular languages. In this view it is problematic to account for metaphorical creativity. How does this somewhat simplified and rough characterization of standard conceptual metaphor theory change in light of the work reported in this paper? Apart from some sporadic studies (such as Aitchison 1987; Benczes, in print; Koller 2004; Kvecses 2005; Semino 2008), the issue of context-induced metaphorical creativity has not been systematically investigated. A considerable portion of novel and unconventional metaphorical language seems to derive from such contextual factors as the immediate linguistic context, knowledge about discourse participants, physical setting, and the like. It remains to be seen how robust the phenomenon is and whether it deserves serious further investigation. Based on an informal collection of data from a variety of newspapers, it appears that the context provides a major source of motivation for the use of many novel metaphors. Many of these metaphors are clearly not, in Gradys (1999) classification, either resemblance or correlation-based cases. They seem to have a unique status, in that they are grounded in the context in which metaphorical conceptualization is taking place. Many of the examples of unconventional metaphoric language we have seen in this paper could simply not be explained without taking into account a series of contextual factors. My claim is that in addition to the well-studied conceptual metaphors and metaphorical analogies used to convey meanings and achieve rhetorical functions in discourse, conceptualizers are also very much aware and take advantage of the various factors that make up the immediate context in which metaphorical conceptualization takes place. (A similar idea can be found in the work of Brandt and Brandt 2005 and in the relevancetheoretic study of metaphor by Sperber and Wilson 2008.) We can imagine these contexts as frames that are nested in one another, such that the physical setting as the outermost frame includes the social frame that includes the cultural frame, where we find the speaker/conceptualizer, the hearer/conceptualizer, and the topic, as well as the diagram for the flow of discourse (functioning as the immediate linguistic context). This idea of contexts as nested frames bears resemblance to Langackers construct of current discourse space, which he defines as everything presumed to be shared by the speaker and hearer as the basis for discourse at a given moment (Langacker 2008: 281). The contextual factors I describe in this paper can all trigger, prompt, facilitate, or simply prime, singly or in combination, the use of conventional or unconventional and novel metaphorical expressions in the discourse. We can represent the joint workings of these factors in a Langackerian diagram below:
Figure 1. Nested contexts
In some cases, the contextual factors will simply lead to the emergence and use of well-worn, conventional metaphorical expressions, but in others they may lead conceptualizers to choose genuinely novel or unconventional metaphorical expressions. The core idea is that we try to be coherent with most of the factors that regulate the conceptualization of the world. A major source of the pressure of coherence is our bodythe body on which correlational metaphors are based. If there is a group of metaphors (such as happy is up) that are dedicated to the activation of particular meanings and that are grounded in embodied experience, that embodiment may lead to the use of certain metaphorical expressions that can activate the intended meanings. Such embodied, correlation-based conceptual metaphors tend to be stable both across time and cultures. The second source of the pressure of coherence comes from the context in which metaphorical conceptualization takes place. People produce metaphors inspired by the contextual factors we have seen. This means that speakers try (and tend) to be coherent with various aspects of the communicative situation in the process of creating metaphorical ideas. Many contextinduced metaphorical expressions appear to be novel and unconventional. This is because the (immediate) context of discourse varies from one discourse to another, and with it the linguistic metaphors that are based on the context will also vary. 6. Conclusions Metaphorical creativity in discourse can involve several distinct cases: (a) the case where a novel source domain is applied or novel elements of the source are applied to a given target domain (source-induced creativity); (b) the case where elements of the target originally not involved in a set of constitutive
692
Z. Kvecses
mappings are utilized and matching counterparts are found in the source (targetinduced creativity); (c) the case of conceptual integration where elements from both source and target are combined in new ways; and (d) the case where various contextual factors lead to novel or unconventional metaphors (contextinduced creativity). The paper has examined the interrelations among the notions of metaphor, discourse, and creativity. Several important connections have been found with respect to contextual factors in conceptual metaphor creation. Conceptualizers seem to rely on a number of contextual factors when they use metaphors in discourse. The ones that have been identified in the paper include (a) the immediate linguistic context, (b) the knowledge conceptualizers have about themselves and the topic, (c) the immediate cultural context, (d) the immediate social context, and (e) the immediate physical setting. Since all of these are shared by the speaker and hearer (the conceptualizers), the contextual factors facilitate the development and mutual understanding of the discourse. The view that many metaphors in real discourse emerge from context has implications for conceptual metaphor theory. The most recent and dominant version of conceptual metaphor theory emphasizes the importance of primary metaphors that arise from certain well-motivated correlations between bodily and subjective experiences (e.g., knowing as seeing) (see, for example, Grady 1997a; Lakoff and Johnson 1999). These metaphors are, in turn, seen as having a neural basis (see Feldman 2006; Lakoff 2008). In the view that I am proposing, in addition to such metaphors, there are what I call contextinduced metaphors that derive not from some such correlations in experience but from the context of metaphorical conceptualization. A, for me, good example of such a metaphor is the one used for Schwarzenegger: The Rise of the Machine. There is no resemblance between Schwarzenegger and the film title, and the metaphor is not based on some bodily correlation either; it derives from the cultural contextthat is, it is a context-induced metaphor. In addition to being a new class of metaphors, the importance of context-induced metaphors lies in revealing a further aspect of human creativity in conceptualizing the world. This class of metaphors was clearly demonstrated in the case of poetry. To some, however, to say that such metaphors represent a new class may be overstating the results of this paper. It may be suggested that even though there is not always a bodily basis, there is always some resemblance on which metaphors are based. In this case, I would argue in the following way: Potential resemblances between entities are legion, but what helps (triggers, prompts, etc.) us (to) choose a source domain would be some contextual factor. If this is what is going on, the weaker conclusion would be that what I call context-induced metaphors constitute a subclass of resemblance metaphors. I believe that the analyses of metaphorical language in poetry I have presented in the paper have certain implications for a variety of issues in the study of poetry.
A new look at metaphorical creativity 693 First, the analyses indicate that it is possible to go beyond some limited, and limiting, approaches to the interpretation of poetry. Poems and poetic language are sometimes studied from a purely hermeneutical-postmodernist perspective without any regard to the social-cultural-personal background to the creative process. Poems are, on the other hand, also sometimes studied from a purely social-historical perspective without any regard to the text-internal systematicity of the poem. The approach that I am advocating here provides a natural bridge between these two apparently contradictory views, in that contextinduced metaphors can be seen as both resulting from the social-culturalpersonal background and lending coherent meaning structures to particular poems. This view is supported by, for example, Guthrie, who claims:
Finally, I would add that I am only too well aware that readings based upon biographical evidence are apt to become excessively reductive and simplistic. Nevertheless, in the prevailing postmodernist critical climate, I think we actually stand at greater risk of underestimating the degree of intimacy existing between an authors literary productions and the network of experiences, great and small, that shapes an individual life. (Guthrie, 1998: 5)
Second, a related implication of the analyses for the study of metaphor in poetry is that in many cases such analyses can point to an additional source of metaphorical creativity in poetry. The use of contextually-based, or contextinduced, metaphors is often novel in poems, simply because the contexts themselves in which poems are created are often unique and/or specific to a particular poet. Just as importantly, although the particular situations (contexts) in which poets conceptualize the world may often be specific to particular poets and hence the metaphors they use may be unique, the cognitive process (i.e., the effect of context on conceptualization) whereby they create them is not. I have argued that context-induced metaphors are also used in everyday speech. In light of what we have seen in this paper, what seems to be unique to metaphorical conceptualization in poetry is the density and complexity of the process of contextual influence on poets. The poem Prayers of Steel by Carl Sandburg is a good illustration of how a variety of contextual factors can jointly shape a poets metaphors within the space of a few lines. In other words, I do not claim in this paper that everyday discourse and poetry are not different. What I claim is that their difference does not come from conceptual metaphors (of whatever kind). Our felt sense of the difference (in addition to many other things, such as formal properties of poetry) derives in part from the density and complexity of context-induced and bodily-based metaphors we find in poetry. Third, the view proposed here may have certain implications for the study of embodied cognition. If it is the case that, for instance, the physical-biological properties of a poet can influence his or her metaphorical conceptualization in the course of creating poems, as we saw in Dickinsons case, then embodied
694
Z. Kvecses
cognition can be based on personal experiences as wellnot only universal correlations in experience, as assumed by the dominant view. If what I found is correct, embodied cognition may be based on a variety of different experiences in metaphorical conceptualization, including universal experiences, but also social, cultural, etc. experiences, and, importantly, unique personal ones. All in all, then, in answer to the question posed at the beginning of the paper, I suggest that we recruit conceptual materials for metaphorical purposes not only from bodily experience but also from a variety of contexts in which we speak, think, and act metaphorically. Since the contexts can be highly variable, the metaphors used will often be variable, novel, and unconventional. The pressure of coherence affects us in both of these major ways. I hope to have demonstrated and made a case for the necessity to study the influence of context in the cognitive linguistic study of metaphor. I feel this is a much needed area of research without which we cannot account for much of what is happening in the use of metaphors. As I have indicated above, several individual researchers have set out in a direction very similar to the present enterprise. My further hope is that others will join us from diverse disciplines, such as cognitive linguistics, relevance theory, cognitive poetics, cognitive psychology, cognitive anthropology, applied linguistics, multimodal communication and media studies, cognitive semiotics, and the like, in the study of figurative creativity within (and beyond) the framework proposed in the paper. Received 23 October 2009 Revision received 17 February 2010 References
Aitchison, Jean. 1987. Words in the Mind. Oxford: Blackwell. Benczes, Rka. In print. Setting limits on creativity in the production and use of metaphorical and metonymical compounds. In: Sascha Michel and Alexander Onysko (eds.), Cognitive Ap proaches to Word Formation. Berlin and New York: Mouton de Gruyter. Boers, Frank. 1999. When a bodily source domain becomes prominent. In Ray Gibbs and Gerard Steen (eds.), Metaphor in Cognitive Linguistics, 4756. Amsterdam: John Benjamins. Boers, Frank and Murielle Demecheleer. 1997. A few metaphorical models in (western) economic discourse. In W. A. Liebert, G. Redeker, and L. Waugh (eds.), Discourse and Perspective in Cognitive Linguistics, 115129. Amsterdam: John Benjamins. Boers, Frank and Murielle Demecheleer. 2001. Measuring the impact of cross-cultural differences on learners comprehension of imageable idioms. ELT Journal, 55, 255262. Brandt, Line and Per Aage Brandt. 2005. Making sense of a blend. A cognitive semiotic approach to metaphor. Annual Review of Cognitive Linguistics, 3. 216249. Cameron, Lynne. 2003. Metaphor in Educational Discourse. London: Continuum. Cameron, Lynne. 2007. Patterns of metaphor use in reconciliation talk. Discourse and Society, 18, 197222. Clausner, Timothy and Croft, William. 1997. Productivity and schematicity in metaphors. Cogni tive Science, 21(3), 247282.
Etvs Lornd University

Deignan, Alice. 1999. Corpus-based research into metaphor. In Lynne Cameron and Graham Low (eds.), Researching and Applying Metaphor, 177199. Cambridge: Cambridge University Press. Deignan, Alice. 2003. Metaphorical expressions and culture: An indirect link. Metaphor and Symbol, 18(4), 255271. Deignan, Alice. 2005. Metaphor and Corpus Linguistics. Amsterdam: John Benjamins. Dobrovolskij, Dmitrij and Elisabeth Piirainen. 2005. Figurative Language. Crosscultural and Crosslinguistic Perspectives. Amsterdam: Elsevier. Fauconnier, Gilles and Mark Turner. 2002. The Way We Think. New York: Basic Books. Feldman, Jerome A. 2006. From Molecule to Metaphor. A Neural Theory of Language. Cambridge, Massachusetts/ ondon, England: The MIT Press. L Freeman, Margaret H. 1995. Metaphor making meaning: Dickinsons conceptual universe. Jour nal of Pragmatics, 24: 643666. Freeman, Margaret H. 2000. Poetry and the scope of metaphor: Toward a cognitive theory of metaphor. In Antonio Barcelona (ed.), Metaphor and Metonymy at the Crossroads, 253281. Berlin: Mouton de Gruyter. Freeman, Margaret H. 2007. Cognitive linguistic approaches to literary studies: State of the art in cognitive poetics. In Dirk Geraerts and Hubert Cuyckens (eds.), The Oxford Handbook of Cog nitive Linguistics, 11751202. Oxford: Oxford University Press. Gevaert, Caroline. 2001. Anger in old and middle english: a hot topic? Belgian Essays on Lan guage and Literature. 89101. Gevaert, Caroline. 2005. The anger is heat question: detecting cultural influence on the conceptualization of anger through diachronic corpus analysis. N. Delbacque, J. Van der Auwera, and G. Geeraerts (eds.), Perspectives on Variation: Sociolinguistic, Historical, Comparative, 195208. Berlin: Mouton de Gruyter. Grady, Joseph. 1997a. Foundations of Meaning: Primary Metaphors and Primary Scenes. Doctoral dissertation, University of California at Berkeley. Grady, Joseph. 1997b. Theories are buildings revisited. Cognitive Linguistics, 8, 267290. Grady, Joseph. 1999. A typology of motivation for conceptual metaphors. Correlations vs. resemblance. In Ray W. Gibbs and Gerard J. Steen (Eds.) Metaphor in Cognitive Linguistics, 79100. Amsterdam and Philadelphia: John Benjamins. Guthrie, James R. 1998. Emily Dickinsons Vision. Illness and Identity in her Poetry. Gainesville, FL: The University Press of Florida. Jackendoff, Ray and David Aaron. 1991. Review Article: More than cool reason: A field guide to poetic metaphor by George Lakoff and Mark Johnson. Language 67(2): 320328. Koller, Veronika. 2004. Metaphor and Gender in Business Media Discourse: a Critical Cognitive Study. Basingstoke and New York: Palgrave. Kolodny, Annette. 1975. The Lay of the Land. Metaphor as Experience and History in American Life and Letters. Chapel Hill: The University of North Carolina Press. Kolodny, Annette. 1984. The Land Before Her. Fantasy and Experience of the American Frontiers, 16301860. Chapel Hill: The University of North Carolina Press. Kvecses, Zoltn. 2000. American English. An Introduction. Peterborough, CA: Broadview Press. Kvecses, Zoltn. 2002. Metaphor. A Practical Introduction. Oxford: Oxford University Press. Kvecses, Zoltn. 2005. Metaphor in Culture. Universality and Variation. Cambridge: Cambridge University Press. Kvecses, Zoltn. 2006. Language, Mind, and Culture. A Practical Introduction. New York: Oxford University Press. Kvecses, Zoltn. 2008. Conceptual metaphor theory: some criticisms and some alternative proposals. Annual Review of Cognitive Linguistics, 6, 168184.
696
Z. Kvecses
Kvecses, Zoltn. 2009a. Metaphor, culture, and discourse: the pressure of coherence. In Andreas Musolff and Jrg Zinken, (eds.), Metaphor and Discourse, 1124. Palgrave Macmillan. Kvecses, Zoltn. 2009b. The effect of context on the use of metaphor in discourse. Ibrica (Jour nal of the European Association of Languages for Specific Purposes), Numero/Number 17, Primavera/Spring, 1123. Kvecses, Zoltn. In print. Methodological issues in conceptual metaphor theory. In Hans-Joerg Schmid and Sandra Handl (eds.), Windows to the Mind. Metaphor, Metonymy, and Conceptual Blending. Berlin: Mouton de Gruyter. Kvecses, Zoltn. 2010. Metaphor. A Practical Introduction. Second Edition. New York: Oxford University Press. Kves, Nikoletta. 2002. Hungarian and American dreamworks of life. Term paper. Department of American Studies, Etvs Lornd University, Budapest. Lakoff, George. 2008. The neural theory of metaphor. In Ray Gibbs (ed.), The Cambridge Hand book of Metaphor. 1738. New York: Cambridge University Press. Lakoff, George and Mark Johnson. 1980. Metaphors We Live By. Chicago: University of Chicago Press. Lakoff, George and Mark Johnson. 1999. Philosophy in the Flesh. New York: Basic Books. Lakoff, George and Mark Turner. 1989. More Than Cool Reason. Chicago: The University of Chicago Press. Langacker, Ronald W. 2008. Cognitve Grammar. A Basic Introduction. Oxford/New York: Oxford University Press. Mussolff, Andreas. 2001. Political imagery of Europe: a house without exit doors? Journal of Multi lingual and Multicultural Development 21:3, 216229. Pragglejaz Group. 2007. MIP: A method for identifying metaphorically used words in discourse. Metaphor and Symbol 22(1), 139. Rakova, Maria. 2002. The philosophy of embodied realism: A high price to pay? Cognitive Lin guistics 133, 215244. Ritchie, David. 2003. argument is warOr is it a game of chess? Multiple meanings in the analysis of imiplicit metaphors. Metaphor and Symbol, 18(2), 125146. Semino, Elena. 2005. The metaphorical construction of complex domains: The case of speech activity in English. Metaphor and Symbol, 2021, 3570. Semino, Elena. 2008. Metaphor in Discourse. Cambridge: Cambridge University Press. Sperber, Dan and Deidre Wilson. 2008. A deflationary account of metaphor. In Ray Gibbs (ed.), The Cambridge Handbook of Metaphor and Thought, 84105. New York: Cambridge University Press. Steen, Gerard. 1999. From linguistic to conceptual metaphor in five steps. In R. Gibbs and G. Steen (eds.), Metaphor in Cognitive Linguistics, 5777. Amsterdam: John Benjamins. Stefanowitsch, Anatol. 2007. Words and their metaphors. In Anatol Stefanowitch and Stefan Th. Gries (eds.) Corpusbased Approaches to Metaphor and Metonymy. 64105. Berlin: Mouton de Gruyter. Turner, Mark. 1996. The Literary Mind. New York: Oxford University Press. Zinken, Joerg. 2007. Discourse metaphors: the link between figurative language and habitual analogies. Cognitive Linguistics, 18(3), 445466. Yu, Ning. 1998. The Contemporary Theory of Metaphor: A Perspective from Chinese. Amsterdam: John Benjamins.
Literary sources
Arnold, Matthew Dover Beach. Retrieved from http://www.artofeurope.com/arnold/arn1.htm.

Dickinson, Emily Plath, Sylvia Sandburg, Carl I reckonwhen I count it all. Retrieved from http://poetry.poetryx.com/poems/2520/. Medusa. Retrieved from http://www.americanpoems.com/poets/sylviaplath/1412. Prayers of Steel. Retrieved from http://www.bartleby.com/134/39.html. Skyscraper. Retrieved from http://www.bartleby.com/165/55.html.
Paradigm structure: Evidence from Russian suffix shift

TORE NESSET and LAURA A. JANDA*
Abstract In this article we apply one of the key concepts in cognitive linguistics, the radial category, to inflectional morphology. We advance the Paradigm Structure Hypothesis, arguing that inflectional paradigms are radial categories with internal structure primarily motivated by semantic relationships of markedness and prototypicality. It is possible to construct an expected structure for a verbal paradigm, facilitating an empirical test for our hypothesis. Data tracking an on-going morphological change in Russian documents the distribution of conservative vs. innovative forms across the cells of the verbal paradigm. A logistic regression model that takes into account the sources of variation (the frequencies of individual verbs and paradigm slots, and individual verb preferences) shows that the language change is implemented differently across the paradigm forms, confirming the expected structure. In addition to markedness and prototypicality, we investigate the impact of frequency and show that there is a good, albeit not perfect match between the expected hierarchy and frequency. We conclude that the diachronic change analyzed in this article gives evidence for the structure of paradigms modeled on the radial category. Keywords: radial category, prototypicality, paradigm, language change, Russian
* Address for correspondence: Tore Nesset: Dept. of Languages and Linguistics, University of Troms NO-9037 Troms, Norway. Email: tore.nesset@uit.no. Laura A. Janda: Dept. of Languages and Linguistics, University of Troms NO-9037 Troms, Norway. Email: laura.janda@ uit.no. The authors gratefully acknowledge support from the University of Troms and the Norwegian Research Council. We received numerous valuable comments and suggestions made by the Editor, Associate Editor and two anonymous peers in the review process, which helped us to make substantial improvements in this article. Of course we retain responsibility for any remaining errors or shortcomings. Cognitive Linguistics 214 (2010), 699725 DOI 10.1515/COGL.2010.022 09365907/10/00210699 Walter de Gruyter
700
T. Nesset and L. A. Janda
1. Introduction Taking cognitive linguistics as its point of departure, this article analyzes the specific details of an on-going language change, a suffix shift among Russian verbs, with an eye toward the theoretical implications this change has for the structure of inflectional paradigms as radial categories. Section 2 presents the facts of the Russian suffix shift, showing which forms in the paradigm it affects and what changes in the grammar it entails. Section 3 is devoted to issues surrounding the paradigm and its structure. Theoretical arguments concerning the status of the paradigm come first (3.1), followed by a discussion of whether the paradigm has internal structure (3.2). Assuming that the paradigm has both status and structure, we offer the radial category as a likely model for hierarchically-structured paradigms (3.3). This section concludes with an expected structure for the Russian verbal paradigm based on rankings of markedness and prototypicality (3.4). This expected structure provides a concrete test case for the Paradigm Structure Hypothesis proposed in Section 4. A statistical analysis of data documenting the Russian suffix shift confirms the expected structure and thus the hypothesis (4.1). In Section 5 we discuss the role of frequency. Conclusions are drawn in Section 6. 2. Russian suffix shift The Russian verbal suffix shift provides us with the opportunity to witness a language change in progress and analyze its mechanisms in detail. Graudina et al. (2001: 283) show that suffix shift started several centuries ago, and that the change is still in progress (see also Kiparsky 1967: 208ff.). According to Andersen (1980: 297), the Russian verbal suffix shift has been in evidence for the past millennium and has all the earmarks of a change in progress. This change involves a shift from a non-productive verbal suffix to a productive one, yielding regularization in the grammar. Though this change is well-known, no previous attempt has been made to investigate its distribution across the paradigm. Data culled from the Russian National Corpus (henceforth RNC, at www.ruscorpora.ru) reveal that the Russian suffix shift is progressing through the verbal paradigm in an uneven fashion.1
1. The data was collected from the RNC in AprilJuly 2006. We gratefully acknowledge the assistance of Hyug Ahn and John Korba in extracting these data. The Russian National Corpus contains approximately 140 million words collected from a wide variety of genres and authors. Though the bulk of material is written and recent (post 1950), spoken Russian and earlier sources are also represented.
Paradigm structure: evidence from Russian suffix shift 701

Table 1. Forms of kapat drip with the original -a suffix and the innovative -aj suffix -a suffix Non-Past 1sg Non-Past 2sg Non-Past 3sg Non-Past 1pl Non-Past 2pl Non-Past 3pl Present Active Participle Imperative Gerund (Verbal Adverb) Infinitive Past masculine sg Past feminine sg Past neuter sg Past pl kaplju kapljo kapljot kapljom kapljotje kapljut kapljujij kaplji(tje) kaplja kapatj kapal kapala kapalo kapalji -aj suffix kapaju kapajo kapajot kapajom kapajotje kapajut kapajujij kapaj(tje) kapaja kapatj kapal kapala kapalo kapalji
With few exceptions, Russian verbs are suffixed.2 The Russian suffix shift involves thirty-seven verbs that are in the process of shifting from the nonproductive -a suffix to the productive -aj suffix. Table 1 illustrates this change.3 There are two observations to be made from Table 1. The first observation is that the forms of the past tense and infinitive are identical for the two suffixes. The homophony of past tense and infinitive forms likely serves as the motive for an abductive change (Andersen 1973), whereby the productive suffix is replacing the non-productive one. This change can thus be compared to the change in English of verbs from the strong to the weak pattern. However, unlike English, where there are some examples of verbs shifting in the opposite direction, as in the case of sneak/sneaked shifting to sneak/snuck (cf. Bybee and Slobin 1982), the Russian suffix shift is unidirectional. The second observation is that the suffix shift eliminates a consonant alternation. Verbs that are
2. The exceptions involve a handful of non-productive conjugation types. Townsend (1975: 9899) gives an inventory of the non-suffixed verbal stems in Russian, totaling 67, which is meager in comparison with the estimated 20,000 verbal stems (Divjak 2004) in the language. 3. Russian verbs are conjugated for only two tenses: Past and what is traditionally called NonPast. The latter usually expresses present tense for imperfective verbs and future tense for perfective verbs. The forms in Table 1 are cited in a phonemic transcription in order to show the presence of /j/ in the -aj suffixed forms. Both Cyrillic and standard transliterations fail to overtly mark the presence of an intervocalic jod. Phonemic transcription is also used in Figure 1 and Tables 2 and 4, since they refer to forms in Table 1. Elsewhere in the text of the article we use a standard transliteration, except when phonemic transliteration is required for clarity (and marked with //). However, a compromise had to be made in Figure 4, since the statistical software is not compatible with diacritics or the use of apostrophes.
702
undergoing suffix shift have a consonant alternation in their -a suffixed forms that is absent in their -aj suffixed replacements. In addition to /p/ ~ /plj/, verbs undergoing suffix shift show the following alternations: /m/ ~ /mlj/, /b/ ~ /blj/, /k/ ~ /tj/, // ~ / /, /x/ ~ //, /t/ ~ /tj/, /d/ ~ / /, /sk/ ~ /j/, /st/ ~ /j/. This simplification is discussed in detail in Nesset 2008. In the case of kapat, the consonant alternation can be seen throughout all Non-Past forms, plus the Present Active Participle, Imperative, and Gerund, where /p/ alternates with /plj/ in the -a suffixed forms, but does not alternate in the -aj suffixed forms (where all forms contain only /p/). The Russian suffix shift therefore entails regularization and simplification of the grammar because it moves verbs from a nonproductive to a productive class and removes a morphophonemic alternation. The Russian suffix shift is well attested in reference works such as Zaliznjak (1977) and vedova et al. (1980) and has been examined from the perspectives of language acquisition, psycholinguistics, stylistic variation, sociolinguistics and dialectology (cf. e.g., Andersen 1980; Gagarina 2003; Gor and Chernigovskaya 2001, 2003ab, 2005; Graudina et al. 2001; Kiebzak-Mandera et al. 1997; Krysin (ed.) 1974; Tkachenko and Chernigovskaya 2006 and references therein). However, there has not been any study that examines differences in the realization of this change across forms of the verbal paradigm. As reflected in Table 1, the Russian suffix shift affects the Non-Past, Present Active Participle, Imperative, and Gerund forms. In section 4 we present an empirical study of how the suffix shift is progressing through these forms of the verbal paradigm. Before turning to the data and analysis, however, we need to discuss the notion of the paradigm and its theoretical implications, which is the topic of Section 3. 3. The paradigm and its structure A number of theoretical questions surrounding the concept of the paradigm remain open, such as: Do paradigms have a theoretical status? If paradigms do have such a status, do they have internal structure? If paradigms have internal structure, what kind of structure do they have? What factors motivate the internal structure of paradigms? In this section we briefly review the theoretical arguments that have been made concerning the existence and structuring of paradigms, limiting discussion to arguments relevant to our analysis in Section 4. If we assume that paradigms do exist and do have structure, the next step is to propose what sort of structure the paradigm should have. It should then be possible to suggest a structure for the Russian verbal paradigm that is relevant for our data on the Russian suffix shift. This section follows this line of reasoning in order to establish a concrete hypothesis that can be tested against our data.
Paradigm structure: evidence from Russian suffix shift 703 3.1. Does the paradigm have theoretical status?
Let us offer an informal definition of paradigm as the set of inflected forms that share a single stem of a lexeme. Although few theoretical terms have a longer history in linguistics, there is considerable disagreement about the status of the inflectional paradigm. Some linguistic theories deny the paradigm any status whatsoever, and among theories that accept the paradigm there are opposing views concerning whether paradigms are unstructured inventories or coherent networks. Paradigms are a hallmark of our legacy from the classical grammarians of Greece and Rome. Plank (1990b: 161) notes that the earliest extant grammatical texts are paradigms that date from about 1600 BC and present the inflected forms of Sumerian. Robins (1979: 25) characterizes the central role of paradigms in this tradition as follows: The framework of grammatical description in western antiquity was the word and paradigm model. The listing of paradigms is a standard feature of reference grammars to this day. In contemporary linguistics, however, the status of the paradigm has been challenged and paradigmatic relations have been neglected (van Marle 1985: 15). Morpheme-based approaches to morphology dispensed with the paradigm altogether. Such frameworks include classical Item and Arrangement (IA) and Item and Process (IP; both described in Hockett 1958), as well as more recent approaches such as Distributed Morphology (Halle and Marantz 1993; Noyer URL). These frameworks propose a lexicon that contains morphemes and a grammar that specifies rules for combining those morphemes, obviating the need for the paradigm. The paradigm is accordingly regarded as an epiphenomenon lacking status as a theoretical object. The paradigm has not, however, been unanimously rejected. Morphological models such as Word and Paradigm (Matthews 1972, 1991; Anderson 1992) and Paradigm Function Morphology (Stump 2001) argue in favor of the paradigm. Stump (2001: 32) claims that paradigms play a central role in the definition of a languages inflectional morphology and that paradigms are not the epiphenomenon that they are often assumed to be in other theories, but constitute a central principle of morphological organization. Fundamental to Stumps framework are paradigm functions that relate a root to a cell in a paradigm. The morpheme-based approaches and the word-based approaches both proceed from a set of theoretical assumptions to build their models, differing radically in whether they choose to deny or recognize the paradigm. An alternative way to probe the theoretical value of the paradigm is by asking whether there are linguistically significant generalizations that cannot be stated without referring to the paradigm. If this is the case, it follows that the paradigm has theoretical status. Both the Paradigm Economy Principle (Carstairs 1987) and the
704
No Blur Principle (Carstairs-McCarthy 1994) have featured generalizations that support the recognition of relationships between inflectional classes. These principles provide at least indirect evidence for the paradigm since inflectional classes rely on the existence of paradigms of individual lexical items (but cf. Mller 2007 for a dissenting opinion). Another kind of evidence in favor of the paradigm is syncretism. If the notion of paradigm enables us to state generalizations about syncretism, we have an argument for the paradigm as more than an epiphenomenon. McCreight and Chvany (1991), following Jakobsons (1958) lead, argue that geometrical representations of paradigms facilitate more insightful descriptions of syncretism than syntactic features. For somewhat different approaches, which also concern geometrical representations of paradigms, see Plank (1990b) and Trosterud (2006). Additional evidence for the psychological reality of paradigms comes from experimental linguistics. Milin et al. (2008) report on a psycholinguistic experiment with Serbian nouns showing that increased complexity of paradigms and inflectional classes yielded longer response times. Milin et al. (2008: 21) conclude that their results support the theoretical concepts of paradigms and inflectional classes. Our aim is to continue the line of reasoning that presents arguments in favor of the paradigm. In Section 4 we present empirical evidence that cannot be explained without reference to the paradigm. If we accept the paradigm as a theoretically significant entity, the question arises as to whether the paradigm has structure. 3.2. Does the paradigm have structure?
The null hypothesis is that the paradigm has no structure, being an unordered list of equiprobable items. Many modern theories take the null hypothesis for granted. A prominent recent example is McCarthys (2005) theory of Optimal Paradigms, which treats paradigms as unstructured entities where the same symmetrical relationships hold among the members so that no form enjoys a privileged position. Alternatively, the notion of structured paradigms has a long tradition and continues to enjoy support among a variety of scholars. Bybee (1985: 49) and Karlsson (1985: 137) both provide detailed criticisms of the assumption that a paradigm has no structure, and their arguments are backed up by empirical evidence. In traditional grammars, certain items were recognized to have privileged status as leading or base forms from which other items in a paradigm are formed. This asymmetric relationship implies that the paradigm has a structure in which certain forms play a more central role. Matthews original version of the Word and Paradigm Model lacked formal mechanisms that could accommodate paradigm structure. However, Matthews (1972: 86 et passim; cf.
Paradigm structure: evidence from Russian suffix shift 705 also Morin 1990) himself was aware of what he calls parasitic formations, i.e., relationships within a paradigm where one stem seems to be derived from a stem of supposedly identical status.4 In later versions of the model, Matthews (1991: 201) invokes metarules, which are closely related to the rules of referral in Stumps (1993, 2001) Paradigm Function Morphology. Both metarules and rules of referral involve relationships among certain forms of a paradigm, and thus imply that inflectional paradigms have structure. Paradigms as structured sets are pivotal in Bybees (1985) and Wurzels (1984, 1989) approaches to inflection. Bybee (1985: 50) discusses asymmetrical basicderived relations holding among the members of a paradigm that resemble the implicational relationships investigated in Wurzel (1984: 116124, 1989: 112 121). In these approaches, paradigms are structured networks in the sense that the members have different statuses because asymmetrical relationships hold between them. According to Bybee (1985: 57) a basic member of a paradigm has a high degree of autonomy, which is the product of three factors, viz. semantics, frequency and irregularity (cf. also Enger 2004). Plank (1990a: 35) likewise analyzes paradigms as structured sets, concluding that paradigms must be recognized to be more than mere unstructured collections of labeled forms. As important factors for paradigm structure, Plank (ibid.) mentions the ordering and grouping of terms realizing inflectional categories, the hierarchical ranking of categories thus realized, the markedness evaluation of paradigmatic oppositions, and the singling out of one or more paradigm slots as formally more characteristic than the others. The studies reviewed in this section suggest that paradigms have internal structure. The next step is to explore what kind of structure we find in paradigms, how this structure is organized and motivated. 3.3. Paradigm structure as modeled by the radial category We propose that paradigm structure can be modeled using one of the cornerstones of cognitive linguistics, namely the radial category. In a radial category, membership is defined in terms of an elements similarity to a central member or subcategorythe prototype. In other words, radial categories are structured around one or several prototypes that are involved in asymmetrical relationships with peripheral members of the category. We suggest that analyzing inflectional paradigms as radial categories has several advantages. The first advantage pertains to cognitive science. Prototypes and radial categories have been applied in psychology (cf. Rosch 1973, 1978), and appear to emerge from general principles of cognition. The radial category thus offers an
4. We will return to the phenomenon of parasitic formations in section 4.1. Similar paradigmatic relations appear to play a crucial role in derivational morphology as well, as argued by Booij (1997).
706
explanatory account of paradigm structure that eliminates the need to invoke any additional mechanisms that would be valid only for language. Radial categories and prototypes have proven fruitful in morphological analysis, as observed inter alia by Bybee and Moder (1983), Janda (1993) and Dbrowska (1997), and yield precisely the type of structured network that has been proposed for paradigms by Bybee (1985) and Wurzel (1984 and 1989). Karlsson (1985, 1986) observed a skewed distribution of Finnish paradigm forms in corpus data and concluded that paradigms must be structured. Because he found a correlation between semantic classes of nouns and the frequency profiles of their paradigms, Karlsson further deduced that meaning properties motivate paradigm structure and that the paradigm is dominated by a few stereotypic forms that are the morphological analogues of the prototypes in Roschs theory of word meaning (Karlsson 1985: 150). Arppe (2005) and Janda and Solovyev (2009) have confirmed the connection between meaning and the frequency distribution of paradigm forms in their studies of synonymy. These studies support the notion that a paradigm may be structured as a radial category and that this structure can be probed empirically. A second advantage is that radial categories have been applied successfully in a variety of areas of linguistics (see particularly Lakoff 1987, Geeraerts 1995, Croft and Cruse 2004, and Lewandowska-Tomaszczyk 2007). The radial category provides conceptual unification across subdisciplines of linguistics, and thus enables us to relate paradigm structure directly to findings in e.g., lexicon, syntax and phonology. As a third point, we would like to mention that the radial category accounts for the markedness effects observed in paradigms (Andersen 1989: 378, see also Andersen 2001). The radial category has an internal structure based on the asymmetrical relationship between the central (unmarked) prototype and (marked) peripheral members. This structure comports with the known phenomena of markedness, such as universal ordering and productivity patterns, Brndahls Principle of Compensation, the allo-eme relationship and neutralization, as well as markedness reversal. Universal ordering patterns result from the fact that the internal structure of a radial category is dependent upon the unmarked prototype and category members related to it. For the category to exist, the prototype must be present and a peripheral member is dependent on the prior existence of the prototype and any intervening members. Brndahls Principle finds more differentiation among unmarked members than among marked members of a relationship. This Principle is a natural byproduct of the privileged position of the unmarked prototype, which has the densest set of relationships to other members, whereas the marked periphery bears a high cost of contextualization, restricting density of expansion. The relative cost of contextualization at the periphery yields the symptomatic allo-eme relationship, where marked allophones and allomorphs exist only in specific contexts
Paradigm structure: evidence from Russian suffix shift 707 in relation to the unmarked members of the relationship that appear in the zero context of neutralization. Janda (1995) details these parallels and illustrates the role of radial categories in six case studies of markedness alignment (aka markedness reversal). The purpose of this article is not to prove or disprove the validity of radial categories, but to suggest that radial categories offer a good fit for the phenomenon we observe in our data. If our interpretation is correct, it has important implications for the paradigm, because we can identify the type of structure paradigms have and make a connection between paradigm structure and other types of structure that are pervasive in language and in cognition. 3.4. Paradigm structure and the Russian verbal paradigm Studies of language change, language typology and psycholinguistics indicate that paradigms have a hierarchical structure and suggest how this structure is arranged. We will restrict this discussion to a sample of works that give evidence of structure within verbal paradigms, yielding a ranking of the categories that are investigated in the present paper. We explore the verbal inflectional endings as symptoms of speakers conceptualization of relevance (in the sense of Bybee 1985). On this basis, we advance a hierarchy, which is tested against a different part of Russian verbal morphology, viz. the derivational suffixes -a and -aj. Thus there is no circularity involved in our line of argumentation. The distinction of finite vs. non-finite is fundamental to the verbal paradigm. We adopt a traditional definition of finiteness, according to which participles and gerunds are non-finite because they cannot express mood, whereas other forms of the paradigm are finite. For the Russian verbs undergoing suffix shift, this entails a distinction between the present active participle (/kapljujij/ kapajujij/ dripping) and the gerund (/kaplja/kapaja/ while dripping) as non-finite vs. the remaining forms of the paradigm. Psycholinguistic experiments such as those reported in Bybee and Pardo (1981) and Bybee (1985: 66) reveal asymmetric inference patterns for Spanish. In this study, Spanish speakers were shown nonce verb forms that mimicked the alternation of diphthongs with mid vowels found in verbs such as contr count with 3sg Present Indicative cunta, but 3sg Preterite cont. Subjects were more likely to use a mid vowel in a past tense form of a nonce verb (1sg Preterite ponz) if they had already heard a past tense form with such a vowel (3sg Preterite ponz), whereas a mid vowel in the infinitive (ponzr) did not have as strong an impact. Bybee (1985: 67) states that [T]his is support, then, for the hypothesis that there are different degrees of relatedness among forms, and that semantic relatedness determines the formal structure of members of a paradigm. On a more abstract level, this finding suggests that finite verb forms are more prototypical than non-finite forms, since inferences are more likely to be based on the former and the direction of inference in a radial category
708
is from the prototype to the periphery (Rosch 1975, 1983). Typological evidence supports a distinction between finite as prototypical vs. non-finite as non-prototypical, since there are languages in which verbs lack infinitives altogether (Joseph 1983). The indicative is likewise the prototypical mood, insofar as it represents the simplest relationship of a situation to reality. Conditionals are more marked since they depict a situation as outside actual reality. Imperatives are about getting people to do things. As pointed out by Nesset (1998: 168), this means that the speaker wants something that is outside actual reality to become part of actual reality. Note that whereas verbs rarely lack indicative forms, some verbs (for example modal verbs, and in some languages verbs of perception) regularly lack imperative forms. Joseph (1983: 24, 110113) even argues that imperatives should be treated as non-finite on the grounds of reduced person opposition and clitic placement. While we do not ascribe to this position, we acknowledge imperatives as non-indicative and therefore less prototypical than indicative forms. For our Russian verbs, this means that the imperatives (/kaplji(tje)/kapaj(tje)/) are more peripheral than the indicative non-past forms. A cross-linguistic study of zero expression among indicative verb forms reported by Bybee (1985: 52) shows that typologically zero expression is by far more common for third person than for first or second person. This finding suggests that the third person, as the unmarked form, is more prototypical than first and second person. This distinction is also supported by Lyons (1977: 638) observation that the third person is unmarked because it is negatively defined with respect to first person and second person: it does not correlate with any positive participant role. Zwicky (1977: 718) corroborates this markedness designation, claiming that third person is maximally unmarked, since it refers to what is left over when the participants in the speech situation (first and second person) have been referred to. The relevant distinction for the Russian verbs is of third person forms (3sg /kapljot/kapajot/ and 3pl /kapljut/kapajut/) vs. first and second person forms. Cross-linguistically singular is unmarked, and therefore prototypical, in relation to plural. This relationship finds support from Janda (1995) on semantic grounds, from Corbett (2000) on typological grounds, and from Lyashevskaya (2004) on the grounds of morphological and syntactic evidence. Given the fact that person selects third person (as opposed to first and second), the number distinction is most crucial for third person singular (/kapljot/kapajot/) vs. third person plural (/kapljut/kapajut/). Taken together, the various pieces of evidence cited in this subsection suggest the following distinctions in terms of prototypicality (where > means that the category to the left is more prototypical than the category to the right). Corresponding examples from the kapat drip paradigm are provided for illustration:
Figure 1. Hypothetical radial category of paradigm for Russian kapat drip
finite > non-finite ex: all other forms > /kapljujij/kapajujij/; /kaplja/kapaja/ indicative > non-indicative ex: non-past forms > /kaplji(tje)/kapaj(tje)/ third person > first and second person ex: /kapljot/kapajot/; /kapljut/kapajut/ > all other non-past forms singular > plural ex: /kapljot/kapajot/ > /kapljut/kapajut/.
Figure 1 displays the hypothetical radial category that the prototypicality relations discussed above suggest for the forms of Russian kapat drip that are undergoing -a > -aj suffix shift. It is possible to collapse this radial category into a one-dimensional scale ranging from the most prototypical to the least prototypical verb form: 3sg > 3pl > 1&2person > imperative > gerund/participle. It is also possible to confirm the relative importance of the above-mentioned categories for prototypicality. Categories differ as to their effect on the meaning of the verb stem. In the terminology of Bybee (1985: 15), a category is relevant to the stem to the extent that the meaning of the category affects the lexical content of the stem. If a category receives a high score on the relevance scale, there is a large semantic distance between the members of this category. Therefore the marked members of a highly relevant category are far away from
710
the prototypical verb form. We suggest that finiteness is most relevant to the verb stem. The marked members of the category, the participles and gerunds, are verbal adjectives and adverbs. Since these non-finite forms are hybrids of verbs and other parts of speech, the participles and gerunds are the least prototypical verb forms. Mood is higher on the relevance scale than the agreement categories of person and number. While agreement only specifies the participants of the situation, mood affects the verbal action itself (although, as Bybee 1985: 22f. is careful to point out, mood has the whole proposition in its scope and does not modify the verb alone). Since mood receives a higher score for relevance than agreement, the marked mood (in our case the imperative) is further away from the prototypical verb form than the marked members of the person and number categories. The relationship between person and number is interesting. Bybee (1985) ranked number higher than person, but the evidence does not appear to be very strong. She predicted that across languages, more relevant categories are more frequently attested in derivational or inflectional morphology. A survey of 50 languages suggests that number is more frequent as a derivational/inflectional category, but Bybee (1985: 33) concludes that the differences are probably not highly significant. Another prediction concerns the relative order of morphemes: more relevant categories are expected to be closer to the stem than categories of lesser relevance. However, while this prediction was borne out by the facts for most categories, the relative order of person and number could not be tested because in a large majority of languages [markers of number] occur in portmanteau expression with person markers and an ordering of elements is impossible to determine (Bybee 1985: 35). As pointed out by Bybee (1985: 58ff.), one way to test the relationship between two categories is to consider paradigms where both categories occur. We expect forms that are semantically close to be similar. Since members of highly relevant categories are semantically more different from each other than are members of less relevant categories, we predict that the members of highly relevant categories are more different formally. Consider the paradigms of Russian personal pronouns, possessive pronouns and verbs (non-past tense), which all display both person and number:
Table 2. Paradigms of personal pronouns (left), possessive pronouns (middle) and non-past tense of verbs (right) Sg 1st 2nd 3rd ja ty on, ona, ono Pl my vy onji Sg moj tvoj jevo, jejo, jevo Pl na va ix Sg -u -V1 -V1t Pl -V1m -V1tje -V2t
Paradigm structure: evidence from Russian suffix shift 711 If person receives the higher score for relevance, we predict paradigms to be divided according to person rather than number. If, on the other hand, we assume number to score higher, we would expect the paradigms to be divided between singular and plural. In personal pronouns, the major division line goes between 3rd person and the rest of the paradigm, insofar as the 3rd singular and plural share the stem /on/ (with palatalization in the plural). This stem sets the 3rd person apart from the rest of the paradigm. The possessive pronouns reveal a similar, but slightly more complex picture. The 1st and 2nd singular form one group in /oj/, and the 1st and 2nd plural form a group of stems in /a/. In this way, the 1st and 2nd persons are different from the 3rd person. It is worth pointing out that the 1st and 2nd persons are inflected for number and case, while the 3rd person forms lack inflection. In this sense, the major division line in the possessive pronouns goes between the 3rd person and the rest of the paradigm.5 The person and number endings for the non-past tense of verbs also show a division line between the 3rd person forms and the rest of the paradigm, since the 3rd singular and plural end in the same consonant. Notice that this generalization holds across Russian dialects although different dialects have different consonants. In northern dialects the 3rd person forms end in /t/, while southern dialects display /tj/ (cf. Kasatkin (ed.) 1989: 151, Poarickaja 2004: 137). The three cases in Table 2 suggest that person receives a higher score for relevance than number.6 Figure 2 summarizes the discussion of finiteness, mood, person and number. Collectively these distinctions support the following expected hierarchy: 3sg > 3pl > 1&2person > imperative > gerund/participle. The remainder of this article takes this hierarchy as its point of departure. It should be noted that this hierarchy is entirely based on the relative status of
5. The reader may ask whether these facts about pronouns bear on the status of person and number in verbs. Booij (1993: 30) has pointed out that the status of number is different in nouns and verbs. Whereas in nouns number specifies whether we are dealing with one or many referents of the noun itself (inherent inflection in Booijs terminology), in verbs number represents agreement (i.e., what Booij calls contextual inflection). Pronouns are in an intermediate position between nouns and verbs. When pronouns are used deicticly, they resemble nouns, but in anaphoric use they behave like verbs. Barlow (1992: 134153) and Corbett (1991: 112 and 2006: 21) demonstrate that important typological generalizations can only be captured if the relationship between a pronoun and its anaphor is considered a type of agreement. Since person and number are involved in agreement in both pronouns and verbs, the facts about pronouns mentioned above bear on the status of person and number in verbs. 6. Our argument is based on Russian. Bybee (1985: 13) states that relevance depends on cognitive and cultural salience, so it is possible that relevance may vary across languages (see Carstairs-McCarthy 1992: 177 for discussion of this point). However, since the present paper focuses on an analysis of Russian, we will not discuss the cross-linguistic implications of the hierarchy we propose.
712
Figure 2.
A hierarchy of categories
forms within a paradigm and does not take into account other possible factors (grammatical constructions, genre, pragmatics). Furthermore, this hierarchy does not address the issue of register difference, despite the fact that gerunds and participles are relatively more frequent in literary as opposed to spoken production and may therefore be more subject to prescriptivism. This second factor is addressed in Section 5. 4. The Paradigm Structure Hypothesis Section 3 gave an overview of a variety of theoretical and empirical studies that support paradigm structure and established an expected hierarchical structure for the Russian verbal paradigm given a variety of types of independent evidence. In this section we state our hypothesis. We then give an operational description of what findings in the Russian suffix shift data would support or refute the hypothesis. The findings are then presented and analyzed. Paradigm Structure Hypothesis: Paradigms are radial categories with prototypical and peripheral members. This hypothesis has direct consequences for language change since we expect language change to progress from peripheral to prototypical forms. Thus a diachronic change like the Russian suffix shift should affect the peripheral forms more than the prototypical forms. The null hypothesis, by contrast, assumes that all forms of a paradigm have equal status and therefore should be affected by diachronic change to the same extent. If paradigms were unordered lists, there would be no reason for different forms in a paradigm to behave differently when undergoing a language change. The Paradigm Structure Hypothesis builds on the logic that a change will begin at the periphery of a radial category, which will show earlier and stronger
Paradigm structure: evidence from Russian suffix shift 713 evidence of the change than the core. Analogical leveling tends to eliminate items that are peripheral. The histories of languages show ample evidence of changes that begin at or are limited to the peripheries of classes; here we cite a few examples from the history of the Slavic languages.7 In the history of Slavic, the dual number was peripheral in relation to singular and plural, and the dual became more and more restricted to use with paired objects, with its paradigm reduced to a few syncretic forms. By Late Common Slavic, there were only three dual forms in the nominal paradigm: one for the nominative, accusative and vocative (-a/- for o-stems; - for a-stems), one for the genitive and locative (-u for both o- and a-stems), and one for the dative and instrumental (-oma for o-stems; -ama for a-stems). Ultimately the dual number was lost in all of Slavic except for Slovene and Sorbian (Janda 1996: 1758). The peripheral athematic verb class was limited to use with five lexical items at the dawn of the Slavic era (vdti know, sti eat, imti have, dati give, byti be), and today no Slavic language maintains it as a distinct verb class (Janda 1996: 913). In relation to the Russian suffix shift, the Paradigm Structure Hypothesis can be tested by examining the distribution of -a vs. -aj suffixed forms across the paradigm. We expect the distribution to follow the independently established hierarchy. Thus the third singular form, which is most prototypical, will be most resistant to the change, retaining -a instead of shifting to -aj. The third plural form should be somewhat less resistant, followed by the first and second person forms that should be even less resistant to suffix shift. The imperative form should be relatively more receptive to the innovative -aj suffix, and the gerund and participle should show the greatest use of -aj. If this pattern is supported by a statistical analysis, the Paradigm Structure Hypothesis is confirmed and the null hypothesis can be rejected. 4.1. Testing the Paradigm Structure Hypothesis
An empirical study of corpus data was conducted in order to test the Paradigm Structure Hypothesis. Ultimately a logistic regression model was designed to handle the individual preferences for -a vs. -aj forms for each verb and its paradigm slots, as well as the factor of root-final consonant.8 This model confirms the expected hierarchy, with one exception: the present active participle,
7. Of course it is possible for seemingly peripheral items to suddenly become very productive rather than being eliminated. For an in-depth study of several such cases, see Janda (1996), which shows that this kind of expansion requires special circumstances that support the interpretation of a marker as prototypical rather than peripheral in the given context. 8. The authors gratefully acknowledge the generous assistance of R. Harald Baayen in designing this model and guiding our interpretation of the results.
714
Figure 3.
Distribution of -a suffix across paradigm forms (in percent)
which uses less -aj than expected. We argue below that this discrepancy is well-motivated on the grounds of formal similarities within the paradigm. On the whole, however, the statistical model confirms the Paradigm Structure Hypothesis. All sentences containing relevant forms of the thirty-seven verbs that are undergoing the Russian suffix shift were extracted from the RNC, yielding 11,460 verb forms. The distribution in terms of percentage of -a (as opposed to -aj) forms is presented in Figure 3. Whereas a chi-square analysis of the distributional differences between the various paradigm forms suggests that these differences are significant,9 chisquare is an unsatisfactory measure for a number of reasons. For one thing, chi-square requires that all observations be independent, but in this data some verbs had more votes to cast than others, since one verb, epat chip contributed only two forms to the data, whereas most others had many more, with the maximum reached by prjatat hide, with 1343 forms. Since different verbs have different frequencies, if they also have different preferences for -a vs. -aj, a chi-square analysis would inflate the statistical significance. It is indeed the case that the various verbs show individual preferences, as visualized in Figure 4, where the vertical dimension in each verbs graph represents preference for -a (toward the top) as opposed to -aj (toward the bottom). Figure 4 represents the log odds for each form of each verb. The log odds ratio measures the odds of the -a suffix vs. the -aj suffix, so a higher value indicates higher odds for -a, while a lower value indicates higher odds for -aj. Log odds were calculated after backing off from zero by adding 1 to all counts according to this formula: log((n -a + 1)/(n -aj + 1)) (log of the number of -a forms plus one, divided by the number of -aj forms plus one). This calculation
9. Chi-square values both for the entire distribution and pairwise between the parts of the paradigm (3sg vs. 3pl, 3pl vs. 1&2person, etc.) are all significant (p < 0.001) with the exception of 1&2person vs. present active participle.
Figure 4. The log odds (of -a versus -aj) for each of the six paradigm slots (s: third person singular, p: third person plural, f: first and second person, i: infinitive, a: active participle, g: gerund).
makes it possible to compare across data with non-uniform numbers of results, since it weights the proportions for the number of contributing observations (Baayen 2008: 196). This use of log odds is important because the frequencies of verbs are very different. A log odds greater than zero (yielding a dot above the midline) indicates a preference for -a, a log odds smaller than zero (yielding a dot below the midline) indicates a preference for -aj. For kapat drip, for example, in the lower left of Figure 4, we see that the log odds for the third person singular (s) is positive, indicating a preference for the -a suffixed /kapljot/ over the -aj suffixed /kapajot/. However, the log odds for the third person plural (p) are negative, indicating a preference for the -aj suffixed /kapajut/ over the -a suffixed /kapljut/. This analysis, via mixed-effects modeling, takes the verb-specific preferences into account by using random intercepts for verbs, as well as by-verb
716
Figure 5. Distribution of -a suffix across root-final consonants (in percent)
random contrasts for paradigm slot. This means that the model contains adjustments that take into account the specific preferences of each verb and paradigm slot. These adjustments are important because of the fact that the observations are not independent, which would be an absolute requirement for a model such as the chi-square test. In other words, since our data is not of observations of 11,460 different verbs, but rather of 11,460 forms of only 37 verbs with varying frequencies and preferences, a mixed-effect model is needed to represent this data responsibly. Whereas prjatat hide clearly prefers -a, other verbs, such as erpat scoop, show the opposite preference. Furthermore, verbs also differ in terms of which paradigm forms are used most frequently. adat thirst and maxat wave are of overall similar frequency (with 1255 total forms for the former and 1232 for the latter), but their distribution over the paradigm is quite different: adat thirst provides 542 present active participles as opposed to only 49 for maxat wave, and while there are only 15 gerunds formed from adat thirst, the figure for maxat wave is 237. In addition to these individual preferences, there is another confounding factor, namely preferences dependent upon the root-final consonant, which for these verbs can be a dental, velar, or labial. Figure 5 presents the differing probabilities of the -a vs. -aj suffix according to the place of articulation of the root-final consonant. A logistic regression model makes it possible to responsibly account for these factors and discover whether there are indeed differences in the use of -a vs. -aj across the paradigm forms. The mixed-effects model is designed to analyze the contributions of paradigm slot, individual verbs preferences, and preferences associated with place of articulation.10 Table 3 presents the coefficients resulting from this model.
10. The code for this model in the R statistical software package is: lmer(cbind(a, aj)~ Paradigm + Place + (1+Paradigm|Verb), data=dat, family=binomial). For an in-depth discussion of the random effects structure of this mixed-effects model, and its role in accounting for the non-interdependence between the observations for a given verb across its paradigm, see Janda, Nesset and Baayen (2010).

Table 3. Coefficients of the mixed-effects model, with associated z- and p-values Estimate a, dental (intercept) f - a (contrast) g - a (contrast) i - a (contrast) p - a (contrast) s - a (contrast) labial - dental (contrast) velar - dental (contrast) 3.738 0.075 2.591 0.988 0.998 1.513 3.382 2.562 Standard Error 0.915 0.483 0.707 0.693 0.366 0.432 1.164 0.967 z-value 4.085 0.154 3.663 1.425 2.731 3.502 2.906 2.649 p-value <0.0001 0.8774 0.0002 0.1542 0.0063 0.0005 0.0037 0.0081
The statistical software selects the active participle (coded a) and the dental place of articulation as its baseline (= intercept) and measures contrasts from that reference point.11 The p-values in the right-hand column indicate that, even when all other sources of variation are taken into account, most of these contrasts are still significant. The two exceptions are the contrast between the present active participle and the 1&2 person and the contrast between the present active participle and the imperative. Figure 6 gives a graphical representation of the relationships among the paradigm slots and how the mixedeffects model has accounted for them. The purpose of Figure 6 is to visualize how well the statistical model fits the observed data. The x-axis in Figure 6 represents the observed values aggregated across all verbs, and the y-axis represents the values of our mixed-effects model. We see that the two values are highly, but not perfectly correlated, since the values of the mixed-effects model include a correction introduced to account for the verb-specific and place of articulation-specific variation. If the model provided a perfect fit, all points (indicated by letters) would fall exactly on the diagonal. Figure 6 indicates a good fit of the model to the data. It is important to note that the separation between the paradigm slots is preserved in our model, and that, with one exception, this distribution follows the expected cline. Thus the third singular shows the strongest retention of the -a suffix, followed by the third plural, then the first and second person forms, and after those come the imperative and finally the gerund with the highest implementation of the -aj suffix. The exception is the present active participle, which is nearly juxtaposed with the first and second persons in Figure 6. The behavior of the present active participle is likely strongly influenced by formal factors, since this participle is a parasitic formation (cf. Maiden 1992,
11. The R package selects the level used as the intercept alphabetically according to the level names. For our data, this yields a (as opposed to f, g, i, p and s) and dental (as opposed to labial and velar) as the levels for the intercept.
718
Figure 6. The log odds ratios for the data aggregated by paradigm slot compared to the corresponding log odds ratios as estimated by the mixed model (s: third person singular, p: third person plural, f: first and second person, i: infinitive, a: active participle, g: gerund) Table 4. The parasitic relationship between the Present Active Participle suffix and the 3pl ending 3pl ending: 1st conjugation verbs 2nd conjugation verbs /ut/; /kapljut/kapajut/ /at/; /govorjat/ Present Active Participle suffix: /uj/; /kapljujij/kapajujij/ /aj/; /govorjajij/
2005) that is dependent upon the third person plural form. We can hypothesize that if two cells in a paradigm are formally closely related, they will display similar behavior with regard to the suffix shift. Table 4 illustrates the close formal relationship between the two forms, illustrated by the verbs kapat drip and govorit talk. For any given verb, the third person plural and present active participle form always include the same vowel, and furthermore, /t/ and /j/ are associated by means of consonant alternations (cf. /otvratjitj/ ~ /otvraju/ repel [infinitive ~ first singular]). It is possible to state a rule of referral (or metarule) according to which the participle is formed on the basis of the third person plural form by replacing the /t/ with its alternant /j/. Since the present active participle is parasitic on the third person plural form, we would expect the present active
Paradigm structure: evidence from Russian suffix shift 719 participle to be influenced by the third person plural. This expectation is borne out by the facts, since the suffix shift data show that the participle ranks between the third person plural and the first and second persons. It appears that for the present active participle, the semantic factors that would favor -aj (peripherality in terms of the paradigm) are moderated by the formal factors that favor -a (due to the parasitic relationship to the more central third plural form). In terms of the Russian suffix shift, the present active participle is located midway between the peripheral position of non-finite forms and the third person plural form that motivates its parasitic formation. The statistical analysis confirms the Paradigm Structure Hypothesis. Data on the Russian suffix shift indicate that the verbal paradigm is structured hierarchically and that diachronic change is implemented most at the periphery of the paradigm. In the following section, we will explore what role frequency plays in suffix shift. 5. Frequency The relationship between frequency and semantic markedness/prototypicality is controversial. Some researchers argue that high frequency is a symptom of unmarkedness or prototypicality (Andersen 1989: 2830, Andersen 2001: 51, Andrews 1990: 136165, Comrie 1983: 85, and Maiden 1992: 287), whereas others ascribe a more explanatory role to frequency (Bybee 2001: 129, Haspelmath 2006). In this section we compare the ranking of paradigm forms established on semantic grounds and confirmed by the distribution of the suffix shift with the ranking of forms by frequency. We show that there is a good, but not perfect match between frequency and the predicted hierarchy. We speculate that the mismatches may be due to the bias toward written language in the Russian National Corpus, although on the basis of the available data it is not possible to pinpoint exactly the role of frequency in suffix shift. In discussing frequency effects in language change, Bybee (2008: 9567) distinguishes between change due to automatization and change motivated by analogy (which is, according to Bybee, a result of imperfect learning). In the case of automatization, high token frequency promotes change since high frequency facilitates a high degree of automatization. Analogy, by contrast, shows the opposite effect, where high frequency elements tend to resist change because they support good mastery of a pattern that is thus resistant to change (cf. also Bybee 1985: 51 and Maczak 1980). The Russian suffix shift is an example of analogy since the -aj suffix yields regularization and simplification of the inflectional paradigm (removing a consonant alternation and preserving the suffix in the present; replacing a nonproductive pattern with a productive one). Thus we expect that the forms with
720
Table 5. Comparison of rankings according to semantics vs. frequency Ranking according to semantic and formal considerations, confirmed by statistical model form 3sg 3pl Pres Act Participle 1&2 Person Imperative Gerund % of -a retained 89.9% 83.5% 79.0% 78.7% 66.2% 49.5% form 3sg Pres Act Participle 3pl Gerund 1&2 Person Imperative
Ranking according to frequency frequency count 4501 2105 1894 1685 964 311
the highest frequency should resist suffix shift, while the forms with the lowest frequency should embrace the innovation. Table 5 compares the suffix shift hierarchy observed in the data and supported by semantic and formal considerations with the hierarchy that would be predicted on the basis of frequency alone based on the counts in our database. Table 5 indicates that frequency correctly predicts the order of four of the forms, namely: 3sg > 3pl > 1&2 Person > Imperative. However, frequency does not account for the ranking of the gerund, which is quite frequent, but nevertheless by far the most innovative form. Frequency also gives an incorrect ranking for the present active participle, which is more than twice as frequent as the 1&2 person forms, but has almost the same rate of retention of the -a suffix. In section 4.1, we suggested that the ranking of the present active participle is due to its formal relationship with the 3 plural. In a comparison of frequency and semantic factors, the participle should therefore arguably be set aside. The main difference between the factors we compare, then, is that semantic markedness and prototypicality yield better predictions for the gerund. One could argue that the frequencies observed in our database are biased because they are based on a mainly written corpus, and that gerunds and participles are likely to be less frequent in spoken Russian. Though a corpus of spoken Russian is under development on the RNC site, it is far smaller than the written corpus, so at this point it is not possible to make a meaningful comparison between spoken and written frequencies for the verbs undergoing suffix shift. However, a survey of the frequencies of gerunds of all verbs in the oral part of the corpus shows that gerunds are less frequent in oral Russian compared to the corpus as a whole. As can be seen from Table 5, the percentage of gerunds in the corpus as a whole is 3.7%, whereas in the oral part of the corpus we found 1.1% gerunds. This difference is statistically significant (chisquare = 2408.749, p-value < 2.2e-16), but the effect size is very small (Cramers V value = 0.0130782; an effect size of 0.1 is considered small).

Table 6. Frequency of gerunds Lemma frequency Whole corpus Oral part of corpus 13581979 135326 Gerund frequency 501036 1522 % Gerunds 3.7 1.1
The results in Table 6 suggest that a better match between suffix shift and frequency could have been obtained if it had been possible to investigate suffix shift in a corpus of spoken Russian. All we can say is that although frequency might play a role in suffix shift, it is not possible to determine exactly the effect of frequency at this point. Hopefully, future developments in corpus linguistics will make it possible to shed more light on the complex relationship between frequency and semantic markedness/prototypicality. 6. Conclusion This article explores inflectional morphology from the point of view of cognitive linguistics. Based on previous research on paradigms and their structures, we propose that paradigms are a valid construct and that they have internal structure. Since the observed differences among paradigm forms involve asymmetric relationships based on markedness and prototypicality, we propose that the structure of the paradigm conforms to that of the radial category, with a central prototype related to more peripheral members. Given this structure and known markedness and prototypicality relationships among members of the verbal paradigm, it is possible to establish the following expected structure, with the more prototypical members toward the left: 3sg > 3pl > 1&2 > Imperative > Gerund/ articiple. P The expected structure gives us a concrete opportunity to test the Paradigm Structure Hypothesis using data documenting the Russian suffix shift. Our prediction is that the most prototypical forms resist the language change, whereas the less prototypical forms implement it. A logistic regression model designed to account for the various sources of variation shows that paradigm slot is indeed a robust predictor of the implementation of the language change, and the overall order of the slots is confirmed. The one exception is the participle, which is less likely to participate in the language change than one would expect on the grounds of semantic factors such as markedness and prototypicality. The present active participle, however, is a parasitic form derived directly from the third person plural form, and it appears that this close formal relationship has motivated a reduced implementation of the language change, since the third person plural is more prototypical than the participle. We show that there is a good, albeit not perfect match between suffix shift and frequency. While
722
the mismatches may be caused by the bias toward written language in our data, it is not possible to pinpoint exactly what role frequency plays in suffix shift. Our empirical study supports the Paradigm Structure Hypothesis that paradigms are a valid concept, that they have structure, that their structure is motivated on semantic grounds, and that this structure comports with that of the radial category. We show that the radial category facilitates a principled account of paradigm structure and morphological change, and our study thus provides empirical evidence in favor of radial categories and cognitive linguistics. Further studies of how on-going diachronic changes progress through inflectional paradigms are needed in order to corroborate this hypothesis and further explore the relationship between frequency and semantic factors in structuring paradigms. It is furthermore possible that there are additional factors at work (syntagmatic, stylistic, pragmatic) that have not been explored in this analysis. Received 21 October 2009 Revision received 18 March 2010 References
Andersen, Henning. 1973. Abductive and deductive change. Language 49. 765793. Andersen, Henning. 1980. Russian conjugation: Acquisition and evolutive change. In Elizabeth C. Traugott et al. (eds.), Papers from the 4th international conference on historical linguistics, 285301. Amsterdam: John Benjamins. Andersen, Henning. 1989. Markedness theorythe first 150 years. In Olga M. Tomic (ed.), Markedness in synchrony and diachrony, 1147. Berlin: Mouton de Gruyter. Andersen, Henning. 2001. Markedness and the theory of linguistic change. In Henning Andersen (ed.), Actualization: Linguistic Change in Progress, 2158. Amsterdam and Philadelphia: John Benjamins. Anderson, Stephen R. 1992. A-Morphous morphology. Cambridge: Cambridge University Press. Andrews, Edna. 1990. Markedness theory: The union of asymmetry and semiosis in language. Durham, NC: Duke University Press. Arppe, Antti. 2005. Morphological features as context in distinguishing semantically similar words. Proceedings from the Corpus Linguistics Conference Series, Vol. 1. Third Biennial Corpus Linguistics 2005 Conference, July 1417, 2005, Birmingham, UK. Baayen, R. Harald. 2008. Analyzing Linguistic Data. Cambridge: Cambridge University Press. Barlow, Michael. 1992. A situated theory of agreement. New York: Garland. Booij, Geert. 1993. Against split morphology. In Geert Booij and Jaap van Marle (eds.), Yearbook of Morphology 1993, 2750. Dordrecht: Kluwer. Booij, Geert. 1997. Autonomous morphology and paradigmatic relations. In Geert Booij and Jaap van Marle (eds.), Yearbook of morphology 1996, 3553. Dordrecht: Kluwer Academic Publishers. Bybee, Joan. 1985. Morphology. Amsterdam/ hiladelphia: John Benjamins. P Bybee, Joan. 2001. Phonology and language use. Cambridge: Cambridge University Press. Bybee, Joan. 2008. Diachronic linguistics. In Dirk Geeraerts and Hubert Cuyckens (eds.), The Oxford handbook of cognitive linguistics, 945987. Oxford: Oxford University Press. Bybee, Joan and Carol L. Moder 1983. Morphological classes as natural categories. Language 59. 251270.
University of Troms

Bybee, Joan and Elly Pardo. 1981. On lexical and morphological conditioning of alternations: Changes in Provencal and Spanish preterite forms. Linguistics 19. 937968. Bybee, Joan and Dan I. Slobin. 1982. Rules and Schemas in the Development and Use of the English Past Tense. Language 58. 265289. Carstairs, Andrew. 1987. Allomorphy in inflection. London: Croom Helm. Carstairs-McCarthy, Andrew. 1992. Current Morphology. London: Routledge. Carstairs-McCarthy, Andrew. 1994. Inflection classes, gender, and the principle of contrast. Language 70. 737787. Comrie, Bernard. 1983. Markedness, grammar, people, and the world. In Fred R. Eckman, Edith A. Moravscik and Jessica R. Wirth (eds.), Markedness, 85106. New York: Plenum Press. Corbett, Greville. 1991. Gender. Cambridge: Cambridge University Press. Corbett, Greville. 2000. Number. Cambridge: Cambridge University Press. Corbett, Greville. 2006. Agreement. Cambridge: Cambridge University Press. Croft, William and Cruse, D. Alan 2004. Cognitive Linguistics. Cambridge: Cambridge University Press. Dbrowska, Ewa. 1997. Cognitive semantics and the Polish dative. Mouton de Gruyter, Berlin and New York. Divjak, Dagmar. 2004. Degrees of verb integration: Conceptualizing and categorizing events in Russian. Leuven: Katholieke Universiteit Leuven PhD dissertation. Enger, Hans-Olav. 2004. A possible constraint on non-affixal inflection. Lingua 114. 5975. Gagarina, Natalija. 2003. The early verb development and demarcation of stages in three Russianspeaking children. In Dagmar Bittner, Wolfgang U. Dressler and Marianne Kilani-Schoch (eds.), Development of verb inflection in first language acquisition: A cross-linguistic perspective, 131170. Mouton de Gruyter: Berlin - New York. Geeraerts, Dirk. 1995. Representational formats in cognitive semantics. Folia Linguistica 29. 2141. Gor, Kira and Tatiana Chernigovskaya. 2001. Rules in the processing of Russian verbal morphology. In Gerhild Zybatow, Uwe Junghanns, Grit Melhorn and Luka Szucsich (eds.), Current issues in formal Slavic linguistics, 528536. Frankfurt am Main: Peter Lang. Gor, Kira and Tatiana Chernigovskaya. 2003a. Mental lexicon structure in L1 and L2 acquisition: Russian evidence. GLOSSOS 4. 131. Gor, Kira and Tatiana Chernigovskaya. 2003b. Generation of complex verbal morphology in first and second language acquisition: Evidence from Russian. Nordlyd 31(6). 819833. Gor, Kira and Tatiana Chernigovskaya. 2005. Formal instruction and the acquisition of verbal morphology. In Alex Housen and Michel Pierrard (eds.), Current issues in instructed second language learning, 103136. Berlin and New York: Mouton de Gruyter. Graudina, Ludmila K., Viktor A. Ickovi and Lija P. Katlinskaja. 2001. Grammatieskaja pravilnost russkoj rei. Stilistieskij slovar variantov. [Grammatical correctness in spoken Russian: Stylistic dictionary of variants.] Moscow: Nauka. Halle, Morris and Alec Marantz. 1993. Distributed morphology and the pieces of inflection. In Kenneth Hale and Samuel J. Keyser (eds.), The view from building 20, 111176. Cambridge, MA: MIT Press. Haspelmath, Martin. 2006. Against markedness (and what to replace it with). Journal of Linguistics 42. 2570. Hockett, Charles F. 1958. Two models of grammatical description. In Martin Joos (ed.), Readings in Linguistics, 386399. Chicago: University of Chicago Press. Jakobson, Roman O. 1958. Morfologieskie nabljudenija nad slavjanskim skloneniem (sostav russkix padenyx form) [Morphological observations on Slavic declension (structure of Russian case forms)]. In American contributions to the fourth international congress of Slavicists, Moscow, September 1958, 127156. The Hague: Mouton.
724
Janda, Laura A. 1993. Cognitive linguistics as a continuation of the Jakobsonian tradition: the semantics of Russian and Czech reflexives. In Robert A. Maguire and Alan Timberlake (eds.), American Contributions to the eleventh international congress of Slavists in Bratislava, 310 319. Columbus: Slavica. Janda, Laura A. 1995. Unpacking markedness. In Eugene Casad (ed.), Linguistics in the redwoods: The expansion of a new paradigm in linguistics, 207233. Berlin: Mouton de Gruyter. Janda, Laura A. 1996. Back from the brink: A study of how relic forms in languages serve as source material for analogical extension. Munich: Lincom Europa. Janda, Laura A., Tore Nesset and R. Harald Baayen. 2010. Capturing correlational structure in Russian paradigms: A case study in logistic mixed-effects modeling. Cognitive Linguistics and Linguistic Theory 6(1). 2948. Janda, Laura A. and Valery Solovyev. 2009. What Constructional Profiles Reveal About Synonymy: A Case Study of Russian Words for sadness and happiness. Cognitive Linguistics 29(2). 367 393. Joseph, Brian. 1983. The synchrony and diachrony of the Balkan infinitive. Cambridge: Cambridge University Press. Karlsson, Fred. 1985. Paradigms and word forms. Studia gramatyczne VII. Ossolineum. 135154. Karlsson, Fred. 1986. Frequency considerations in morphology. Zeitschrift fr Phonetik, Sprachwissenschaft und Kommunikationsforschung 39. 1928. Kasatkin, Leonid L. (ed.). 1989. Russkaja dialektologija [Russian dialectology]. 2. ed. Moscow: Prosveenie. Kiebzak-Mandera, Dorota, Magdalena Smoczynska and Ekaterina Protassova. 1997. Acquisition of Russian verb morphology: the early stages. In Wolfgang U. Dressler (ed.), Studies in pre- and protomorphology, 101114. Wien: Verlag der sterreichischen Akademie der Wissenschaften. Kiparsky, Valentin. 1967. Russische historische Grammatik (Band 2). Heidelberg: Carl Winter Universittsverlag. Krysin, Leonid P. (ed.). 1974. Russkij jazyk po dannym massovogo obsledovanija [The Russian language according to data from empirical investigation]. Moscow: Nauka. Lakoff, George. 1987. Women, Fire, and Dangerous Things. Chicago: University of Chicago Press. Lewandowska-Tomaszczyk, Barbara. 2007. Polysemy, prototypes and radial categories. In Dirk Geeraerts and Hubert Cuyckens (eds.), The Oxford handbook of cognitive linguistics, 139169. Oxford: Oxford University Press. Lyashevskaya, Olga N. 2004. Semantika russkogo isla [The semantics of Russian number]. Moscow: Jazyki slavjanskoj kultury. Lyons, John. 1977. Semantics. Cambridge: Cambridge University Press. Maiden, Martin. 1992. Irregularity as a determinant of morphological change. Journal of Linguistics 28. 285312. Maiden, Martin. 2005. Morphological autonomy and diachrony. In Geert Booij and Jaap van Marle (eds.), Yearbook of morphology 2004, 137175. Dordrecht: Springer. Maczak, Witold. 1980. Laws of analogy. In Jacek Fisiak (ed.), Historical morphology, 283288. The Hague, Paris and New York: Mouton Publishers. Marle, Jaap van. 1985. On the paradigmatic dimension of morphological creativity. Dordrecht and Cinnaminson: Foris Publications. Matthews, Peter H. 1972. Inflectional morphology. A theoretical study based on aspects of Latin verb conjugation. Cambridge: Cambridge University Press. Matthews, Peter H. 1991. Morphology. 2nd ed. Cambridge: Cambridge University Press. McCreight, Katherine and Catherine V. Chvany. 1991. Geometric representation of paradigms in a modular theory of grammar. In Frans Plank (ed.), Paradigms. The economy of inflection, 91 111. Berlin and New York: Mouton de Gruyter.

McCarthy, John J. 2005. Optimal paradigms. In Laura J. Downing, T. Alan Hall and Renate Raffelsiefen (eds.), Paradigms in phonological theory, 170210. Oxford: Oxford University Press. Milin, P., V. Kuperman, A. Kostic and R. Harald Baayen. 2008. Paradigms bit by bit: An information-theoretic approach to the processing of paradigmatic structure in inflection and derivation. In Blevins, James P. and Juliette Blevins (eds.), Analogy in grammar: form and acquisition, 214252. Oxford: Oxford University Press. Morin, Yves-Charles. 1990. Parasitic formation in inflectional morphology. In Wolfgang U. Dressler, Hans C. Luschtzky, Oskar E. Pfeiffer and John R. Rennison (eds.), Contemporary morphology, 197202. Berlin and New York: Mouton de Gruyter. Mller, Gereon. 2007. Notes on paradigm economy. Morphology 17. 138. Nesset, Tore. 1998. Russian conjugation revisited: A cognitive approach to aspects of Russian verb inflection. Oslo: Novus Press. Nesset, Tore. 2008. Objasnenie togo, to ne imelo mesto: Blokirovka suffiksalnogo sdviga v russkix glagolax [An explanation of what did not happen: Blocking of suffix shift in Russian verbs]. Voprosy jazykoznanija 6. 3548. Noyer, Rolf. URL (http://www.ling.upenn.edu/~rnoyer/dm/), accessed December 17, 2008. Plank, Frans. 1990a. Of abundance and scantiness in inflection: a typological prelude. In Frans Plank (ed.), Paradigms. The economy of inflection, 139. Berlin and New York: Mouton de Gruyter. Plank, Frans. 1990b. Rasmus Rasks dilemma. In Frans Plank (ed.), Paradigms. The economy of inflection, 161196. Berlin and New York: Mouton de Gruyter. Poarickaja, Sofja K. 2004. Russkaja dialektologija [Russian dialectology]. Moscow: Akademieskij proekt. Robins, Robert H. 1979. A Short History of Linguistics (2nd edition). London and New York: Longman. Rosch, Eleanor. 1973. Natural categories. Cognitive psychology 4. 32850. Rosch, Eleanor. 1975. Cognitive reference points. Cognitive psychology 7. 53247. Rosch, Eleanor. 1978. Principles of categorization. In Eleanor Rosch and Barbara B. Lloyd (eds.), Cognition and Categorization, 2748. Hillsdale, NJ: Lawrence Erlbaum Associates. Rosch, Eleanor. 1983. Prototype classification and logical classification: The two systems. In Elin Scholnick (ed.), New trends in cognitive representation: Challenges to Piagets theory, 7386. Hillsdale, NJ: Lawrence Erlbaum Associates. Stump, Gregory T. 1993. On rules of referral. Language 69(3). 449479. Stump, Gregory T. 2001. Inflectional morphology. A theory of paradigm structure. Cambridge: Cambridge University Press. vedova, Natalija Ju. (ed.). 1980. Russkaja grammatika [Russian grammar] (vol. 1). Moscow: Nauka. Tkachenko, Elena and Tatiana Chernigovskaya. 2006. Focus on form in the acquisition of inflectional morphology by L2 learners: Evidence from Norwegian and Russian. Paper presented at The Second Biennial Conference on Cogntive Science, St. Petersburg, June 913 2006. Townsend, Charles E. 1975. Russian word-formation. Columbus, OH: Slavica Publishers. Trosterud, Trond. 2006. Homonymy in the Uralic two-argument agreement paradigms. Helsinki: FinnoUgrian Society. Wurzel, Wolfgang U. 1984. Flexionsmorphologie und Natrlichkeit. Berlin: Akademie-Verlag. Wurzel, Wolfgang U. 1989. Inflectional morphology and naturalness. Dordrecht, Boston and London: Kluwer Academic Publishers. Zaliznjak, Andrej A. 1977. Grammatieskij slovar russkogo jazyka [Grammatical dictionary of Russian]. Moscow: Izdatelstvo Russkij Jazyk. Zwicky, Arnold M. 1977. Hierarchies of person. In W. Beach, S. Fox and S. Philosoph (eds.), Papers from the thirteenth regional meeting, Chicago linguistic society, 714733. Chicago: The University of Chicago.
Using corpus methodology for semantic and pragmatic analyses: What can corpora tell us about the linguistic expression of emotions?
ULRIKE OSTER*
Abstract The aim of this paper is to explore some of the possibilities, advantages and difficulties of corpus-based analyses of semantic and pragmatic aspects of language in one particular field, namely the linguistic expression of emotion concepts. For this purpose, a methodological procedure is proposed and an exemplary analysis of the emotion concept fear in English is performed. The procedure combines Kvecses lexical approach and Stefanowitschs metaphorical pattern analysis with additional concepts from corpus linguistics such as semantic preference and semantic prosody. The results of the study show that such a corpus-based analysis of emotion words offers several advantages. Firstly, by exploring the surroundings of the search word in a vast amount of text, we are not only able to find evidence of conceptual metaphor and metonymy that structure the emotion concept and of related emotion concepts, but also we can enrich the description of the emotion concept with information from a series of dimensions and add a pragmatic viewpoint by revealing an explicit or implicit evaluation of the emotion. The second advantage offered by a corpus-based approach lies in the possibility of quantifying results, i.e., comparing the frequency, productivity and creative use of individual metaphors and metonymies, which is especially interesting in view of contrastive studies.
* Address for correspondence: Departament de Traducci i Comunicaci, Universitat Jaume I, Campus del Riu Sec, 12071 Castell, Spain. Email: oster@trad.uji.es. Acknowledgements: I am grateful to two anonymous reviewers for their constructive criticism of an earlier version of this paper, and would also like to thank Heike van Lawick for assisting me with the analysis of metaphorical expressions, and Jos Luis Mart and Ignasi Navarro for their helpful comments on statistical analysis and prepositional constructions. The study has been supported by research project FFI200909544, funded by the Spanish Ministry for Science and Innovation (MICINN). Cognitive Linguistics 214 (2010), 727763 DOI 10.1515/COGL.2010.023 09365907/10/00210727 Walter de Gruyter
728
U. Oster Emotion concepts, corpus methodology, conceptual metaphor, conceptual metonymy, semantic preference, semantic prosody
Keywords:
1. Introduction Corpus linguistics has grown over recent decades into a well-established field of research, and its methods are now being increasingly applied also in cognitive linguistics. Present-day corpora contain very large quantities of data and are thus especially useful for quantitative studies of grammatical aspects of language, which are more easily analysed through (semi)-automatic search processes than semantic or pragmatic issues. Nevertheless, corpus linguistics has also developed a number of conceptual toolsfor example co-occurrence and collocation, semantic preference and semantic prosodythat help to pinpoint semantic and pragmatic aspects of lexical units. The main objective of this paper is to apply these tools to one of the central themes of cognitive linguistic research: the linguistic expression of emotions and their conceptualization through conceptual metaphor and metonymy. In this field, Kvecses (2005: 32) distinguishes between cognitivelyoriented studies, which typically use elicited data, and language-use-oriented researchers, who tend to use corpus data. The first follow a top-down approach that is directed at a supraindividual level and aim to propose conceptual metaphors on the basis of linguistic expressions. The goal of bottom-up corpusbased studies, on the other hand, is the systematic identification of linguistic metaphors in natural discourse (Kvecses 2008a: 169). These aims and methods can be considered as complementary (Kvecses 2008a: 181182), and thus the question pursued in this paper is in which way a qualitative approach can be complemented by using quantitative data from electronic text corpora. In particular, the questions that will be addressed are the following: How can corpus-linguistic concepts be applied in order to establish the conceptual metaphors and metonymies that structure our understanding of emotions? Do they reveal additional semantic or pragmatic facets of the words expressing these emotions? How does this methodology differ from other ways of analysing emotion concepts? In what way might these differences influence obtainable results? In order to answer these questions, a method for studying emotion concepts with the help of electronic corpora is proposed and an exemplary analysis of the emotion concept fear1 in English is performed.
1.
For the sake of clarity, concepts will be marked by inverted commas (pride), lexical units used as search words by single inverted commas (pride), and co-occurrences from the corpus by italics (heart).
Using corpus methodology for semantic and pragmatic analyses 729 2. Corpus-based analyses of the linguistic expression of emotions: A methodological proposal 2.1. Background: Linguistic approaches to the study of emotions The interdisciplinary field of language and emotion has been the subject of numerous studies in the fields of psychology and anthropology (cf. Wilce 2009: Ch. 2 and 9). Within linguistics, interest in the interrelation between language and emotions is more recent and sometimes still seen as neglected (e.g., Schwarz-Friesel 2007). However, a series of approaches has been developed from different linguistic perspectives (cf. Bednarek 2008a: 612 for an overview), and especially in recent years a number of international conferences and theme sessions as well as specific publications on the subject (e.g., Fussell 2002, Schwarz-Friesel 2007, Bednarek 2008a, to name but a few) show a renewed interest in the study of emotion language. For the purpose of this work, I shall briefly overview three of the main methodological approaches used in this area. Within the framework of natural semantic metalanguage (NSM), emotions are described through a metalanguage that consists of universal semantic primitives (Wierzbicka 1990, 1992a, 1992b, 1999). In accordance with this aspiration to universality, natural semantic metalanguage is used especially for contrastive descriptions of emotions that aim at differentiating them either intralinguistically or comparing them in different languages and cultures. In this approach, in order to arrive at the description of an emotion, a pragmatic analysis of prototypical situations in a given culture is carried out, the researchers intuition being crucial. In contrast to Wierzbickas methodology, the lexical approach developed mainly by Zoltn Kvecses (1986, 1990, 1998, 2000, etc.) starts with the idea that . . . language, particularly its lexicon, is a reflection of our conceptual system (Kvecses 1990: 41). For this reason, it bases the description of the complex structure of emotion concepts on an analysis of conventionalized linguistic expressions, such as metaphors, metonymies, idioms, clichs, proverbs and collocations (Kvecses 1990: 43). The data used for studies in the lexical approach is either elicited (enquiries among students) or collected from lexicographical sources like Rogets University Thesaurus (Kvecses 1986: 50). The conceptual structure of the emotion is described on four levels: a system of conceptual metonymies associated with the emotion concept, a system of conceptual metaphors associated with the emotion concept, a set of concepts related to the emotion concept, and a prototypical cognitive model. In the field of cognitive linguistics, as well as in linguistics in general, there is a growing trend towards using a corpus-based methodology. This is especially true with regard to metaphor research (cf., for example, Deignan 1999, 2005; Charteris-Black 2004; Gevaert 2001, 2005; Stefanowitsch 2005;
730
U. Oster
Stefanowitsch and Gries 2006) and also to the study of emotions (Stefanowitsch 2006, Bednarek 2008a). Within the framework of conceptual metaphor theory, Stefanowitsch (2006) advocates very strongly for such an approach. The method he proposes (metaphorical pattern analysis) consists of choosing a lexical item from the target domain, extracting a random sample of its occurrences in the corpus, identifying all metaphorical expressions that the search word is a part of, and grouping them according to general mappings. Stefanowitschs results show that most of the metaphors described in introspective studies (using Kvecses [1998] as reference point) can be identified through a corpus-based analysis and that additional metaphors can be found. Furthermore, an important advantage of a corpus-based approach is the possibility of quantifying results. In Stefanowitschs study, quantification is used in order to find out which metaphors are most strongly associated with an emotion. 2.2. Some methodological considerations
a) Dealing with large quantities of data. Corpus-based analysis of metaphor is not methodologically simple. First there is the question of how to deal with the sheer amount of data obtained. If using a bottom-up approach on a very large electronic corpus, this means that either a vast number of concordances have to be analysed or some way of pre-selection has to be established. The procedure proposed by Charteris-Black (2004) consists of two steps: a qualitative study of the metaphors, which comprises identification, interpretation and explanation, followed by a quantitative concordance analysis. Stefanowitsch (2006) follows a similar approach (a qualitative analysis of part of the corpus and subsequent quantification). However, this author also complements his method with a top-down approach by carrying out specific searches for previously described metaphor types that have not been found in the original sample of 1000 hits (Stefanowitsch 2006: 79). This is plausible within the context of Stefanowitschs aim of demonstrating that all the metaphor types identified through introspection can also be found through a corpus-based study. However it poses a methodological problem regarding sample size if the aim is simply to provide a corpus-based description of the metaphorical expression of the emotion word. Supposing that there are no reliable detailed previous studies, how big should the sample be so as to ensure that no relevant metaphor types are left out? What the contributions of the authors cited above show is that we need to achieve a balance between the necessity of applying coherent methodological criteria and the limitations imposed on the research by corpus size. While it is not possible, when working with very large corpora, to analyse every single co-occurrence, the method should be defined coherently in a way that makes it
Using corpus methodology for semantic and pragmatic analyses 731 possible to trace the metaphorical expressions as exhaustively as possible. I will deal with this issue in some detail in Section 3. b) Retrieving and analysing the data. A corpus-based approach has certain limitations but it also offers additional possibilities. As Deignan puts it: There is no way of, say, entering speaker meaning or a conceptual metaphor into a computer and being provided with a list of lexical items realising that particular meaning or metaphor (1999: 197). While in the lexical approach the analysis starts with a conceptual domain (for example pride) and the data analysed consists of conventionalized expressions, in a corpus-based approach, the starting point is necessarily different: As in Stefanowitschs (2006) approach, in this study I start from a lexical unit (fear). This procedure yields data in the form of co-occurrences (for example heart, swallow, palpable). When looking at the resulting combinations of the search word and its co-occurrences, many of these might be the same as the conventionalized expressions analysed in the lexical approach (e.g., strike fear into someones heart, swallow ones pride). However, the overall picture is bound to be slightly different for two main reasons: The list of co-occurrences only includes lexical units that occur in the vicinity of the search word (fear in this case). Expressions that are used figuratively to describe the emotion without naming it cannot be detected with this method. For example, we would not find expressions like make ones blood curdle as an expression of fear, unless the lexical unit fear occurred within the span that has been defined for the search. It would of course be possible to design a second phase of the analysis, elaborating on every individual metaphor and metonymy. I have not done this here and feel it would not be feasible with such a complex object of study as the linguistic expression of emotions. However, taking into account the finding of Fussell and Moss (1998) that speakers tend to use figurative expressions in addition to the literal ones and not instead of them, this limitation of the data does not seem to be a substantial drawback of the method. A much wider range of contexts can be analysed because these are not limited to conventionalized expressions. 2.3. Potential results
The data that can be obtained through corpus analysis is thus more limited than that of the lexical approach in one respect and more wide-ranging in another. By means of analysing the data provided by the corpus analysis, we arrive at conclusions about the structure of the emotion concept. This structure is again similar to that of the lexical approach (it tells us something about conceptual metaphors and metonymies), but it also opens up some different possibilities.
732
U. Oster
We are therefore going to look in more detail into the additional results that can be expected. 2.3.1. Conceptual proximity. Emotion words seem to have a strong tendency to co-occur with other lexical units expressing feelings, either similar or contradicting ones (for example lovetenderness vs. lovehate). Corpus analysis allows us to quantify the strength of the collocational bonds between the search word and the co-occurring items. However, is it also possible to use these data on co-occurrences to make inferences about how the complex category of an emotion concept is structured? We would probably be expecting too much if we tried to provide a description of the whole category and its internal relationships between central and peripheral members. Nevertheless, analysing the co-occurring emotion words can give us an idea about which elements belong to the emotion category, which are most strongly connected to the emotion word that was used as starting point, as well as about other emotions that are frequently associated with it. 2.3.2. Evaluation and description. Apart from allowing us to have a closer look at well-known aspects of the structure of emotion concepts like metaphor and metonymy, the main advantage of a corpus-based approach is the possibility of making use not only of the material devices (the corpora themselves, the lists of co-occurrences, and the statistical measures) but also of conceptual tools developed in corpus linguistics. The concepts I am proposing to employ are two key notions of corpus studies: semantic preference and semantic prosody, which go back to John Sinclairs work on collocations. Together with collocational profile (i.e., lexical realisation) and colligational patterns (lexicogrammatical realisations), semantic preference and semantic prosody bind words tightly into their contexts and into linguistic convention, forming extended units of meaning (Sinclair 1996). Of these four points, semantic preference and prosody are probably the most difficult to describe, as they focus on connotational and evaluative aspects. In recent years, these concepts have been examined critically (e.g., Whitsitt 2005), renamed (cf. Hoeys (2005) semantic association and Stubbs (2001) discourse prosody), re-explained (Morley and Partington 2009), re-examined (Hunston 2007, Bednarek 2008b) and also increasingly used in cross-linguistic contexts (Tognini-Bonelli 2001, Xiao and McEnery 2006, Stewart 2009, Munday in press). Like Bednarek (2008b), I consider that it is useful to maintain distinction between the two concepts and, as they have not always been defined homogeneously, I will clarify their use here: a) Semantic preference refers to the semantic subsets a words collocates predominantly belong to. In this study, it is used to determine the way the emo-
Using corpus methodology for semantic and pragmatic analyses 733 tion is described and how it combines with (groups of) other lexical units. For this purpose, the semantic fields of the collocates of fear are analysed and classified, which leads to identifying the main descriptive dimensions of the concept and of typical co-occurrence partners from a functional point of view (causes of fear, experiencers of fear, etc.). b) Semantic prosody, on the other hand, is a connotation that can be transferred to a word if it co-occurs frequently with words carrying a positive or negative evaluative load. Classic examples of semantic prosody include: happen, for which Sinclair (1987) showed that things that happen are usually negative; utterly, which Louw (1993) showed to be overwhelmingly combined with adjectives expressing something negative; and cause, which according to Stubbs (1995) collocates most frequently with negative nouns like harm, alarm, quarrel, danger, etc.
Semantic prosody thus reveals an evaluative potential of the extended unit of meaning that is not always obvious (cf. Channell 2000) and which takes the analysis to a pragmatic level (Sinclair 1996: 87). Several aspects of semantic prosody have been controversial in recent years. One of them is over the possibility of an evaluative connotation (the good or bad semantic prosody) being transferred from one lexical unit to another (cf. Withsitt 2005). Hunston (2007: 266) points out that the idea of the carrying over of attitudinal meaning from one context to another provides a good explanation for implied meaning, but that this does not mean that it necessarily always happens. The issue is clearly beyond the scope of this article; however, in the case of emotion concepts, which inherently already bear an important evaluative load, it is probably not the most important question. What does matter in the context of analysing emotion concepts is the idea that the typical surroundings of a lexical unit may reveal attitudinal meaning and evaluative connotations that relate to the concept. Another open question is whether an analysis of semantic prosodies should concentrate essentially on positive or negative evaluation (Morley and Partington 2009) or if it is also useful to identify more differentiated aspects of attitudinal meaning (Hunston 2007, Bednarek 2008b). Possibly the two options are more compatible than it seems, e.g., through the Linnaean-style binomial notation proposed by Morley and Partington (2009: 141), such as [good: pleasurable], [good: being in control], [bad: difficult], and [bad: not being in control]. For the purpose of this study, those co-occurrences of the emotion word that present an additional evaluative load will be recorded separately with the aim of assessing a potential positive or negative connotation of some kind.
734
U. Oster
3. Applying the method: A corpus-based analysis of fear 3.1. Corpus choice For the purpose of this study, there are no special needs regarding genre, mode or time of the texts included in the corpus. Size, however, is important as emotion words are not high frequency words and a very large amount of text material is needed in order to draw conclusions about their behaviour in context. The best choice, therefore, will be using a very large, general purpose corpus of several hundreds of millions of words rather than compiling a corpus tailormade to the needs of the analysis and necessarily smaller. For English, there are several large and freely available corpora, from which I have chosen the Corpus of Contemporary American English (Davies 2008), containing approximately 300 million words at the time of the search. This corpus is continually updated, which means that results may change slightly when a search is carried out at different moments in time. 3.2. Search process
Out of the range of possibilities offered by corpus-analysis software (including word lists, statistics of word frequencies, concordances and key words in context), the most useful tool for semantic analysis seems to be that of extracting a list of co-occurrences for a given search word. This is the (semi)automatic part of the analysis, although this does not mean there are no decisions to be taken in this phase of the process. There are large differences in the search facilities offered and the type of data provided. Even if this does not influence the results of the study directly, the availability or absence of certain facilities is bound to make the search process either easier or more cumbersome. The following are a few of the features afforded by most corpora and which I have found useful in order to rationalize the search process: Regarding the search word (node): Searching by lemmas, which makes it possible to find different forms of the lexical unit in one search. This might not be relevant in our case because the result of the search for fear will automatically include the occurrences of the plural noun fears. However, it does matter in languages in which the plural or other inflectional forms differ more substantially from the singular form (e.g., the German Angst/ngste) or when the search word is a verb.2
2. It has been shown that different forms of the same lemma can behave differently. Although this is also true for fear/fears, these differences have not been analysed in depth since the focus of this research is on the overall behaviour of the concept, taking together its occurrences in singular and in plural form.
Using corpus methodology for semantic and pragmatic analyses 735 Limiting the search by the part of speech (POS) of the search word. Collocations may differ considerably between the noun fear and the verb fear. This is especially important in contrastive studies as in most languages verb and noun do not coincide in their form. Regarding texts: Choosing particular sub-corpora according to genre or time. For the purpose of this study, the only restriction in this sense was the limitation to written sources. In contrastive studies this can be also used to make the corpora more comparable by, for example, excluding genres that are not present in another corpus. Regarding co-occurrences: Differentiating the search with respect to the POS of the co-occurring word. This is useful in order to facilitate the subsequent phase of classifying the co-occurrences. For example, conceptual metaphors and metonymies are often expressed through verbs and nouns, while an adjective search will reveal many things about how the emotion is described. It also makes it possible to adjust the search span according to POS. For example, the classic 4-word span is adequate when looking for nouns and verbs. However, searching for adjectives and prepositions with the same span produces a lot of noise and I chose to reduce it to 2 for these. Searching not only for co-occurrences but also for compounds containing the search word (i.e., *fear, fear*). This is especially interesting in view of contrastive studies that combine languages whose tendencies towards compounding are not equally strong. I have found it useful, however, to exclude lexicalised compounds (e.g., fearful) as their extremely high frequency may distort the results considerably. Regarding results: Grouping results by lemmas. This means that different forms of a word are counted together and shown in their uninflected form. Examples include: paralyze, paralyzed paralyze; terror, terrors terror. Using different possibilities for sorting the results. For example, it can be useful to carry out a search twice, sorting it once by absolute frequency and once by one of the statistical measures offered by the corpus that are intended to establish the relevance of a collocation.3 In this way, one finds both the most frequent collocates and those that appear especially often together with the node. Establishing a minimum frequency of co-occurrence in order to exclude extremely rare combinations. This is even more important when using a
3.
Commonly used measurements of this type are the MI-index (mutual information), loglikelihood or t-score.
736
U. Oster
measure like the MI-index4. If no minimum frequency is set, very strange words (or even typos) score very highly on this index despite the fact that they only occur once or twice in the whole corpusbecause they always do in combination with the search word! Establishing a maximum number of hits. This can be used as a means to keep the amount of data that has to be analysed on a manageable level without having to limit the analysis to random samples. Table 1 shows the searches that were carried out applying these criteria:5
Table 1. Searches carried out in the COCA corpus Search string 1 2 3 4 5 6 7 8 9 10 11 fear.[nn*] fear.[nn*] fear.[nn*] fear.[nn*] fear.[nn*] fear.[nn*] fear.[nn*] fear.[nn*] fear.[nn*] fear* *fear POS of co-occurrence noun noun verb verb adjective adjective adverb preposition preposition Sorted by frequency relevance (MI-index) frequency relevance frequency relevance frequency frequency frequency frequency frequency Minimum frequency 2 5 2 5 2 5 2 2 2 2 2 left/right span 4/4 4/4 4/4 4/4 2/2 2/2 2/2 0/27 2/0 Maximum no. of hits 400 400 400 400 400 not necessary6 not necessary not necessary not necessary not necessary not necessary
3.3.
Classification process
Following the search process described above, the corpus provides us with a list of co-occurrences (or, rather, several lists if complex searches are carried out). In the second, qualitative phase, these lists are then analysed and classified. Classifying the co-occurrences means deciding whether they are relevant for one (or more) of the following points: a) The metaphorical understanding of the emotion (e.g., overwhelmed by fear FEAR IS AN ATTACKER)
4. The mutual information index compares the observed frequency of co-occurrence of two items to what could be statistically expected if they only co-occurred by chance. 5. Additionally, for all searches the sections of the corpus were limited to fiction, magazine, newspaper and academic and the results were always grouped by lemmas. 6. In these cases there was no need to determine a maximum number of hits because there were less than 400 co-occurrences with a minimum frequency of two. 7. It proved useful to search separately for prepositions occurring before and after the noun (e.g., fear in ones heart vs. to be in fear).
Using corpus methodology for semantic and pragmatic analyses 737 b) c) d) e) Evidence for metonymic usages (e.g., choke Disturbed breathing as a sign for fear) Relationships to other emotion concepts (e.g., loathing) A description of the emotion (e.g., ever-present) An evaluation of the emotion (e.g., hidden)
Of course, not every co-occurrence is relevant for the analysis. Even highly frequent items may not reveal anything about any of the five categories. In the case of fear, some of these unrevealing co-occurrences were, for example, factor, way, or day. In a second step, this kind of qualitative filtering is also used to check the quantitative results. This means that when a collocate is classified into one of the categories, a closer examination of the contexts is carried out in order to count only those instances that are actually relevant. For example, in the COCA corpus bad and fear co-occur 59 times, but only 5 of the contexts can be considered relevant in the sense that bad is used as an intensifier of fear, like in the following examples.
Year 1994 1991 Corpus section NEWS FIC Title WashingtonPost BkSF:TemptationsSt Concordance line I just felt totally trapped. I did not know what to do and the fear got so bad I was unable to function like in my normal life. get hurt. A safe way, I figured, of overcoming a real bad fear. Besides, you probably also recall how I reacted when the Old Guy told
On the other hand, the following contexts are left out in the count of relevant examples.
Year 2008 2006 Corpus section MAG NEWS Title GolfMag WashMonth Concordance line , powerful swing. MANY OF US get caught up in swing thoughts and our fear of hitting bad shots, a combination that results in, what elsebad shots decide not to hold as many lobbyist-sponsored fundraising events, especially in Washington, for fear of bad publicity (and possible indictment against themselves) and raise much more of
3.3.1. The identification of metaphorical expressions. The metaphorical conceptualisation of the emotion concept is the main focus of this analysis and has also been the object of extensive previous research. I will therefore explain the identification and classification process of the metaphorical expressions in more detail.
738
U. Oster
The identification process follows the core of the procedure proposed by the Pragglejaz Group (2007). As we are not dealing with linear textual analysis, though, but starting with a list of co-occurrences, the order of steps has to be different: a) b) c) The list of co-occurrences is scanned for possible candidates that might be part of a metaphorical expression. Contexts of these are retrieved (concordance lines). It is established whether there is a contrast between contextual meaning and the more basic meaning of the lexical unit (i.e., a more concrete or precise meaning, one that is related to bodily action or that is historically older, cf. Pragglejaz Group 2007: 3).8 Only those instances in which a contrast exists are counted.
d)
The qualitative context-based filtering mentioned above is especially relevant with respect to metaphor identification. Not only is it used to establish whether an expression is used metaphorically but also whether it is used in more than one sense. For example, it can happen that one part of the instances of a cooccurring lexical item is classified in one category, while another part is in another. So, some instances of the co-occurrence override were classified as an expression of the metaphor FEAR IS SOMETHING THAT DOMINATES.
Year 2002 Corpus section ACAD Title ForeignAffairs Concordance line to signal that the United States was not prepared to defend local governments, the fear of Soviet control overrode these concerns. In 1957, in response to increased instability
Others, like the following, belong to the category FEAR IS SOMETHING THE SELF FIGHTS BACK AGAINST.
Year 1990 2003 Corpus section FIC FIC Title BkSF:Iris LiteraryRev Concordance line It was all fascinating, but not quite fascinating enough to override the growing fear in her. Despite the fact that it was yet to come on formally, having told him Id been a teacher? # And if I had overridden fear and dared ask him, Donny, please close the curtain, wouldnt I
8.
In case of doubt, the Merriam-Webster Online dictionary was used for establishing this contrast. When there were more than 50 contexts for a co-occurrence, the total number of metaphorical contexts was estimated on the basis of the proportion found in the first 50 instances.
Using corpus methodology for semantic and pragmatic analyses 739 As a certain degree of subjectivity in the classification of metaphorical expressions is inevitable, this process was done in several steps. First, all the expressions were classified independently by two researchers. The discussion of divergences in the classifications led to the establishment of specific criteria for problematic cases, which were then applied coherently to the whole data set. The final result was then checked again by both researchers. The criteria are the following: In the case that an expression could be interpreted according to more than one metaphor (e.g., gripped by fear as an instance of FEAR IS AN ATTACKER or FEAR IS A HUMAN BEING), the analysis is first done on the highest level of the typology. Gripped by fear, for instance, seems more indicative of the conceptualisation of fear as an ANTAGONIST than as an AUTONOMOUS FORCE. It is therefore classified in the category FEAR IS AN ATTACKER. If an expression is already metaphorical in the source domain, only the last link in this metaphorical chain is taken into account. For example, the cooccurrence of fear with severe or stricken with is classified as an instance of FEAR IS AN ILLNESS, irrespective of the fact that in combination with an illness these could in turn be interpreted as metaphorical (THE ILLNESS IS A HUMAN BEING / A WEAPON). 3.3.2. The classification of metaphorical expressions. The starting point for the classification of the metaphor expressions identified in this way is a typology that was itself the result of a corpus-based contrastive analysis of several emotion concepts, including fear, envy and pride (Oster 2010). This study confirmed that most metaphors found in the corpus data could be accounted for by one of the metaphor types described in the lexical approach, but it also introduces some additional subtypes. THE EMOTION IS: A. SOMETHING INSIDE THE BODY B. A FORCE i. AN ANTAGONIST ii. AN AUTONOMOUS FORCE iii. AN AUTONOMOUS BEING, THOUGH STILL PART OF THE PERSON C. AN ILLNESS/ INSANITY D. AN OBJECT E. A PLACE / A CONTAINER
740
U. Oster
This typology is of course heavily influenced by Kvecses studies, but it groups the types of metaphor differently. Furthermore, the conceptual metaphors at the highest level of abstraction are formulated in a slightly more general way, in order to account for minor differences in metaphorical expressions between languages without postulating conceptual metaphors that differ at the highest level. For example, type A (THE EMOTION IS SOMETHING INSIDE THE BODY) is similar to Kvecses conceptual metaphor THE EMOTION IS A FLUID IN A PRESSURIZED CONTAINER. The more general formulation used here owes to the fact that in Spanish and German few examples were found that evidence the conceptualization of the emotion as a fluid and even fewer that would make one think of a pressurized container. In the case of conceptual metaphor B (THE EMOTION IS A FORCE),9 what the three subtypes have in common is that the emotion is characterised by being a distinct entity that is autonomous, up to the point of a conceptualization as a person or an animal. I distinguish three subtypes that highlight different features. (i) AN ANTAGONIST: The emotion is an (external) force that attacks the self and is fought against. (ii) AN AUTONOMOUS FORCE: The emotion is a force that acts independently and is not controlled by the self. (iii) AN AUTONOMOUS BEING, THOUGH STILL PART OF THE PERSON: Here, there is a stronger link to the self. The emotion is conceived as a person inside the person, which can be passive (she hurt my pride), or active if it acts instead of the self (Spanish: mi orgullo se subleva). Depending on the kind of emotion we are looking at, some types are more relevant than others. For example, in the case of negative emotion concepts like fear or envy, there is a predominance of metaphorical expressions that represent the emotion as coming from the outside and attacking the person. Emotions like pride, on the other hand, which can be seen as positive, are more frequently seen as an entity which belongs to the person even if it is autonomous.
9. We could also understand types A, B and C according to a generic-level master metaphor EMOTION IS FORCE, as suggested by Kvecses (2000: 6180), and consider them specificlevel instantiations of this metaphor. However, EMOTION IS FORCE is used here in a more restricted sense, applying it only to the three subtypes of B (OPPONENT, AUTONOMOUS FORCE AND AUTONOMOUS BEING INSIDE THE PERSON), thus stressing the differentiating traits of the three major metaphors (A SOMETHING INSIDE, B FORCE, C ILLNESS/MADNESS). On the other hand, it does not seem possible to include types D (THE EMOTION IS AN OBJECT) and E (THE EMOTION IS A PLACE/CONTAINER) in the EMOTION IS FORCE generic-level metaphor.
Using corpus methodology for semantic and pragmatic analyses 741 3.3.3. Finding evidence for conceptual metonymy. In the lexical approach to the study of emotions, we speak of conceptual metonymy when an emotion is represented by its physiological effects or by the behavioural reactions it generates. This makes the limitation of the corpus-based method as explained above especially relevant: Owing to the methodological necessity of including the node fear in the corpus search, it is only possible to find instances of physiological effects or behavioural reactions if the emotion itself is also mentioned, whereas there does not seem to be a straightforward way of tracing instances in which the physical effect actually stands for the emotion. Nevertheless, the co-occurrences include many expressions that evidence various physical or behavioural effects of the emotion. An analysis of these cooccurrences can probably provide us with interesting insights into the question of which effects are prevalent in the conceptualization of an emotion in a given language/culture. 3.4. Ways of quantifying the relevance of results
Many corpus-based semantic studies make use of corpus data in order to obtain only qualitative information on their subject. For example, in his account of corpus studies in lexical semantics and the role of co-occurrence analysis, Stubbs (2002: 73) chooses not to present any figures on the statistical significance of the data. The two main reasons presented for this decision are that variables in natural language texts are not randomly distributed and that the levels of co-occurrence are far above what one might expect by chance. However, even if the linguistic data in text corpora is not randomly distributed, this does not mean that statistic analysis cannot be performed or that its results are not meaningful. In fact, the possibility of quantifying results is one of the main advantages of the corpus-based approach. Only this will enable us to assess the productiveness of certain metaphors and to make comparisons either intralinguistically, for example, for questions like Are there metaphors that are more frequent for some emotions than for others? (cf. Stefanowitsch 2006) or Do we use the same type of metaphors now as we did two hundred years ago? or interlinguistically (Is a certain type of metaphor more frequent in one language than in another?). One common way of quantifying the relevance of collocations is the mutual information index. However, this index tends to give high scores to very rare words. In view of these difficulties, I have adopted an approach that is based on Kvecses reflection that . . . the more frequent and numerous the linguistic metaphorical expressions linking the two domains, the more stable the connections between them in the brain may be (2005: 34). Kvecses hypothesis actually seems to link the two observable parameters (frequency and quantity of metaphorical expressions) to the cognitive entrenchment of the metaphors
742
U. Oster
in the human brain. This is a plausible hypothesis, but I do not want to take the conclusions from a corpus-based study this far. Nevertheless, I think that these parameters are useful in order to quantify the relevance of certain metaphorical usages in a given corpus and in the section of language it was designed to represent (in the case of our study: contemporary written American English). The following parameters were thus adopted as a measurement of relevance of the results: The number of different linguistic expressions co-occurring with fear in the corpus that are a realisation of a certain metaphor or metaphor subtype. For example, there are eight different linguistic expressions for the metaphor FEAR IS A BURDEN (outweigh, carry, alleviate, weight, fraught with, heavy, fear-charged, fear-laden). The absolute frequency of linguistic expressions, i.e., how many times an item co-occurs with fear. For example, the absolute frequency of outweigh as co-occurrence partner of fear in our corpus is 13, and for the metaphor FEAR IS BURDEN it is 59 (the sum of outweigh [13], carry [12], alleviate [12], weight [10], fraught with [4], heavy [4], fear-charged [2] and fear-laden [2]).10 The relative frequency, i.e., the absolute frequency divided by the total number of classified metaphorical co-occurrences. For example, the relative frequency of outweigh as co-occurrence partner of fear is 0.2% out of a total of 5516 metaphorical co-occurrences.
4. Results 4.1. Conceptual metaphors
4.1.1. Metaphor types and subtypes. When looking at the results from a qualitative point of view, we find that all the major types of conceptual metaphor but one are used to express the concept of fear. Fear is described as a FORCE that is either hostile towards the self (Bi) or out of its control (Bii), but
10. As parallel to the corpus-linguistic concepts type and token, some researchers (Gevaert 2001, 2005; Koller 2006, 2008) have called type what is called number of different expressions here and token what I call absolute frequency. In order to avoid confusion between the well-established notion of type-token ratio as a measure of the lexical variety of a text and the creativity ratio I am going to introduce, I prefer to maintain the more transparent terminology explained above.
Using corpus methodology for semantic and pragmatic analyses 743 there is no evidence for metaphors that stress that the emotion is part of the person (Biii). Furthermore, the metaphorical expressions found for metaphor types A, B, C, D and E show distinct features, which allows a further classification into subtypes. I will now briefly explain the most important aspects of every metaphor and refer the reader to Appendix 1 for a complete list of linguistic expressions (with quantities in square brackets) for every metaphor and metaphor subtype. A. FEAR IS SOMETHING INSIDE THE BODY On a general level, one of the most pervasive signs for the conceptualization of fear as something located inside the human body are prepositional or adverbial structures like fear in . . . (persons, body parts), strike fear into . . . , fear moves through . . . , fear inside, or fear within. Furthermore we find a great variety of co-occurrences that evidence that fear is located in or affects specific body parts (heart, stomach, blood, etc.) or the soul. Additionally, there are many metaphorical expressions that allow us to subdivide the basic conceptual metaphor into subtypes. For example, many of the metaphorical expressionsbut by no means all of themare related to the conceptualization of the emotion as a liquid and/or to more general metaphors like MORE IS UP: Fear is a liquid inside the body (wave of fear, trickle of fear, fear drains from s/o, etc.) and when it becomes stronger it tends to go up in the body like a liquid in a container (rise (in/inside) s/o, swallow, force down ones fear, etc.). Other aspects are that the emotion comes from the outside ( fill, inspire, instil(l), etc.), that an emotion that is strong is deep inside the body (deep, deep-seated, ingrained, profound, etc.) and that it emanates from the body and is thus perceptible (smell, fear-scent). Bi. FEAR IS AN ANTAGONIST Fear is also very frequently understood as an external force that acts against the experiencer. This can take the form of an attack (grip, seize, overwhelming, etc.), domination (haunt, take hold of, dominate, etc.) or destruction (gnawing, all-consuming, nagging, etc.). It may cause pain (twinge, numb ones fear, tortured by fear) or be a burden (be outweighed by, carry, alleviate, etc.). The attacker can also be conceptualized as an evil force (haunt, possessed by). On the other hand, there is the possibility of counterattack (conquer ones fear, overcome, fight (back), etc.). Bii. FEAR IS AN AUTONOMOUS FORCE Expressions relating to the general idea of fear being an autonomous force acting independently and not controlled by the person are spread, uncontrollable and powerful. More specifically, the emotion can be seen as a human being or animal (arouse, awaken, creep, etc.) or as a plant (sow, deep-rooted). It is also
744
U. Oster
described as a liquid11 (wave of fear, sweep, wash over s/o, etc.) or fire (spark, fuel, stoke, etc.). C. FEAR IS AN ILLNESS/INSANITY Another widespread conceptualization of fear is that of an illness (suffer, develop a fear, sick with, etc.) or madness (crazy with fear, mad with, insane with/from).12 D. FEAR IS AN OBJECT Fear can be perceived as a physical object in general (palpable, push away, struck with fear), as a piercing object (edge, sharp), a possession (lose, bring, take, etc.), as food ( feed on fear, to be fed fear) or an obstacle (break through fear, get over/past fear). E. FEAR IS A PLACE/CONTAINER The conceptualization of fear as a place or container is mainly expressed through the use of prepositions (in ones fear, out of fear, through ones fear, over ones fear). Some expressions also convey the idea of fear as something that surrounds people (amid fear, thick in the air). 4.1.2. Frequency of occurrence. A quick glance at the most and least frequent co-occurrences (cf. Table 2) shows two things: the importance of the conceptualisation of fear as SOMETHING INSIDE THE BODY (in, into, heart, fill, inspire), and the fact that prepositions enjoy a special status. Four out of the ten most frequent co-occurring items are prepositional constructions and these are extremely frequent in comparison to other metaphorical expressions. In spite of the fact that there are rather few different prepositions expressing a metaphorical understanding of an emotion (in, into, out of, past, through, over), this important numerical difference between prepositional constructions and other word classes leads to a considerable impact of the former on the overall distribution of the different metaphors. If prepositions are taken into account (cf. the first column of each parameter in Table 3), the most prevalent metaphor is FEAR IS SOMETHING INSIDE THE BODY (2072 instances, i.e., 37.6% out of a total of 5516), followed by FEAR IS A PLACE/ CONTAINER (26.5%) and, with less incidence, FEAR IS AN ANTAGONIST (14.5%), FEAR IS AN OBJECT (11.8%) and FEAR IS AN AUTONOMOUS FORCE (7.3%). In comparison, FEAR IS AN ILLNESS/INSANITY (2.3%) can almost be considered marginal.
11. In contrast to one of the subtypes of A (A LIQUID INSIDE THE BODY), the liquid here is outside the person, surrounding it. 12. In both cases, contexts dealing with fear as an actual mental illness, i.e., not in a metaphorical sense, were not taken into account.
Using corpus methodology for semantic and pragmatic analyses 745

Table 2. The ten most and least frequent co-occurrences Co-occurrences in ones fear fear in . . . (persons, body parts) out of fear have overcome heart lose fill (strike, etc.) fear into . . . inspire ... ... lungs leak fear stricken with constrained by fear take over driven beget fear fear-engendering ignite thick Absolute subtype frequency 1000 800 400 287 203 170 150 114 100 95 2 2 2 2 2 2 2 2 2 2
Table 3. Overall distribution of conceptual metaphors (with and without taking prepositional constructions into account) Conceptual metaphor: FEAR IS Absolute frequency With prep. A. SOMETHING INSIDE THE BODY B i. AN ANTAGONIST B ii. AN AUTONOMOUS FORCE C. AN ILLNESS/ INSANITY D. AN OBJECT E. A PLACE/CONTAINER Total 2072 801 401 127 651 1459 5516 Without prep. 1087 796 401 127 651 36 3103 Relative frequency Number of different expressions With prep. 51 52 36 18 15 9 181 Without prep. 46 51 36 18 15 4 170
With prep. 37.6 % 14.5% 7.3% 2.3% 11.8% 26.5% 100 %
Without prep. 35.0% 25.7% 12.9% 4.1% 21.0% 1.2% 100 %
746
U. Oster
However, if we only count those word classes traditionally thought of as having stronger lexical weight (nouns, verbs, adjectives), which results in a total of 3103 instances, the picture is somewhat different, with a more even distribution among the three major metaphor types A (35%), B (38.6% taking together the two subtypes) and D (21.0%). The largest difference is to be found in the PLACE/CONTAINER metaphor. While it is the second most important metaphor when taking into account prepositions, there are very few expressions that include nouns, verbs or adjectives, which makes the frequency of this metaphor drop to 1.2% if we only count the latter. This might be interpreted as indicating that the conceptualisation of an emotion as a place or container is a very basic and conventionalized metaphor, but barely ever consciously employed. Because of the large differences in frequency between prepositions and other word classes, it seems convenient to treat prepositional expressions separately when comparing the quantitative results for individual metaphors in more detail. Table 4 below (Section 4.1.3) therefore shows the frequency of every subtype without taking into account prepositional expressions in order to be able to appreciate the differences more clearly. From the point of view of frequency, the major subtypes are: FEAR IS A POSSESSION (525 instances16.9% of a total of 3103) IT IS SOMETHING THE SELF FIGHTS BACK AGAINST (345 11.1%) IT IS LOCATED IN OR AFFECTS SPECIFIC BODY PARTS (329 10.6%) IT COMES FROM THE OUTSIDE (4149.9%) IT IS A HUMAN BEING OR ANIMAL (2177.0%) IT IS UNSPECIFICALLY LOCATED INSIDE THE BODY (2167.0%) IT IS AN ATTACKER (1625.2%) 4.1.3. Productivity and creative use. Frequency of occurrence alone is not enough to describe the metaphorical expression of an emotion, though. There are some metaphors that are highly frequent but only very few different linguistic expressions can be found for them. Apart from the FEAR IS A PLACE/ CONTAINER metaphor, which is almost exclusively present in rather few prepositional structures, there are other metaphor subtypes whose materialisation is limited to almost stereotyped expressions, for example fill with and its synonyms for the metaphor FEAR IS SOMETHING THAT COMES FROM the OUTSIDE. In contrast, there are other metaphors that are not only frequent but also show a high degree of productivity and seem to be explored creatively in language use. This is the case of FEAR IS FIRE, for which we find nine different co-occurrences (spark, fuel, flash, stoke, flare, extinguish, burn, fire, ignite).

Table 4. Distribution of metaphor subtypes FEAR IS SOMETHING THAT IS UNSPECIFICALLY LOCATED INSIDE THE BODY SOMETHING THAT IS LOCATED IN OR AFFECTS SPECIFIC BODY PARTS SOMETHING THAT AFFECTS THE SOUL SOMETHING THAT COMES FROM THE OUTSIDE A. SOMETHING INSIDE THE BODY SOMETHING THAT IS DEEP INSIDE THE BODY WHEN IT IS STRONG SOMETHING THAT TENDS TO GO UP IN THE BODY WHEN IT BECOMES STRONGER A LIQUID INSIDE THE BODY SOMETHING THAT EMANATES FROM THE BODY AND IS THUS PERCEPTIBLE TOTAL AN ATTACKER SOMETHING THAT DOMINATES A BURDEN SOMETHING THAT DESTROYS SOMETHING THE SELF FIGHTS BACK AGAINST B i. AN ANTAGONIST SOMETHING THAT CAUSES PAIN AN EVIL FORCE DARKNESS TOTAL ASF 216 % 7.0% NDE 4 PI 16.8 CR 0.3
329
10.6%
15
95.8
0.8
29 414 75
0.9% 9.9% 2.4%
1 8 7
0.6 53.5 10.2
0.6 0.5 1.7
66
2.1%
7.7
1.6
11 47
0.4% 1.5%
3 2
0.6 1.8
4.9 0.8
1087 162 95 59 38 345 33 45 19 796
35.0% 5.2% 3.1% 1.9% 1.2% 11.1% 1.1% 1.5% 0.6% 25.7%
46 8 15 8 4 7 4 3 2 51
970.7 25.2 27.7 9.2 3.0 46.9 2.6 2.6 0.7 788.1
0.8 0.9 2.9 2.5 1.9 0.4 2.2 1.2 1.9 1.2
748
U. Oster
Table 4 (Continued) FEAR IS AN UNSPECIFIC AUTONOMOUS FORCE B ii. AN AUTONOMOUS FORCE A PLANT A HUMAN BEING OR ANIMAL A LIQUID OUTSIDE THE BODY FIRE TOTAL ILLNESS INSANITY TOTAL A PHYSICAL OBJECT A PIERCING OBJECT D. AN OBJECT A POSSESSION FOOD AN OBSTACLE TOTAL A PLACE OR CONTAINER E. A PLACE SOMETHING THAT SURROUNDS THE PERSON TOTAL ASF 75 17 217 38 54 401 106 21 127 32 30 525 16 48 651 36 27 27 % 2.4% 0.5% 7.0% 1.2% 1.7% 12.9% 3.4% 0.7% 4.1% 1.0% 1.0% 16.9% 0.5% 1.5% 21.0% 1.2% 0.87% 0.87% NDE 5 2 15 6 9 37 14 4 18 2 3 5 2 3 15 4 2 6 PI 7.3 0.7 63.2 3.7 9.4 280.3 28.8 1.6 44.4 1.2 1.7 51.0 0.6 2.80 189.6 1.0 1.1 2.8 CR 1.2 2.1 1.2 2.4 3.0 1.6 2.4 3.4 2.6 1.1 1.8 0.2 2.3 1.1 0.4 1.3 1.3 2.7
If we want to appreciate these differences, it is therefore not enough to look at a metaphors frequency or its number of different linguistic expressions in isolation; rather these two parameters should be seen in relation to each other, because this tells us something about how productive a metaphor is and how creatively it is used. This is why I introduce two additional parameters that will be called productivity index and creativity ratio. The productivity index is defined as the product of absolute subtype frequency (ASF) and number of different expressions (NDE). In order to give equal weight to both parameters, percentages are used rather than absolute
C. ILLNESS/ INSANITY
Using corpus methodology for semantic and pragmatic analyses 749 figures.13 This operation provides us with an index that yields very high values for items that score high both on absolute subtype frequency and number of different expressions (for example FEAR IS LOCATED IN OR AFFECTS SPECIFIC BODY PARTS has a productivity index of 95.5), whereas metaphor subtypes with few different expressions and low frequency score very low (for example FEAR IS A PLANT with an index of 0.7). The creativity ratio, on the other hand, is the ratio between the two parameters.14 This means that the higher the number of different expressions for a metaphor with respect to its overall frequency (i.e., the more creatively it is used), the higher the ratio will be. On the other hand, the more conventionalized a metaphor (with few, highly frequent, different expressions), the lower its score. Table 4 shows the five quantitative indicators absolute subtype frequency (ASF), percentage of total metaphorical expressions, number of different expressions (NDE), productivity index (PI) and creativity ratio (CR) for all the metaphor subtypes explained in Section 4.1.1. Looking at the parameters productivity index and creativity ratio, we can locate every metaphor on a cline from the most conventional to the most extravagant. Subdividing this continuum is to a certain extent subjective; nevertheless I want to give examples for each of three possible categories between which there are no clear-cut boundaries: a) Highly conventional metaphors (CR is low or very low 1, PI is high15, cf. Table 5) These metaphors present high frequencies and a comparatively small number of different linguistic expressions. There can be a certain degree of variability in the words these metaphors are expressed through. However, they are characterised by a relatively small number of highly conventionalized fixed expressions. b) Creatively used metaphors (CR is medium or high, PI is low to medium, cf. Table 6) These metaphors have a relatively high number of different expressions combined with a medium to low frequency. The higher the creativity ratio, the more creatively a metaphor is used. This does not mean that individual
ASF NDE Productivity index: total ASF ) total NDE 14. Creativity ratio (The absolute ratio is divided by its mean in order to normalize the result NDE zNDE . distributed around 1): ASF / zASF 15. Applying the k-means clustering algorithm in order to take into account the clustering of the results, the values of CR and PI are distributed across five groups. PI below 5 is considered very low, 1 to 16 as low, 16 to 40 as medium, 40 to 75 as high, and above 75 as very high. A CR below 0.8 is rated as very low, 0.8 to 1.7 as low, 1.7 to 2.7 as medium, 2.7 to 4.2 as high, and above 4.2 as very high.
13.
750
U. Oster
Table 5. Highly conventional metaphors FEAR IS SOMETHING THAT COMES FROM THE OUTSIDE ( fill, inspire, instil(l), engender, fear-inspiring, fear-inducing) SOMETHING THE SELF FIGHTS BACK AGAINST (conquer ones fear, overcome, fight, confront, override, combat, banish) A POSSESSION (have, lose, bring, take, get rid of ) Table 6. Creatively used metaphors FEAR IS FIRE (spark, flash, fuel, stoke, flare, burn, extinguish, fire, ignite) SOMETHING THAT IS DEEP INSIDE THE BODY WHEN IT IS STRONG (deep, deep-seated, deepen, ingrained, profound, deep-rooted, entrenched) ILLNESS (develop, sick with, morbid, struck with fear, fear-stricken, pathological, chronic, latent, infect, contagious, severe, stricken with, fear-struck) SOMETHING THAT DOMINATES (take hold of, dominate, overtake, spur, overcome by, force, fear-ridden, fear-driven, constrained by, inhibited by, take over, succumb to, override, compel, driven) Table 7. Rare metaphors FEAR IS A LIQUID INSIDE THE BODY (wave of fear (inside the person), trickle of fear, fear drains from s/o) ASF 11 NDE 3 PI 0.6 CR 4.9 ASF 54 75 NDE 9 7 PI 9.4 10.2 CR 3.0 1.7 ASF 414 345 525 NDE 8 7 5 PI 53.5 46.9 51.0 CR 0.5 0.4 0.2
106
14
28.8
2.4
95
15
27.7
2.9
metaphorical expressions have to be especially unusual, but rather that the conceptual metaphor is explored in a creative way in language use. Examples can be found in all of the four major metaphor types. c) Rare metaphors (CR is high or very high, PI is very low, cf. Table 7) Rather infrequent metaphors are characterised by an extremely high creativity ratio and low productivity index. This is due to the combination of very low frequencies with a very small number of different expressions.
Using corpus methodology for semantic and pragmatic analyses 751 4.2. Evidence for conceptual metonymy
As explained above (cf. Section 3.3.3), working with corpora does not really allow us to track conceptual metonymies. However, we do find a great number of expressions that evidence various physical effects of the emotion and it seems to be useful to analyse these co-occurrences in order to see which effects are most frequently used to describe in a graphic way the state of fear of a person. Table 8 shows quantitative data on the physical effects of fear as found in the corpus (for the complete details cf. Appendix 2).
Table 8. Physical effects of fear as evidenced in the corpus ASF Fear causes agitation Fear causes immobilisation or contraction Fear causes body temperature to sink Fear causes screaming or crying Fear affects ones voice Fear disturbs breathing Fear shows in the face Fear shows in the eyes Fear causes dilation of the eyes Fear causes an unpleasant taste or smell Fear causes a change of colour Fear causes weakness/incapability Fear causes sweating Fear causes a prickling sensation Fear causes body temperature to rise Fear causes loss of control over body functions 363 179 151 117 107 87 214 380 80 50 41 39 31 13 7 2 NDE 20 13 5 6 2 9 1 14 6 3 5 10 3 3 2 1 PI 382.9 122.7 39.8 37.0 11.3 41.3 11.3 280.6 25.3 7.9 10.8 20.6 4.9 2.1 0.7 0.1 CR 1.0 1.3 0.6 0.9 0.3 1.9 0.1 0.7 1.4 1.1 2.2 4.7 1.8 4.2 5.2 9.0
From the point of view of frequency, agitation (with co-occurrences like tremble, shake, jump, etc.), immobilisation or contraction (paralyze, stiff, shrink, etc.) and falling body temperature ( freeze, cold, icy, etc.) are the effects that are mentioned most often. If we also take into account the number of different expressions and calculate the productivity index and creativity ratio as explained in Section 4.1, we can distinguish three main groups: a) Highly conventional conceptual metonymies (CR is low or very low, PI is high) These are characterised by a low number of different expressions in relation to their frequency. Fear causes body temperature to sink: freeze, cold, etc. Fear shows in the eyes: eyes, look of fear, stare b) Creatively used conceptual metonymies (CR is medium or high, PI is medium or low)
752 c)
U. Oster
Fear causes a change of colour: white, tinge, pale, blanch, yellow Fear causes weakness: cripple, blind with, mute with, weak from, stumble, weak legs, disabling, speechless, debilitating, limp with Rare conceptual metonymies (CR is high or very high, PI is very low)
Fear causes body temperature to rise:16 hot, blaze Fear causes loss of control over body functions:17 wet oneself 4.3. Conceptual proximity
With respect to the conceptual proximity to other emotion words, fear is most frequently found in combination with other negative emotions (in 85% of cases).18 The most frequent co-occurrences by far are those from the conceptual domain of fear itself (26.4%). These can be grouped further into subdomains: Strong fear: fright, anguish, horror, dread, terror, panic Lack of trust: wariness, mistrust, distrust, suspicion Slight fear: worry, apprehension, concern Hopelessness: despair, desperation, anxiety Pathological forms of fear: phobia, paranoia
Other negative emotions include the following: negative emotions oriented towards others: anger ( frustration, rage, disgust); hate (loathing, hatred, hostility); others (greed, envy, jealousy) insecurity (uncertainty, confusion, doubt) negative emotions oriented towards oneself (guilt, shame, remorse) sadness (grief, sorrow, hopelessness) pain (pain, hurt) inability to act (apathy, indecision, incapacity)
16. Another way of dealing with these strange examples is suggested by Kvecses (2005: 288290). He introduces the concept of cognition over embodiment override to explain the apparent contradiction between the overall conceptual metonymy FEAR IS COLD and individual metaphors involving heat as an indicator of fear (like hot or blaze among our examples). This means that highly entrenched generic conceptual metaphors like INTENSITY IS HEAT can be applied even if the metaphor does not fit the embodied notion FEAR IS COLD. 17. The surprisingly low frequency of this metonymy is possibly due to the fact that the corpus was restricted to written genres of a rather formal kind (fiction, magazines, newspapers and academic papers). 18. The complete quantitative data can be found in appendix 3.
Using corpus methodology for semantic and pragmatic analyses 753 On the other hand, fear is also found in combination with positive emotions, either feelings oriented towards others (love, respect) or feelings that are a reaction to good things in the present or future (hope, excitement, joy). Some of the co-occurring emotion words are ambivalent or neutral (pity, curiosity). By themselves, perhaps these results do not tell us much more than what intuition might predict. However, in contrastive studies (interlinguistic or diachronic) this kind of analysis can be important to highlight differences between languages that are not plain to the eye if not backed by large quantities of corpus data. In a similar study that compares the German and Spanish words for the concept pride (Stolz and orgullo), the analysis showed Stolz to be more often found in the vicinity of the more positively oriented lexical units Selbstachtung (self-esteem) and Wrde (dignity), whereas orgullo was closer to vanidad (vanity), arrogancia (arrogance) and soberbia (haughtiness) (Oster 2010). 4.4. Semantic preference and semantic prosody: Description and evaluation
As happens with conceptual proximity, the results of an analysis of description and evaluation are appreciated best in a contrastive context, which will enable us to point out usage differences between languages. However, even in a monolingual study, looking more closely at the semantic fields of the collocates of fear (semantic preference) and the data relating to evaluative expressions (semantic prosody) allows us to gain interesting insights into how the emotion is predominantly described and evaluated. If we look at the semantic subsets that the co-occurrences belong to from the point of view of syntagmatic relations between the emotion concept and its surroundings in text, we will find information on causes of fear (fear of sth.), on its objects (fear for sth.) and on who experiences it (cf. Appendix 4). On closer look, the causes are related to punishment of some kind (retribution, retaliation), physical harm (death, injury), social consequences of ones acts (rejection, ridicule), persons or groups of persons (stranger, police), insecurity or violence (crime, war), natural things (night, heights), risks of m dern o life (weight gain, contamination), dangers in general ( flying, risk), political, social or religious groupings (communism, immigrants) or dangers related to the economy (layoff, inflation). On the other hand, fear is experienced for a close relative (kid, husband) or for ones personal well-being (money, food). The experiencers can be classified into individuals in general (man, parent), members of specific groups (student, investor) or collective bodies (community, public). Examination of the results also reminds us that corpus data always has to be handled with care and that, however big a corpus, it will only represent the reality of the texts it contains. In this case, the section of the COCA corpus
754
U. Oster
used in the analysis reflects present-day American written language and even highlights some peculiarities of 20th century American culture. For example, there are a striking number of references to the fear of liability, litigation, lawsuits or prosecution. Also, genre and/or subject matter can have an unexpected influence on the results when certain collocations acquire a term-like status and are used recurrently in specific publications. In this case, the rather unexpected frequency of gymnast as experiencer of fear and fear of victimization is almost exclusively due to their respective prolific use in the academic journals Sport Behavior and Adolescence. With respect to descriptive and evaluative aspects of the emotion concept fear, it is probably not surprising that adjectives and to a lesser extent verbs proved to be the most revealing co-occurrences (cf. Appendix 5). From the point of view of description, intensity of the emotion is by far the most important aspect (35.5% of the describing or evaluating items). These expressions refer to the size of the emotion (big, greatest, exaggerated), its being bad or dangerous (worst, terrible, dreadful), its strength (intense, strong, extreme) and, in a few cases, its weakness ( faint, certain, slight). As to the quality of the emotion, there is a very large group of words that stress its being pure or real (real, genuine, naked, etc.). Another very frequent aspect is that of the origin of the emotion. In many cases, it is described as something that is not rational (irrational, instinctive, superstitious, etc.) or old (primal, innate, ancestral, etc.). Conversely, in very few cases (0.3%) is it presented as rational. Other descriptive aspects refer to the duration of fear as long (constant, persistent, lifelong, etc.) its extension (pervasive, widespread, common) or its formlessness (vague, shapeless, unarticulated). As to the analysis of evaluative co-occurrences, it proved useful to classify further the positive and negative evaluations into more differentiated aspects. This shows that the emotion is frequently presented as being either justified (well-founded, justify, healthy) or unjustified (unfounded, unreasonable, unwarranted). Only onethough very strongcollocation with a purely negative evaluation (abject) has been found.19 The most noticeable finding is probably that there is rather strong evidence (12.4% of all descriptive or evaluative expressions) that fear is seen as something shameful (hide, betray, confess). This, again, is more interesting as contrastive data: results from a similar corpus-based study indicate that the same is true for Spanish but much less so for German (Oster 2008, 2010).
19. Kvecses (1998: 142) describes positive negative evaluation of emotions through the use of certain metaphors. For example, an emotion that is seen as an illness is being conceptualized in a negative way, or happiness is evaluated positively through the metaphors HAPPINESS IS LIGHT / IS FEELING LIGHT / IS UP / IS BEING IN HEAVEN (Kvecses 2008b: 136).
Using corpus methodology for semantic and pragmatic analyses 755 5. Conclusions Corpus analysis has proved to be a powerful tool in many areas of linguistic research. With respect to the quantification of results, there have probably been more advances in areas like morphology or syntax, which are more easily formalized and quantified. There are also promising conceptual tools for semantic or pragmatic analyses (including collocation and co-occurrence, semantic prosody and semantic preference), but for these the methodological difficulty of finding an adequate balance between semi-automatic and manual analysis has to be overcome. Like in every semantic study, a manual analysis, that is a careful examination and classification of the examples, is necessary, but it is the corpus tools that make it possible to find the relevant stretches of text, process them efficiently, and thus keep the intervention of intelligent analysis at a manageable level. What I have tried to do in this paper is to explore and explain a few of the possibilities, advantages and difficulties of corpus-based analyses of semantic and pragmatic aspects of language in one particular field, namely the linguistic expression of emotion concepts. The methodological procedure that has been proposed combines Kvecses lexical approach and Stefanowitschs metaphorical pattern analysis with additional concepts from corpus linguistics such as semantic preference and semantic prosody. In my view, such a corpus-based analysis of emotion words offers two advantages. Firstly, by exploring the surroundings of the search word fear in a truly vast amount of text and analysing the words it tends to co-occur with, we are not only able to find both evidence of conceptual metaphor and metonymy that structure the emotion concept and evidence of related emotion concepts, but also we have two other benefits. We can enrich the description of the emotion concept with information about a series of dimensions (intensity, quality, form, origin, duration and extension) and about the semantic subsets the cooccurrences typically belong to. We can also add a pragmatic viewpoint to the analysis by revealing an explicit or implicit evaluation of the emotion. The second advantage offered by a corpus-based methodology lies in the possibility of quantifying results, i.e., comparing the frequency, productivity and creative use of individual metaphors and metonymies. Apart from taking into account the absolute and relative frequency of co-occurrences as well as the number of different expressions, the parameters productivity index and creativity ratio have been introduced as a tool for differentiating between different degrees of conventionality and creative use. Here we have seen that some highly frequent metaphors are expressed through a number of strongly conventionalized lexical items (this is the case of FEAR IS SOMETHING THAT COMES FROM THE OUTSIDE, for example) whereas others, although less frequent in absolute terms, are explored creatively by language users through a
756
U. Oster
larger number of different linguistic expressions (FEAR IS FIRE or FEAR IS ILLNESS). Being able to quantify the results in such a way is especially important with respect to interlinguistic contrastive studies. Due to intercultural exchange and mutual influence (in literature, philosophy, religion, science and film, for instance), which for many cultures and languages has been going on for thousands of years and is steadily increasing, languages have been continuously incorporating new features. It is highly probable that this is also the case for metaphorical systems. This may partly explain the high degree of coincidence in conceptual metaphors between languages and cultures (especially Western cultures) and even in individual metaphorical expressions. However, if we are able to look at quantitative results alongside qualitative data, this will give us a finer-grained picture and enable us to comprehend the subtler differences. Received 4 August 2009 Revision received 8 April 2010 Universitat Jaume I
Appendix 1: Conceptual metaphors

A. FEAR IS SOMETHING INSIDE THE BODY Something that is unspecifically located inside the body: fear in . . . persons, body [800], physical [94], body [63], fear moves through . . . [60], full of fear [45], fear inside [17], fear within [8], harbor [14] Something that is located in or affects specific body parts: heart [170], stomach [39], blood [38], chest [27], throat [9], visceral [9], mouth [8], spine [7], vein [5], belly [4], muscles [3], neck [3], nerves [2], lungs [2], skin [3] Something that affects the soul: soul [29] Something that comes from the outside: fill [114], strike, etc. fear into . . . [about 100], inspire [95], instill [65], engender [22], fear-filled [6], fear-inspiring [5], fear-inducing [5], fearengendering [2] Something that is deep inside the body when it is strong: deep [37], deep-seated [16], deepen [6], ingrained [5], profound [5], deep-rooted [4], entrenched [2] Something that tends to go up in the body when it becomes stronger: rise in/inside s/o [40], swallow [11], force down ones fear [4], gulp back/down [4], escalating [3] A liquid inside the body: wave of fear [5], trickle of fear [4], fear drains from s/o [2], ebb [4] Something that emanates from the body and is thus perceptible: smell [45], fear-scent [2]
Using corpus methodology for semantic and pragmatic analyses 757 Appendix 1 (Continued)
B i. FEAR IS AN ANTAGONIST An attacker: grip [60], overwhelm [39], seize [34], clutch [10], escape fear [8], plagued by fear [5], stab [4], fear-gripped [2] Something that dominates: dominate [15], overtake [14], spur [12], overcome by [10], take hold of [9], fear forces [8], fearridden [5], under fear [5], fear-driven [4], constrained by fear [2], inhibited by fear [3], fear takes over [2], succumb to [3], fear overrides [3], compel [3], driven by [2] A burden: outweigh [13], carry [12], alleviate [12], weight [10], fraught with [4], heavy [4], fear-charged [2], fear-laden [2] Something that destroys: consume [17], nagging [12], fear eats s/o [12], gnawing [5], all-consuming [4] Something the self fights back against: overcome [203], conquer ones fear [61], fight back [50], confront [20], override [3], combat [6], fear banished [2] Something that inflicts pain: suffer [22], twinge [5], numb ones fear [3], tortured by fear [3] An evil force: haunt [37], possessed by [4], demon [4] Darkness: dark [12], shadow [7] An unspecific autonomous force: spread [50], fear sweeps [10], uncontrollable [9], powerful [6], recede [5] A plant: sow [13], deep-rooted [4] A human being or animal: grow [63], arouse fear [30], lurk [28], engender [22], awaken [13], breed fear [10], born of fear [9], lurking [9], stir [8], fear-arousing [3], beget fear [2], creep [26], wild [6], fear feeds on sth. [5], crawl [3], fear-engendering [2] A liquid outside the body: wave of fear [20], fear washes over s/o [6], evaporate [5], ebb [4], undercurrent of fear [3] Fire: spark [14], flash [10], fuel [7], stoke [7], flare [4], burn [4], extinguish [3], fire [3], ignite [2] Illness: develop a fear [30], sick with [10], morbid [10], struck with fear [10], fear-stricken [5], pathological [8], chronic [7], latent [6], infect [5], contagious [4], severe [4], stricken with [2], fear-struck [2], epidemic [3] Insanity: crazy with fear [9], mad with [5], fear-crazed [4], insane with/from [3] A physical object: palpable [17], push away [15], A piercing object: edge [20], acute [6], sharp [4] A possession: have [287], lose [150], bring [60], take ones fear [20], get rid of [8] Food: feed on fear [15], to be fed fear An obstacle: get over ones fear [10], get past ones fear [7], move, shove, etc. past ones fear [5], break through fear [2] A place or container: in ones fear [about 1000], out of fear [about 400], verb of movement (sink, plunge, etc.) into fear [6], beyond fear [18], verb of movement (move, push, etc.) toward [3] Something that surrounds people: push, move, etc. through ones fear [25], amid fear [5], thick in the air [2]
B ii. FEAR IS AN AUTONOMOUS FORCE
C FEAR IS AN ILLNESS/ INSANITIY
D FEAR IS AN OBJECT E FEAR IS A PLACE/ CONTAINER
758
U. Oster
Appendix 2: Conceptual metonymies

Fear causes agitation tremble [110], shake [82], shiver of fear [39], thrill [22], jump [18], quiver [15], trepidation [14], quake [12], tremor [12], shudder [9], frisson [5], knee-jerk [4], jolt of fear [4], frantic with fear [4], heart leaps [3], jittery [2], frenzied [2], flicker of fear [2], ripple of fear [2], shaky [2] paralyze [101], cower [19], stiff [12], shrink [11], rigid [8], immobilize [6], cringe [6], petrified [5], taut with [2], heart-stopping [2], immobility [2], tense with [2], curdle [3] freeze [76], cold [42], chill [26], icy [5], ice [2] scream [35], cry [32], tear [31], howl [7], sob [7], whimper [6] voice [105], shrill [2] take a quick breath [26], breathy voice [2], breathe hard/fast etc. [18], choke [16], gasp [13], lump of fear in chest/ stomach/throat [5], knot of fear [3], suffocating [2], breathless [2] It shows in the face: in/on face [180], mouth [20], twisted face [8], contorted [8] It shows in the eyes/way of looking: eye [334], look of fear [32], stare [14] It causes dilation of the eyes: wide with fear [42], wide-eyed [11], eyes widen [16], dilate [4], wide-eyed [5], fear-widened [2]
Fear causes immobilisation or contraction Fear causes the temperature to sink Fear causes screaming or crying Fear affects the voice Fear disturbs breathing
Fear shows in the face
Fear causes a change of colour Fear causes an unpleasant taste or smell Fear causes weakness/ incapability Fear causes sweating Fear causes the temperature to rise Fear causes a prickling sensation Fear causes loss of control over body functionsbladder/ bowels
white with fear [15], tinge [11], pale [7], blanch [6], yellow [2] smell [45], taste [3], sour [2] cripple [15], blind with [4], mute with [4], weak from [4], stumble [2], make legs weak [2], disabling [2], speechless [2], debilitating [2], limp with [2] sweat [21], clammy [6], fear-sweat [4] hot [4], blaze [3] prickle [9], spine-tingling [2], tingle [7] wet oneself [2]
Using corpus methodology for semantic and pragmatic analyses 759 Appendix 3: Conceptual proximity
Negative emotions fear: anxiety [248], panic [108], terror [73], concern [69], suspicion [60], dread [50], distrust [46], apprehension [45], horror [42], despair [39], mistrust [39], worry [29], desperation [28], paranoia [27], phobia [19], anguish [16], fright [8], wariness [4] negative emotions oriented towards others: anger: anger [380], frustration [67], rage [60], disgust [40], resentment [39], fury [23], bitterness [16], wrath [15], irritability [5] hate: loathing [102], hatred [99], hate [66], hostility [25], revulsion [21], antipathy [6], repulsion [5], repugnance [2], detestation [2] others: greed [55], envy [32], jealousy [17] insecurity: uncertainty [82], confusion [104], insecurity [54], doubt [48], helplessness [25], shyness [11], bewilderment [9], self-doubt [6], disorientation [5], timidity [4] negative emotions oriented towards oneself: guilt [127], shame [95], remorse [9], self-pity [5], self-blame [3] sadness: sadness [77], grief [61], sorrow [41], hopelessness [11], dejection [2], aloneness [2], homesickness [2] pain: pain [225], hurt [7] inability to act: apathy [9], indecision [5], weariness [5], incapacity [2] positive emotions oriented towards others: love [97], awe [48], respect [4], reverential [2] positive feelings as a reaction to good things in the present or future: hope [80], excitement [63], joy [54], pleasure [37], relief [35], euphoria [7], elation [6], exhilaration [6], gladness [3] desire [72], pity [60], passion [23], curiosity [6]
Positive emotions
Ambivalent or neutral emotions
Appendix 4: Semantic subsets of the co-occurrences

Causes fear of Punishment of some kind: retribution [51], retaliation [48], punishment [42], reprisal [70], repression [5], repercussion [13] litigation [21], persecution [38], prosecution [32], lawsuit [25], intervention [15], liability [19], damnation [7], deportation [15], recrimination [3], annihilation [3], lawsuit [23], discovery [21], arrest [18], entrapment [5] Physical harm: death [268], injury [61], harm [53], disease [23], cancer [37], doctor [3], pregnancy [8], illness [15], castration [11], contagion [9], sterility [5], insanity [3], blindness [5], germ [5], epidemic [4], flu [4], needle [12], AIDS [33], Social consequences of ones acts: rejection [59], victimization [68], humiliation [9], isolation [2], abandonment [24], exposure [18], censure [6], stigma [13], ridicule [13], censorship [6], discrimination [14], scandal [10], exclusion [4]
760
U. Oster
Appendix 4 (Continued)
A person or group of persons: mother [3], man [25], stranger [18], husband [3], enemy [6], name [8], teacher [4], blacks [4], foreigner [8], thief [3], witch [4], police [9], Insecurity or violence: crime [194], violence [122], war [102], abuse [7] , conflict [10], gun [5], anarchy [7], unrest [4], terrorism [30], harassment [8], aggression [5], casualty [8], assassination [4], gang [4], chaos [18], Natural things: night [14], heights [68], darkness [6], nature [5], snake [24], dark [31], dog [25], fire [16], spider [12], Unavoidable dangers: future [55], loss [102], Risks of modern life: weight gain [22], food [10], technology [18], success [37], contamination [10] Dangers: danger [39], risk [9], drowning [10], flying [72], airplane [4] Personal inadequacies: mistake [18], failure [240], Political, social or religious groupings: communism [19], immigrant [12], fundamentalism [5], Islam [5] Dangers related to the economy: layoff [8], unemployment [12], inflation [8] Individuals in general: man [90], child [80], woman [85], parent [55], mother [6], father [21], kid [15], boy [32], girl [29], victim [18], Member of a specific group: American [40], officials [30], student [30], blacks [2], voter [12], gymnast [10], opponent [8], prisoner [5], teacher [5], politician [8], worker [15], leader [15], soldier [18], investor [15], patient [15], Groups: community [25], public [32], population [22], nation [21], family [21] A close relative: kid [5], husband [5], son [8], baby [17], wife [3] Personal well-being: money [13], food [2], safety [46], health [9],
Experiencer
Object fear for
Appendix 5: Descriptive and evaluative aspects

Description Intensity - big big [151], greatest [127], great [112], exaggerated [13], inordinate [12], enormous [6], excessive [6], ultimate [6], absolute [5], overblown [4], huge [4], outsized [2], preternatural [2] worst [88], terrible [26], tremendous [15], horrible [11], dreadful [10], awful [6], dark [6], bad [5] intense [29], strong [29], extreme [19], stark [6] certain [14], faint [10], slight [8], small [4], quiet [4], gentle [2], mild [2]
- bad/ dangerous - strong - weak
Using corpus methodology for semantic and pragmatic analyses 761 Appendix 5 (Continued)
Quality: - pure Form: - vague Origin: - not rational - very old/ innate - rational Duration: - long Extension: - widespread Evaluation: - positive: justified - negative - negative: unjustified - negative: shameful well-founded [26], justify [19], healthy [18], understandable [9], realistic [9], justified [8], legitimate [2] abject [15] unfounded [19], unreasonable [9], unwarranted [8], inexplicable [6], unrealistic [6], childish [4], ungrounded [4] hide [61], acknowledge [22], betray [22], confess [18], suppressed [15], admit [15], secret [12], conceal [11], mask [11], unspoken [10], cover [10], hidden [4], bespeak [2] pervasive [34], widespread [36], common [28], generalized [13], spreading [8] constant [94], always [51], persistent [19], ever-present [11], lifelong [10], abiding [9], perpetual [8], long-standing [5], eternal [3] irrational [58], instinctive [12], superstitious [9], unreasoning [7], subconscious [5], instinctual [5] primal [41], innate [10], primitive [8], atavistic [4], ancestral [4], inborn [3] rational [5] vague [6], formless [4], shapeless [2], dull [4], unarticulated [2] real [108], genuine [22], pure [14], naked [8], raw [8], outright [5], sheer [7], utter [5], plain [3], plain [2]
References
Bednarek, Monika. 2008a. Emotion Talk across Corpora. Houndmills: palgrave Macmillan. Bednarek, Monika. 2008b. Semantic preference and semantic prosody re-examined. Corpus Linguistics and Linguistic Theory. 4(2), 119139.
762
U. Oster
Channell, Joanna. 2000. Corpus-Based Analysis of Evaluative Lexis. In Hunston, Susan and Geoff Thompson (eds.), Evaluation in Text. Authorial Stance and the Construction of Discourse, 38 55. Oxford: Oxford University Press. Charteris-Black, Jonathan. 2004. Corpus Approaches to Critical Metaphor Analysis. Houndmills: Palgrave Macmillan. Davies, Mark. 2008. The Corpus of Contemporary American English (COCA). Available online at: http://www.americancorpus.org/. Deignan, Alice. 2005. Metaphor and Corpus Linguistics. Amsterdam: John Benjamins. Deignan, Alice. 1999. Corpus-based research into metaphor. In Cameron, Lynne & Graham Low (eds.), Researching and Applying Metaphor, 177199. Cambridge: Cambridge University Press. Fussell, Susan R. (ed.). 2002. The Verbal Communication of Emotions: Interdisciplinary Perspectives. Mahwah, New Jersey: Lawrence Erlbaum. Fussell, Susan R. and Mallie M. Moss. 1998. Figurative language in emotional communication. In Fussell, Susan R. and Roger J. Kreuz (eds.), Social and Cognitive Aspects of Interpersonal Communication, 113143. Mahwah, New Jersey: Lawrence Erlbaum. Gevaert, Caroline. 2001. Anger in Old and Middle English: A Hot Topic? Belgian Essays on Language and Literature. 89101. Gevaert, Caroline. 2005. The ANGER IS HEAT Question: Detecting Cultural Influence on the Conceptualization of Anger through Diachronic Corpus Analysis. In Delbecque, Nicole, Johan van der Auwera and Dirk Geeraerts (eds.), Perspectives on Variation: Sociolinguistic, Historical, Comparative, 195208. Berlin and New York: Mouton de Gruyter. Hoey, Michael. 2005. Lexical Priming. A new theory of words and language. London and New York: Routledge. Hunston, Susan. 2007. Semantic prosody revisited. International Journal of Corpus Linguistics. 12(2), 249268. Koller, Veronika. 2006. Of critical importance: Using electronic text corpora to study metaphor in business media discourse. In Stefanowitsch, Anatol and Stefan Th. Gries (eds.), Corpus-Based Approaches to Metaphor and Metonymy, 237266. Berlin, New York: Mouton de Gruyter. Koller, Veronika. 2008. Brothers in arms. Contradictory metaphors in contemporary marketing discourse. In Cameron, Lynne, Mara Sophia Zanotto and Marilda C. Cavalcanti (eds.), Confronting Metaphor in Use: an applied linguistic approach, 103126. Amsterdam and Philadelphia: John Benjamins. Kvecses, Zoltn. 1986. Metaphors of Anger, Pride, and Love: A Lexical Approach to the Structure of Concepts. Amsterdam: John Benjamins. Kvecses, Zoltn. 1990. Emotion Concepts. New York: Springer-Verlag. Kvecses, Zoltn. 1998. Are there any emotion-specific metaphors? In Athanasiadou, Angeliki and Elzbieta Tabakowska (eds.), Speaking of Emotions. Conceptualization and Expression, 127 151. Berlin and New York: Mouton de Gruyter. Kvecses, Zoltn. 2000. Metaphor and Emotion: Language, Culture, and Body in Human Feeling. Cambridge: Cambridge University Press. Kvecses, Zoltn. 2005. Metaphor in Culture. Universality and Variation. Cambridge: Cambridge University Press. Kvecses, Zoltn. 2008a. Conceptual metaphor theory: Some criticisms and alternative proposals. Annual Review of Cognitive Linguistics. 6, 168184. Kvecses, Zoltn. 2008b. The Conceptual Structure of Happiness. In Tissari, Heli, Anne Birgitta Pessi and Mikko Salmela (eds.), Happiness: Cognition, Experience, Language, 131143. Helsinki: Helsinki Collegium for Advanced Studies. Louw, Bill. 1993. Irony in the Text or Insincerity in the Writer? The Diagnostic Potential of Semantic Prosodies. In Baker, Mona, Gill Francis and Elena Tognini-Bonelli (eds.), Text

and Technology: In Honour of John Sinclair, 240251. Amsterdam and Philadelphia: Benjamins. Morley, John and Alan Partington. 2009. A few Frequently Asked Questions about semanticor evaluativeprosody. International Journal of Corpus Linguistics. 14(2), 139158. Munday, Jeremy. in press. Looming large: A cross-linguistic analysis of semantic prosodies in comparable reference corpora. In Kruger, Alet and Kim Walmach (eds.), Corpus-Based Translation Studies. Manchester: St. Jerome. Oster, Ulrike. 2008. Angst and fear in contrast: A corpus-based analysis of emotion concepts. Paper presented at the International Conference on Cognitive Linguistics between Universality and Variation, Dubrovnik, 30 September1 October. Oster, Ulrike. 2010. Metforas conceptuales y emociones: El anlisis de corpus como herramienta de la enseanza de la traduccin. In Emsel, Martina and Annette Endruschat (eds.), La metfora en la traduccin, 153172. Mnchen: Martin Meidenbauer. Pragglejaz, Group. 2007. MIP: A method for identifying metaphorically used words in discourse. Metaphor and Symbol. 22(1), 139. Schwarz-Friesel, Monika. 2007. Sprache und Emotion. Tbingen and Basel: A. Francke. Sinclair, John. 1987. Collocation: a progress report. In Steele, Ross and Terry Threadgold (eds.), Language Topics. Essays in Honour of Michael Halliday, 319332. Amsterdam and Philadelphia: Benjamins. Sinclair, John. 1996. The Search for Units of Meaning. TEXTUS. IX(1), 75106. Stefanowitsch, Anatol. 2005. The function of metaphor: Developing a corpus-based perspective. International Journal of Corpus Linguistics. 10(2), 161198. Stefanowitsch, Anatol and Stefan Thomas Gries (eds.) 2006. Corpus-based Approaches to Metaphor and Metonymy. Berlin: Mouton de Gruyter. Stefanowitsch, Anatol. 2006. Words and their metaphors: A corpus-based approach. In Stefanowitsch, Anatol & Stefan Thomas Gries (eds.), Corpus-Based Approaches to Metaphor and Metonymy, 63105. Berlin / New York: Mouton de Gruyter. Stewart, Dominic. 2009. Safeguarding the lexicogrammatical environment: Translating semantic prosody. In Beeby, Allison, Patricia Ins Rodrguez and Pilar Snchez-Gijn (eds.), Corpus Use and Translating: Corpus use for learning to translate and learning corpus use to translate, 2946. Amsterdam: Benjamins. Stubbs, Michael. 1995. Collocations and semantic profiles. On the cause of the trouble with quantitative studies. Functions of Language. 2(1), 2355. Stubbs, Michael. 2001. Words and Phrases. Corpus Studies of Lexical Semantics. Oxford: Blackwell. Stubbs, Michael. 2002. Words and Phrases. Corpus Studies of Lexical Semantics. Oxford: Blackwell. Tognini-Bonelli, Elena. 2001. Corpus linguistics at work. Amsterdam: John Benjamins. Whitsitt, Sam. 2005. A critique of the concept of semantic prosody. International Journal of Corpus Linguistics. 10, 283305. Wierzbicka, Anna. 1990. The semantics of emotion: fear and its relatives in English. Australian Journal of Linguistics (Special issue on the semantics of emotions). 10(2), 395375. Wierzbicka, Anna. 1992a. Defining emotion concepts. Cognitive Science. 16, 53981. Wierzbicka, Anna. 1992b. Semantics, Culture and Cognition: Universal human concepts in culture-specific configurations. New York: Oxford University Press. Wierzbicka, Anna. 1999. Emotions across Languages and Cultures. Cambridge: Cambridge University Press. Wilce, James M. 2009. Language and Emotion. Cambridge: Cambridge University Press. Xiao, Richard & Tony McEnery. 2006. Collocation, Semantic Prosody and Near Synonymy: A Cross-Linguistic Perspective. Applied Linguistics. 27(1), 103129.
Metaphor in usage
GERARD J. STEEN*, ALET TA G. DORST, J. BERENIKE HERRMANN, ANNA A. KA AL and TINA KRENNMAYR
Abstract This paper examines patterns of metaphor in usage. Four samples of text excerpts of on average 47,000 words each were taken from the British National Corpus and annotated for metaphor. The linguistic metaphor data were collected by five analysts on the basis of a highly explicit identification procedure that is a variant of the approach developed by the Pragglejaz Group (2007). Part of this paper is a report of the protocol and the reliability of the procedure. Data analysis shows that, on average, one in every seven and a half lexical units in the corpus is related to metaphor defined as a potential cross-domain mapping in conceptual structure. It also appears that the bulk of the expression of metaphor in discourse consists of non-signalled metaphorically used words, not similes. The distribution of metaphor-related words, finally, turns out to be quite variable between the four registers examined in this study: academic texts have 18.5%, news 16.4%, fiction 11.7%, and conversation 7.7%. The systematic comparative investigation of these registers raises new questions about the relation between cognitive linguistic and other approaches to metaphor. Keywords: BNC-Baby, metaphor, metaphor identification, text annotation, register, simile
* Address for correspondence: Gerard Steen, Department of Language and Communication, Faculty of Arts, VU University Amsterdam, De Boelelaan 1105, 1081 HV Amsterdam, Netherlands. T: ++31-20-5986433. F: ++31-20-5986500. E-mail: Gj.steen@let.vu.nl. Acknowledgements: All authors gratefully acknowledge the financial support of this research by NWO, the Netherlands Organization for Scientific Research, Vici grant 277-30-001, Metaphor in discourse: Linguistic forms, conceptual structures, cognitive representations. We are also extremely grateful to the comments by two anonymous referees and the editors of this journal on a previous version of this article; we hope that the present revision removes the most important weaknesses. Finally, many thanks to Alan Cienki, for running a final check on the English and the text as whole. Cognitive Linguistics 214 (2010), 765796 DOI 10.1515/COGL.2010.024 09365907/10/00210765 Walter de Gruyter
766
G. J. Steen et al.
1. Introduction 1.1. Basic questions about metaphor in cognitive linguistics The cognitive linguistic approach to metaphor launched by Lakoff and Johnson (1980; cf. 1999) has not only been essential for the development of cognitive linguistics as a school of linguistics itself but has also affected many other disciplines concerned with the study of metaphor, including philosophy, poetics, psycholinguistics and psychology, discourse analysis and communication studies, and anthropology (e.g., Gibbs 2008). Attention to metaphor in discourse is crucial for the realization of the cognitive linguistic objective of constructing a usage-based grammar, in which lexical elements and other lexicogrammatical constructions may be motivated by metaphor in thought, or cognition. Yet this research programme is also controversial, and fundamental questions have been raised about central cognitive-linguistic tenets about metaphor in usage, for instance by Steen (1994), Chilton (1996), Cameron and Low (1999), Eubanks (2000), Cameron (2003), Charteris-Black (2004), Koller (2004), Musolff (2004), Deignan (2005), Caballero (2006), Stefanowitsch and Gries (2006), Steen (2007), Cienki and Mller (2008), Mller (2008), Semino (2008), and Musolff and Zinken (2009), to mention just the most familiar book publications. Three issues appear to stand out. One question concerns the relationship between the on-going psychological processes and their products, on the one hand, and the linguistic forms and conceptual structures of metaphor analyzed as signs or symbols, on the other. The problem here is that what is analyzed in cognitive linguistics as metaphorical in the linguistic and conceptual structures of discourse does not have to be a one-on-one reflection of the psychological processes of human verbal and cognitive behaviour in discourse (e.g., Cameron and Low 1999; CharterisBlack 2004; Gibbs 2006; Steen 2007; Mller 2008). Metaphor in the structure of language does not always have to be directly driven by metaphor in the on-line processes of thought. Moreover, these processes only pertain to the short-term discourse processes of production, reception and interaction that characterize performance; how the relationship between metaphor and these discourse processes is connected with long-term psycholinguistic processes such as language acquisition, maintenance and attrition is an even more complex issue. The entire question impacts on the way in which cognitive-linguistic and psycholinguistic approaches to metaphor can be seen as identical or distinct; various psycholinguists and psychologists have been fairly vociferous in dissociating themselves from the cognitive-linguistic position about metaphor as thought; for some examples and references, see Steen (2007) and Gibbs (2008). Another issue has to do with the social and cultural variation of metaphor in usage. What is metaphorical to some language users does not have to be meta-
Metaphor in usage
767
phorical to other language users (e.g., Shore 1996; Cameron 2003; Kvecses 2005; Steen 2007). The cognitive linguistic idea is that our thought and language are metaphorical in roughly the same ways for everyone because of a number of constant parameters in human experience; this is an important and exciting proposition, but it also is a gross idealization when it comes to observing variation in usage. Even if linguistic forms may be the same across a number of contexts of usage, this does not mean that they necessarily function in the same way to the concrete individual participants in these various usage events. This issue affects the way in which cognitive linguistic approaches can be said to be the same as sociolinguistic, discourse-analytical and culturallinguistic ones; here, too, it should be pointed out that various scholars from the other camps, who are mostly known as applied linguists, have again dissociated themselves from a uniform role of metaphor between groups of language users. A third question concerns the precise relationship between the linguistic forms and conceptual structures of metaphor themselves. The problem here is the adequate and accurate identification and demarcation of conceptual metaphors, or even less systematic cross-domain mappings, in relation to the concrete linguistic expressions in situated events or documents (e.g., Cameron 2003; Ritchie 2004; Steen 2007). Fundamental problems have been pointed out with respect to distinguishing between competing variants or alternatives of conceptual metaphors. Moreover, their nature as complex or primary metaphors is unclear. Furthermore, the way they can or must play a role in concrete instantiations of meaning in discourse is open to debate. This third issue therefore has an impact on the general linguistic validity of cognitive linguistic proposals for metaphor and the question as to whether they are acceptable to the linguistic community at large. Critical perspectives from other schools in linguistics have been formulated by, for instance, Wierzbicka (1986), Jackendoff and Aaron (1991) and Sperber and Wilson (2008). There are hence fundamental theoretical and empirical questions about the cognitive-linguistic approach to metaphor. This is not surprising, for the cognitive linguistic approach has presented a boldly new and far-reaching theory, which has even been advertised as the contemporary theory of metaphor (Lakoff 1993). The non-converted and sceptical have hence been provoked to raise critical questions about the cognitive linguistic position on the one hand and psycholinguistic, sociolinguistic, and general linguistic approaches to metaphor on the other. This has only deepened our appreciation of the complexities and richness of metaphor in usage. The present contribution aims to address some of these issues by adopting a novel methodological perspective which has been applied to a large set of materials. It is inspired by cognitive linguistics, but does not follow all of its current practices. Our alternative is inspired by aspects of discourse analysis,
768
G. J. Steen et al.
as is apt for the study of metaphor in usage. This is not incompatible with the cognitive linguistic approach to metaphor and presents an alternative for consideration. The empirical findings of our research, moreover, are novel and can be interpreted within the three-dimensional framework for metaphor in usage developed in Steen (2008). They offer a new, usage-based perspective for the debate about metaphor in cognitive linguistics, and raise new questions for future collaboration between cognitive linguists and other linguists, psycholinguists, and sociolinguists. 1.2. A methodological answer
One practical approach to addressing the critical issues surrounding metaphor in cognitive linguistics is via methodology, in particular, of the analysis of metaphor in the documents and transcripts of concrete events of usage. When the object of study is metaphor in discourse, the problem arises how metaphor can be identified reliably and validly in the linguistic forms and conceptual structures of large sets of verbal data, since discourse analysis by definition sets out from a given set of language data. One primary question for most cognitive linguists therefore is how they can get from these language data to the underlying conceptual mappings in such a way that their findings form a solid basis for the study of metaphor in discourse (e.g., Steen 1999, 2007, 2009; Semino et al. 2004). When this question is answered by carefully attending to the methodology of analysis, precise distinctions and relations emerge between various aspects of metaphor research as described by Steen (2007); this also throws into relief some of the issues above about metaphor raised by psycholinguists, sociolinguists, and other linguists. Two major types of answers to the question of metaphor identification have been proposed in the literature (Steen 2007). Firstly, metaphor identification in discourse can be done deductively, which means that a set of conceptual metaphors is assumed by the analyst and used for the detection of related linguistic expressions of these metaphors in a set of materials (e.g., Koller 2004). But, secondly, metaphor may also be identified inductively, moving from the available linguistic structures towards a set of reconstructed conceptual structures that constitute cross-domain mappings; since this approach is inductive and not deductive, the resulting cross-domain mappings are not necessarily identical with the conceptual metaphors proposed in cognitive linguistics (e.g., Cameron 2003). If it is the aim of the analyst to describe all metaphor in discourse, as opposed to examining a specially selected set of conceptual metaphors and their expressions in a specific set of discourse data, then the deductive method is in greater difficulty since it does not have an established adequate list of well-defined conceptual metaphors that is exhaustive: George Lakoff and his associates worked on a Master Metaphor List in the 1990s, but the
Metaphor in usage
769
project appears to have been abandoned. A top-down approach from conceptual metaphor to linguistic expression may as a result miss many metaphors in discourse. When an inductive approach is followed, this does not mean that all we know about conceptual metaphors should be ignored, for that would be throwing out the baby with the bath water. What it does mean is that we need an explicit, systematic, and reliable tool for finding linguistic expressions that may be related to metaphor in conceptual structure, and that this tool should at least lead to the inclusion of the obvious cases which have been so successfully revealed by the deductive approach that is characteristic of the cognitive linguistic approach to metaphor. Such a reliable inductive tool has been advanced by the Pragglejaz Group (2007), of which the first author of the present paper was the founding coordinator. The Pragglejaz Group was a group of ten metaphor researchers, namely Peter Crisp, Ray Gibbs, Alan Cienki, Graham Low, Gerard Steen, Lynne Cameron, Elena Semino, Joe Grady, Alice Deignan, and Zoltn Kvecses. They developed a tool called MIP, standing for Metaphor Identification Procedure, which consists of a brief set of instructions for the discourse analyst who aims to find metaphorically used words in a stretch of discourse (Pragglejaz Group 2007: 3): 1. 2. 3a. Read the entire text/discourse to establish a general understanding of the meaning. Determine the lexical units in the text/discourse For each lexical unit in the text, establish its meaning in context, i.e., how it applies to an entity, relation or attribute in the situation evoked by the text (contextual meaning). Take into account what comes before and after the lexical unit. For each lexical unit, determine if it has a more basic contemporary meaning in other contexts than the one in the given context. For our purposes, basic meanings tend to be: more concrete; what they evoke is easier to imagine, see, hear, feel, smell, and taste. related to bodily action. more precise (as opposed to vague). historically older. Basic meanings are not necessarily the most frequent meanings of the lexical unit. If the lexical unit has a more basic current/contemporary meaning in other contexts than the given context, decide whether the contextual meaning contrasts with the basic meaning but can be understood in comparison with it. If yes, mark the lexical unit as metaphorical.
3b.
3c.
4.
770
G. J. Steen et al.
This set of instructions was developed and tested over five years. It now produces fairly reliable results between sets of individual analysts consisting of as many as six scholars at the same time, who display relatively high levels of agreement between their independent analyses of texts (Pragglejaz Group 2007). According to the Pragglejaz Group, metaphorical meaning in usage is defined as indirect word meaning and arises out of a contrast between the contextual meaning of a lexical unit and its more basic meaning, the latter being absent from the actual context but observable in others. For instance, when a lexical unit like attack or defend is used in a context of argumentation, its contextual meaning has to do with verbal exchange. However, this is an indirect meaning, in the sense of Lakoff (1986, 1993) and Gibbs (1993, 1994), because it can be contrasted with the more basic meaning of these words in other contexts, which involves physical engagement or even war between people. Since the basic meaning can afford a relation with the contextual meaning on the grounds of some form of nonliteral comparison, all uses of defend and attack in contexts of argumentation can be analyzed as metaphorical. This procedure therefore provides an operational way of finding all conventional metaphor in actual usage (technical details about the notion of lexical unit will be discussed later in this article). Novel metaphorical usage is accommodated as follows. When the linguistic form wipe out is used in the context of argumentation, as in Lakoff and Johnsons example If you use that strategy, hell wipe you out, its contextual sense is clear. However, that contextual sense, having to do with argumentation, has not become highly conventionalized. For instance, it has not ended up in the Macmillan English Dictionary for Advanced Learners (Rundell 2002). Yet MIP does not have a problem with this: the ad hoc or situation-specific contextual sense that may be constructed for wipe out may simply be contrasted with and compared to the basic sense of wiping out, which has to do with cleaning. As a result, wipe out is also identified as metaphorical language use. By contrast, historical metaphor is not identified as metaphorical by MIP. For instance, the words fervent and ardent used to have two senses, one for temperatures and one for emotions. This may, for instance, be gathered from the Concise Oxford Dictionary published in 1974 (McIntosh 1974). However, in contemporary British English both terms have lost their original temperature sense: in the Macmillan dictionary, for instance, they only have their presentday emotion senses. Hence expressions like ardent lover are not metaphorical when analyzed by MIP, because there is no contrast between the contextually appropriate emotion sense and the historically older and more basic temperature sense, simply because the latter is no longer available to the typical contemporary language user (Deignan 2005).
Metaphor in usage
771
When we take a closer look at the rationale of the Pragglejaz Groups procedure, we soon find important connections with the fundamental issues about metaphor research evoked above. Thus, metaphor is always a relational term, and short for metaphorical to some language user. In the research to be reported below, we have adopted the position that our language user is the idealized native speaker of English as represented in the description of English by the dictionary of a particular period. This facilitates our application of MIP. It simultaneously makes explicit our position towards one of the three abovementioned issues, the sociolinguistic question of metaphorical to whom? Idealized native speakers, for instance, are abstractions glossing over a good deal of variation on a number of parameters among real native speakers of the same language. Another important advantage of MIP is its independence from conceptual analysis: linguistic forms are identified as related to metaphor on the basis of shallow lexical-semantic analysis that only involves distinct and comparable meanings. Findings may subsequently be analyzed for their possible relations to one or more cross-domain mappings in conceptual structure, but this is not required for the identification of metaphor in the language data. Such an approach goes against received practice in cognitive linguistics, but not in other schools of linguistics, as referenced above. It is indeed precisely one aim of the development of MIP to make available reliable, generally acceptable linguistic analyses of metaphor in usage for subsequent exploration of related conceptual structures, so that cognitive linguistic approaches to metaphor in thought can be pursued with more methodological rigour. Our adoption of MIP hence also clarifies our position towards another of the above-mentioned three fundamental issues, pertaining to the relation between cognitive linguistic and other linguistic approaches to metaphor at the levels of linguistic forms and conceptual structure. A third important assumption is our restriction to identifying metaphor in language at the level of semiotic analysis. We study the linguistic forms of metaphor at the level of the sign system and its manifestation in meaningful expressions and defer examining their assumed relation to conceptual structures, which can also be studied in either semiotic or behavioural ways (Steen 2007). This means that we do not make detailed assumptions about the processes that may be associated with these expressions, and we certainly do not claim that we can describe the details of these processes on the basis of the formal and semantic analysis that we present here. Evidence about the processing side of metaphor in usage can only be gathered on the basis of behavioural data, which involves the observation of people doing things with language in real time. We are not concerned with that type of analysis in this paper. Again, we only wish to present an analysis of metaphor in usage from a semiotic perspective so that independent research on metaphor in behaviour can make use of our insights.
772
G. J. Steen et al.
This paper hence presents the findings of a large-scale application of an extended and refined version of MIP, called MIPVU, about which more in the methods section (cf. Steen, Dorst, et al., 2010). It is a report of a first analysis of four samples of discourse from BNC-Baby, a sample from the British National Corpus containing four registers: academic discourse, conversation, fiction and news. Our sample totals some 190,000 words, which were annotated at the level of lexical units for their relation to metaphor. This may contribute to a better view of the role of the linguistic form of metaphor in discourse, and in a usage-based grammar. It moreover has interesting implications for the relation between cognitive linguistic approaches to metaphor on the one hand and psycholinguistic, sociolinguistic and generally linguistic approaches to metaphor on the other. 1.3. Goals
Our theoretical framework makes a distinction between three more specific research goals. These are primarily related to linguistic, psycholinguistic, and sociolinguistic aspects of the cognitive linguistic approach to metaphor in usage. (1) First of all, we will show that it is possible and informative to collect metaphor data at a linguistic level alone. We have deliberately left aside the question of how these linguistic metaphors are related to which conceptual metaphors in conceptual structure, if any. Indeed, our linguistic findings are precisely meant to provide a good starting point for further cognitive linguistic analysis of the conceptual structures of the words related to metaphor. They can also lead to better-motivated research on their cognitive processing during comprehension and understanding. (2) A second objective is to gain insight into the competition between some of the linguistic and rhetorical forms of metaphor. In particular, we will look at the linguistic expression of metaphor as either metaphor proper or as simile (cf. Gentner and Bowdle 2001, 2008; Bowdle and Gentner 2005; Glucksberg and Haught 2006; Glucksberg 2008). Since this opposition between distinct figures has become crucial for the debate about competing psycholinguistic models of metaphor, it will be informative to have a more solid view of the frequency and importance of the most familiar types of these figures in actual usage. (3) Our third major objective in this study has to do with the relation between metaphor and register. The materials were sampled from the four-million word BNC-Baby, taken from the 100-million word British National Corpus. BNC-Baby was chosen because it offers a set of language materials that are parallel with the phenomena described in the university grammar of spoken and written English produced by Douglas Biber and colleagues
Metaphor in usage
773
(Biber, et al. 1999). Our alignment with this research facilitates the description of metaphor in four specific registers that have been well studied from a lexico-grammatical point of view. The project is hence in tune with the new cognitive linguistic interest in sociolinguistics and language variation (e.g., Geeraerts 2005; Kristiansen and Dirven 2008). In sum, we aim to identify and analyze the various forms of lexical metaphor in natural discourse on a relatively large and systematic scale. It is our intention to interpret these findings against the background of sociolinguistic variation between registers and some issues that are central to current psycholinguistic models of metaphor processing. The purpose of this research is to add to the cognitive linguistic description and explanation of metaphor in usage. We do this by addressing the question of which forms of metaphor are used in which ways in which registers. 2. Method The Pragglejaz Group have presented a list of methodological items they recommend should be reported in any study involving the identification of metaphor in discourse (2007: 14). Some are not applicable here, such as the question whether contemporary meanings were retained in the case of historical texts: our texts are from the latter part of the twentieth century. Also, none of our texts counts as an allegory. Another question asks whether we used text external indications by the author during the analysis; we did not. An iterative procedure in which higher-level units such as metaphorical idioms were coded after words were done was not applied either. Yet we did follow another kind of iterative procedure, in which independent analyses were checked and discussed by all members of the research team (see protocol below). Not applicable either was the question about transcription decisions for oral (or dialectal) data, since all of these have been made for us in the text files of the British National Corpus. The remaining items in the Pragglejaz Group list are treated at more length in the following subsections. For full details about our procedure and protocol, we refer to our book publication (Steen, et al. 2010). 2.1. Materials
All files were taken from BNC-Baby. The original plan of the project was to annotate ten percent of each file in this corpus. However, because the number of lexical units related to metaphor was higher than expected, and because our protocol was more time-consuming than planned, we managed to analyze excerpts from only half of the files. Details of the corpus can be found in the Appendix.
774
G. J. Steen et al.
Selected fragments were randomly taken from the beginnings, middles, and ends of the complete BNC-Baby files. The selection of the files was prepared by splitting up all files into separate fragments defined by the highest division into sections (such as chapter sections in fiction and academic writing, or separate newspaper articles in news) in the texts. A small number of files were discarded because their content was too difficult: it is impossible to identify metaphorical lexical units if the contextual meaning of too many stretches of discourse are unclear to the analysts. Other files were discarded because they were too short and therefore too deviant from the average length of the excerpts. Even though these criteria were clear from the start, they were applied intuitively, causing a lack of complete consistency; however, we have no reason to believe that this has had great effects on the findings. 2.2. Technique
Procedure An explicit set of instructions was developed at the beginning of the research. The starting point of this set was provided by MIP, the Metaphor Identification Procedure published by the Pragglejaz Group (2007). The main additions and alterations to MIP involve the following two features: 1. 2. the detailed explication of many aspects of the decision-making process regarding lexical units and the identification of metaphorically used lexical units; the addition of new sections on other forms of metaphor (direct and implicit metaphor), novel compounds, and signals for metaphor (MFlags).
Our variant of MIP is called MIPVU, with VU being the abbreviation of Vrije Universiteit Amsterdam, the university at which our work was carried out. MIPVU comprises a brief manual of some 16 pages. It has been described and demonstrated in detail in Steen et al. (2010). The most important issues are summarized below. Annotation of words related to metaphor On the basis of the procedure, all words were examined and when relevant annotated for their relation to metaphor. This terminology is employed to suggest that a lexical unit can be related to a metaphorical idea in conceptual structure in three different ways, producing three types of metaphors in usage: it can be used metaphorically itself, that is, indirectly (He defends his claims well); 2. it can be the direct expression of a conceptual domain that functions as a source domain in a cross-domain mapping that is explicitly expressed as some form of comparison (And he wings up high, like an eagle, said of 1.
Metaphor in usage
775
a bicycle racer who has escaped from the pack and races up a steep mountain); 3. or it can be an implicit expression of a metaphorically used source domain, as in Naturally, to embark on such a step is not necessarily to succeed immediately in realizing it (where it substitutes the metaphor-related antecedent step and thus is used metaphorically in an implicit way); implicit metaphor is always based on substitution or ellipsis. When metaphors are expressed directly, as in (2), they are also often signalled by some lexical flag, such as like in the example in (2), which we code as MFlag (for Metaphor Flag). Such direct and signalled metaphors are often, but not always, similes (cf. Goatly 1997, for extensive discussion). One preliminary issue in our procedure for metaphor identification concerns the delimitation of lexical units. Although most lexical units are single words, there are some notorious borderline cases including polywords, idioms, phrasal verbs, and compounds (cf. Pragglejaz Group 2007). Some of these problems come with their own solutions in the BNC, such as polywords, which are multi-word expressions like of course or in fact that are treated as single lexical items by the Part-of-Speech tagging programme in BNC-Baby; this is fine for MIP and our variant. Similarly, idioms consist of a number of distinct lexical units, which may be kept as such in the database, following the strategy of the Pragglejaz Group. Phrasal verbs and compounds are also split into their component words in BNC-Baby, each of them receiving a separate Part-of-Speech tag, but this does require an alternative treatment during metaphor analysis. This is because phrasal verbs and compounds function as single lexical units in our theoretical framework: they have a unitary, single conceptual and referential function in discourse, where they designate distinct entities, attributes, or relations. Therefore we have given all phrasal verbs and compounds an additional annotation in our database, showing that they are single but complex lexical units, as opposed to all other lexical units that are simple. With one group of exceptions, all of these complex lexical units are typically conventional and can be found in the dictionary. The group of exceptions concerns novel compounds. Since they are novel, they are by definition not listed in the dictionary. We argue that this absence can also be taken to reflect their absence from the mental lexicon of the idealized contemporary language user, who therefore has to (a) analyze novel compounds into their component parts, (b) presumably activate the two related concepts and (c) set up some referential relation between them (e.g., Estes 2003). Novel compounds are therefore analyzed as comprising two lexical units, each of which will have to be judged for metaphorical usage in the regular way of MIP (cf. Giegerich 2004). They do count as single cases in our database, however, since they are new coinages. Thus, state-masonry is a novel
776
G. J. Steen et al.
compound of which masonry has been coded as related to metaphor, while the whole word counts as one case in the sample. Another issue is the recognition of borderline cases. These have been explicitly marked up as such by the code WIDLII, When In Doubt, Leave It In. In our protocol, this code was assigned to those cases that, after initial independent annotation by one analyst, and subsequent online commenting by colleagues, were not speedily resolved by live group discussion between all analysts. Cases eventually marked as WIDLII represent the problematic cases in our data. Their annotation as controversial explicitly signals them as an interesting group for further research. Furthermore, a small group of lexical units (n = 401) were discarded for metaphor analysis because their contextual meaning was completely unclear. Almost all of these cases came from the conversation sample. They represent about 1% of the data in the conversation sample. Finally, all cases of for (n = 1384) and of (n = 4796) were treated as nonmetaphorical on the basis of the argument that they were delexicalised prepositions exhibiting a problematic distinction between basic and other senses. Together these two prepositions comprise 33.8% of all prepositions, and 3.3% of the entire data set. Tools The Macmillan English Dictionary for Advanced Learners (Rundell 2002) was the main tool we used for making decisions about lexical units, contextual meanings, basic meanings, and distinctness of contextual and basic meanings. The reasons for using this type of dictionary, and Macmillan in particular, are that they are recent and corpus-based (cf. Pragglejaz Group 2007). We also used a second dictionary to have a second opinion about specific types of problems, the Longman Dictionary of Contemporary English. An informal test at the beginning of the project, comparing about 100 lexical units, showed that there was no essential or systematic difference between the two dictionaries for our purposes. We therefore fixed Macmillan as our first dictionary, to be supplemented by Longman only in cases of doubt. 2.3. Details about the analysis
Analysts All data were analyzed by the PhD students on the project. In the first year, these were Ewa Biernacka, Lettie Dorst, Anna Kaal, and Irene Lpez Rodrguez. From the second year on, Berenike Herrmann and Tina Krennmayr replaced Ewa Biernacka and Irene Lpez Rodrguez. The PhD students received training in MIPVU from the principal investigator. Protocol MIPVU is the basis of our identification procedure, but it should be seen in the context of our overall approach to the materials. We handled the texts according to the following protocol:
Metaphor in usage 1. 2.
777
3. 4. 5.
6. 7. 8.
Excerpts were selected from BNC-Baby by the principal investigator and entered into an administrative database; PhD students selected the excerpts assigned to them and produced an individual set of annotations; care was taken that all analysts saw materials from each register in order to attune them to differences between phenomena that had to be solved consistently with the same procedure; The individual set of annotations were posted on an intranet website for comments by the other PhD students; The other PhD students went through the work of their colleagues and posted comments and queries; All PhD researchers and the principal investigator had group meetings about the comments, referring to the details of the procedure and to previous decisions about specific cases, which had been recorded in a special lexical database; they made final verdicts about problematic cases, which were recorded. The annotations in the individually analyzed files were subsequently corrected on the basis of the web version; The final annotations were then stored in a separate folder; Any decisions about problematic cases were recorded in a special lexical database, for future reference.
A slightly simplified example of a web version after discussion is presented below (from academic text AS6, fragment 01). The essays in </mrw> this </mrw> book do not amount </mrw> to </mrw> a programme: but they are intended to provide a springboard </mrw> for <mrw type = met status = UNCERTAIN morph = n TEIform = seg> one </mrw>. I think we should actually mark this deictic marker as well 3.2 one: Im not sure, maybe only if the word it refers to is M; in this case it refers to programme, right? So not M because programme is not M? L 3.2.1 perhaps you are all right. not M. AIC All text in italics is from the BNC-Baby text, with annotations added in angular brackets: the code mrw stands for Metaphor-Related Word, and includes indirect, direct, and implicit metaphor. The underlined comments inserted in between the annotated BNC-Baby fragment are queries posted by the individual analyst into the annotated document; they alert the other PhD students to potential problems and are meant to elicit discussion. Underneath the annotated text, new, numbered comments made by the other analysts can be found about specific lexical units. Comments are signed by the initial of the analyst who posts the comment. They are numbered by utterance number and responses to comments can be added by other members of the group, with further indentation, another number, and signature being added. In this case,
778
G. J. Steen et al.
one comment can be seen, which uses M for metaphorical; the Analyst In Charge (AIC) positively responds to the comment. Reliability before discussion Five reliability tests were conducted throughout the entire period of annotation, to examine the extent of agreement between analysts when they had analyzed their materials independently of each other (before discussion). The smallest test included 713 words, the largest 1940, with a total number of 6659 cases in all tests together. Since the incidence of all of the special cases, such as direct metaphor, indirect metaphor, and WIDLII, was extremely low, both in the overall data and in the reliability tests, the analysis of the reliability data was only concerned with one type of classification: related to metaphor, or not. Error margins for the other phenomena were estimated in a different way (see troubleshooting below). An example from the materials and data is given below (from academic text CLP): From1234 the narrow1234 accountancy viewpoint1234, people are a cost23 and it is desirable to keep1234 this1234 cost2 as low1234 as possible. In1234 these1234 terms3 it is very difficult to justify1, for example, sending2 a member134 of staff on1234 a training1 course1234. The training1 requires expenditure and so also does the replacement for the person away3. Where124 is the return1234? The return1234 is actually in1234 the improved human resource23 but this1234 is not readily measurable2 in1234 terms3 which accountants use1234. The digits in bold indicate which word was marked as related to metaphor by which analyst. Reliability was good (a full report is offered in Steen et al. 2010). Measured by Fleiss kappa, which looks at agreement on a case-by-case basis, the mean value was about 0.85. On average, the analysts achieved unanimous agreement about the question whether a lexical unit is related to metaphor or not about some 92% of all cases. This is unanimous agreement between four independent analysts, that is, before discussion. It also holds between two differently composed teams, with two analysts remaining constant. It should be noted that this is substantially higher than the figures for the Pragglejaz Group (2007), which moreover concerned a much smaller data set. Measurements of Cochrans Q, which looks at analyst bias while ignoring what happens between cases, were often significant. This suggests that, when working independently, one or two analysts often scored fewer or more items in all than the others, per test. This problem was subsequently alleviated by the overall protocol of analysis, as described above: most of the errors or too generous inclusions were filtered out. That this reduction is also dependent on group dynamics is acknowledged here, but it should also be appreciated that
Metaphor in usage
779
the basis of our identification procedure lies in the reliable individual case-bycase analyses, as was shown by Fleiss Kappa. Therefore, what we are dealing with here is the further increase of consistency against the systematic and explicit set of instructions. Troubleshooting after completion When all of the metaphor data had been collected, a separate round of post-hoc troubleshooting was carried out (again, a detailed report can be found in Steen, Dorst et al. 2010). Selected samples of features that we had experienced as troublesome were inspected in order to judge how great the damage might be. Some systematic errors were detected and removed, and remaining margins of error were estimated. The upshot of this exercise is as follows: 1. 2. 3. 4. 5. For the prior identification of phrasal verbs, compounds, and polywords by the BNC, a margin of error of 0.3% should be taken into account. One percent of all lexical units in the conversations has been discarded for metaphor analysis on account of their lack of intelligibility in context. There may be a 20% error margin for the group of lexical units classified as metaphorical on the basis of WIDLII (When In Doubt, Leave It In). For the class of lexical units flagging the presence of metaphor (MFlags), agreement was about 95%. The error margin for classifying lexical units related to metaphor as direct expressions of metaphor was not separately examined since the behaviour of these words is closely connected to the behaviour of MFlags (see previous point). The error margin for classifying lexical units related to metaphor as implicit expressions of metaphor was separately examined and led to a separate round in which we re-analyzed all potential cases in all of the data. We did so by checking all cases of a list of about 30 potentially cohesive words. This list included modal verbs, primary verbs, expressions such as one, another, and so on, and comprised about 16% of all data. We decided whether each token of these types was indeed used for cohesion or not, and if it was, whether its cohesive use was implicitly metaphorical. Reliability estimates between pairs of raters of truly cohesive use of these potentially cohesive devices in a test sample of over 2,000 words yielded kappas of on average 0.79. For all samples of written text, agreement about the subsequent decision for implicitly metaphorical use between four analysts was 100%, but for conversation it was substantially lower. On the basis of this test, more explicit instructions were formulated, all data were divided by register, and then re-analyzed by one analyst each. A final sample of about 1000 cases per register was analyzed by the principal
6.
780
G. J. Steen et al. investigator, and this led to the same reliability results between an individual analyst and the principal investigator. In all, then, reliability of implicit metaphor is roughly equal to reliability for indirect metaphor.
2.4.
Preparation of final database
After all annotated files had been corrected for errors discovered during the stage of troubleshooting, they were converted into an SPSS database. Separated lexical units that needed to be treated as single units (compounds, phrasal verbs, and polywords) were collapsed into single cases. All contractions such as hed for he would were treated as two distinct lemmas in the SPSS database. By contrast, all separate POS-tags for genitive s or simply have been ignored as separate cases in the statistical analyses. The total number of cases (lexical units) that remain in the SPSS database is 186,688. 3. Results All lexical units were annotated for the categories that are functionally most important when it comes to the variable relation to metaphor: (a) non Metaphor-Related Words (non-MRWs), (b) Metaphor-Related Words (MRWs), and (c) words that function as Metaphor Flags (MFlags). A subdivision was made between clear cases of MRWs and those cases that were doubtful but included in order to keep a broad scope (coded as WIDLII, When In Doubt, Keep It In). We will now first report the overall division of the data across these four categories. Of all 186,688 lexical units in the sample, 161,105 (86.3%) are not related to metaphor. It is not true that a cognitive linguistic approach to metaphor turns everything into metaphor, a complaint sometimes heard when examples of this approach are offered to novice and lay audiences. By contrast, when we include the borderline cases called WIDLII, a total of 25,442 lexical units (13.6%) are related to metaphor. In other words, our corpus analysis suggests that on average one in every seven and a half words is related to metaphor. Finally, 141 lexical units, or less than 0.1%, function as metaphor flags. This class includes signals such as like and as, and suggests that less than one in a thousand words functions as a signal for metaphor. Since MFlags typically signal similes and comparable figures, this points to the extremely low frequency of these rhetorical figures in comparison with the number of all metaphor-related words. The division between clear cases of metaphor versus doubtful cases of metaphor yields a figure of 1831 cases classified as WIDLII, out of a total of 25,442 MRWs. This is 7.33% of all cases marked as related to metaphor. Intuitively this looks like an acceptable band of borderline cases in as complex a field as metaphor identification. When we take into account the 20% error mar-
Metaphor in usage
781
gin calculated for the WIDLII category as such, which was reported in Section 2, the true maximum value of this band of borderline cases may be estimated to be in fact slightly higher, amounting to some 8 or 9 percent. In the overall picture, however, the entire group of uncertain cases constitutes just less than one percent of all of the 186,688 lexical units that have been analyzed. The set of lexical units marked as WIDLII therefore seems to constitute a valid if small group of borderline cases that may warrant further investigation in the future. Separate investigation showed that, of all lexical units in our data, 1.6% (2990 cases) is complex. This includes all polywords marked as such by BNCBaby, which in fact comprise almost half of all complex lexical units in our data (1458 cases). The other half includes phrasal verbs and compounds, which did not come with a prior code in BNC-Baby but could be identified as such on the basis of the dictionary. In all, however, multi-word units defined in the way we have done, with respect to their distinct referential role in the discourse, are infrequent. Since the set of complex lexical units only comprises 1.6% of all data, we will ignore this distinction for the rest of this analysis. We can now establish the relation of the four metaphor categories with the four registers. A two-way frequency table is presented in Table 1, which crosses the variables of register (with the levels of conversation, fiction, news, and academic texts) with relation to metaphor (with the levels of not related to metaphor, WIDLII, related to metaphor, and metaphor flag). For each cell, we present observed counts, expected counts, row percentages, and standardized residuals; the latter are most important for interpreting the findings: when they exceed a positive or negative value of 2.54, they suggest that the observed frequency significantly deviates from the count expected by mere chance at alpha = 0.01. A chi-square analysis shows that there is a significant association between the two variables (2(9) = 3,044, p < 0.001; Cramers V = 0.07, p < 0.001). The four registers indeed vary with regard to frequency of MRWs. The academic sample has the highest proportion of clear MRWs (17.5%), conversation the lowest (6.8%), and news (15.3%) and fiction (10.9%) are in between. The distribution of the non-MRWs exhibits the mirror image of this pattern: conversation has the highest number of non-metaphorical words, then fiction and news, and the academic sample has the lowest proportion of nonmetaphor related words. These two complementary patterns account for 99% of all data. They show that there is an interaction between relation to metaphor and register, to the effect that academic and news texts are rather metaphorical in terms of density of metaphor-related words, whereas fiction and conversation are not. The interaction does not involve the distribution of the doubtful cases of metaphor labelled WIDLII. These do not appear to interact with register: with
782
G. J. Steen et al.
Table 1. Lexical units in relation to metaphor, divided by register Relation to metaphor Non-MRW Unclear MRW (WIDLII) 496 483.7 1.0% 0.6 488 439.3 1.1% 2.3 410 437.9 0.9% 1.3 437 470.1 0.9% 1.5 1831 1831 1.0% Clear MRW 8,624 6,236.9 17.5% 30.2 6,854 5,665.0 15.3% 15.8 4,883 5,646.8 10.9% 10.2 3,250 6,062.4 6.8% 36.1 23,611 23,611 12.7% MFlag Total
Academic
News
Fiction
Conversation
Total
Obs Count Exp Count % in register Std Residual Obs Count Exp Count % in register Std Residual Obs Count Exp Count % in register Std Residual Obs Count Exp Count % in register Std Residual Obs Count Exp Count % in register
40,174 42,556.2 81.5% 11.5 37,413 38,653.9 83.5% 6.3 39,281 38,529.6 88.0% 3.8 44,237 41,365.3 92.3% 14.1 161,105 161,105 86.3%
20 37.2 0.0% 2.8 37 33.8 0.1% 0.5 74 33.7 0.2% 6.9 10 36.2 0.0% 4.4 141 141 0.1%
49,314 49,314 100% 44,792 44,792 100% 44,648 44,648 100% 47,934 47,934 100% 186,688 186,688 100%
alpha at 0.01 because of the large number of observations, the unclear cases are evenly distributed across the four registers in relation to the number of nonmetaphorical words, metaphorical words, and MFlags. In this way the WIDLIIs appear to fill out almost all of the final percent of the classified data missing from the picture after examination of the 99% represented by clear MRWs and non-MRWs. The last metaphor category, MFlags, does contribute substantially to the overall interaction between relation to metaphor and register. This is true even though it is extremely small in absolute numbers. MFlags has one insignificant interaction with register, for news, but all other registers interact in significant ways with it, with academic texts ending just under alpha at 0.01. Given the association between MFlags and simile, this suggests that simile and related figures are not evenly distributed across academic texts, fiction, and conversation. The pattern of distribution for MFlags is radically different to the one for the contrast between the two categories of non-MRWs and MRWs, which are so dominant. For non-MRWs and MRWs, the corpus appears to be divided into two opposite sets of texts that are more metaphorical (academic and news)
Metaphor in usage
783
versus less metaphorical (fiction and conversation). For MFlags, by contrast, there are fewer metaphor signals in academic texts and in conversation than may be expected by chance, while there are many more in fiction than may be expected by chance. All of these tendencies are significant at the 0.01 level. News is the only register where the observed number of metaphor flags corresponds with the expectations based on chance. MFlags are presumably related to the group of direct metaphors; the findings for MFlags hence naturally lead on to an examination of the distribution of the three types of metaphor which we have distinguished in our analysis: indirect metaphor, which is the classic case of metaphorically used words; direct metaphor, which is typically represented by simile; and implicit metaphor, which is expressed by substitution or ellipsis (see Section 2.2 above). We have collapsed all clear and unclear cases for the following analysis. The question is therefore how the three types of metaphorindirect, direct, and implicitare distributed across the four registers. Since the overall distribution of metaphor-related words and non-metaphorrelated words is uneven, the relation between the three metaphor categories on the one hand and the group of words that are not related to metaphor on the other needs to be included: the proportion between metaphorical versus nonmetaphorical word use may exert an effect on preferences for the type of metaphor chosen. For instance, if a text has 80% non-metaphorical words, it has more space for the expression of metaphor than if it has 95% non-metaphorical words; the increase in space might privilege the use of explicit and extended metaphorical comparisons (direct metaphor) whereas the reduction in space might boost the use of implicit metaphor. These are just theoretical, basically numerical possibilities, but they motivate the inclusion of non-metaphor related words in the analysis. A two-way frequency table was therefore constructed that crosses the variables of metaphor type (with four levels: nonmetaphor related, indirect, direct, and implicit) and register (with four levels: conversation, fiction, news, and academic texts). The category of non-metaphor related words also includes the cases treated as metaphor flags in the previous analysis (Table 1). Table 2 offers details about observed and expected frequencies, percentages of observations per register, and standardized residuals. No cells have an expected frequency lower than 5, which means that a chi-square analysis is allowed. It shows that there is a significant association between the two variables (2(9) = 3,045, p < 0.001; Cramers V = 0.07, p < 0.001). All observed frequencies are significantly different from the expected frequencies at alpha = 0.01, with the exception of implicit metaphor for news and fiction. Indirect metaphor is the predominant group. It is responsible for the complementary pattern of distribution between metaphor-related words and non-metaphor related words observed in Table 1. Indirect metaphor accounts
784
G. J. Steen et al.
Table 2. Types of lexical metaphor, divided by register Relation to metaphor Non MRW Academic Obs Count Exp Count % in register Std Residual Obs Count Exp Count % in register Std Residual Obs Count Exp Count % in register Std Residual Obs Count Exp Count % in register Std Residual Obs Count Exp Count % in register 40,192 42,592.9 81.5% 11.6 37,450 38,687.2 83.6% 6.3 39,355 38,562.9 88.1% 4.0 44,247 41,401.0 92.3% 14.0 161,244 161,244 86.4% Indirect MRW 8,961 6,555.5 18.2% 29.7 7,145 5,954.3 16.0% 15.4 5,074 5,935.2 11.4% 11.2 3,637 6,372.0 7.6% 34.3 24,817 24,817 13.3% Direct MRW 40 88.8 0.1% 5.2 112 80,6 0.3% 3.5 165 80.4 0.4% 9.4 19 86.3 0.0% 7.2 336 336 0.2% Implicit MRW 121 76.9 0.2% 5.0 85 69.8 0.2% 1.8 54 69.6 0.1% 1.9 31 74.7 0.1% 5.1 291 291 0.2% 49,314 49,314 100% 44,792 44,792 100% 44,648 44,648 100% 47,934 47,934 100% 186,688 186,688 100% Total
News
Fiction
Conversation
Total
for 13.3% of all data, whereas implicit metaphor and direct metaphor each comprise 0.2%. Of all the 25,444 metaphor-related words together, implicit and indirect metaphor each comprises over 1%. Implicit metaphor follows the main pattern of indirect metaphor. It is most frequent in academic discourse, followed by news and fiction, and least frequent in conversation. The standardized residuals show that these differences are significant only for overuse in academic discourse and underuse in conversation, though. Direct metaphor does not follow this pattern. It is overused, relatively speaking, in fiction and news, but underused in conversations as well as in academic texts. All of these observations are significantly deviant from what is expected by chance. As expected, the pattern observed here for direct metaphor corresponds with the above findings for MFlags. In fiction, the overuse of direct metaphor corresponds with an overuse of metaphor flags. In conversations and in academic writing, direct metaphor and metaphor flags are not used as much as might be expected. And in news, direct metaphor is used more often than could be predicted on the basis of chance alone, and it is signalled fairly often, but not as much as might be expected.
Metaphor in usage
785
The general conclusion is that direct and implicit metaphor clearly each have their own role to play in the make-up of the metaphorical register profiles that emerge in this section. Both are fairly small categories, each constituting a little more than one percent of all lexical units related to metaphor. However, implicit metaphor exhibits a different pattern than direct metaphor: implicit metaphor is comparable to indirect metaphor, whereas direct metaphor is distributed in a way of its own. We shall now turn to an interpretation of these findings. 4. Discussion The availability of a reliable method for metaphor identification in discourse has opened up new possibilities for research on metaphor in usage. We manually annotated four samples of natural discourse from the British National Corpus, with a total number of 187,971 words, for metaphor. We were able to show that 13.6% of all words was related to metaphor. In addition, the number of signals for similes, analogies, comparisons, and other explicitly flagged forms of metaphor was low: signals such as like comprised less than one pro mille of all lexical units in the data. This figure is related to the number of words we found in similes and other alternative forms of metaphor, which was about two pro mille. What is more, the distribution of these phenomena across the four registers in our study (academic texts, news texts, fiction, and conversation) has revealed unexpected and complex patterns. There are two stories. The first story is about the most encompassing pattern, involving 99% of the metaphorical data; it concerns the classic case of metaphorically used words. Thus, when people use terms like defend, attack, position, strategy, manoeuvre and so on to communicate about discussions and argument, they may be said to use these terms metaphorically: according to cognitive linguists, they do so on the basis of some underlying comparison (cross-domain mapping) between the concrete war senses of these words and their abstract argument senses (argument is war, Lakoff and Johnson 1980). The concrete sense is called the basic sense, the use of which is assumed to be direct, whereas the abstract sense is the metaphorical sense, which is indirect (Pragglejaz Group 2007). These words are hence used indirectly to evoke (by non-literal comparison) another referent than the one designated by their basic meaning (Lakoff 1986, 1993; Gibbs 1993, 1994). This type of metaphor we therefore label as indirect metaphor. Odd as this may seem, this is only one story. For there is a second story, about the small group of direct metaphors, which are expressed as similes, figurative analogies, comparisons, and extended variants of all of these. This group is characterized by the typical presence of a signal for comparison and
786
G. J. Steen et al.
by the fact that its source domains are directly expressed as such in the discourse. When Shakespeare asks Shall I compare thee to a summers day?, we have a direct expression of a metaphorical comparison which includes direct reference to the source domain of the summers day (Steen and Gibbs 2004). In utterances about defending or attacking a position, however, the presumed source domain is evoked indirectly, via the contrast between the appropriate contextual sense of argumentation and the more basic sense of war or physical conflict. Moreover, a signal for comparison is typically lacking. The group of direct metaphor is small but functionally important. It forces addressees to access source domains as relatively autonomous semantic or conceptual spaces. What is more, it exhibits a radically different pattern of distribution in discourse than indirect metaphor. In particular, there is a split between fiction and news which use direct metaphor most often, while academic texts and conversation seem to avoid it. This pattern is in much greater agreement with popular assumptions about the relation between register on the one hand and metaphorical meaning on the other than the pattern reported just now for classic, indirect metaphor (which however comprises the bulk of all cases). We propose to interpret the distinction between these two patterns with reference to another distinction that has been neglected in the study of metaphor in discourse, the one between deliberate and non-deliberate metaphor (Steen 2008, in press), as we shall now explain. Direct metaphor, such as simile, is a class of metaphor characterized by a particular linguistic form: in referential terms, it directly evokes a source domain. As a result, it may also be argued to have a typical communicative property, that it is deliberate: it intentionally and explicitly instructs addressees to set up a cross-domain comparison between the referents designated by the words in the discourse. This is precisely what Shakespeares first line of Sonnet XVIII quoted above asks us to do: we have to set up referent that pertains to the you, and another referent that has to do with a summers day. It is virtually impossible to interpret such direct expressions of metaphor in discourse without using some form of comparative processing between two relatively independent referents, and without postulating some intention on the part of the assumed sender to use metaphor as metaphor. Direct metaphor, including simile, is almost by definition deliberate. Indirect metaphor, by contrast, does not necessarily have this communicative feature of deliberateness. Indeed, it has been one of the main tenets of contemporary metaphor research that indirect metaphor is mostly used unconsciously, that is, without any awareness of its dependence on cross-domain mapping. Some psycholinguists have even argued that indirect metaphor may also be resolved by categorization instead of comparison (Gentner and Bowdle 2001, 2008; Glucksberg 2001, 2008), especially when it is conventional, a characteristic which holds for an estimated 99% of all of our cases of indirect
Metaphor in usage
787
metaphor. Another processing option, which has received less attention but is equally feasible, is that much indirect metaphor is resolved in even more shallow fashion, by lexical disambiguation, and does not get to any stage of conceptual mapping at all, whether by comparison or by categorization (Steen 1994, 2008; Giora 2003). Indirect metaphor may therefore be well resolvable without comparison or cross-domain mapping. This, in turn, produces a paradox of metaphor: most metaphor may typically not be processed metaphorically (Steen 2008). Indirect metaphor consequently may either be used non-deliberately, which we think is the typical case, or deliberately, which also happens. When indirect metaphor is used deliberately, and is hence meant to trigger processing by comparison, it typically alerts addressees in some way that they are meant to process the metaphorical expression as a cross-domain mapping. This may be done, for instance, by using marked constructions such as an A is B format, as in Karl Marxs Religion is the opium of the people. Other constructions may involve the use of a set of words from a source domain within a number of phrases, or even an extended comparison across utterances. All of these considerations about the deliberate versus non-deliberate use of metaphor have to do with their communicative function, which is a discourse aspect of metaphor that has been neglected in many current cognitive-scientific models of metaphor (Steen 2007, 2008, in press). These ideas may account for the contrasting register profiles for indirect versus direct metaphor mentioned above. Peoples general experience of fiction as typically metaphorical may be based on their association of fiction with deliberate metaphor, typically in the form of simile (e.g., Lodge 1977; Semino and Steen 2008), which may have impinged on their awareness more often than in any other register. By contrast, the avoidance of such figures in academic writing may also have registered in their minds, and contributed to the overall idea that scientific writing is non-metaphorical. That this in fact is not true is shown by our findings for indirect metaphor: from that perspective, academic writing is heavily metaphorical, and this may have to do with the abstract nature of the topics of many scientific texts. These indirect metaphors in academic texts, however, are not typically deliberate but mostly automatic and unconscious, which is why they probably do not register as metaphorical. To be sure, metaphors are also used deliberately in academic discourse, for instance in the form of extended analogies and comparisons; but apparently they are not used as often as may be assumed, and even when they are used, they may not be experienced as very typical of academic writing. These proposals can be connected to existing work on the nature of distinct registers such as Biber et al. (1999), and elaborated and interpreted in the framework of a three-dimensional discourse model of metaphor which makes a distinction between metaphor in language, thought, and communication (Steen 2008, in
788
G. J. Steen et al.
press): direct and indirect metaphor are forms of expression in language of assumed (but for the moment unspecified) cross-domain mappings in thought which may or may not be deliberately used as specific rhetorical devices in communication. In this framework, another small group of metaphor in usage, implicit metaphor, may also be given its own place. It is like indirect metaphor, but does not display any lexical manifestation of a source domain. Instead, this is replaced by substituting lexis, or left altogether unexpressed. In either case, such utterances are metaphorical on the grounds of underlying metaphorical propositions that need to be reconstructed when language users process the surface text and produce a conceptual model of the discourse. How such expressions are used in communication needs further study. Metaphor in usage is a complex affair. It may be manifested by three different types of metaphor, indirect metaphor, direct metaphor and implicit metaphor, which display widely differing frequencies. The bulk of metaphor in usage involves indirect metaphor, or metaphorically used words, that are not signalled by a metaphor flag such as like. Moreover, the distribution of these different classes of metaphor interacts with different registers, as we have shown for academic discourse, news, fiction and conversation. These interactions yield complex register profiles for metaphor that may be sensibly interpreted with reference to the linguistic forms, conceptual structures and communicative functions of metaphor in usage. 5. Conclusion In this article we have approached metaphor in usage as a research problem that may benefit from a methods-driven approach. We have attempted to show that metaphor can be reliably identified in large-scale linguistic research without having to resort to assumptions about conventionalized conceptual metaphors. We have also raised the question whether the experimental focus on metaphor versus simile in psycholinguistic processing models of metaphor is justified by the distribution of metaphor and simile in natural language use. And we have finally queried whether metaphor is evenly used in the same ways across different varieties of language, or whether a usagebased approach needs a sociolinguistic component next to a cognitivelinguistic one. Our method-driven research has been inspired by the cognitive-linguistic approach but we do not agree with all of its typical assumptions (Steen 2007). In particular, we do not postulate a one-on-one connection between the metaphorical use of words such as defend and attack on the one hand and presumably underlying conceptual metaphors such as argument is war on the
Metaphor in usage
789
other. In fact, we agree with various critics of the cognitive linguistic approach (e.g., Vervaeke and Kennedy 1996; Ritchie 2004) that the relation between metaphor-related words in language and cross-domain mappings in thought requires much further study. It has precisely been one of the aims of the present research programme to make available a new resource for the systematic and large-scale study of the connection between metaphor in language and thought. This is possible because we do theoretically define metaphor at the conceptual level, as a cross-domain mapping in conceptual structure. We also agree that metaphor in language can eventually be seen as a reflection of metaphor in thought, albeit perhaps in many different ways. We have consequently examined the expression in language of metaphor in thought as a relatively independent phenomenon, with the help of a lexical-semantic as opposed to conceptual analysis of word use, along the lines developed and tested by the Pragglejaz Group (2007). Our own variant of this approached, called MIPVU, has enabled us to achieve a uniquely high level of reliability in annotating words that are expressions of metaphor, which we refer to as metaphor-related words (Steen, et al. 2010). Our objective in focussing on the linguistic aspect of metaphor in usage was to show that our variant of MIP (Pragglejaz Group 2007), MIPVU, works and, what is more, is even more reliable than MIP itself. We also demonstrated that it is possible to be maximally explicit and systematic about issues that have to do with the demarcation of lexical units. And MIPVU finally caters to other forms of metaphor than just metaphorically used words, the focus of MIP. The latter are a form of indirect language use, as noted above, but our approach can also identify other forms of metaphor (such as simile) which are typically direct. All of these are general linguistic variations on the cognitive linguistic view of metaphor in usage, measured at the level of lexical units only. The bulk of metaphor in language in our data is indeed of the kind illustrated by words like defend and attack in the context of argumentation. This is the typical example of metaphor employed in cognitive linguistic analyses of metaphor, and it involves the indirect use of words which on other occasions display direct, presumably more basic senses that have to do with physical conflict or even war. This indirect, metaphorical use of words consequently requires some form of non-literal comparison for semantic interpretation. This type of linguistic form of metaphor comprises up to 99% of all metaphor in discourse in our data. One more detailed question of our research had to do with the number of A is (like) B expressions, which have formed the basis of the experimental work reported by psycholinguists. Casual observation suggests that these formats are rather rare and marked in authentic usage, but we have now been able to
790
G. J. Steen et al.
offer more precise estimates of their incidence, which will be of help when we aim to evaluate the contributions by psycholinguistic research to the overall picture of metaphor in usage. This revaluation has led to new questions about the adequacy of current two-dimensional models of metaphor in usage that limit their attention to metaphor in language and thought; in our view, the communicative function of metaphor as a deliberate or non-deliberate rhetorical device is essential in accounting for the presumed processes of cross-domain mapping and other types of processing attending the process of metaphor versus simile. Another more detailed question that we have posed is: is there a clear difference between the registers of academic texts, conversation, fiction and news texts when it comes to their degree of metaphorical meaning? For instance, is it true that fiction, being literary, is the most metaphorical of the four? And are academic texts, being scientific, indeed the least metaphorical? These two traditional views of the relation between metaphor and register have now been relativized and made more precise by our usage data. We have found that the distribution of indirect metaphor across academic texts, conversation, fiction and news texts is not even. In fact, the uneven distribution of metaphorically used words is not according to the most natural expectations that might be entertained about the relation between metaphor and these four registers. For instance, it is not the case that metaphor occurs most often in fiction, and least often in academic texts. Instead, it turns out that academic texts have a high metaphor density, with no less than 18.6% of their lexical units being used in a metaphorical way (clear and less clear cases combined); news texts are comparable (with 16.4%), whereas fiction has only 11.8%, and conversation a mere 7.7%. Again, these are patterns for indirect metaphor, accounting for 99% of all of our metaphorical data. The picture becomes more complex when direct and implicit metaphor are included. These are new findings which enrich the grammatical picture of register variation presented by Biber et al. (1999) with an important lexical-semantic component. In all then, new questions can be formulated for the study of metaphor in usage. Cognitive linguistic approaches for metaphor have set the scene for some ground-breaking work since the 80s of the previous century, but current work can go beyond that framework. In making these advances, productive collaboration can be sought with other linguists, including corpus linguists, as well as with more behaviourally oriented scholars of metaphor in psycholinguistics and sociolinguistics. Their joint effort may help to bring out more details about the way metaphor works in usage. Received 25 May 2009 Revision received 21 May 2010 VU University Amsterdam
Metaphor in usage Appendix A Overview of annotated files from BNC-Baby Academic

File ID A6U ACJ ALP AMM AS6 AS6 B17 B1G CLP CLW CRS CTY EA7 ECV EW1 FEF Total Total number of words in BNC file 27,329 37,678 25,632 39,563 30,938 id 34,305 38,559 40,742 38,714 40,250 34,131 25,531 40,343 41,695 26,854 522,264 Total number divisions in BNC file 6 2 4 2 4 id 3 2 2 1 3 5 3 7 2 4 NA ID number of file division coded 2 1 1 2 1 2 2 2 1 1 1 3 3 5 1 3 NA
791
Number of lexical units in data 2,814 4,189 2,253 3,866 3,366 2,840 1,608 3,006 3,368 3,748 2,044 3,434 2,771 3,847 3,708 2,703 49,561
Conversation
File ID KB7 KB7 KB7 KB7 KBC KBD KBD KBH KBH KBH KBH KBH KBH KBJ KBP KBW Total number of words in file 103,997 Id Id Id 31,337 58,087 Id 47,995 Id Id Id Id Id 11,137 27,179 115,332 Total number of divisions in file 60 id id id 13 25 id 63 Id Id Id Id id 26 15 62 ID number of file division coded 10 31 45 48 13 7 21 1 2 3 4 9 41 17 9 4 Number of lexical units in data 3,072 3,161 2,830 2,983 3,641 3,124 2,779 436 1,227 165 1,838 714 616 1,083 2,666 1,712
792
File ID KBW KBW KBW KBW KCC KCF KCU KCV Total
G. J. Steen et al.
Total number of words in file Id Id Id Id 5,311 21,898 49,751 32,714 504,738 Total number of divisions in file Id Id Id id 2 30 9 50 NA ID number of file division coded 9 11 17 42 02 14 02 42 NA Number of lexical units in data 1,351 1,670 2,295 2,655 836 1,305 3,347 2,495 48,001
Fiction
File ID AB9 AC2 BMW BPA C8T CB5 CCW CCW CDB CDB FAJ FET FPB G0L Total Total number of words in file 42,247 37,662 42,584 37,769 41,117 41,727 40,408 Id 38,169 Id 42,500 35,526 41,894 43,292 484,895 Total number of divisions in file 8 10 9 19 2 2 4 id 6 id 23 7 1 1 NA ID number of file division coded 3 6 9 14 1 2 3 4 2 4 17 1 1 1 NA Number of lexical units in data 4,221 3,045 4,584 2,920 2,877 2,818 2,083 1,958 2,703 1,907 4,058 4,222 4,119 3,377 44,892
News
File ID A1E A1F A1F A1F A1F A1F A1F Total number of words in file 9,916 8,909 Id Id Id Id Id Total N divisions in file 17 20 Id Id Id Id Id ID N of file division coded 1 6 7 8 9 10 11 Number of lexical units in data 584 87 269 111 223 178 222
Metaphor in usage
File ID A1F A1G A1G A1H A1H A1J A1J A1K A1L A1M A1N A1N A1P A1P A1U A1X A1X A1X A2D A31 A36 A38 A39 A3C A3E A3E A3K A3M A3P A4D A5E A7S A7T A7W A7Y A80 A8M A8N A8R A8U A98 A9J AA3 AHB AHC AHC Total number of words in file id 10,242 Id 3,108 id 13,981 id 1,905 1,849 4,910 14,770 id 2,595 id 4,198 3,322 Id Id 1,042 3,492 6,173 3,254 2,355 8,522 1,858 Id 3,500 3,007 8,032 3,167 5,411 5,414 8,720 25,255 10,862 10,608 3,595 12,014 6,735 8,816 6,769 3,705 9,084 17,314 39,523 Id Total N divisions in file id 31 id 6 id 40 id 3 2 5 49 id 4 id 5 5 id id 7 3 9 3 3 13 4 id 11 6 14 4 8 8 16 55 9 26 7 19 7 18 12 2 15 52 82 id ID N of file division coded 12 26 27 5 6 33 34 2 1 1 9 18 1 3 4 3 4 5 5 3 7 1 1 5 2 3 11 2 9 2 6 3 1 1 3 15 2 19 2 14 3 1 8 51 60 61
793
Number of lexical units in data 62 405 593 935 724 813 605 1,012 1,074 1,113 698 812 647 653 1,892 145 194 279 1,039 699 546 756 257 1,031 233 778 1,227 887 947 1,246 1,080 848 951 1,734 895 585 313 653 851 832 593 1,505 757 901 1,116 1,097
794
File ID AHD AHE AHF AHF AHL AJF AL0 AL2 AL2 AL5 Total
G. J. Steen et al.
Total number of words in file 4,236 1,236 27,457 Id 2,552 6,472 5,143 9,361 Id 2,523 356,912 Total N divisions in file 10 3 73 Id 5 14 9 50 id 5 NA ID N of file division coded 6 3 24 63 2 7 6 16 23 3 NA Number of lexical units in data 303 315 1,202 1,311 447 669 532 410 413 827 45,116
References
Biber, Douglas, Stig Johansson, Geoffrey Leech, Susan Conrad, Edward Finegan. 1999. The Longman grammar of spoken and written English. London: Longman. Bowdle, Brian F., and Gentner, Dedre. 2005. The career of metaphor. Psychological Review 112(1). 193216. Caballero, Mara. d. R. 2006. Re-viewing space: Figurative language in architects assessment of built space. Berlin: Mouton de Gruyter. Cameron, Lynne. 2003. Metaphor in educational discourse. London and New York: Continuum. Cameron, Lynne, and Low, Graham (eds.). 1999. Researching and applying metaphor. Cambridge: Cambridge University Press. Charteris-Black, Jonathan. 2004. Corpus approaches to critical metaphor analysis. London: Palgrave MacMillan. Chilton, Paul. 1996. Security metaphors: Cold war discourse from containment to common house. New York: Peter Lang. Cienki, Alan, and Mller, Cornelia (eds.). 2008. Metaphor and gesture. Amsterdam and Philadelphia: John Benjamins. Deignan, Alice. 2005. Metaphor and corpus linguistics. Amsterdam and Philadelphia: John Benjamins. Estes, Zach. 2003. Attributive and relational processes in nominal combination. Journal of Memory and Language 48. 304319. Eubanks, Philip. 2000. A war of words in the discourse of trade: The rhetorical constitution of metaphor. Carbondale and Edwardsville: Southern Illinois University Press. Geeraerts, Dirk. 2005. Lectal variation and empirical data in Cognitive Linguistics. In Franciso J. Ruiz de Mendoza Ibanez and M. Sandra Pea Cervel (eds.), Cognitive linguistics: Internal dynamics and interdisciplinary interaction (pp. 163189). Berlin and New York: Mouton de Gruyter. Gentner, Dedre, and Bowdle, Brian F. 2001. Convention, form, and figurative language processing. Metaphor and Symbol 16(3 and 4). 223248. Gentner, Dedre, and Bowdle, Brian F. 2008. Metaphor as structure-mapping. In Raymond W. Gibbs, Jr., (ed.), The Cambridge handbook of metaphor and thought (pp. 109128). New York: Cambridge University Press.
Metaphor in usage
795
Gibbs, Raymond W., jr. 1993. Process and products in making sense of tropes. In Andrew Ortony (ed.), Metaphor and thought: Second edition (pp. 252276). Cambridge: Cambridge University Press. Gibbs, Raymond W., jr. 1994. The poetics of mind. Cambridge: Cambridge University Press. Gibbs, Raymond W., jr. 2006. Embodiment and cognitive science. New York: Cambridge University Press. Gibbs, Raymond W., jr. (ed.). 2008. The Cambridge handbook of metaphor and thought. New York: Cambridge University Press. Giegerich, Hans J. 2004. Compound or phrase? English noun-plus-noun constructions and the stress criterion. English Language and Linguistics 8(1). 124. Giora, Rachel. 2003. On our mind: Salience, context, and figurative language. New York: Oxford University Press. Glucksberg, Sam. 2001. Understanding figurative language: From metaphors to idioms. Oxford and New York: Oxford University Press. Glucksberg, Sam. 2008. How metaphors create categoriesquickly. In Raymond W. Gibbs, Jr., (ed.), The Cambridge handbook of metaphor and thought (pp. 6783). New York: Cambridge University Press. Glucksberg, Sam, and Haught, Catherine. 2006. On the relation between metaphor and simile: When comparison fails. Mind and Language 21(3). 360378. Goatly, Andrew. 1997. The language of metaphors. London: Routledge. Jackendoff, Ray, and Aaron, David. 1991. Review article of Lakoff and Turners More than cool reason. Language 67(2). 320338. Koller, Veronika. 2004. Metaphor and gender in business media discourse: A critical cognitive study. Basingstoke and New York, NY: Palgrave Macmillan. Kvecses, Zoltn. 2005. Metaphor in culture: Universality and variation. Cambridge and New York: Cambridge University Press. Kristiansen, Gitte, and Dirven, Ren (eds.). 2008. Cognitive sociolinguistics: Language variation, cultural models, social systems. Berlin/New York: Mouton de Gruyter. Lakoff, George. 1986. A figure of thought. Metaphor and Symbolic Activity 1(3). 215225. Lakoff, George. 1993. The contemporary theory of metaphor. In Andrew Ortony (ed.), Metaphor and thought: Second edition (pp. 202251). Cambridge: Cambridge University Press. Lakoff, George, and Johnson, Mark. 1980. Metaphors we live by. Chicago: Chicago University Press. Lakoff, George, and Johnson, Mark. 1999. Philosophy in the flesh: The embodied mind and its challenge to western thought. New York: Basic Books. Lodge, David J. 1977. The modes of modern writing. London: Arnold. McIntosh, Ed (ed.). 1974. The concise Oxford dictionary of current English. London: Book Club Associates. Mller, Cornelia. 2008. MetaphorsDead and alive, sleeping and waking: A dynamic view. Chicago: University of Chicago Press. Musolff, Andreas. 2004. Metaphor and political discourse: Analogical reasoning in debates about Europe. Houndmills, Basingstoke: Palgrave Macmillan. Musolff, Andreas, and Zinken, Jrgen (eds.). 2009. Metaphor and discourse. Basingstoke and New York, NY: Palgrave Macmillan. Pragglejaz Group. 2007. MIP: A method for identifying metaphorically used words in discourse. Metaphor and Symbol 22(1). 139. Ritchie, David. 2004. Common ground in metaphor theory: Continuing the conversation. Metaphor and Symbol 19(3). 233244. Rundell, M. (ed.). 2002. Macmillan English dictionary for advanced learners. Oxford: Macmillan Publishers.
796
G. J. Steen et al.
Semino, Elena. 2008. Metaphor in discourse. Cambridge: Cambridge University Press. Semino, Elena, Heywood, John, and Short, Mick H. 2004. Methodological problems in the analysis of metaphors in a corpus of conversations about cancer. Journal of Pragmatics 36(7). 12711294. Semino, Elena, and Steen, Gerard J. 2008. Metaphor in literature. In Raymond W. Gibbs, Jr., (ed.), The Cambridge handbook of metaphor and thought (pp. 232246). New York: Cambridge University Press. Shore, Brad. 1996. Culture in mind: Cognition, culture, and the problem of meaning. Oxford: Oxford University Press. Sperber, Dan, and Wilson, Deirdre. 2008. A deflationary account of metaphor. In Raymond W. Gibbs, Jr., (ed.), The Cambridge Handbook of Metaphor and Thought (pp. 84105). New York: Cambridge University Press. Steen, Gerard J. 1994. Understanding metaphor in literature: An empirical approach. London: Longman. Steen, Gerard J. 1999. From linguistic to conceptual metaphor in five steps. In Raymond W. Gibbs, jr. and Gerard J. Steen (Eds.), Metaphor in cognitive linguistics (pp. 5777). Amsterdam: John Benjamins. Steen, Gerard J. 2007. Finding metaphor in grammar and usage: A methodological analysis of theory and research. Amsterdam: John Benjamins. Steen, Gerard J. 2008. The paradox of metaphor: Why we need a three-dimensional model of metaphor. Metaphor and Symbol 23(4). 213241. Steen, Gerard J. 2009. From linguistic metaphor to conceptual structure in five steps: Analyzing metaphor in poetry. In Geert Brne and Jeroen Vandaele (eds.), Cognitive poetics: Goals, gains and gaps (pp. 197226). Berlin: Walter de Gruyter. Steen, Gerard J. In press. When is metaphor deliberate? In Nils.-Lennart Johannesson, Christina Alm-Arvius and David C. Minugh (eds.), Selected Papers from the Stockholm 2008 Metaphor Festival. Stockholm: Acta Universitatis Stockholmiensis. Steen, Gerard J., Dorst, Aletta G., Herrmann, J. Berenike., Kaal, Anna, Krennmayr, Tina, and Pasma, Trijntje. 2010. A method for linguistic metaphor identification: From MIP to MIPVU. Amsterdam/Philadelphia: John Benjamins. Steen, Gerard J., and Gibbs, Raymond W., jr. 2004. Questions about metaphor in literature. European Journal of English Studies 8(4). 337354. Stefanowitsch, Anatol, and Gries, Stefan T. (eds.). 2006. Corpus-based approaches to metaphor and metonymy. Berlin, New York: Mouton de Gruyter. Vervaeke, John and Kennedy, John M. 1996. Metaphors in language and thought: Falsification and multiple meanings. Metaphor and Symbol 11(3). 273284. Wierzbicka, Anna. 1986. Metaphors linguists live by: Lakoff and Johnson contra Aristotle. Papers in Linguistics 19(2). 287313.

Cognitive Linguistics Issue1 4.vol.21

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Cognitive Linguistics Issue1 4.vol.21

Uploaded by

Copyright:

Available Formats

Metaphor and metonymy: Making their connections more slippery

Metaphor and metonymy

Metaphor and metonymy

Similarity versus contiguity?

Metaphor and metonymy

Metaphor and metonymy

Metaphor and metonymy

Metaphor and metonymy

Metaphor and metonymy

3. However, Norrick (1981) does not discuss Representational metonymy.

Metaphor and metonymy

Metaphor and metonymy

Metaphor and metonymy

Contiguity involving similarity, 3: Other cases

J. A. Barnden Source/target links as part of the message (link survival)

Metaphor and metonymy

Metaphor and metonymy

4. http:/ /ladynicole.blogspot.com/2005_08_01_archive.html (accessed 3rd July 2008).

Metaphor and metonymy

Metaphor and metonymy 4.3. (Some) metaphor as double metonymy

Metaphor and metonymy

Metaphor and metonymy

Grammatical weight and relative clause extraposition in English

Grammatical weight b. PCD for VP with HNPS

Experiment 1: Reading time

Subject NP: 3/8 (37.5%) Matrix S: 2/2 (100%) Total words: 10

Subject NP: 2/2 (100%) Matrix S: 3/7 (43%) Total words: 9

Figure 1. Mean reading time per word by clause weight

Figure 2. Mean acceptability ratings by clause weight9

Figure 4. Percentage of extraposed RCs by ratio of VP length to RC length

Writing 3.32 11.97 12.94 9.31

Figure 5. Predicate types for canonical and extraposition sentences

The polyfunctionality of magari

Magari B: Magari! Id love to! / I wish I could!

The theoretical approach Construction grammar

F. Masini and P. Pietrandrea

F. Masini and P. Pietrandrea

F. Masini and P. Pietrandrea

per esempio for example

lanteatro the amphitheatre lo stadio the stadium

F. Masini and P. Pietrandrea

F. Masini and P. Pietrandrea

Figure 1. The constructional network for lists

F. Masini and P. Pietrandrea

F. Masini and P. Pietrandrea

Figure 2. The association of the ve functions of magari with lists

F. Masini and P. Pietrandrea

F. Masini and P. Pietrandrea

F. Masini and P. Pietrandrea

e dei titoli and about the titles

sono stati messi have been quoted

poche volte few times PRE ADJ2

F. Masini and P. Pietrandrea

quattro four cinque volte ve times ADJ2

ADJ1 ARG1 (45) PRE ADJ1 ADJ2

F. Masini and P. Pietrandrea

magari arguably ma but

violento violent facile da superare easy to overcome ARG1 ADJ2

ADJ1 ARG1 PRE ADJ1 ARG2

CAUSE ADJ1 (58) PRE

F. Masini and P. Pietrandrea

ARG1 PRE ARG1 ADJ1

F. Masini and P. Pietrandrea